Day 0 - Setup: GPT from Scratch with MLX

Day 0 - Setup: GPT from Scratch with MLX

Reference Material:

Livestreams

Day 0: Getting Python environment setup and going over some basics to prepare for Day 1.
Video preview

Stream on X

Setup Python Environments with MLX

  1. Install Conda - https://docs.conda.io/en/latest/miniconda.html
  1. Choose your IDE - after going through this stream I want to use PyCharm from Jetbrains unless I can figure out how to configure this with Cursor.sh
  1. Make a project in your ~/workspace/GPTfromScratch
  1. From the terminal window in the IDE run conda create -n GPTfromScratch python=3.10
  1. conda activate GPTfromScratch
  1. Install MLX conda install -c conda-forge mlx
 

Preparing Data - LLM Tokenization Explained

Breaking down the following statement to a 5th grader.
The first step to training an LLM is collecting a large corpus of text data and then tokenizing it. Tokenization is the process of mapping text to integers, which can be fed into the LLM.
notion image
 

To Do

Figure out environments using https://mise.jdx.dev

Next up: Tackling data setup in depth


All Episodes