https://jaykmody.com/blog/gpt-from-scratch/
In this post, we'll implement a GPT from scratch in just 60 lines of numpy
. We'll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.
Note:
EDIT (Feb 9th, 2023): Added a "What's Next" section and updated the intro with some notes.
EDIT (Feb 28th, 2023): Added some additional sections to "What's Next".