Demo
Model
50M parameters
Dataset
SimpleStories
Training
5M tokens
Device
U200 FPGA
Clock
125Mhz
Proof of concept — model will make mistakes
0 / 20