⭐ TERA V2

A language model built from scratch — no pretrained weights, no transformers.

Architecture: Time Mix + Token Shift + GroupNorm + Channel Mix + Squared ReLU

10 60
0.1 1
3 20
0.3 1
1 2

TERA V2 by Vedaco • ~929K parameters • Trained from scratch