⭐ TERA V2

A language model built from scratch — no pretrained weights, no transformers.

Architecture: Time Mix + Token Shift + GroupNorm + Channel Mix + Squared ReLU

Your message