Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
appvoid 
posted an update 1 day ago
Post
1106
Let's keep the momentum for small models. I just published dot. It's the first pretrained causal model that is trained on math/symbols rather than english. The goal is to get an agnostic fewshot meta learner that learns from reality itself instead of language.

It's already decent at some tasks, with next version coming in a few weeks.


appvoid/dot

So my understanding is that the data used to train this model from the beginning is not English corpus, nor is it text, so its tokenizer is also different from the traditional one. I'm curious about how this part is handled and how the model itself understands things. Is it the same as the traditional one, which is also a one-dimensional token sequence?

·

Correct! It's causal modeling (for now) with a char level tokenizer with only 8 tokens.

The model learns by looking for relationships of sequences for a single token, so the only way it learns is literally nudging weights towards a generalized solution using pure sequences.

In short, it learns to learn.

In this post