-
veyra-ai/veyra2-30m-base-2b-tokens
Text Generation • 34.6M • Updated • 202 -
veyra-ai/veyra2-30m-base-2b-tokens-gguf
Text Generation • 34.6M • Updated • 231 -
veyra-ai/veyra2-30m-base-2b-tokens-onnx-int8
Text Generation • Updated • 48 -
veyra-ai/veyra2-15m-base-1b-tokens
Text Generation • 14.7M • Updated • 80
Veyra AI
AI & ML interests
Building tiny English language models for practical local AI. Veyra AI focuses on CPU-friendly inference, function calling, tool use, Python-oriented small models, distillation, RLVR, and lightweight fine-tuning. The goal is to make compact models that are easy to run, inspect, adapt, and use in real workflows without large hardware.
Recent Activity
Welcome to Veyra AI
Tiny English language models built for fast local inference. Veyra AI focuses on compact, CPU-friendly language models that are easy to run, fine-tune, and experiment with. Our work is centered on small English models, function calling, Python-oriented variants, distillation, RLVR, tool use, and local AI. The goal is simple: make capable small models that are practical for local workflows, research, and lightweight deployment.
Current Model Families:
- Veyra2 30M — Next-generation 30M parameter model optimized for low-latency inference and on-device deployment. Delivers excellent speed and efficiency without sacrificing responsiveness.
- Veyra2 15M — Ultra-lightweight 15M parameter model built for highly resource-constrained environments. Ideal for edge devices and ultra-fast inference.
- Veyra 30M (Legacy) — Proven 30M base model with strong instruction-following and balanced general capabilities. Still reliable for many use cases.
Planned Model Families:
- Veyra2 80M Fast and efficient 80M parameter model that brings significantly improved reasoning, coherence, and instruction adherence while maintaining excellent speed and low resource usage.
- Veyra SmolLM2 135M — Compact yet highly capable 135M model. A custom instruction-tuned version of SmolLM2 135M, offering strong performance in a small footprint.
- Kairo 30M — Experimental architecture designed to validate next-generation design choices for the entire Veyra lineup. Some Kairo models are out now.
-
veyra-ai/veyra2-30m-base-2b-tokens
Text Generation • 34.6M • Updated • 202 -
veyra-ai/veyra2-30m-base-2b-tokens-gguf
Text Generation • 34.6M • Updated • 231 -
veyra-ai/veyra2-30m-base-2b-tokens-onnx-int8
Text Generation • Updated • 48 -
veyra-ai/veyra2-15m-base-1b-tokens
Text Generation • 14.7M • Updated • 80
