Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head
AI & ML interests
None defined yet.
Recent Activity
Organization Card
Embedl
Efficient AI for the edge.
Embedl develops advanced tools and algorithms for Edge AI. Our mission is to make AI models run faster, more energy-efficient, and reliably across diverse hardware platforms, while significantly reducing development time.
We help teams deploy high-performance AI on real-world, resource-constrained devices.
models 24
embedl/all-MiniLM-L6-v2-quantized-trt
Sentence Similarity • Updated
embedl/mobilevit-small-quantized
Image Classification • Updated • 22
embedl/gemma-3-270m-it-FlashHead
0.3B • Updated • 216 • 3
embedl/gemma-3-1b-it-FlashHead-W4A16
1B • Updated • 148 • 3
embedl/gemma-3-1b-it-FlashHead
1.0B • Updated • 145 • 3
embedl/Qwen3.5-9B-FlashHead
Image-Text-to-Text • 10B • Updated • 445
embedl/Qwen3.5-4B-FlashHead
Image-Text-to-Text • 5B • Updated • 457
embedl/Qwen3.5-2B-FlashHead
Image-Text-to-Text • 2B • Updated • 472
embedl/Qwen3.5-27B-FlashHead
Image-Text-to-Text • 28B • Updated • 286
embedl/Qwen3.5-0.8B-FlashHead
Image-Text-to-Text • 0.9B • Updated • 430 • 1