Jonna Matthiesen

JonnaMat

12 4 12

AI & ML interests

None yet

Recent Activity

updated a model about 2 months ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

updated a collection about 2 months ago

Cosmos-Reason2

updated a collection about 2 months ago

Cosmos-Reason2

View all activity

Organizations

updated a model about 2 months ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated May 19 • 227 • 13

updated a collection about 2 months ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated May 18 • 4

reacted to HannesVonEssen's post with 🔥 about 2 months ago

Post

260

📣 I made a visualizer for Hugging Face models: https://hfviewer.com

✨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture!

🔗 The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27B

Feel free to try it out and give me feedback on how it can be improved! ❤️

1 reply

reacted to HannesVonEssen's post with 🔥❤️ about 2 months ago

Post

11661

📣 Hugging Face Visualizer, now as Chrome extension!
https://hfviewer.com

✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!

🔗 Link:
https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej

Thanks for all the nice feedback so far! ❤️

5 replies

updated a collection about 2 months ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated May 18 • 4

updated 4 models 2 months ago

posted an update 2 months ago

Post

142

⚡ Qwen3.5, up to 1.4× faster. Same quality. Less latency.

We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.

📊 embedl/Edge-Inference-Benchmarks

🤗 https://huggingface.co/collections/embedl/qwen35

updated 3 collections 2 months ago

NVIDIA Jetson AGX Orin

Collection

Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference. • 8 items • Updated Apr 29 • 4

NVIDIA Jetson AGX Thor

Collection

Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads. • 7 items • Updated Apr 29 • 1

FlashHead

Collection

Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head • 24 items • Updated Apr 29 • 3

Jonna Matthiesen

AI & ML interests

Recent Activity

Organizations

JonnaMat's activity