12 4 12

Jonna Matthiesen

JonnaMat

AI & ML interests

None yet

Recent Activity

updated a model 27 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

updated a collection 27 days ago

Cosmos-Reason2

updated a collection 27 days ago

Cosmos-Reason2

View all activity

Organizations

updated a model 27 days ago

embedl/Cosmos-Reason2-2B-W4A16-Edge2

Image-Text-to-Text • 2B • Updated 26 days ago • 336 • 12

updated a collection 27 days ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated 27 days ago • 4

reacted to HannesVonEssen's post with 🔥 30 days ago

Post

242

📣 I made a visualizer for Hugging Face models: https://hfviewer.com

✨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture!

🔗 The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27B

Feel free to try it out and give me feedback on how it can be improved! ❤️

1 reply

reacted to HannesVonEssen's post with 🔥❤️ 30 days ago

Post

11640

📣 Hugging Face Visualizer, now as Chrome extension!
https://hfviewer.com

✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!

🔗 Link:
https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej

Thanks for all the nice feedback so far! ❤️

5 replies

updated a collection about 1 month ago

Cosmos-Reason2

Collection

nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. • 13 items • Updated 27 days ago • 4

updated 4 models about 2 months ago

posted an update about 2 months ago

Post

135

⚡ Qwen3.5, up to 1.4× faster. Same quality. Less latency.

We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.

📊 embedl/Edge-Inference-Benchmarks

🤗 https://huggingface.co/collections/embedl/qwen35

updated 3 collections about 2 months ago

NVIDIA Jetson AGX Orin

Collection

Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference. • 8 items • Updated Apr 29 • 3

NVIDIA Jetson AGX Thor

Collection

Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads. • 7 items • Updated Apr 29 • 1

FlashHead

Collection

Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head • 24 items • Updated Apr 29 • 2

Jonna Matthiesen

AI & ML interests

Recent Activity

Organizations

JonnaMat's activity