embedl/Cosmos-Reason2-2B-W4A16-Edge2 Image-Text-to-Text β’ 2B β’ Updated 26 days ago β’ 336 β’ 12
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. β’ 13 items β’ Updated 27 days ago β’ 4
view post Post 242 π£ I made a visualizer for Hugging Face models: https://hfviewer.comβ¨ Simply paste a Hugging Face URL to get an interactive visualization of the architecture!π The recent Qwen3.6-27B model as an example: https://hfviewer.com/Qwen/Qwen3.6-27BFeel free to try it out and give me feedback on how it can be improved! β€οΈ See translation 1 reply Β· β€οΈ 15 15 π₯ 13 13 π 4 4 π€― 3 3 π€ 2 2 + Reply
view post Post 11640 π£ Hugging Face Visualizer, now as Chrome extension!https://hfviewer.comβ¨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!π Link:https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakejThanks for all the nice feedback so far! β€οΈ See translation 5 replies Β· β€οΈ 27 27 π₯ 10 10 + Reply
Cosmos-Reason2 Collection nvidia/Cosmos-Reason2 multi-modal reasoning models optimized by Embedl. β’ 13 items β’ Updated 27 days ago β’ 4
view post Post 135 β‘ Qwen3.5, up to 1.4Γ faster. Same quality. Less latency.We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.π embedl/Edge-Inference-Benchmarksπ€ https://huggingface.co/collections/embedl/qwen35 See translation π₯ 1 1 + Reply
NVIDIA Jetson AGX Orin Collection Models optimized and bench-marked for NVIDIA Jetson AGX Orin. Memory-efficient and latency-optimized variants designed for real-time edge inference. β’ 8 items β’ Updated Apr 29 β’ 3
NVIDIA Jetson AGX Thor Collection Models validated and performance-optimized for NVIDIA Jetson AGX Thor. Tailored for high-performance edge AI workloads. β’ 7 items β’ Updated Apr 29 β’ 1
FlashHead Collection Efficient Drop-In Replacement for the Classification Head in Language Model Inference. https://github.com/embedl/flash-head β’ 24 items β’ Updated Apr 29 β’ 2