OpenVINO (OpenVINO Toolkit)

posted an update 4 days ago

Post

3567

Update: TRELLIS.2 (Text to 3D, Image to 3D) Gradio with Rerun Embedded demo with improved visualization of the 3D model previewer is now available on Hugging Face. Generate assets and view them in the 3D viewer, powered and streamlined with Microsoft’s TRELLIS.2 and Tongyi-MAI’s Z-Image-Turbo models.

🤗 TRELLIS.2 (Demo): prithivMLmods/TRELLIS.2-Text-to-3D
🕹️ GitHub: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D-RERUN
🕹️ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!

inoculatemedia

posted an update 5 days ago

Post

228

Waitlist is open for Lilikoi and I need a couple people to help test now though.
inoculatemedia/lilikoi-view

prithivMLmods

posted an update 5 days ago

Post

4086

Introducing the Qwen-Image-Edit-2511-LoRAs-Fast demo, featuring image property comparison and contrast, built on top of Gradio and the combined Rerun SDK. It supports single and multi-image edits with existing LoRAs that are lazily loaded. (Note: This is still an experimental Space for Qwen-Image-Edit-2511.)

⭐ Space Demo: prithivMLmods/Qwen-Image-Edit-2511-LoRAs-Fast
⭐ GitHub: https://github.com/PRITHIVSAKTHIUR/Qwen-Image-Edit-2511-LoRAs-Fast-Multi-Image-Rerun
⭐ Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

To know more about it, visit the app page or the respective model page!

2 replies

·

inoculatemedia

posted an update 9 days ago

Post

1442

I’m opening the waitlist for what I believe to be the most advanced multimodal bridge for A/V professionals. Txt2img, img2video, editing, export to ProRes, apply Luts, Pexels and TouchDesigner integrations, music and voice gen, multichannel mixing.

Announcing: Lilikoi by Haawke AI

Teaser video made entirely with Lilikoi:
https://youtu.be/-O7DH7vFkYg?si=q2t5t6WjQCk2Cp0w

Https://Lilikoi.haawke.com

Technical brief:
https://haawke.com/technical_brief.html

prithivMLmods

posted an update 12 days ago

Post

3655

Introducing demos for new SOTA models from AI2: SAGE-MM (Smart Any-Horizon Agents for Long-Video Reasoning) and Molmo-2, an open vision-language model that supports multi-image (QA and pointing) and video (QA, pointing, and tracking). The respective demo-related collections are listed below. 🎃🔥

✨ SAGE-MM [Video-Reasoning]: prithivMLmods/SAGE-MM-Video-Reasoning
✨ Molmo2 [Demo]: prithivMLmods/Molmo2-HF-Demo

🎃 GitHub[SAGE-MM]: https://github.com/PRITHIVSAKTHIUR/SAGE-MM-Video-Reasoning
🎃 GitHub[Molmo2]: https://github.com/PRITHIVSAKTHIUR/Molmo2-HF-Demo
🎃 Multimodal Implementations: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!

1 reply

·

prithivMLmods

posted an update 13 days ago

Post

2026

Introducing TRELLIS.2 Text-to-3D. The demo for the TRELLIS.2-4B (Image-to-3D) model is streamlined with the Z-Image Turbo image generation model to enable Text-to-3D functionality. There is no need for input assets, making a small leap forward for ideation. Optionally, it also includes default support for Image-to-3D inference using direct image assets. Find the demo and related collections below... 🤗🔥

✨ TRELLIS.2-Text-to-3D [Demo]: prithivMLmods/TRELLIS.2-Text-to-3D
✨ Multimodal Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ Github: https://github.com/PRITHIVSAKTHIUR/TRELLIS.2-Text-to-3D

To know more about it, visit the app page or the respective model page!

Nymbo

posted an update 14 days ago

Post

1830

🚨 New tool for the Nymbo/Tools MCP server: The new Agent_Skills tool provides full support for Agent Skills (Claude Skills but open-source).

How it works: The tool exposes the standard discover/info/resources/validate actions. Skills live in /Skills under the same File_System root, and any bundled scripts run through Shell_Command, no new infrastructure required.

Agent_Skills(action="discover")  # List all available skills
Agent_Skills(action="info", skill_name="music-downloader")  # Full SKILL.md
Agent_Skills(action="resources", skill_name="music-downloader")  # Scripts, refs, assets

I've included a music-downloader skill as a working demo, it wraps yt-dlp for YouTube/SoundCloud audio extraction.

Caveat: On HF Spaces, Shell_Command works for most tasks, but some operations (like YouTube downloads) are restricted due to the container environment. For full functionality, run the server locally on your machine.

Try it out ~ https://www.nymbo.net/nymbot

prithivMLmods

posted an update 15 days ago

Post

1997

Demo for Molmo2 on Hugging Face is live now, including Single/Multi-Image VQA, Visual Pointing/Grounding, Video VQA, and Video Point Tracking. Find the demo and related collections below. 🔥🤗

● Molmo2 HF Demo🖥️: prithivMLmods/Molmo2-HF-Demo
● Model Collection: https://huggingface.co/collections/allenai/molmo2
● Related Multimodal Space Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations

To know more about it, visit the app page or the respective model page!

amokrov

published a model 16 days ago

OpenVINO/gemma-3-4b-it-int4-cw-ov

Image-Text-to-Text • Updated 16 days ago • 124 • 2

amokrov

updated a collection 16 days ago

LLMs optimized for NPU

Collection

10 items • Updated 16 days ago • 13

amokrov

updated a model 16 days ago

OpenVINO/gemma-3-4b-it-int4-cw-ov

Image-Text-to-Text • Updated 16 days ago • 124 • 2

openvino-ci

updated a model 16 days ago

OpenVINO/gemma-3-4b-it-int4-cw-ov

Image-Text-to-Text • Updated 16 days ago • 124 • 2

amokrov

published 2 models 16 days ago

OpenVINO/gemma-3-4b-it-int4-ov

Image-Text-to-Text • Updated 16 days ago • 18

OpenVINO/gemma-3-4b-it-fp16-ov

Image-Text-to-Text • Updated 16 days ago • 7

amokrov

updated a model 16 days ago

OpenVINO/gemma-3-4b-it-fp16-ov

Image-Text-to-Text • Updated 16 days ago • 7

openvino-ci

updated 2 models 16 days ago

OpenVINO/gemma-3-4b-it-int4-ov

Image-Text-to-Text • Updated 16 days ago • 18

OpenVINO/gemma-3-4b-it-fp16-ov

Image-Text-to-Text • Updated 16 days ago • 7

prithivMLmods

posted an update 16 days ago

Post

5518

Introducing the Z Image Turbo LoRA DLC App, a gallery space for plug-and-play Z-Image-Turbo LoRAs. It features a curated collection of impressive LoRAs for generating high-quality images. By default, it runs on the base model. Simply choose a LoRA, type your prompt, and generate images. You can find the app and more details below. 🤗🧪

● Space [Demo]: prithivMLmods/Z-Image-Turbo-LoRA-DLC
● Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection
● Check the list of Z-Image LoRA's: https://huggingface.co/models?other=base_model:adapter:Tongyi-MAI/Z-Image-Turbo
● Github: https://github.com/PRITHIVSAKTHIUR/Z-Image-Turbo-LoRA-DLC

Other related image gen spaces:-

● FLUX-LoRA-DLC2: prithivMLmods/FLUX-LoRA-DLC2
● FLUX-LoRA-DLC: prithivMLmods/FLUX-LoRA-DLC
● Qwen-Image-LoRA-DLC: prithivMLmods/Qwen-Image-LoRA-DLC
● Qwen-Image-Edit-2509-LoRAs-Fast: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast
● Qwen-Image-Edit-2509-LoRAs-Fast-Fusion: prithivMLmods/Qwen-Image-Edit-2509-LoRAs-Fast-Fusion

& more...

To know more about it, visit the app page or the respective model page!