Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
nvidia
/
omnivinci
like
174
Follow
NVIDIA
51k
Feature Extraction
Transformers
Safetensors
vila
omni-modal
multimodal
vision
audio
video
llm
custom_code
Eval Results (legacy)
arxiv:
2510.15870
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
6
Deploy
Use this model
main
omnivinci
/
sound_tower
1.27 GB
2 contributors
History:
1 commit
Hanrong Ye
commit
c48c32c
26 days ago
config.json
Safe
1.29 kB
commit
26 days ago
model.safetensors
Safe
1.27 GB
xet
commit
26 days ago