-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2401.09417
-
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 39 -
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Paper • 2405.14224 • Published • 15 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 148
-
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Paper • 2311.14495 • Published • 1 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Paper • 2401.13560 • Published • 1 -
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Paper • 2402.00789 • Published • 2
-
ZigMa: Zigzag Mamba Diffusion Model
Paper • 2403.13802 • Published • 18 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 39 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62
-
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Is Cosine-Similarity of Embeddings Really About Similarity?
Paper • 2403.05440 • Published • 3 -
Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches
Paper • 2211.16285 • Published -
Similarity-Based Domain Adaptation with LLMs
Paper • 2503.05281 • Published
-
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Paper • 2311.08046 • Published • 2 -
nvidia/GR00T-N1-2B
Robotics • 2B • Updated • 133 • 341 -
nvidia/Eagle2-1B
Image-Text-to-Text • 1B • Updated • 157 • 26 -
nvidia/PhysicalAI-Robotics-GR00T-X-Embodiment-Sim
Updated • 849k • 182
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
Paper • 2405.19893 • Published • 33 -
Is Cosine-Similarity of Embeddings Really About Similarity?
Paper • 2403.05440 • Published • 3 -
Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches
Paper • 2211.16285 • Published -
Similarity-Based Domain Adaptation with LLMs
Paper • 2503.05281 • Published
-
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Paper • 2311.08046 • Published • 2 -
nvidia/GR00T-N1-2B
Robotics • 2B • Updated • 133 • 341 -
nvidia/Eagle2-1B
Image-Text-to-Text • 1B • Updated • 157 • 26 -
nvidia/PhysicalAI-Robotics-GR00T-X-Embodiment-Sim
Updated • 849k • 182
-
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 39 -
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
Paper • 2405.14224 • Published • 15 -
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 148
-
StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization
Paper • 2311.14495 • Published • 1 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62 -
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation
Paper • 2401.13560 • Published • 1 -
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces
Paper • 2402.00789 • Published • 2
-
ZigMa: Zigzag Mamba Diffusion Model
Paper • 2403.13802 • Published • 18 -
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 111 -
VMamba: Visual State Space Model
Paper • 2401.10166 • Published • 39 -
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Paper • 2401.09417 • Published • 62