-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
Collections
Discover the best community collections!
Collections including paper arxiv:2312.11556
-
VecFusion: Vector Font Generation with Diffusion
Paper • 2312.10540 • Published • 22 -
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36 -
HuggingFaceM4/datikz
Viewer • Updated • 48.3k • 43 • 4 -
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Paper • 2504.06263 • Published • 182
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 31 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 6 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 33
-
SVGDreamer: Text Guided SVG Generation with Diffusion Model
Paper • 2312.16476 • Published • 2 -
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Paper • 2306.14685 • Published • 2 -
Beyond Pixels: Exploring Human-Readable SVG Generation for Simple Images with Vision Language Models
Paper • 2311.15543 • Published -
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36
-
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Paper • 2306.06094 • Published • 1 -
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers
Paper • 2304.14400 • Published • 4 -
VecFusion: Vector Font Generation with Diffusion
Paper • 2312.10540 • Published • 22 -
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis
Paper • 2401.17093 • Published • 20
-
Woodpecker: Hallucination Correction for Multimodal Large Language Models
Paper • 2310.16045 • Published • 17 -
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Paper • 2310.14566 • Published • 27 -
SILC: Improving Vision Language Pretraining with Self-Distillation
Paper • 2310.13355 • Published • 9 -
Conditional Diffusion Distillation
Paper • 2310.01407 • Published • 20
-
VecFusion: Vector Font Generation with Diffusion
Paper • 2312.10540 • Published • 22 -
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36 -
HuggingFaceM4/datikz
Viewer • Updated • 48.3k • 43 • 4 -
OmniSVG: A Unified Scalable Vector Graphics Generation Model
Paper • 2504.06263 • Published • 182
-
SVGDreamer: Text Guided SVG Generation with Diffusion Model
Paper • 2312.16476 • Published • 2 -
DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models
Paper • 2306.14685 • Published • 2 -
Beyond Pixels: Exploring Human-Readable SVG Generation for Simple Images with Vision Language Models
Paper • 2311.15543 • Published -
StarVector: Generating Scalable Vector Graphics Code from Images
Paper • 2312.11556 • Published • 36
-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 31 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 28 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 6 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 33
-
Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding
Paper • 2306.06094 • Published • 1 -
IconShop: Text-Guided Vector Icon Synthesis with Autoregressive Transformers
Paper • 2304.14400 • Published • 4 -
VecFusion: Vector Font Generation with Diffusion
Paper • 2312.10540 • Published • 22 -
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis
Paper • 2401.17093 • Published • 20