Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2511.04217

The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 127
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 190 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Lotteryticket Hypothesis Multi-Head Attention Mechanisms

The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

Bugai's Collection

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28, 2025 • 89
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24, 2025 • 80
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

Paper • 2508.19247 • Published Aug 26, 2025 • 43
VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

Theory and Representation learning

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 30
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 47
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

Lotteryticket Hypothesis Multi-Head Attention Mechanisms

The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 127
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

Bugai's Collection

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28, 2025 • 89
TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24, 2025 • 80
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space

Paper • 2508.19247 • Published Aug 26, 2025 • 43
VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 190 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12, 2025 • 38
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30, 2025 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23, 2025 • 88

Theory and Representation learning

I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 30
SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 47
The Strong Lottery Ticket Hypothesis for Multi-Head Attention Mechanisms

Paper • 2511.04217 • Published Nov 6, 2025 • 17

UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Paper • 2410.14059 • Published Oct 17, 2024 • 63
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7, 2025 • 46
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6, 2025 • 96
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Paper • 2503.10639 • Published Mar 13, 2025 • 53

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs