4 373

M Saad Salman

MSS444

MSS444

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Agentic Code Reasoning

upvoted a paper 1 day ago

Learn Hard Problems During RL with Reference Guided Fine-tuning

upvoted a paper 1 day ago

Tool Verification for Test-Time Reinforcement Learning

View all activity

Organizations

None yet

upvoted 3 papers 1 day ago

upvoted 2 papers 2 days ago

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Paper • 2602.23008 • Published 6 days ago • 34

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Paper • 2602.24286 • Published 5 days ago • 69

upvoted 5 papers 9 days ago

Discovering Multiagent Learning Algorithms with Large Language Models

Paper • 2602.16928 • Published 14 days ago • 16

"What Are You Doing?": Effects of Intermediate Feedback from Agentic LLM In-Car Assistants During Multi-Step Processing

Paper • 2602.15569 • Published 15 days ago • 13

Arcee Trinity Large Technical Report

Paper • 2602.17004 • Published 13 days ago • 17

Unified Latents (UL): How to train your latents

Paper • 2602.17270 • Published 13 days ago • 57

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published 23 days ago • 256

upvoted 3 papers 15 days ago

Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation

Paper • 2602.07298 • Published 26 days ago • 4

AIDev: Studying AI Coding Agents on GitHub

Paper • 2602.09185 • Published 23 days ago • 3

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Paper • 2602.14041 • Published 17 days ago • 52

upvoted 5 papers 16 days ago

Detecting RLVR Training Data via Structural Convergence of Reasoning

Paper • 2602.11792 • Published 20 days ago • 2

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Paper • 2602.11748 • Published 20 days ago • 30

FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching

Paper • 2602.12829 • Published 19 days ago • 4

Intelligent AI Delegation

Paper • 2602.11865 • Published 20 days ago • 14

Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs

Paper • 2602.10388 • Published 22 days ago • 237

upvoted 2 papers 19 days ago

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

Paper • 2602.08741 • Published 23 days ago • 2

GoodVibe: Security-by-Vibe for LLM-Based Code Generation

Paper • 2602.10778 • Published 21 days ago • 3

M Saad Salman

AI & ML interests

Recent Activity

Organizations

MSS444's activity