Somshubra Majumdar

smajumdar94

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

updated a collection 18 days ago

Nemotron-Post-Training-v3

updated a collection 18 days ago

Nemotron-Post-Training-v3

View all activity

Organizations

upvoted a paper 8 days ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published 9 days ago • 92

upvoted a paper 25 days ago

daVinci-Env: Open SWE Environment Synthesis at Scale

Paper • 2603.13023 • Published 29 days ago • 30

upvoted a collection about 1 month ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 4 days ago • 117

upvoted a paper about 1 month ago

BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?

Paper • 2603.03194 • Published Mar 3 • 57

upvoted 2 articles about 2 months ago

Article

Custom Kernels for All from Codex and Claude

Feb 13

•

Article

Forge: Scalable Agent RL Framework and Algorithm

Feb 13

•

145

upvoted an article 2 months ago

Article

We Got Claude to Build CUDA Kernels and teach open models!

Jan 28

•

153

upvoted a collection 4 months ago

Openhands Trajectories

Collection

Dataset of 67,074 OpenHands trajectories collected with Qwen3-Coder-480B-A35B-Instruct and two RFT checkpoints trained on the data • 3 items • Updated Dec 23, 2025 • 8

upvoted a paper 4 months ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 303

upvoted a paper 6 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 148

upvoted 3 papers 7 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 192

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9, 2025 • 105

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4, 2025 • 76

upvoted a paper 8 months ago

Understanding Tool-Integrated Reasoning

Paper • 2508.19201 • Published Aug 26, 2025 • 32

upvoted an article 8 months ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

Aug 18, 2025

•

upvoted 2 papers 9 months ago

Replacing thinking with tool usage enables reasoning in small language models

Paper • 2507.05065 • Published Jul 7, 2025 • 16

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published Jul 16, 2025 • 43

upvoted an article 9 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8, 2025

•

768

upvoted a paper 9 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

upvoted a paper 10 months ago

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10, 2025 • 108

Somshubra Majumdar

AI & ML interests

Recent Activity

Organizations

smajumdar94's activity

Custom Kernels for All from Codex and Claude

Forge: Scalable Agent RL Framework and Algorithm

We Got Claude to Build CUDA Kernels and teach open models!

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

SmolLM3: smol, multilingual, long-context reasoner