7 165 88

Quentin Tardif

ntnq

AI & ML interests

None yet

Recent Activity

liked a Space about 8 hours ago

HuggingFaceFW/FinePDFsBlog

liked a Space 22 days ago

lvwerra/jagged-data-frontier

upvoted an article 22 days ago

Saving Memory Using Padding-Free Transformer Layers during Finetuning

View all activity

Organizations

liked a Space about 8 hours ago

FinePDFs: Liberating 3T of the finest tokens from PDFs

📄

liked a Space 22 days ago

The Jagged AI Frontier is a Data Frontier

🧭

Why AI capabilities are shaped by data availability

upvoted 2 articles 22 days ago

Article

Saving Memory Using Padding-Free Transformer Layers during Finetuning

Jun 11, 2024

•

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

23 days ago

•

104

liked a model 29 days ago

mistralai/Devstral-Small-2-24B-Instruct-2512

24B • Updated 16 days ago • 248k • 463

liked a model about 1 month ago

EssentialAI/rnj-1-instruct

Text Generation • 8B • Updated 15 days ago • 14.2k • • 293

upvoted an article about 1 month ago

Article

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Dec 4, 2025

•

liked a Space about 1 month ago

Evaluation Guidebook

📝

230

Display benchmark evaluation data for LLMs

upvoted an article about 1 month ago

Article

Continuous batching from first principles

Nov 25, 2025

•

297

upvoted a collection about 2 months ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 9 items • Updated 15 days ago • 158

upvoted 2 papers about 2 months ago

Fantastic Pretraining Optimizers and Where to Find Them

Paper • 2509.02046 • Published Sep 2, 2025 • 13

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published Nov 12, 2025 • 95

upvoted an article 2 months ago

Article

What makes good reasoning data

Oct 30, 2025

•

liked 2 Spaces 2 months ago

The Smol Training Playbook

📚

2.81k

The secrets to building world-class LLMs

Unlocking On-Policy Distillation for Any Model Family

📝

Apply on-policy distillation to any model family

upvoted an article 2 months ago

Article

On the Shifting Global Compute Landscape

Oct 29, 2025

•

upvoted a paper 3 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 501

liked 2 models 3 months ago

ServiceNow-AI/Apriel-1.5-15b-Thinker

Image-Text-to-Text • 15B • Updated Oct 6, 2025 • 2.39k • 461

facebook/cwm

33B • Updated Oct 15, 2025 • 40.9k • 255

upvoted a paper 4 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4, 2025 • 195

Quentin Tardif

AI & ML interests

Recent Activity

Organizations

ntnq's activity

FinePDFs: Liberating 3T of the finest tokens from PDFs

The Jagged AI Frontier is a Data Frontier

Saving Memory Using Padding-Free Transformer Layers during Finetuning

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand

Evaluation Guidebook

Continuous batching from first principles

What makes good reasoning data

The Smol Training Playbook

Unlocking On-Policy Distillation for Any Model Family

On the Shifting Global Compute Landscape