Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

AI & ML interests

metallic intuition

Recent Activity

published a model about 1 hour ago

pszemraj/0xProto-368K-GGUF

updated a model about 1 hour ago

pszemraj/0xProto-368K-GGUF

upvoted an article 1 day ago

Introducing the Ettin Reranker Family

View all activity

Organizations

upvoted an article 1 day ago

Article

Introducing the Ettin Reranker Family

tomaarsen

•

4 days ago

• 41

upvoted a paper 2 days ago

Hierarchical Reasoning Model

Paper • 2506.21734 • Published Jun 26, 2025 • 54

upvoted 3 papers 9 days ago

upvoted an article 10 days ago

Article

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

bezzam, Steveeeeeeen, eustlb, SBruccoleriAppen, jmss-appen, c-e-ford-appen, wgb14, YukaiHuang, like2026, logicbean, ally-lxl

•

17 days ago

• 16

upvoted a paper 11 days ago

Investigating Efficiently Extending Transformers for Long Input Summarization

Paper • 2208.04347 • Published Aug 8, 2022 • 1

upvoted an article 11 days ago

Article

EMO: Pretraining mixture of experts for emergent modularity

allenai

•

14 days ago

• 37

upvoted an article 18 days ago

Article

Multimodal Embedding & Reranker Models with Sentence Transformers

tomaarsen

•

Apr 9

• 59

upvoted a collection 20 days ago

OlmPool

Collection

Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension". • 26 items • Updated 22 days ago • 5

upvoted 2 papers 21 days ago

A Survey on LLM-based Conversational User Simulation

Paper • 2604.24977 • Published 26 days ago • 8

Efficient Training on Multiple Consumer GPUs with RoundPipe

Paper • 2604.27085 • Published 24 days ago • 40

upvoted a paper 23 days ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 23

upvoted a collection 23 days ago

Olmo 3.1

Collection

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 52

upvoted an article 23 days ago

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

23 days ago

• 73

upvoted a paper 23 days ago

Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora

Paper • 2604.24819 • Published 26 days ago • 88

upvoted a collection 24 days ago

Laguna XS.2

Collection

Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 15 days ago • 21

upvoted a collection 26 days ago

Parakeet ASR

Collection

NeMo Parakeet ASR Models attain strong speech recognition accuracy while being efficient for inference. Available in CTC and RNN-Transducer variants. • 16 items • Updated 3 days ago • 74

upvoted 2 papers about 1 month ago

Multi-User Large Language Model Agents

Paper • 2604.08567 • Published Mar 19 • 27

BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

Paper • 2604.09497 • Published Apr 10 • 29

Peter Szemraj PRO

AI & ML interests

Recent Activity

Organizations

pszemraj's activity

Introducing the Ettin Reranker Family

Adding Benchmaxxer Repellant to the Open ASR Leaderboard

EMO: Pretraining mixture of experts for emergent modularity

Multimodal Embedding & Reranker Models with Sentence Transformers

Granite 4.1 LLMs: How They’re Built