Daniel Khashabi's picture

1 27 4

Daniel Khashabi

danyaljj

·

danyaljj

AI & ML interests

None yet

Recent Activity

liked a dataset 9 days ago

NIH-CARD/CARDBiomedBench

upvoted a paper about 1 month ago

ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models

upvoted a paper about 2 months ago

Genomic Next-Token Predictors are In-Context Learners

View all activity

Organizations

upvoted a paper about 1 month ago

ChiKhaPo: A Large-Scale Multilingual Benchmark for Evaluating Lexical Comprehension and Generation in Large Language Models

Paper • 2510.16928 • Published Oct 19, 2025 • 4

upvoted 2 papers about 2 months ago

Genomic Next-Token Predictors are In-Context Learners

Paper • 2511.12797 • Published Nov 16, 2025 • 7

SynthTextEval: Synthetic Text Data Generation and Evaluation for High-Stakes Domains

Paper • 2507.07229 • Published Jul 9, 2025 • 11

upvoted 2 papers 2 months ago

World-in-World: World Models in a Closed-Loop World

Paper • 2510.18135 • Published Oct 20, 2025 • 76

MedScore: Generalizable Factuality Evaluation of Free-Form Medical Answers by Domain-adapted Claim Decomposition and Verification

Paper • 2505.18452 • Published May 24, 2025 • 4

upvoted 3 papers 3 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9, 2025 • 41

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

Paper • 2509.22621 • Published Sep 26, 2025 • 8

The Flaw of Averages: Quantifying Uniformity of Performance on Benchmarks

Paper • 2509.25671 • Published Sep 30, 2025 • 6

upvoted 4 papers 4 months ago

mmBERT: A Modern Multilingual Encoder with Annealed Language Learning

Paper • 2509.06888 • Published Sep 8, 2025 • 12

The Trickle-down Impact of Reward (In-)consistency on RLHF

Paper • 2309.16155 • Published Sep 28, 2023 • 1

Jailbreak Distillation: Renewable Safety Benchmarking

Paper • 2505.22037 • Published May 28, 2025 • 1

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2, 2025 • 24

upvoted 3 papers 6 months ago

Seq vs Seq: An Open Suite of Paired Encoders and Decoders

Paper • 2507.11412 • Published Jul 15, 2025 • 30

The Translation Barrier Hypothesis: Multilingual Generation with Large Language Models Suffers from Implicit Translation Failure

Paper • 2506.22724 • Published Jun 28, 2025 • 10

Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning

Paper • 2506.02327 • Published Jun 2, 2025 • 20

upvoted 3 papers 7 months ago

Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback

Paper • 2506.11930 • Published Jun 13, 2025 • 53

BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases

Paper • 2505.20321 • Published May 23, 2025 • 5

Lost in the Haystack: Smaller Needles are More Difficult for LLMs to Find

Paper • 2505.18148 • Published May 23, 2025 • 5

upvoted 2 papers 8 months ago

Certified Mitigation of Worst-Case LLM Copyright Infringement

Paper • 2504.16046 • Published Apr 22, 2025 • 13

ICL CIPHERS: Quantifying "Learning'' in In-Context Learning via Substitution Ciphers

Paper • 2504.19395 • Published Apr 28, 2025 • 5