view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 11 days ago β’ 47
MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes Paper β’ 2510.16380 β’ Published Oct 18, 2025 β’ 2
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation Paper β’ 2511.15958 β’ Published Nov 20, 2025 β’ 1
VeriCoT: Neuro-symbolic Chain-of-Thought Validation via Logical Consistency Checks Paper β’ 2511.04662 β’ Published Nov 6, 2025 β’ 36
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 β’ 159
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness Paper β’ 2308.08708 β’ Published Aug 17, 2023 β’ 5
Multi-Level Aware Preference Learning: Enhancing RLHF for Complex Multi-Instruction Tasks Paper β’ 2505.12845 β’ Published May 19, 2025 β’ 1
Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus Paper β’ 2411.12498 β’ Published Nov 19, 2024 β’ 2
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper β’ 2410.02884 β’ Published Oct 3, 2024 β’ 54
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper β’ 2401.01335 β’ Published Jan 2, 2024 β’ 69
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model Paper β’ 2312.11370 β’ Published Dec 18, 2023 β’ 20
Prompting Is Programming: A Query Language for Large Language Models Paper β’ 2212.06094 β’ Published Dec 12, 2022 β’ 1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper β’ 2308.03188 β’ Published Aug 6, 2023 β’ 2