🤖 Agents - a m-ric Collection

m-ric 's Collections

Could be useful one day

Scaling Laws 📏

🚀 Spinning Up in LLMs

🧑‍⚖️ LLM-as-a-judge

🔎⇒💬 RAG

🛣️ Grammar

💡 Interpretability - understanding LLMs

LLM foundations

🔧 Optimization Mechanics 🔧

Open-source AI Releases - August '24

Mother of all Training Clusters

🤖 Agents

updated Dec 31, 2024

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

Paper • 2310.03714 • Published Oct 5, 2023 • 37

Note I'm not a fan of the implementation, but I think the ideas behind DSPy are interesting.
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 11

Note The paper that introduced the concept of multi-agents!
GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 245

Note GAIA benchmark is the most challenging benchmark for generalist agents, requiring a good web browser, multimodal capabilities, and complex multi-step task solving.
Self-Discover: Large Language Models Self-Compose Reasoning Structures

Paper • 2402.03620 • Published Feb 6, 2024 • 117
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12, 2024 • 46
Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
Reflexion: Language Agents with Verbal Reinforcement Learning

Paper • 2303.11366 • Published Mar 20, 2023 • 5
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 6
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

Paper • 2303.11381 • Published Mar 20, 2023 • 2
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace

Paper • 2303.17580 • Published Mar 30, 2023 • 15
Communicative Agents for Software Development

Paper • 2307.07924 • Published Jul 16, 2023 • 6
More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3, 2024 • 57
ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 33

Note This paper is the basis for the Thought -> Action -> Observation cycle used in most agent frameworks nowadays.
Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 188

Note Has nice explanations as to why writing agent actions in code is better.
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Paper • 2310.06770 • Published Oct 10, 2023 • 9
Runtime error

145

Agent Data Analyst

🤔

145

Need to analyze data? Let a Llama-3.1 agent do it for you!
DynaSaur: Large Language Agents Beyond Predefined Actions

Paper • 2411.01747 • Published Nov 4, 2024 • 37
ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 89
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction

Paper • 2412.04454 • Published Dec 5, 2024 • 71

Note This paper displays much more impressive scores than ShowUI : but the VLMs used are also much larger (7B and 72B vs 2B) and based on the better Qwen2.5 instead of Qwen2.
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

Paper • 2401.00812 • Published Jan 1, 2024 • 11