VoladorLuYu 's Collections Research on LLM
updated
When can transformers reason with abstract symbols?
Paper
• 2310.09753
• Published
• 3
In-Context Pretraining: Language Modeling Beyond Document Boundaries
Paper
• 2310.10638
• Published
• 30
Reward-Augmented Decoding: Efficient Controlled Text Generation With a
Unidirectional Reward Model
Paper
• 2310.09520
• Published
• 11
Connecting Large Language Models with Evolutionary Algorithms Yields
Powerful Prompt Optimizers
Paper
• 2309.08532
• Published
• 54
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Paper
• 2310.11441
• Published
• 29
ControlLLM: Augment Language Models with Tools by Searching on Graphs
Paper
• 2310.17796
• Published
• 18
Ultra-Long Sequence Distributed Transformer
Paper
• 2311.02382
• Published
• 6
Can LLMs Follow Simple Rules?
Paper
• 2311.04235
• Published
• 13
Everything of Thoughts: Defying the Law of Penrose Triangle for Thought
Generation
Paper
• 2311.04254
• Published
• 15
mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with
Modality Collaboration
Paper
• 2311.04257
• Published
• 22
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper
• 2311.01282
• Published
• 37
Language Models can be Logical Solvers
Paper
• 2311.06158
• Published
• 20
Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection
Paper
• 2310.11511
• Published
• 79
Adapting Large Language Models via Reading Comprehension
Paper
• 2309.09530
• Published
• 82
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads
to Answers Faster
Paper
• 2311.08263
• Published
• 16
Contrastive Chain-of-Thought Prompting
Paper
• 2311.09277
• Published
• 35
DoLa: Decoding by Contrasting Layers Improves Factuality in Large
Language Models
Paper
• 2309.03883
• Published
• 36
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper
• 2309.03852
• Published
• 45
Effective Long-Context Scaling of Foundation Models
Paper
• 2309.16039
• Published
• 31
Zephyr: Direct Distillation of LM Alignment
Paper
• 2310.16944
• Published
• 123
Textbooks Are All You Need II: phi-1.5 technical report
Paper
• 2309.05463
• Published
• 89
The ART of LLM Refinement: Ask, Refine, and Trust
Paper
• 2311.07961
• Published
• 11
Interpreting Pretrained Language Models via Concept Bottlenecks
Paper
• 2311.05014
• Published
• 1
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
• 2307.09288
• Published
• 250
From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with
Small Language Models
Paper
• 2311.06754
• Published
• 1
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper
• 2312.00752
• Published
• 150
Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized
Model Responses
Paper
• 2312.00763
• Published
• 23
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
• 2312.11514
• Published
• 260
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Paper
• 2304.13712
• Published
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language
Models
Paper
• 2401.01335
• Published
• 68
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper
• 2401.02954
• Published
• 53
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper
• 2401.12954
• Published
• 33
Improving Text Embeddings with Large Language Models
Paper
• 2401.00368
• Published
• 82
H2O-Danube-1.8B Technical Report
Paper
• 2401.16818
• Published
• 18
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large
Language Models
Paper
• 2402.10524
• Published
• 23
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
Paper
• 2404.05961
• Published
• 66