9 56 100

BuiDoan

AI & ML interests

None yet

Recent Activity

reacted to unmodeled-tyler's post with 👀 4 days ago

PSA: LiteLLM has been compromised on PyPI - if you have it installed, CHECK NOW. LiteLLM is used as a dependency in A LOT of AI tooling, so there's a pretty good chance that you have it installed somewhere on your machine (my instance was part of Hermes Agent, but I was unaffected by the hack) Versions 1.82.7 & 1.82.8 on PyPI have been compromised with a multi-stage credential stealer. - Version 1.82.8 uses a .pth file that executes on EVERY python process startup. You don't even need to import litellm. Just having it installed is enough. - The payload harvests SSH keys, .env files, AWS/GCP/Azure credentials, Kubernetes configs, database passwords, crytpo wallets, shell history - basically every secret on your machine. - Stolen data is encrypted with a hardcoded RSA key and exfiltrated to a domain that is NOT part of a legitimate litellm infrastructure. - If you're running Kubernetes, it attempts lateral movement across the entire cluster. - The C2 is hosted on the Internet Computer blockchain, making it essentially impossible to take down. This is part of a coordinated campaign by a threat actor called TeamPCP who have also hit Trivy (Aqua Security), Checkmarx KICS, and multiple npm packages in the last week ALONE. What to do: 1. Run 'pip show litellm' in every environment you have 2. If you're on 1.82.7 or 1.82.8 - rotate EVERY secret on that machine immediately. 3. Check for persistence artifacts ~/.config/sysmon/sysmon.py & ~/.config/systemd/user/sysmon.service I was lucky in this case that my litellm version was out of date, but if you've installed litellm as a dependency in ANY package within the last 24ish hours, you're gonna want to check. SOURCES https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/ Same group, different attack a couple of days ago: https://www.stepsecurity.io/blog/canisterworm-how-a-self-propagating-npm-worm-is-spreading-backdoors-across-the-ecosystem

liked a model 4 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

liked a model 13 days ago

mistralai/Mistral-Small-4-119B-2603

View all activity

Organizations

upvoted a collection 4 months ago

📙 LLM Engineer's Handbook

Collection

Models and datasets from my book. All the code is freely available at https://github.com/PacktPublishing/LLM-Engineers-Handbook • 6 items • Updated Apr 7, 2025 • 16

upvoted a collection 9 months ago

Kimi-K2

Collection

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated Jan 27 • 172

upvoted an article 11 months ago

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

Apr 30, 2025

•

upvoted 3 papers 11 months ago

MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining

Paper • 2505.07608 • Published May 12, 2025 • 82

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8, 2025 • 187

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6, 2025 • 191

upvoted an article 11 months ago

Article

Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

Jun 16, 2023

•

upvoted 2 papers 11 months ago

100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models

Paper • 2505.00551 • Published May 1, 2025 • 36

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29, 2025 • 54

upvoted an article 11 months ago

Article

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

Apr 27, 2025

•

upvoted a paper 11 months ago

R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

Paper • 2505.02835 • Published May 5, 2025 • 28

upvoted 2 articles 11 months ago

Article

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Mar 17, 2025

•

355

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Jan 31, 2025

•

upvoted 3 papers 11 months ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30, 2025 • 54

BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25, 2025 • 49

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30, 2025 • 49

upvoted 2 articles 11 months ago

Article

Open R1: Update #3

Mar 11, 2025

•

297

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28, 2025

•

888

upvoted 2 papers 11 months ago

Trillion 7B Technical Report

Paper • 2504.15431 • Published Apr 21, 2025 • 38

Tina: Tiny Reasoning Models via LoRA

Paper • 2504.15777 • Published Apr 22, 2025 • 56

BuiDoan

AI & ML interests

Recent Activity

Organizations

BuiDoan's activity

The 4 Things Qwen-3’s Chat Template Teaches Us

Yes, Transformers are Effective for Time Series Forecasting (+ Autoformer)

What is MoE 2.0? Update Your Knowledge about Mixture-of-experts

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

Open R1: Update #3

Open-R1: a fully open reproduction of DeepSeek-R1