-
mistralai/Devstral-2-123B-Instruct-2512
125B • Updated • 24.3k • 325 -
mistralai/Devstral-Small-2-24B-Instruct-2512
24B • Updated • 345k • 625 -
deepreinforce-ai/Ornith-1.0-397B
Text Generation • 397B • Updated • 7.36k • 192 -
deepreinforce-ai/Ornith-1.0-35B
Text Generation • 665k • Updated • 186k • • 297
Ji-Xiang
Ji-Xiang
AI & ML interests
None yet
Recent Activity
liked a dataset about 4 hours ago
wenbopan/Chinese-dpo-pairs liked a dataset about 4 hours ago
Blaze7451/Wiki-zh-20250601 liked a model 3 days ago
deepseek-ai/DeepSeek-V4-Pro-DSparkOrganizations
Image-Editing
Robotics
Reasoning models
Taiwanese Taigi Datasets
Image-Text-to-Text
Text Generation Inference
-
Qwen/QwQ-32B
Text Generation • 33B • Updated • 83.5k • • 2.93k -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
Text Generation • 71B • Updated • 463k • • 782 -
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Text Generation • 33B • Updated • 808k • • 1.57k -
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 810 • 281
Video generation
General screen parsing tool
Reasoning datasets
RLVR Datasets
Reinforcement Learning from Verifiable Rewards (RLVR) Datasets
WebGPU
HTML to Markdown
Logical Reasoning Datasets
Object Detection
Image-to-Video
SFT Datasets
Coder LLM
- PausedAgentsFeatured1.73k
Qwen2.5 Coder Artifacts
🐢1.73kGenerate and preview app code from a text description
-
Qwen/Qwen2.5-72B-Instruct
Text Generation • 73B • Updated • 626k • • 960 -
Qwen/Qwen2.5-32B-Instruct
Text Generation • 33B • Updated • 3.01M • • 354 -
Qwen/Qwen2.5-14B-Instruct
Text Generation • 15B • Updated • 2.12M • • 351
Multimodal Language Models
Suggestion Models
-
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 7.85M • • 13.4k -
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Text Generation • 33B • Updated • 808k • • 1.57k -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
Text Generation • 71B • Updated • 463k • • 782 -
allenai/Llama-3.1-Tulu-3-70B
Text Generation • 71B • Updated • 458 • 61
China models
China-dataset
unfiltered dataset
Edge Computing
Medical
GGUF Models
Visual Question Answering
Multi Tasks
DPO datasets
SLM (small language models)
-
HuggingFaceTB/SmolLM-135M
Text Generation • 0.1B • Updated • 180k • 261 -
HuggingFaceTB/SmolLM-135M-Instruct
Text Generation • 0.1B • Updated • 31.5k • 140 -
HuggingFaceTB/SmolLM-360M-Instruct
Text Generation • 0.4B • Updated • 6.24k • 85 -
HuggingFaceTB/SmolLM-360M
Text Generation • 0.4B • Updated • 7.71k • 70
Vision-Language dataset
Dense Passage Retrieval (DPR) Datasets
background-removal
Try on
Docling
V-JEPA 2
-
facebook/vjepa2-vitl-fpc64-256
Video Classification • 0.3B • Updated • 173k • 203 -
facebook/vjepa2-vith-fpc64-256
Video Classification • 0.7B • Updated • 150k • 20 -
facebook/vjepa2-vitg-fpc64-256
Video Classification • 1B • Updated • 163k • 56 -
facebook/vjepa2-vitg-fpc64-384
Video Classification • 1B • Updated • 6k • 42
GUI-Actor
1-bit Large Language Model (LLM)
GRPO datasets
Conversational Speech Model
OCR tools
Images Datasets
Critique Fine-Tuning (CFT) Datasets
Critique Fine-Tuning (CFT) Datasets
Test-time scaling Datasets
Please see below:
https://medium.com/@techsachin/s1-simple-test-time-scaling-approach-to-exceed-openais-o1-preview-performance-ec5a624c5d2f
Thinking/Reasoning Datasets
RLHF Datasets
Math Datasets
Multilingual-dataset
Retrieval-Augmented Generation (RAG) Dataset
Multilingual Large Language Models
-
meta-llama/Llama-3.2-1B
Text Generation • 1B • Updated • 1.85M • • 2.47k -
meta-llama/Llama-3.2-3B-Instruct
Text Generation • 3B • Updated • 2.22M • • 2.29k -
meta-llama/Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 9.69M • • 6.2k -
mistralai/Mistral-Small-24B-Base-2501
24B • Updated • 3.95k • 262
Recommended Datasets
Text-to-Video
Traditional-chinese-dataset
Chinese models
-
MediaTek-Research/Breeze-7B-Instruct-v0_1
Text Generation • 7B • Updated • 433 • 90 -
MediaTek-Research/Breeze-7B-Base-v0_1
Text Generation • 7B • Updated • 11 • 23 -
MediaTek-Research/Breeze-7B-Instruct-v1_0
Text Generation • 7B • Updated • 1.47k • 67 -
YC-Chen/Breeze-7B-Instruct-v1_0-GGUF
Text Generation • 7B • Updated • 71 • 22
Uncensored models
common-dataset
Image Generator
- Running on ZeroAgentsFeatured1.14k
Playground V2.5
🌍1.14kGenerate highly aesthetic images
- Running on ZeroAgentsFeatured9.48k
FLUX.1 [dev]
🖥9.48kGenerate images from text prompts
- RunningAgentsFeatured914
Kolors Character With Flux
🤹914Kolors Character to keep character developed with Flux
-
franciszzj/Leffa
Image-to-Image • Updated • 344
Voice
Big Language Models
-
deepseek-ai/DeepSeek-V2-Chat
Text Generation • 236B • Updated • 12.8k • 462 -
deepseek-ai/DeepSeek-V2
Text Generation • 236B • Updated • 5.34k • 334 -
zai-org/cogvlm2-llama3-chat-19B
Text Generation • 20B • Updated • 5.91k • 220 -
ibm-granite/granite-34b-code-base-8k-GGUF
Text Generation • 34B • Updated • 21 • 3
text-to-speech (TTS)
Chat
Vision
ORPO-DPO datasets
automatic speech recognition (ASR)
MoE
Audio-To-Text
Extreme Quantization
Agentic Coding
-
mistralai/Devstral-2-123B-Instruct-2512
125B • Updated • 24.3k • 325 -
mistralai/Devstral-Small-2-24B-Instruct-2512
24B • Updated • 345k • 625 -
deepreinforce-ai/Ornith-1.0-397B
Text Generation • 397B • Updated • 7.36k • 192 -
deepreinforce-ai/Ornith-1.0-35B
Text Generation • 665k • Updated • 186k • • 297
Docling
Image-Editing
V-JEPA 2
-
facebook/vjepa2-vitl-fpc64-256
Video Classification • 0.3B • Updated • 173k • 203 -
facebook/vjepa2-vith-fpc64-256
Video Classification • 0.7B • Updated • 150k • 20 -
facebook/vjepa2-vitg-fpc64-256
Video Classification • 1B • Updated • 163k • 56 -
facebook/vjepa2-vitg-fpc64-384
Video Classification • 1B • Updated • 6k • 42
Robotics
GUI-Actor
Reasoning models
1-bit Large Language Model (LLM)
Taiwanese Taigi Datasets
GRPO datasets
Image-Text-to-Text
Conversational Speech Model
Text Generation Inference
-
Qwen/QwQ-32B
Text Generation • 33B • Updated • 83.5k • • 2.93k -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
Text Generation • 71B • Updated • 463k • • 782 -
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Text Generation • 33B • Updated • 808k • • 1.57k -
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 810 • 281
OCR tools
Video generation
Images Datasets
General screen parsing tool
Critique Fine-Tuning (CFT) Datasets
Critique Fine-Tuning (CFT) Datasets
Reasoning datasets
Test-time scaling Datasets
Please see below:
https://medium.com/@techsachin/s1-simple-test-time-scaling-approach-to-exceed-openais-o1-preview-performance-ec5a624c5d2f
RLVR Datasets
Reinforcement Learning from Verifiable Rewards (RLVR) Datasets
Thinking/Reasoning Datasets
WebGPU
RLHF Datasets
HTML to Markdown
Math Datasets
Logical Reasoning Datasets
Multilingual-dataset
Object Detection
Retrieval-Augmented Generation (RAG) Dataset
Image-to-Video
Multilingual Large Language Models
-
meta-llama/Llama-3.2-1B
Text Generation • 1B • Updated • 1.85M • • 2.47k -
meta-llama/Llama-3.2-3B-Instruct
Text Generation • 3B • Updated • 2.22M • • 2.29k -
meta-llama/Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 9.69M • • 6.2k -
mistralai/Mistral-Small-24B-Base-2501
24B • Updated • 3.95k • 262
SFT Datasets
Recommended Datasets
Coder LLM
- PausedAgentsFeatured1.73k
Qwen2.5 Coder Artifacts
🐢1.73kGenerate and preview app code from a text description
-
Qwen/Qwen2.5-72B-Instruct
Text Generation • 73B • Updated • 626k • • 960 -
Qwen/Qwen2.5-32B-Instruct
Text Generation • 33B • Updated • 3.01M • • 354 -
Qwen/Qwen2.5-14B-Instruct
Text Generation • 15B • Updated • 2.12M • • 351
Text-to-Video
Multimodal Language Models
Traditional-chinese-dataset
Suggestion Models
-
deepseek-ai/DeepSeek-R1
Text Generation • 685B • Updated • 7.85M • • 13.4k -
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B
Text Generation • 33B • Updated • 808k • • 1.57k -
deepseek-ai/DeepSeek-R1-Distill-Llama-70B
Text Generation • 71B • Updated • 463k • • 782 -
allenai/Llama-3.1-Tulu-3-70B
Text Generation • 71B • Updated • 458 • 61
Chinese models
-
MediaTek-Research/Breeze-7B-Instruct-v0_1
Text Generation • 7B • Updated • 433 • 90 -
MediaTek-Research/Breeze-7B-Base-v0_1
Text Generation • 7B • Updated • 11 • 23 -
MediaTek-Research/Breeze-7B-Instruct-v1_0
Text Generation • 7B • Updated • 1.47k • 67 -
YC-Chen/Breeze-7B-Instruct-v1_0-GGUF
Text Generation • 7B • Updated • 71 • 22
China models
Uncensored models
China-dataset
common-dataset
unfiltered dataset
Image Generator
- Running on ZeroAgentsFeatured1.14k
Playground V2.5
🌍1.14kGenerate highly aesthetic images
- Running on ZeroAgentsFeatured9.48k
FLUX.1 [dev]
🖥9.48kGenerate images from text prompts
- RunningAgentsFeatured914
Kolors Character With Flux
🤹914Kolors Character to keep character developed with Flux
-
franciszzj/Leffa
Image-to-Image • Updated • 344
Edge Computing
Voice
Medical
Big Language Models
-
deepseek-ai/DeepSeek-V2-Chat
Text Generation • 236B • Updated • 12.8k • 462 -
deepseek-ai/DeepSeek-V2
Text Generation • 236B • Updated • 5.34k • 334 -
zai-org/cogvlm2-llama3-chat-19B
Text Generation • 20B • Updated • 5.91k • 220 -
ibm-granite/granite-34b-code-base-8k-GGUF
Text Generation • 34B • Updated • 21 • 3
GGUF Models
text-to-speech (TTS)
Visual Question Answering
Chat
Multi Tasks
Vision
DPO datasets
ORPO-DPO datasets
SLM (small language models)
-
HuggingFaceTB/SmolLM-135M
Text Generation • 0.1B • Updated • 180k • 261 -
HuggingFaceTB/SmolLM-135M-Instruct
Text Generation • 0.1B • Updated • 31.5k • 140 -
HuggingFaceTB/SmolLM-360M-Instruct
Text Generation • 0.4B • Updated • 6.24k • 85 -
HuggingFaceTB/SmolLM-360M
Text Generation • 0.4B • Updated • 7.71k • 70
automatic speech recognition (ASR)
Vision-Language dataset
MoE
Dense Passage Retrieval (DPR) Datasets
Audio-To-Text
background-removal
Extreme Quantization
Try on