arxiv:2603.10178
Taiwei Shi
MaksimSTW
AI & ML interests
reinforcement learning, alignment, human-AI collaboration, and computational social science
Recent Activity
authored a paper about 23 hours ago
Video-Based Reward Modeling for Computer-Use Agents upvoted a paper 11 days ago
Video-Based Reward Modeling for Computer-Use Agents authored a paper 20 days ago
DP-RFT: Learning to Generate Synthetic Text via Differentially Private Reinforcement Fine-Tuning