Multimodal Autoregressive Pre-training of Large Vision Encoders
Paper
• 2411.14402 • Published
• 47
timm compatible AIM-v2 (https://huggingface.co/papers/2411.14402) image encoder weights from https://huggingface.co/apple/aimv2-large-patch14-336-distilled