AI & ML interests

AI, ML, LLM, NLP

jjokah 
posted an update about 1 month ago
view post
Post
1055
TranslateGemma: Open Translation Models (Jan 2026)

Google introduces TranslateGemma, a new suite of open translation models based on Gemma 3, available in 4B, 12B, and 27B parameter sizes.

Key Highlights:
• Supports 55 languages with high-quality translation across high-, mid-, and low-resource languages
• Exceptional efficiency: 12B model outperforms 27B baseline on WMT24++ benchmark
• Built using two-stage fine-tuning process distilling knowledge from Gemini models
• Retains strong multimodal capabilities (can translate text within images)
• Trained on nearly 500 additional language pairs for research adaptation
• Designed for diverse deployment environments from mobile to cloud

The models achieve state-of-the-art performance while maintaining exceptional efficiency, making high-quality translation accessible across different devices and use cases.

https://huggingface.co/collections/google/translategemma
jjokah 
posted an update 3 months ago
jjokah 
posted an update 6 months ago
view post
Post
333
The combination of Gemini Nano (Google's AI model) and the Tensor G5 chip (Google's AI processor), built into the Pixel 10 (Google's Smartphone), provides Google with a strong foundation to continue pushing the limits of edge AI → 🔮Magic Cue.

Magic Cue digs through your device (Gmail, Calendar, Messages, Photos, screenshots, notes, and more) to surface what’s useful at that moment.

Ref (Magic Cue):
https://store.google.com/intl/en/ideas/articles/magic-cue/
jjokah 
posted an update 6 months ago
jjokah 
posted an update 11 months ago
view post
Post
2412
# Video Tokenization — for efficient AI video processing

Meet 𝐕𝐢𝐝𝐓𝐨𝐤, a new open-source video tokenization technique developed by Microsoft Research to address the computational challenges of processing large volumes of video data. The core problem VidTok tackles is the inefficiency caused by redundant information in raw video pixels.

VidTok converts complex video footage into compact, structured units called tokens, making it easier and more efficient for AI systems to analyze, understand, and generate video content.

Research Paper: https://arxiv.org/abs/2412.13061
VidTok Code: https://github.com/microsoft/VidTok
jjokah 
posted an update 12 months ago
view post
Post
4650
The past few years have been a blast for artificial intelligence, with large language models (LLMs) stunning everyone with their capabilities and powering everything from chatbots to code assistants. However, not all applications demand the massive size and complexity of LLMs, the computational power required makes them impractical for many use cases. This is why Small Language Models (SLMs) entered the scene to make powerful AI models more accessible by shrinking in size.

In this article we went through what SLMs are, how they are made small, their benefits and limitations, real-world use cases, and how they can be used on mobile and desktop devices.
https://huggingface.co/blog/jjokah/small-language-model
  • 2 replies
·
jjokah 
posted an update over 1 year ago
view post
Post
799
Google's revamped Machine Learning Crash Course covers the recent advances in AI, with an increased focus on interactive learning.

📝 100+ exercises
🗂 12 modules
🕒 15 hours
📹 Video explainers of ML concepts
🌎 Real-world examples
📊 Interactive visualizations

Ref:
https://developers.google.com/machine-learning/crash-course
jjokah 
posted an update over 1 year ago
view post
Post
1875
🔗 Neural Network  (1 Byte explainer for everybody)

Just like our brain, a Neural Network is made up of interconnected "neurons". These neurons work together by learning from (input) data and getting better at tasks (in the hidden layer) to give (output) predictions or decisions.