option
Home
Flash News
Content
StevenMartin
StevenMartin
June 4, 2026

Google released Gemma 4 12B, a unified multimodal model that eliminates traditional encoders to directly process visual and audio data. It requires only 16GB VRAM for local deployment on consumer hardware. Using lightweight embedding layers, it reduces computational complexity while approaching the performance of the 26B MoE model. Open-sourced under Apache 2.0, it supports multiple inference frameworks and edge deployment, with over 150 million downloads.

Google released Gemma 4 12B, a unified multimodal model that eliminates traditional encoders to directly process visual and audio data. It requires only 16GB VRAM for local deployment on consumer hardware. Using lightweight embedding layers, it reduces computational complexity while approaching the performance of the 26B MoE model. Open-sourced under Apache 2.0, it supports multiple inference frameworks and edge deployment, with over 150 million downloads.
Comments (0)
0/300
OR