Google DeepMind launched the Gemma series in February last year and released two open-source models with 2 billion and 7 billion parameters, respectively. At today's Google I/O developer conference, Google unveiled its heavyweight upgrade, the Gemma 2 series, with the first member being a larger-scale lightweight model with 27 billion parameters. However, this model will not be immediately available and is expected to be released in June this year.
Josh Woodward, Vice President of Google Research, explained this decision in detail at last week's press roundtable: "We deliberately chose the scale of 27 billion parameters, which has been optimized to run efficiently on a single TPU host in Nvidia's next-generation GPU or Vertex AI. This is also one of the reasons why it is easy to use. We have seen its outstanding performance, which even surpasses models twice its size."
The Gemma series is a lightweight model series created by Google specifically for developers, aiming to help them integrate AI technology into their applications and devices while avoiding excessive memory or processing power consumption. This makes the Gemma series particularly suitable for resource-constrained devices such as smartphones, IoT devices, and personal computers. Since its launch earlier this year, Google has added several variants to the Gemma series, including CodeGemma for code completion, RecurrentGemma for improved memory efficiency, and the recently launched visual-language model PaliGemma.
The Gemma 2 series, now with 27 billion parameters, promises developers more accurate results and superior performance while being able to handle more complex tasks than its predecessor. Trained on larger-scale datasets, the Gemma 2 series can provide higher-quality responses in a shorter amount of time.
Woodward emphasized that although the Gemma 2 series is designed to run on a single TPU, he is referring to Google's latest generation of computer chips, TPUv5e, which was released in August last year. This means that using the Gemma 2 series requires a dedicated AI chip to handle computational tasks, significantly reducing latency and efficiently processing complex tasks such as image recognition and natural language processing. For developers, this means they can save more resources to reinvest in their applications.
It is worth mentioning that the debut of the Gemma 2 series coincides with OpenAI's announcement of its multimodal LLM GPT-4o. GPT-4o is considered a "major upgrade" to the current user experience, especially for those using the free version of ChatGPT. However, Google DeepMind's Gemma 2 series undoubtedly provides new possibilities and directions for the development and application of AI technology.