Google Unveaks Its Most Powerful Text Embedding Model Yet: Gemini Embedding

2025-03-10

On Friday, Google introduced a brand-new experimental text "embedding" model called Gemini Embedding to its Gemini developer API.

Embedding models transform text inputs, such as words and phrases, into numerical representations known as embeddings, which capture the semantic meaning of the text. These embeddings are utilized in various applications like document retrieval and classification, largely because they can reduce costs and improve latency.

Companies such as Amazon, Cohere, and OpenAI already offer embedding models through their respective APIs. While Google has previously provided embedding models, Gemini Embedding marks its first model trained on the Gemini series of AI models.

In a blog post, Google stated, "This embedding model is trained on the Gemini model, inheriting Gemini's understanding of language and nuanced context, making it suitable for a wide range of applications." "We have trained our model to be highly versatile, performing exceptionally well across different domains such as finance, science, law, and search."

Google claims that Gemini Embedding surpasses its previous state-of-the-art embedding model, text-embedding-004, delivering competitive performance on popular embedding benchmarks. Compared to text-embedding-004, Gemini Embedding can process larger chunks of text and code at once and supports twice as many languages (over 100).

Google noted that Gemini Embedding is still in the "experimental stage" with limited capacity and may undergo changes. The company wrote in its blog post, "We are working towards launching a stable and widely accessible version in the coming months."