Google announces Gemini 1.5 Flash, a rapid multimodal model with a 1M context window

May 15, 2024 | by

Google is announcing the release of Gemini 1.5 Flash, a small multimodal model built for scale and tackling narrow high-frequency tasks. It has a one million token context window and is available today in public preview through the Gemini API within Google AI Studio.

However, that’s not the only Gemini news. Gemini 1.5 Pro, which debuted in February, is receiving an enlarged context window, expanding to two million tokens from one million. Developers interested in this update will have to sign up for the waitlist.

There are some notable differences between Gemini 1.5 Flash and Gemini 1.5 Pro. The former is intended for those who care about output speed, while the latter has more weight and performs similarly to Google’s large 1.0 Ultra model. Josh Woodward, Google’s vice president of Google Labs, points out that developers should use Gemini 1.5 Flash if they are looking to address quick tasks where low latency matters. On the other hand, he explains Gemini 1.5 Pro is geared towards “more general or complex, often multi-step reasoning tasks.”

Developers now have a wider selection of AI from which to choose versus a one-size-fits-all approach. Not all apps require the same data and AI capabilities and having variations can make the difference in how users experience an AI-powered service. What may be appealing is that Google found a way to essentially bring a state-of-the-art AI model to developers while accelerating its performance. Perhaps the biggest downside is that it’s not trained on large enough datasets that developers may want. In that case, the next option is to move up to Gemini 1.5 Pro.

Google’s models span the spectrum from the most lightweight with Gemma and Gemma 2 to Gemini Nano, Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 1.0 Ultra. “Developers can move between the different sizes, depending on the use case. That’s why it’s got the same multimodal input abilities, the same long context, and, of course, runs as well in the same sort of backend,” Woodward points out.

This new small language model was revealed 24 hours after one of Google’s biggest AI competitors, OpenAI, unveiled GPT-4o, a multimodal LLM that will be available for all users and includes a desktop app.

Both Gemini 1.5 models are available in public preview in over 200 countries and territories worldwide, including the European Economic Area, the UK and Switzerland.

Updated as of May 14 at 12:06 p.m. PT: Corrected to state only Gemini 1.5 Pro will receive a two million context window, not Gemini 1.5 Flash.

The post Google announces Gemini 1.5 Flash, a rapid multimodal model with a 1M context window appeared first on Venture Beat.


View all

view all