Hands with Gemini 2.5 Pro: This can still be the most useful model of thinking


Join our daily and weekly newsletters for the latest updates and exclusive content in the industry’s leading AI coverage. Learn more


Unfortunately GoogleThe release of the latest flagship language model, Gemini 2.5 Prowas buried under Studio Ghibli EU Image Storm It sucks the air from the AI ​​space. And perhaps afraid of previous failed launches, Google presented with caution As the approach of other AI laboratories to introduce new models as the best in the world, like our « Smart EU Model ».

However, the Gemini 2.5 Pro that shows practical experiments with real world examples can be really impressive and the best justification model. This opens the way of many new apps, and perhaps the Google generation puts on the front front of the AI ​​race.

Polymarket ai race
Source: Polymarket

Long context with good coding capabilities

The outstanding feature of the Gemini 2.5 Pro is a very long context window and output length. The model allows you to quickly display up to 1 million Token (2 million future), very long documents and all code deposes as needed. The model has a speech limit of about 8,000 to 64,000 Token for other twin models.

The long context window also allows extended conversations, because each interaction with a model can create tens of thousands of tokens, especially if the code, pictures and videos are included (200,000-Token context window Claude 3.7 Sonnet.).

For example, the software engineer Simon Willison, used the Gemini 2.5 Pro to create a new feature for the website. Willison is sad on the blog« This spoke about the whole codon and as a result, as soon as I saw, I started all 4 minutes per project – 45 minutes from 18 files.

Effective multimodal justification

Gemini 2.5 Pro has effective thinking abilities on unstructured text, pictures and videos. For example, I provided this with the text of my last article Sample Based Search and asked to create a SVG schedule that describes the algorithm described in the text. Gemini 2.5 Pro, properly removed the correct information from the article and created a flowchart for sampling and search process, get conventional steps correctly. (For reference, the same task clod 3.7 has been a lot of interactions with more than one interaction with Sonnet and I ended the mark limit.)

There are some visual errors in the displayed image (arrowheads are incorrect). I can use a facelift, so this is a very modal desire to code 2.5 Pro, but along with a modal desire to develop and develop it and develop it. The results were impressive. OK fixed heads and improved the visual quality of the diagram.

Other users enjoyed similar experiences with multimodal tips. For example, in Their tests, Datacamp, the athlete presented in Google Blog repeated the game example, then provided the Gemini 2.5 Pro and video records and asked for some changes in the game code. Model can think of visuals, and finds some of the code to be changed and make correct changes.

However, it should be noted as other generative models, the Gemini 2.5 Pro is prone to make mistakes as changing non-related files and code segments. How accurate your instructions are the risk of the model that makes the wrong changes.

Analysis of data with useful thinking track

Finally, I tested the Gemini 2.5 Pro Classic Messy Data Analysis Test for justification models. I presented it with a file that made it straight text and raw HTML data that I copied and pasted from various Stock History Pages in Yahoo! Finance. Then I asked him to calculate the value of a portfolio that will invest $ 140 every month, spread equal in early January 2024, spread equal to the deadline in the document.

The file has determined which shares of the model, which is calculated from the File (APPLE, NVIDIA, Microsoft, Tesla, Alphabet, Alphabet and Meta, and the cost of stock and portfolio.

More importantly, I considered the trail of thinking very useful. Google reveals the views-minded (cot) to the Gemini 2.5 Pro, but the trace of thinking is very detailed. You can clearly see how the model is based on how it is substantiated, removing different data and the calculation of the results that create answers. This can help you sit in the right direction when you do the behavior of the model and mistake.

Enterprise grade justification?

A concern about the Gemini 2.5 Pro, it is possible only in thinking mode, ie the model will always pass through the « Thinking » process for very simple hints that can always be answered directly.

Gemini 2.5 Pro is currently in preview. Completely available model and price information, we will better understand how much it will cost business applications on the model. However, as the resulting costs continue to fall, we can wait for this scale to be practical.

Gemini 2.5 Pro could not be leap debut, but the opportunities require attention. Its mass context window, effective multimodal justification and detailed thinking chain, offers material advantages for a complex enterprise workload for complex enterprise workload until the analysis of Nuansen data.



Source link