


The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4
On Thursday, American AI startup Inflection AI officially released a new generation of large language model Inflection-2.5.
According to reports, Inflection-2.5 will combine powerful LLM technology and Inflection’s unique “empathy fine-tuning” feature, integrating the characteristics of high emotional intelligence and high IQ. It can obtain factual information through the Internet, and its performance is comparable to leading large-scale models such as GPT-4 and Gemini.
Inflection-2.5 is now available to all Pi users for free on PC, iOS and Android apps. After a simple test by Heart of the Machine, we found that there is still a certain gap compared with GPT-4, but it is still worth a try. Interested users can experience it themselves.
Link: https://pi.ai/talk
It is worth noting that Inflection -2.5 achieves performance close to GPT-4, while the training process only uses 40% of the computing power of GPT-4.
Inflection AI points out that a new generation of large-scale models has made significant progress in areas such as intelligent coding and mathematics. These advances will translate into concrete improvements to key industry benchmarks, ensuring Pi remains at the forefront of technology. In addition, Pi also integrates world-class real-time web search capabilities to ensure that users have access to high-quality breaking news and the latest information.
Inflection-2.5 vs GPT-4
The FLOP used in Inflection-1 training is about 4% of GPT-4, in various In "IQ-oriented" tasks, its average performance is about 72% of the GPT-4 level. Now, Inflection-2.5 achieves an average performance of over 94% of GPT-4, despite using only 40% of GPT-4’s FLOPs for training. As shown in the figure below, the performance of Inflection-2.5 has achieved significant improvements across the board, with the greatest improvements in STEM domain knowledge.
The results of Inflection-2.5 on two different STEM exams - the Hungarian Mathematics Examination and the Physics Graduate Record Examination (GRE) - are as follows:
As shown in the table below, the study also evaluated Inflection-2.5 on the MMLU benchmark and GPQA Diamond benchmark. The MMLU benchmark covers 57 disciplines in STEM, humanities, social sciences, and more, effectively testing an LLM’s comprehensive knowledge capabilities, while the GPQA Diamond benchmark is an extremely difficult expert-level benchmark.
On the BIG-Bench-Hard benchmark, Inflection-2.5 improves performance by more than 10% than Inflection-1 and is comparable to GPT-4 Comparable. The BIG-Bench-Hard benchmark mainly covers problems that are difficult to solve with large language models.
The study was also evaluated on the MT-Bench benchmark. However, the research team realized that the benchmark had a large portion (nearly 25%) of sample examples in the Reasoning, Mathematics, and Coding categories with incorrect reference solutions or flawed premises. Therefore, the study corrected these examples and performed the evaluation experiments again, and the results are shown in the following table:
Evaluation on GSM8k and MATH benchmarks The results show that Inflection-2.5 is a significant improvement over Inflection-1 in terms of math and coding capabilities:
To further test the coding of Inflection-2.5 Ability, this study conducted evaluation experiments on two coding benchmarks, MBPP and HumanEval, and the results are shown in the following table:
The research team evaluated Inflection-2.5 on HellaSwag and ARC-C, as well as various models on common sense and scientific benchmarks. Judging from the results below, Inflection-2.5 achieves strong performance on these benchmarks.
Additionally, all of the above evaluations were done using models that now support Pi. However, it is also important to note that the user experience may vary slightly due to network retrieval (the above benchmark does not use network retrieval), the structure of the few-shot prompts, and other production aspects.
In general, Inflection-2.5 maintains Pi’s “heart-centered” features and extremely high security standards, becoming a more comprehensive and useful model.
In recent times, the technology competition for large language models has entered a fierce stage. Among many technology companies, Mistral AI (Mistral Large ), Anthropic (Claude 3) stand out, and the new technology proposed achieves capabilities close to GPT-4 and Gemini Ultra. Inflection-2.5, which appeared yesterday, seems to be joining the first echelon.
As a star startup in Silicon Valley, Inflection AI has a long history. It was established in 2022. The three co-founders are Mustafa Suleyman, the original co-founder of DeepMind, and the co-founder of Linkedln. Reid Hoffman, and former DeepMind chief scientist Karen Simonyan.
In June last year, Inflection AI announced that it had received US$1.3 billion in financing from Microsoft, Nvidia, Reid Hoffman, Bill Gates, and former Google CEO Eric Schmidt led the investment. Currently, Inflection AI has become the fourth largest generative AI startup in the world.
The above is the detailed content of The new model that challenges OpenAI is now available for free, with 40% of the computing power and performance approaching GPT-4. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

WorldCoin (WLD) stands out in the cryptocurrency market with its unique biometric verification and privacy protection mechanisms, attracting the attention of many investors. WLD has performed outstandingly among altcoins with its innovative technologies, especially in combination with OpenAI artificial intelligence technology. But how will the digital assets behave in the next few years? Let's predict the future price of WLD together. The 2025 WLD price forecast is expected to achieve significant growth in WLD in 2025. Market analysis shows that the average WLD price may reach $1.31, with a maximum of $1.36. However, in a bear market, the price may fall to around $0.55. This growth expectation is mainly due to WorldCoin2.

Factors of rising virtual currency prices include: 1. Increased market demand, 2. Decreased supply, 3. Stimulated positive news, 4. Optimistic market sentiment, 5. Macroeconomic environment; Decline factors include: 1. Decreased market demand, 2. Increased supply, 3. Strike of negative news, 4. Pessimistic market sentiment, 5. Macroeconomic environment.

Exchanges that support cross-chain transactions: 1. Binance, 2. Uniswap, 3. SushiSwap, 4. Curve Finance, 5. Thorchain, 6. 1inch Exchange, 7. DLN Trade, these platforms support multi-chain asset transactions through various technologies.

Aavenomics is a proposal to modify the AAVE protocol token and introduce token repos, which has implemented a quorum for AAVEDAO. Marc Zeller, founder of the AAVE Project Chain (ACI), announced this on X, noting that it marks a new era for the agreement. Marc Zeller, founder of the AAVE Chain Initiative (ACI), announced on X that the Aavenomics proposal includes modifying the AAVE protocol token and introducing token repos, has achieved a quorum for AAVEDAO. According to Zeller, this marks a new era for the agreement. AaveDao members voted overwhelmingly to support the proposal, which was 100 per week on Wednesday

In the bustling world of cryptocurrencies, new opportunities always emerge. At present, KernelDAO (KERNEL) airdrop activity is attracting much attention and attracting the attention of many investors. So, what is the origin of this project? What benefits can BNB Holder get from it? Don't worry, the following will reveal it one by one for you.

The steps to draw a Bitcoin structure analysis chart include: 1. Determine the purpose and audience of the drawing, 2. Select the right tool, 3. Design the framework and fill in the core components, 4. Refer to the existing template. Complete steps ensure that the chart is accurate and easy to understand.

Suggestions for choosing a cryptocurrency exchange: 1. For liquidity requirements, priority is Binance, Gate.io or OKX, because of its order depth and strong volatility resistance. 2. Compliance and security, Coinbase, Kraken and Gemini have strict regulatory endorsement. 3. Innovative functions, KuCoin's soft staking and Bybit's derivative design are suitable for advanced users.

The platforms that have outstanding performance in leveraged trading, security and user experience in 2025 are: 1. OKX, suitable for high-frequency traders, providing up to 100 times leverage; 2. Binance, suitable for multi-currency traders around the world, providing 125 times high leverage; 3. Gate.io, suitable for professional derivatives players, providing 100 times leverage; 4. Bitget, suitable for novices and social traders, providing up to 100 times leverage; 5. Kraken, suitable for steady investors, providing 5 times leverage; 6. Bybit, suitable for altcoin explorers, providing 20 times leverage; 7. KuCoin, suitable for low-cost traders, providing 10 times leverage; 8. Bitfinex, suitable for senior play
