


AI solves college mathematics problems in a few seconds, achieves an accuracy rate of more than 80%, and also acts as a question teacher
Maybe the math test questions you took were machine-generated.
MIT students can easily solve mathematical topics such as multivariate calculus, differential equations, linear algebra, etc., but these But the machine learning model was stumped. Because machine learning models can only answer elementary or high school level math questions, and they don’t always find the right answer.
##Now, researchers from MIT, Columbia University, Harvard University and the University of Waterloo use small sample learning and OpenAI’s Codex to automatically synthesize programs and solve the problem in a few seconds. solved college mathematics problems and reached human level. The research was published in the Proceedings of the National Academy of Sciences (PNAS).
In addition, the model can explain the generated solutions and quickly generate new college mathematics problems. When the researchers showed these machine-generated questions to students, the students couldn't even tell whether the questions were generated by an algorithm or a human.
This research can also be used to streamline course content generation, which is especially useful for schools with thousands of students and massive open online courses (MOOCs). The system can also act as an online tutor, showing students the steps to solve math problems.
Paper address: https://www.pnas.org/doi/epdf/10.1073/pnas.2123433119
The method of this study combines three innovations:
- Unlike pre-training only on text, this study While pre-training on the text, fine-tuning is also performed on the code;
- Using small sample learning to synthesize the program can correctly solve mathematical problems;
- The research can solve problems, explain solutions, and generate new questions.
Examples of new questions generated by this research are as follows.
##A model that can answer, solve and pose questions
The study randomly selected 25 problems from seven courses: MIT's 18.01 Single Variable Calculus, 18.02 Multivariable Calculus, 18.03 Differential Equations, 18.05 Introduction to Probability and Statistics, 18.06 Linear Algebra, 6.042 Mathematics for Computer Science, and COMS3251 Computational Linear Algebra from Columbia University.
For the MATH dataset, the study randomly sampled 15 questions from six topics in the dataset (Algebra, Counting and Probability, Intermediate Algebra, Number Theory, Preliminary Algebra, and Calculus) .
It is different from networks such as GPT-3 that are only pre-trained on text. They transformed these problems into programming tasks and applied program synthesis and few-shot learning techniques. Turning a mathematical problem into a programming task can be as simple as rewriting the problem of finding the distance between two points as writing a program to find the difference between two points.
It is worth mentioning that this research not only pre-trained Codex on text, but also fine-tuned the code so that it can generate programs for solving large-scale mathematical problems.
Pre-trained models display millions of code examples from online repositories. Because the model's training data includes millions of natural language words and millions of lines of code, it can learn relationships between snippets of text and snippets of code. As shown in the figure below, this study uses zero-shot and small-shot learning to automatically generate a program that can solve 81% of mathematical problems. They then use the Codex to interpret the resulting program. The generated program can output answers in many forms. For example, calculating and depicting the geometric shape of singular value decomposition (SVD) not only gives the correct answer, but also the corresponding explanation! Drori, one of the authors of the paper, explained that many mathematical problems can be solved using graphs or trees, but problems written in text are difficult to converted into this representation. However, because the model has learned the relationship between text and code, it can convert text questions into code by simply giving a few examples of question code and then running the code to answer the question. "When you ask questions using only text, it is difficult for machine learning models to give answers, even though the answer may be in the text. This work fills in the missing gap in code and program synthesis. Partly," Drori said. Drori also added that this work is the first to solve an undergraduate mathematics problem and improves accuracy from 8% to more than 80%. In fact, it is not always easy to convert mathematical problems into programming tasks. Some problems require researchers to add context so that neural networks can handle the problem correctly. A student will learn this background knowledge while taking the course, but neural networks do not have this background knowledge unless explicitly stated by the researcher. For example, they need to explain that the network in the text refers to a neural network and not a communication network. Or they may need to tell the model which programming package to use. They may also need to provide certain definitions, for example in a question about playing cards, they may need to tell the model that each deck contains 52 cards. The study automatically feeds these programming tasks, along with included context and examples, into a pre-trained and fine-tuned neural network, which outputs a neural network that typically produces the correct answer. program of. More than 80% of the questions were correct. The researchers also used their model to generate questions, by giving a neural network a series of mathematical questions about a topic and then letting it create a new question. For example, there is the problem of quantum detection of horizontal and vertical lines, which creates a new problem of quantum detection of diagonals. So it's not just creating new problems by replacing values and variables in existing problems. The researchers tested the machine-generated questions by showing them to college students. The researchers randomly gave students 10 problems from an undergraduate mathematics course; five were created by humans and five were generated by machines. Students were unable to tell whether the machine-generated questions were generated by an algorithm or a human, and they gave similar ratings on the difficulty and appropriateness of the course. However, Drori noted that this work is not intended to replace human professors. "Now the accuracy has reached 80%, but it will not reach 100%. Every time you solve a problem, someone will ask a harder problem. But this work It opens up the field for people to start using machine learning to solve increasingly difficult problems. We think this will have a huge impact on higher education," Drori said. The research team is excited about the success of their approach and is extending the work to handle mathematical proofs. They also plan to address some limitations. Currently, the model cannot use a visual component. Answering questions also fails to solve problems that are difficult to compute due to computational complexity. In addition to overcoming these obstacles, the research also aims to scale the model to hundreds of courses. With these courses, they will generate more data to increase automation and provide insights into course design and curriculum. Apply neural networks with OpenAI Codex to solve, interpret and generate mathematical problems.
Add context
Human-Asked Questions vs. Machine-Generated Questions
The above is the detailed content of AI solves college mathematics problems in a few seconds, achieves an accuracy rate of more than 80%, and also acts as a question teacher. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











The top ten cryptocurrency exchanges in the world in 2025 include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi, Bitfinex, KuCoin, Bittrex and Poloniex, all of which are known for their high trading volume and security.

Bitcoin’s price ranges from $20,000 to $30,000. 1. Bitcoin’s price has fluctuated dramatically since 2009, reaching nearly $20,000 in 2017 and nearly $60,000 in 2021. 2. Prices are affected by factors such as market demand, supply, and macroeconomic environment. 3. Get real-time prices through exchanges, mobile apps and websites. 4. Bitcoin price is highly volatile, driven by market sentiment and external factors. 5. It has a certain relationship with traditional financial markets and is affected by global stock markets, the strength of the US dollar, etc. 6. The long-term trend is bullish, but risks need to be assessed with caution.

MeMebox 2.0 redefines crypto asset management through innovative architecture and performance breakthroughs. 1) It solves three major pain points: asset silos, income decay and paradox of security and convenience. 2) Through intelligent asset hubs, dynamic risk management and return enhancement engines, cross-chain transfer speed, average yield rate and security incident response speed are improved. 3) Provide users with asset visualization, policy automation and governance integration, realizing user value reconstruction. 4) Through ecological collaboration and compliance innovation, the overall effectiveness of the platform has been enhanced. 5) In the future, smart contract insurance pools, forecast market integration and AI-driven asset allocation will be launched to continue to lead the development of the industry.

The top ten cryptocurrency trading platforms in the world include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi Global, Bitfinex, Bittrex, KuCoin and Poloniex, all of which provide a variety of trading methods and powerful security measures.

Currently ranked among the top ten virtual currency exchanges: 1. Binance, 2. OKX, 3. Gate.io, 4. Coin library, 5. Siren, 6. Huobi Global Station, 7. Bybit, 8. Kucoin, 9. Bitcoin, 10. bit stamp.

The top ten digital currency exchanges such as Binance, OKX, gate.io have improved their systems, efficient diversified transactions and strict security measures.

Using the chrono library in C can allow you to control time and time intervals more accurately. Let's explore the charm of this library. C's chrono library is part of the standard library, which provides a modern way to deal with time and time intervals. For programmers who have suffered from time.h and ctime, chrono is undoubtedly a boon. It not only improves the readability and maintainability of the code, but also provides higher accuracy and flexibility. Let's start with the basics. The chrono library mainly includes the following key components: std::chrono::system_clock: represents the system clock, used to obtain the current time. std::chron

Handling high DPI display in C can be achieved through the following steps: 1) Understand DPI and scaling, use the operating system API to obtain DPI information and adjust the graphics output; 2) Handle cross-platform compatibility, use cross-platform graphics libraries such as SDL or Qt; 3) Perform performance optimization, improve performance through cache, hardware acceleration, and dynamic adjustment of the details level; 4) Solve common problems, such as blurred text and interface elements are too small, and solve by correctly applying DPI scaling.
