Home Technology peripherals AI Unleash excellent programming resources, giant models and agents will trigger more powerful forces

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

Jan 16, 2024 pm 01:12 PM
project uiuc

Just as Rhysford's wand created the legend of extraordinary magicians such as Dumbledore in the past, traditional large-scale language models with huge potential, after pre-training/fine-tuning of code corpus, have mastered more beyond Original execution ability.

Specifically, the advanced version of the large model has been improved in terms of writing code, stronger reasoning, independent reference to execution interfaces, independent improvement, etc., which will provide As an AI agent, it brings benefits in all aspects when performing downstream tasks.

Recently, a research team from the University of Illinois at Urbana-Champaign (UIUC) published an important review.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

Paper link: https://arxiv.org/abs/2401.00812

This review explores the code (Code) How to give large language models (LLMs) and their intelligent agents (Intelligent Agents) based on them powerful capabilities. Unleash excellent programming resources, giant models and agents will trigger more powerful forces
Among them, code specifically refers to a formal language that is machine-executable and human-readable, such as a programming language, a predefined function set, etc. Similar to how we guide LLMs to understand/generate traditional natural language, making LLMs proficient in code only requires applying the same language modeling training objectives to code data.

Different from traditional language models, today’s commonly used LLMs, such as Llama2 and GPT4, have not only significantly improved in size, but they have also undergone development independent of typical natural language corpora. code corpus training. Code has standardized syntax, logical consistency, abstraction and modularity, and can transform high-level goals into executable steps, making it an ideal medium to connect humans and computers.

As shown in Figure 2, in this review, the researchers compiled relevant work and analyzed in detail the various advantages of incorporating code into LLMs training data.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

Specifically, the researchers observed that unique properties of code contribute to:

1. Enhance the code writing capabilities, reasoning capabilities, and structured information processing capabilities of LLMs so that they can be applied to more complex natural language tasks;
2. Guide LLMs to generate structured and accurate Intermediate steps, these steps can be connected to the external execution end through function calls;
3. Use the compilation and execution environment of the code to provide diverse feedback for independent improvement of the model.

In addition, the researchers also deeply investigated the optimization items of these LLMs given by the code, how to strengthen them as the decision-making center of the Intelligent Agent, understand instructions, decompose goals, and plan and a set of abilities to perform actions and improve from feedback.

As shown in Figure 3, in the first part, the researchers found that the pre-training of LLMs on code has expanded the task scope of LLMs to natural language. outside. These models can support a variety of applications, including code generation for mathematical theories, general programming tasks, and data retrieval. Code needs to produce a logically coherent, ordered sequence of steps, which is essential for effective execution. Additionally, the executability of each step in the code allows step-by-step verification of the logic. Exploiting and embedding these code attributes in pre-training improves the Chain of Thought (CoT) performance of LLMs in many traditional natural language downstream tasks, validating their improvement in complex reasoning skills. At the same time, by implicitly learning the structured format of code, codeLLMs perform better on common-sense structured reasoning tasks, such as those related to markup languages, HTML, and diagram understanding.
Unleash excellent programming resources, giant models and agents will trigger more powerful forces
As shown in Figure 4, connecting LLMs with other functional ends (that is, extending LLMs capabilities through external tools and execution modules) helps LLMs to more accurately and reliably Perform tasks.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

In the second part, as shown in Table 1, researchers observed a general trend: LLMs establish connections with other functional endpoints by generating programming languages ​​or leveraging predefined functions. This "code-centric paradigm" differs from the rigid approach of strictly hardcoding tool calls in the inference mechanism of LLMs, which allows LLMs to dynamically generate tokens that call execution modules, with adjustable parameters.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

This paradigm provides a simple and clear way for LLMs to interact with other functional ends, enhancing the flexibility and scalability of their applications. sex. More importantly, it also allows LLMs to interact with numerous functional endpoints covering multiple modalities and domains. By expanding the number and variety of functional terminals accessible to LLMs, LLMs are able to handle more complex tasks.

As shown in Figure 5, embedding LLMs into the code execution environment can achieve automated feedback and independent model improvement. LLMs perform beyond the range of their training parameters, in part because they are able to accommodate feedback. However, feedback must be chosen carefully as noisy cue input may impede the performance of LLMs on downstream tasks. Furthermore, since human resources are expensive, feedback needs to be collected automatically while maintaining authenticity. In the third part, the researchers found that embedding LLMs into the code execution environment can yield feedback that meets all of these criteria.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

First of all, since code execution is deterministic, obtaining feedback from the results of executing code can directly and faithfully reflect the tasks performed by LLM. Additionally, code interpreters provide LLMs with a way to automatically query internal feedback, eliminating the need for expensive human annotations when leveraging LLMs to debug or optimize erroneous code. The Code compilation and execution environment also allows LLMs to incorporate diverse and comprehensive external feedback forms, such as simple generation of binary correct and error evaluations, slightly more complex natural language explanations of execution results, and various rankings with feedback values. methods, they all make the methods of improving performance highly customizable.

By analyzing various ways in which code training data integration enhances the capabilities of LLMs, researchers further discovered that the advantage of code empowering LLMs lies in the key development of Intelligent Agent. LLM application areas are particularly obvious.

Figure 6 shows the standard workflow of an intelligent assistant. The researchers observed that the improvements brought about by code training in LLMs also affected the actual steps they performed as intelligent assistants.

Unleash excellent programming resources, giant models and agents will trigger more powerful forces

These steps include: (1) enhancing IA’s decision-making capabilities in environmental awareness and planning, (2) implementing actions in modular action primitives and efficient organization of memory to optimize policy execution, and (3) optimize performance through feedback automatically derived from the code execution environment.

In summary, in this review, researchers analyze and clarify how code gives LLMs powerful capabilities, and how code assists LLMs in working as decision-making centers for Intelligent Agents .

Through a comprehensive literature review, the researchers observed that after code training, LLMs improved their programming skills and reasoning capabilities, and gained implementation and cross-modal and domain expertise. Flexible connection capabilities for multiple function terminals, as well as enhanced ability to interact with evaluation modules integrated in the code execution environment and achieve automatic self-improvement.

In addition, the improved capabilities of LLMs brought by code training help them perform as Intelligent Agents in downstream applications, reflected in specific tasks such as decision-making, execution, and self-improvement. Steps. In addition to reviewing previous research, the researchers also proposed several challenges in the field as guiding elements for potential future directions.

Please refer to the original article for more details!

The above is the detailed content of Unleash excellent programming resources, giant models and agents will trigger more powerful forces. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1666
14
PHP Tutorial
1273
29
C# Tutorial
1253
24
The author of ControlNet has another hit! The whole process of generating a painting from a picture, earning 1.4k stars in two days The author of ControlNet has another hit! The whole process of generating a painting from a picture, earning 1.4k stars in two days Jul 17, 2024 am 01:56 AM

It is also a Tusheng video, but PaintsUndo has taken a different route. ControlNet author LvminZhang started to live again! This time I aim at the field of painting. The new project PaintsUndo has received 1.4kstar (still rising crazily) not long after it was launched. Project address: https://github.com/lllyasviel/Paints-UNDO Through this project, the user inputs a static image, and PaintsUndo can automatically help you generate a video of the entire painting process, from line draft to finished product. follow. During the drawing process, the line changes are amazing. The final video result is very similar to the original image: Let’s take a look at a complete drawing.

Topping the list of open source AI software engineers, UIUC's agent-less solution easily solves SWE-bench real programming problems Topping the list of open source AI software engineers, UIUC's agent-less solution easily solves SWE-bench real programming problems Jul 17, 2024 pm 10:02 PM

The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com The authors of this paper are all from the team of teacher Zhang Lingming at the University of Illinois at Urbana-Champaign (UIUC), including: Steven Code repair; Deng Yinlin, fourth-year doctoral student, researcher

From RLHF to DPO to TDPO, large model alignment algorithms are already 'token-level' From RLHF to DPO to TDPO, large model alignment algorithms are already 'token-level' Jun 24, 2024 pm 03:04 PM

The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com In the development process of artificial intelligence, the control and guidance of large language models (LLM) has always been one of the core challenges, aiming to ensure that these models are both powerful and safe serve human society. Early efforts focused on reinforcement learning methods through human feedback (RL

arXiv papers can be posted as 'barrage', Stanford alphaXiv discussion platform is online, LeCun likes it arXiv papers can be posted as 'barrage', Stanford alphaXiv discussion platform is online, LeCun likes it Aug 01, 2024 pm 05:18 PM

cheers! What is it like when a paper discussion is down to words? Recently, students at Stanford University created alphaXiv, an open discussion forum for arXiv papers that allows questions and comments to be posted directly on any arXiv paper. Website link: https://alphaxiv.org/ In fact, there is no need to visit this website specifically. Just change arXiv in any URL to alphaXiv to directly open the corresponding paper on the alphaXiv forum: you can accurately locate the paragraphs in the paper, Sentence: In the discussion area on the right, users can post questions to ask the author about the ideas and details of the paper. For example, they can also comment on the content of the paper, such as: "Given to

A significant breakthrough in the Riemann Hypothesis! Tao Zhexuan strongly recommends new papers from MIT and Oxford, and the 37-year-old Fields Medal winner participated A significant breakthrough in the Riemann Hypothesis! Tao Zhexuan strongly recommends new papers from MIT and Oxford, and the 37-year-old Fields Medal winner participated Aug 05, 2024 pm 03:32 PM

Recently, the Riemann Hypothesis, known as one of the seven major problems of the millennium, has achieved a new breakthrough. The Riemann Hypothesis is a very important unsolved problem in mathematics, related to the precise properties of the distribution of prime numbers (primes are those numbers that are only divisible by 1 and themselves, and they play a fundamental role in number theory). In today's mathematical literature, there are more than a thousand mathematical propositions based on the establishment of the Riemann Hypothesis (or its generalized form). In other words, once the Riemann Hypothesis and its generalized form are proven, these more than a thousand propositions will be established as theorems, which will have a profound impact on the field of mathematics; and if the Riemann Hypothesis is proven wrong, then among these propositions part of it will also lose its effectiveness. New breakthrough comes from MIT mathematics professor Larry Guth and Oxford University

Posthumous work of the OpenAI Super Alignment Team: Two large models play a game, and the output becomes more understandable Posthumous work of the OpenAI Super Alignment Team: Two large models play a game, and the output becomes more understandable Jul 19, 2024 am 01:29 AM

If the answer given by the AI ​​model is incomprehensible at all, would you dare to use it? As machine learning systems are used in more important areas, it becomes increasingly important to demonstrate why we can trust their output, and when not to trust them. One possible way to gain trust in the output of a complex system is to require the system to produce an interpretation of its output that is readable to a human or another trusted system, that is, fully understandable to the point that any possible errors can be found. For example, to build trust in the judicial system, we require courts to provide clear and readable written opinions that explain and support their decisions. For large language models, we can also adopt a similar approach. However, when taking this approach, ensure that the language model generates

LLM is really not good for time series prediction. It doesn't even use its reasoning ability. LLM is really not good for time series prediction. It doesn't even use its reasoning ability. Jul 15, 2024 pm 03:59 PM

Can language models really be used for time series prediction? According to Betteridge's Law of Headlines (any news headline ending with a question mark can be answered with "no"), the answer should be no. The fact seems to be true: such a powerful LLM cannot handle time series data well. Time series, that is, time series, as the name suggests, refers to a set of data point sequences arranged in the order of time. Time series analysis is critical in many areas, including disease spread prediction, retail analytics, healthcare, and finance. In the field of time series analysis, many researchers have recently been studying how to use large language models (LLM) to classify, predict, and detect anomalies in time series. These papers assume that language models that are good at handling sequential dependencies in text can also generalize to time series.

The first Mamba-based MLLM is here! Model weights, training code, etc. have all been open source The first Mamba-based MLLM is here! Model weights, training code, etc. have all been open source Jul 17, 2024 am 02:46 AM

The AIxiv column is a column where this site publishes academic and technical content. In the past few years, the AIxiv column of this site has received more than 2,000 reports, covering top laboratories from major universities and companies around the world, effectively promoting academic exchanges and dissemination. If you have excellent work that you want to share, please feel free to contribute or contact us for reporting. Submission email: liyazhou@jiqizhixin.com; zhaoyunfeng@jiqizhixin.com. Introduction In recent years, the application of multimodal large language models (MLLM) in various fields has achieved remarkable success. However, as the basic model for many downstream tasks, current MLLM consists of the well-known Transformer network, which

See all articles