


Google discloses its own 'AI+ software engineering' framework DIDACT: Thousands of developers have tested it internally, and they all say it is highly productive after using it
Any large-scale software is not fully conceived from the beginning, but is improved, edited, unit tested, repaired by developers, solved by code review, and solved again and again until it is satisfied and goes online. The code can be merged into the warehouse only after the requirements are met.
The knowledge of controlling the entire process is called Software Engineering.
Software engineering is not an independent process, but consists of developers, code reviewers, bug reporters, software architects and various development tools (such as compilers, unit tests, Connector, static analyzer).
Recently, Google announced its own DIDACT (Dynamic Integrated Developer ACTivity, dynamic integrated developer activity) framework, which uses AI technology to enhance software engineering and integrate software development The intermediate states are used as training data to assist developers in writing and modifying code, and understand the dynamics of software development in real time.
DIDACT is a multi-task model trained on development activities including editing, debugging, fixing and code review
The researchers built and deployed three DIDACT tools in-house, Annotation Parsing, Build Repair, and Tip Prediction, each integrated at different stages of the development workflow.
Software Engineering = Interaction Log
For decades, Google’s software engineering tool chain stored every operation related to code as a tool and development Logs of interactions between people.
In principle, users can use these records to replay in detail the key change process in the software development process, that is, how Google's code base was formed, including every code edit, Compilation, annotation, variable renaming, etc.
Google's development team will store the code in a monorepo (mono repository), which is a code repository that contains all tools and systems.
Software developers typically make code modifications in a local copy-on-write workspace managed by Clients in the Cloud (CitC) systems. experiment.
When a developer is ready to package a set of code changes together to achieve a certain task (such as fixing a bug), he or she needs to create a code change in Critique, Google's code review system. Changelist (CL).
Like common code review systems, developers communicate with peer reviewers about functionality and style, and then edit the CL to address issues raised during review comments.
Eventually, the reviewer declared the code "LGTM!" and merged the CL into the code base.
Of course, in addition to conversations with code reviewers, developers also need to maintain a large number of "dialogues" with other software engineering tools, including compilers, test frameworks, linkers, Static analyzers, fuzz testing tools, etc.
An illustration of the complex network of activities involved in software development: the activities of developers, interactions with code reviewers, and the use of tools such as compilers transfer.
Multi-task models in software engineering
DIDACT leverages the interaction between engineers and tools to empower machine learning models by suggesting or optimizing developers’ execution of software actions during engineering tasks to assist Google developers in participating in the software engineering process.
To this end, the researchers defined a number of tasks regarding individual developer activities: fixing broken builds, predicting code review comments, processing code review comments, renaming variables, editing files, etc. .
Then define a common form for each activity: get a certain State (code file), an Intent (annotation specific to an activity, such as code review annotation or compilation processor error) and generate an Action (an operation for processing the task).
Action is like a mini programming language that can be expanded into newly added activities, covering editing, adding comments, renaming variables, marking code errors, etc. It can also be called this The first language is DevScript.
The input prompts of the DIDACT model are tasks, code snippets and comments related to the task, and the output is development actions, such as editing or comments
Status- The definition form of Intent-Action (State-Intent-Action) can capture different tasks in a common way. More importantly, DevScript can express complex actions concisely without the need to output the entire state after the action occurs ( original code), making the model more efficient and interpretable.
For example, renaming may modify multiple places in the code file, but the model only needs to predict one renaming operation.
Assign a programmer to the AI model
DIDACT runs very well on personal auxiliary tasks. For example, the following example demonstrates the code of DIDACT after the function is completed. For cleanup work, first enter the code reviewer's final comments (marked human in the picture), and then predict the operations required to solve the problems raised in the comments (shown with diff).
Given an initial snippet of code and the comments the code reviewer attached to the snippet, DIDACT's Pre-Submit Cleanup task generates a Editing operations (insertion and deletion of text)
The multi-modal nature of DIDACT also gives rise to some completely new behaviors that emerge with scale, one of which is history enhancement ( history augmentation), this capability can be enabled via prompts. Knowing what the developer has done recently allows the model to better predict what the developer should do next.
##Demonstration of historical enhanced code completion
The history enhanced code completion task can demonstrate this ability. In the example above, the developer added a new function parameter (1) and moved the cursor into the document (2). Based on the developer's editing history and cursor position, the model is able to accurately predict the docstring entry for the new parameter and complete the third step.
In the more difficult task of history-augmented edit prediction, the model is able to select the location of the next edit in a historically consistent manner.
Demonstration of edit prediction over multiple chained iterations
If a developer removes a function parameter (1), the model can correctly predict an update to the docstring (2) that removes the parameter based on history (without requiring a human developer to manually place the cursor there), and in the syntax correctly (and arguably semantically) update the statement in function (3).
With the history, the model can clearly decide how to correctly continue the "editing code process", but without the history, the model has no way of knowing that the missing function parameters were intentional ( Because the developer was doing a longer editing operation to remove the parameter) or was it an unexpected situation (the model should re-add the parameter to fix the problem).
In addition, the model can also complete more tasks, such as starting from a blank file and requiring the model to continuously predict the next editing operations until a complete code is written. document.
Most importantly, the model assists in writing code in a step-by-step manner that is natural to developers:
Start by creating a complete working framework with imports, flags, and a basic main function; then gradually add new functionality, such as reading and writing results from files, and adding filtering of certain lines based on user-supplied regular expressions Function.
ConclusionDIDACT transforms Google’s software development process into training demos for machine learning developer assistants and uses these demo data to train models in a step-by-step manner Build code, interact with tools and code reviewers.
The DIDACT approach complements the great achievements of large-scale language models from Google and others to reduce workload, increase productivity, and improve the quality of software engineers' work.
The above is the detailed content of Google discloses its own 'AI+ software engineering' framework DIDACT: Thousands of developers have tested it internally, and they all say it is highly productive after using it. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Bitcoin’s price ranges from $20,000 to $30,000. 1. Bitcoin’s price has fluctuated dramatically since 2009, reaching nearly $20,000 in 2017 and nearly $60,000 in 2021. 2. Prices are affected by factors such as market demand, supply, and macroeconomic environment. 3. Get real-time prices through exchanges, mobile apps and websites. 4. Bitcoin price is highly volatile, driven by market sentiment and external factors. 5. It has a certain relationship with traditional financial markets and is affected by global stock markets, the strength of the US dollar, etc. 6. The long-term trend is bullish, but risks need to be assessed with caution.

The top ten cryptocurrency exchanges in the world in 2025 include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi, Bitfinex, KuCoin, Bittrex and Poloniex, all of which are known for their high trading volume and security.

The top ten cryptocurrency trading platforms in the world include Binance, OKX, Gate.io, Coinbase, Kraken, Huobi Global, Bitfinex, Bittrex, KuCoin and Poloniex, all of which provide a variety of trading methods and powerful security measures.

The top ten digital currency exchanges such as Binance, OKX, gate.io have improved their systems, efficient diversified transactions and strict security measures.

Currently ranked among the top ten virtual currency exchanges: 1. Binance, 2. OKX, 3. Gate.io, 4. Coin library, 5. Siren, 6. Huobi Global Station, 7. Bybit, 8. Kucoin, 9. Bitcoin, 10. bit stamp.

MeMebox 2.0 redefines crypto asset management through innovative architecture and performance breakthroughs. 1) It solves three major pain points: asset silos, income decay and paradox of security and convenience. 2) Through intelligent asset hubs, dynamic risk management and return enhancement engines, cross-chain transfer speed, average yield rate and security incident response speed are improved. 3) Provide users with asset visualization, policy automation and governance integration, realizing user value reconstruction. 4) Through ecological collaboration and compliance innovation, the overall effectiveness of the platform has been enhanced. 5) In the future, smart contract insurance pools, forecast market integration and AI-driven asset allocation will be launched to continue to lead the development of the industry.

Recommended reliable digital currency trading platforms: 1. OKX, 2. Binance, 3. Coinbase, 4. Kraken, 5. Huobi, 6. KuCoin, 7. Bitfinex, 8. Gemini, 9. Bitstamp, 10. Poloniex, these platforms are known for their security, user experience and diverse functions, suitable for users at different levels of digital currency transactions

DMA in C refers to DirectMemoryAccess, a direct memory access technology, allowing hardware devices to directly transmit data to memory without CPU intervention. 1) DMA operation is highly dependent on hardware devices and drivers, and the implementation method varies from system to system. 2) Direct access to memory may bring security risks, and the correctness and security of the code must be ensured. 3) DMA can improve performance, but improper use may lead to degradation of system performance. Through practice and learning, we can master the skills of using DMA and maximize its effectiveness in scenarios such as high-speed data transmission and real-time signal processing.
