


Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models
On May 26, NetEase’s Fuxi Youling crowdsourcing platform made its debut at the China International Big Data Industry Expo. This platform is a human-computer collaboration online task platform developed by NetEase Fuxi based on its own research and development. It is currently the only crowdsourcing platform on the market that supports real-time human-computer interaction annotation. The goal of the Fuxi Youling crowdsourcing platform is to solve the labor shortage problem in all walks of life and provide the entire society with more convenient and interesting online employment opportunities. Enterprise customers can quickly model and publish tasks through this platform, while each gig user can freely receive tasks without restrictions on time and geography. In this way, the Fuxi Youling crowdsourcing platform provides enterprises and individuals with a more efficient and flexible working model.

In today's era, artificial intelligence technology is rapidly changing the way humans live and work. With the rapid development of artificial intelligence technologies such as large language models and multi-modal large models, the field of data annotation has ushered in a new era of vigorous development. A large amount of data is constantly emerging in various fields. However, in this exciting era, both the demand side and the provider side are facing huge challenges. They need to find an efficient way to provide high-quality, low-cost data support. This is not only related to the accuracy and practicality of artificial intelligence technology, but also to the development prospects of the entire industry. Therefore, the data annotation industry needs continuous innovation and improvement to meet the needs of artificial intelligence technology and promote the sustainable development of the industry.
In order to adapt to the trend of the big data era, many artificial intelligence companies have begun to establish training and management systems for data trainers, and continue to carry out technological innovation and improve data quality. However, as labor costs rise, more and more organizations are looking for more efficient and economical ways to annotate data. NetEase Fuxi Youling crowdsourcing platform came into being, based on the idea of HITL (Human-in-the-Loop).
The idea of human-machine collaboration injects new vitality into the data annotation industry
At this Data Expo, Fuxi Youling Crowdsourcing Platform It demonstrates its unique capabilities and advantages: combining human intelligence and decision-making power with the computing power of machine learning to achieve high-quality data annotation. Through a detailed and rigorous annotation process and a scientific scoring system, the platform maintains the accuracy and reliability of the data. At the same time, Fuxi Youling has also adopted a series of cutting-edge technical measures, including reducing costs, shortening the annotation cycle and ensuring data quality, to improve efficiency and effectiveness.

Data closed loop
After the annotator completes the data annotation, the platform provides support for real-time backflow model training, and the task issuer can evaluate the effect of the model before and after training. Compare and feel how the data annotation results improve the model and automatically update the model. The updated model can assist subsequent data annotation tasks and further improve the quality and efficiency of data annotation.
Full data inspection
The platform supports automatic quality inspection of all task data. The task issuer can flexibly configure the quality inspection process. The platform will combine users with Historical task levels and user portraits are used to conduct task quality inspection. At the same time, models are introduced to participate in quality inspection, so that AI and people can participate in quality control at the same time, and ultimately achieve high-accuracy delivery of tasks.
User Portraits
The platform has a complete user portrait and task matching mechanism, based on the user’s past task performance and combined with the user’s personal label data. Achieve matching according to the diverse needs of different task types, and assign tasks to the best people to do it, so as to meet the quality, efficiency and cost requirements of data annotation tasks.
Swarm Intelligence
The platform will locate diversified annotators based on user portraits, introduce redundant annotation forms, and use interval estimation and true Algorithmic methods such as value inference enable them to jointly participate in labeling decisions and obtain the final labeling results, ensuring the objectivity and accuracy of the final results.

According to the person in charge of the platform: The current platform mainly focuses on cognitive work content, which comes from the collection and labeling needs of multi-modal data such as text, pictures, and speech by AIGC and other artificial intelligence technologies. With the widespread application of communication technologies such as 5G, the platform will undertake more decision-making tasks such as remote control in the future. Based on digital twin technology, offline work will be digitized and online, allowing users to complete tasks in a gamified digital twin environment. happy working.
NetEase Fuxi Youling platform uses AI technology and manual annotation to ensure the quality and accuracy of data annotation and improve data annotation efficiency. It not only provides reliable and efficient data services for enterprises, but also contributes to the vigorous development of AI technology.
The soulful crowdsourcing platform helps AI technology flourish
During the same period of the exhibition, Dr. Wu Runze of NetEase Fuxi Lab also focused on "NetEase Fuxi Data" The theme of "Application Practice of Crowdsourcing Empowering Large Models" was shared.

Dr. Wu said: NetEase Fuxi has been deeply involved in large model technology since 2019, taking text pre-training and multi-modal pre-training as the main entry points, relying on the data crowdsourcing platform to provide high-quality data feedback closed loop, and overcome For key technical challenges such as unified representation construction, distributed object storage, and large-scale vector engines, it was selected as the "Pioneer Project" of Zhejiang Province and received official recognition for funding. It has successfully incubated two major game vertical products such as Danqingyue Art Platform and Game Intelligent NPC.
Currently, the Fuxi Youling crowdsourcing platform has been applied in multiple products and scenarios within NetEase Group: In the open world of the "Nishuihan" mobile game, the emotions are delicate and the reactions are Smart NPCs with sensitive, realistic movements and rich expressions are deeply loved by players. Smart NPCs require massive amounts of high-quality Human Feedback data to support them.
NetEase Fuxi Youling Crowdsourcing provides multi-data services involving voice collection, text annotation, emotional judgment, image annotation and other data services for the intelligent NPC model in the game, and ultimately supports the creation of text, voice , facial expressions and other multi-dimensional intelligent game NPCs. This is the deep integration that NetEase has accumulated in the fields of game engines and AI to solve the closed-loop problem of large-scale computing power data and pre-training models.
At present, NetEase Fuxi Youling crowdsourcing platform has processed hundreds of millions of data. While ensuring the performance of game AI, it can more efficiently collect feedback from game players and further improve AI performance. , thereby applying the technology in more diverse scenarios. Based on the concepts of openness, cooperation, and win-win, NetEase Fuxi will invite partners from upstream and downstream of the industry chain to jointly create a new era of AI digitalization.
The above is the detailed content of Intelligent data annotation solution: a crowdsourcing platform that welcomes the era of large models. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

This site reported on June 27 that Jianying is a video editing software developed by FaceMeng Technology, a subsidiary of ByteDance. It relies on the Douyin platform and basically produces short video content for users of the platform. It is compatible with iOS, Android, and Windows. , MacOS and other operating systems. Jianying officially announced the upgrade of its membership system and launched a new SVIP, which includes a variety of AI black technologies, such as intelligent translation, intelligent highlighting, intelligent packaging, digital human synthesis, etc. In terms of price, the monthly fee for clipping SVIP is 79 yuan, the annual fee is 599 yuan (note on this site: equivalent to 49.9 yuan per month), the continuous monthly subscription is 59 yuan per month, and the continuous annual subscription is 499 yuan per year (equivalent to 41.6 yuan per month) . In addition, the cut official also stated that in order to improve the user experience, those who have subscribed to the original VIP

Improve developer productivity, efficiency, and accuracy by incorporating retrieval-enhanced generation and semantic memory into AI coding assistants. Translated from EnhancingAICodingAssistantswithContextUsingRAGandSEM-RAG, author JanakiramMSV. While basic AI programming assistants are naturally helpful, they often fail to provide the most relevant and correct code suggestions because they rely on a general understanding of the software language and the most common patterns of writing software. The code generated by these coding assistants is suitable for solving the problems they are responsible for solving, but often does not conform to the coding standards, conventions and styles of the individual teams. This often results in suggestions that need to be modified or refined in order for the code to be accepted into the application

To learn more about AIGC, please visit: 51CTOAI.x Community https://www.51cto.com/aigc/Translator|Jingyan Reviewer|Chonglou is different from the traditional question bank that can be seen everywhere on the Internet. These questions It requires thinking outside the box. Large Language Models (LLMs) are increasingly important in the fields of data science, generative artificial intelligence (GenAI), and artificial intelligence. These complex algorithms enhance human skills and drive efficiency and innovation in many industries, becoming the key for companies to remain competitive. LLM has a wide range of applications. It can be used in fields such as natural language processing, text generation, speech recognition and recommendation systems. By learning from large amounts of data, LLM is able to generate text

Large Language Models (LLMs) are trained on huge text databases, where they acquire large amounts of real-world knowledge. This knowledge is embedded into their parameters and can then be used when needed. The knowledge of these models is "reified" at the end of training. At the end of pre-training, the model actually stops learning. Align or fine-tune the model to learn how to leverage this knowledge and respond more naturally to user questions. But sometimes model knowledge is not enough, and although the model can access external content through RAG, it is considered beneficial to adapt the model to new domains through fine-tuning. This fine-tuning is performed using input from human annotators or other LLM creations, where the model encounters additional real-world knowledge and integrates it

Machine learning is an important branch of artificial intelligence that gives computers the ability to learn from data and improve their capabilities without being explicitly programmed. Machine learning has a wide range of applications in various fields, from image recognition and natural language processing to recommendation systems and fraud detection, and it is changing the way we live. There are many different methods and theories in the field of machine learning, among which the five most influential methods are called the "Five Schools of Machine Learning". The five major schools are the symbolic school, the connectionist school, the evolutionary school, the Bayesian school and the analogy school. 1. Symbolism, also known as symbolism, emphasizes the use of symbols for logical reasoning and expression of knowledge. This school of thought believes that learning is a process of reverse deduction, through existing

Editor |ScienceAI Question Answering (QA) data set plays a vital role in promoting natural language processing (NLP) research. High-quality QA data sets can not only be used to fine-tune models, but also effectively evaluate the capabilities of large language models (LLM), especially the ability to understand and reason about scientific knowledge. Although there are currently many scientific QA data sets covering medicine, chemistry, biology and other fields, these data sets still have some shortcomings. First, the data form is relatively simple, most of which are multiple-choice questions. They are easy to evaluate, but limit the model's answer selection range and cannot fully test the model's ability to answer scientific questions. In contrast, open-ended Q&A

Editor | KX In the field of drug research and development, accurately and effectively predicting the binding affinity of proteins and ligands is crucial for drug screening and optimization. However, current studies do not take into account the important role of molecular surface information in protein-ligand interactions. Based on this, researchers from Xiamen University proposed a novel multi-modal feature extraction (MFE) framework, which for the first time combines information on protein surface, 3D structure and sequence, and uses a cross-attention mechanism to compare different modalities. feature alignment. Experimental results demonstrate that this method achieves state-of-the-art performance in predicting protein-ligand binding affinities. Furthermore, ablation studies demonstrate the effectiveness and necessity of protein surface information and multimodal feature alignment within this framework. Related research begins with "S

According to news from this site on August 1, SK Hynix released a blog post today (August 1), announcing that it will attend the Global Semiconductor Memory Summit FMS2024 to be held in Santa Clara, California, USA from August 6 to 8, showcasing many new technologies. generation product. Introduction to the Future Memory and Storage Summit (FutureMemoryandStorage), formerly the Flash Memory Summit (FlashMemorySummit) mainly for NAND suppliers, in the context of increasing attention to artificial intelligence technology, this year was renamed the Future Memory and Storage Summit (FutureMemoryandStorage) to invite DRAM and storage vendors and many more players. New product SK hynix launched last year
