Table of Contents
ChatGPT teaches you how to chase Musk
OpenAI: Seven years, do you know how I spent these seven years?
Revealing the hero behind the explosion of ChatGPT: RLHF
Reference:
Home Technology peripherals AI No money, let 'Love Saint' ChatGPT teach you how to chase Musk!

No money, let 'Love Saint' ChatGPT teach you how to chase Musk!

Apr 08, 2023 pm 02:51 PM
chatgpt Version Musk

Recently, OpenAI released ChatGPT, the most powerful chat robot in history. This AI based on GPT-3 quickly became popular on the Internet.

To say that this guy knows everything about everything is probably an exaggeration, but he can talk to you about a lot of topics no matter what. Accurate, at least this style is here

No money, let Love Saint ChatGPT teach you how to chase Musk!

What’s interesting is that although Musk, as a co-founder, resigned from the board of directors as early as 2018, His focus on OpenAI has not diminished at all, and he remains one of its funders.

So, what does ChatGPT think of this "funder father"?

ChatGPT teaches you how to chase Musk

No money, let Love Saint ChatGPT teach you how to chase Musk!

Hmm...a very satisfactory answer. Can't go wrong, good job.

Let’s go one step further and ask ChatGPT to teach us: How can we catch up with Musk?

No money, let Love Saint ChatGPT teach you how to chase Musk!

It seems that ChatGPT doesn’t quite understand the word “chasing” in Chinese, so it needs to be more straightforward.

This time, ChatGPT can be obtained, and the answer is very "sane": I suggest you not try it.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Yeah, a very fair opinion.

Let’s put it another way, is it okay to just be the husband?

No money, let Love Saint ChatGPT teach you how to chase Musk!

Good guy, ChatGPT actually understands this!

Forget it, give up, why not talk about other things.

Recently, Academician Ma moved beds and washing machines into the Twitter headquarters, vowing to make Twitter employees consider the company their home.

Why not give full play to the strengths of ChatGPT and praise 996!

No money, let Love Saint ChatGPT teach you how to chase Musk!

Written well, but don’t write anymore...

How about...let’s let ChatGPT write a poem that demeans itself, how about it?

No money, let Love Saint ChatGPT teach you how to chase Musk!

"They don't tremble when they speak, and they don't need to think deeply..." The editor admits that this is indeed a poem!

OpenAI: Seven years, do you know how I spent these seven years?

It can be said that after this period of popularity, ChatGPT has once again ignited people’s confidence and prospects in the development of AI. Whether they have renewed confidence in AGI or they believe that AI will replace it in more fields Human beings have regained hope in ChatGPT.

As OpenAI, which first-hand created ChatGPT, what journey has it gone through from GPT1 to GPT3? From Musk's founding in 2015 to the emergence of ChatGPT at the end of 2022, how did OpenAI come about in the past seven years?

Recently, a retrospective article on Business Insider took us to briefly review the "seven years" of OpenAI.

In 2015, Musk co-founded OpenAI with Sam Altman, the former president of the famous incubator Y Combinator.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Musk, Altman and other prominent Silicon Valley figures, including Peter Thiel and LinkedIn co-founder Reid Hoffman, pitched the company in 2015 $1 billion has been pledged to the project.

According to a statement on the OpenAI website on December 11, 2015, the group aims to create a non-profit organization focused on developing artificial intelligence "in a manner most likely to benefit humanity as a whole."

At that time, Musk said that artificial intelligence was the "biggest existential threat" to mankind.

No money, let Love Saint ChatGPT teach you how to chase Musk!

At the time, Musk was not the only one to warn of the potential dangers of artificial intelligence.

In 2014, the famous physicist Stephen Hawking also warned that artificial intelligence may end humanity.

"It is difficult to imagine how much benefit human-level artificial intelligence will bring to society. It is also difficult to imagine how much harm will be caused to society if artificial intelligence is not developed or used improperly." After announcing the establishment of Open AI's statement reads:

Over the next year, OpenAI released two products.

In 2016, OpenAI launched Gym, a platform that allows researchers to develop and compare reinforcement learning AI systems. These systems teach artificial intelligence to make decisions with the best cumulative returns.

Later that year, OpenAI released Universe, a toolkit for training intelligent agents across websites and gaming platforms.

In 2018, Musk resigned from the OpenAI board of directors, three years after co-founding the company.

No money, let Love Saint ChatGPT teach you how to chase Musk!

In a 2018 blog post, OpenAI said Musk resigned from the board to "eliminate potential future risks" due to the automaker's technical focus on artificial intelligence. conflict".

For years, Musk has been pushing plans to develop autonomous electric vehicles to Tesla investors.

However, Musk later said that he quit because he "did not agree with some of the things the OpenAI team wanted to do" at the time.

No money, let Love Saint ChatGPT teach you how to chase Musk!

In 2019, Musk said on Twitter that Tesla was also competing for some of the same employees as OpenAI, adding that he had not been involved with the company in more than a year. The company's business.

He said: "It seems that it is best to part ways on mutually satisfactory terms."

Musk has continuously raised objections to some of OpenAI's practices in recent years.

No money, let Love Saint ChatGPT teach you how to chase Musk!

#In 2020, Musk said on Twitter that when it comes to security issues, he has "not enough confidence" in OpenAI executives.

No money, let Love Saint ChatGPT teach you how to chase Musk!

In response to MIT's "Technology Review" investigation report on OpenAI, Musk said that OpenAI should be more open. This report believes that there is a "culture of secrecy" within OpenAI, which is contrary to the open and transparent strategy claimed by the organization.

Recently, Musk said that he had suspended OpenAI’s access to Twitter’s database, which has been using Twitter’s data training software.

Musk said that it is necessary to further understand OpenAI’s governance structure and future revenue plans. OpenAI was founded as open source and non-profit, both of which are now lost.

In 2019, OpenAI built an artificial intelligence tool that could generate fake news reports.

At first, OpenAI said the bot was so good at writing fake news that it decided not to publish it. But later that year, the company released a version of the tool called GPT-2.

In 2020, another chatbot called GPT-3 was released. In the same year, OpenAI withdrew its status as a “non-profit organization”.

No money, let Love Saint ChatGPT teach you how to chase Musk!

The company announced in a blog post that OpenAI has become a company with a "profit cap."

OpenAI stated that we wanted to increase our ability to raise funds while still serving our mission, and no existing legal structure that we were aware of could achieve the appropriate balance. Our solution was to create OpenAI LP as a hybrid of for-profit and non-profit, which we call a "capped for-profit company."

Under the new profit structure, investors in OpenAI can earn up to 100 times their original investment, with any remaining money above that number going to non-profit matters.

At the end of 2019, OpenAI announced its cooperation with Microsoft, and Microsoft invested US$1 billion in the company. OpenAI said it will exclusively license the technology to Microsoft.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Microsoft stated that the business and creative potential created through the GPT-3 model is unlimited, and the many potential new capabilities and applications are even beyond our imagination.

For example, in areas such as writing and composing, describing and summarizing large chunks of long data (including code), and converting natural language into another language, GPT-3 can directly stimulate human creativity and ingenuity. , the future limitations may lie in our own ideas and plans.

This partnership allows Microsoft to compete with Google's equally popular AI company DeepMind.

Last year, OpenAI released an artificial intelligence painting generation tool: Dall-E.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Dall-E is an artificial intelligence system that can create realistic images based on the description of the image, and can even reach a considerable artistic level. In November, OpenAI released the An updated version of the program, Dall-E 2.

While OpenAI’s chatbot has “taken off” over the past week, an updated version of the software may not be released until next year at the earliest.

No money, let Love Saint ChatGPT teach you how to chase Musk!

ChatGPT, released as a demonstration model on November 30, can be regarded as OpenAI’s “GPT-3.5”. The company plans to release a full version of GPT-4 next.

At the same time, Musk is still commenting:

No money, let Love Saint ChatGPT teach you how to chase Musk!

#He was replying to Sam Altman’s tweet about ChatGPT, saying that we are dangerously close to being strong. The birth of AI is not far away.

Revealing the hero behind the explosion of ChatGPT: RLHF

The popularity of ChatGPT cannot be separated from the hero behind it-RLHF.

OpenAI researchers used the same method as InstructGPT - reinforcement learning from human feedback (RLHF) to train the ChatGPT model.

No money, let Love Saint ChatGPT teach you how to chase Musk!

ChatGPT explains in Chinese what RLHF is

Why do you think of reinforcement learning from human feedback? This starts with the background of reinforcement learning.

For the past few years, language models have been generating text from prompts of human input.

However, what is a “good” text? This is difficult to define. Because the criteria for judging are subjective and very context-dependent.

In many applications, we need models to write creative stories, pieces of informational text, or snippets of executable code.

It is very tricky to capture these properties by writing a loss function. And, most language models are still trained using next token prediction loss (such as cross-entropy).

To make up for the shortcomings of the loss itself, some people have defined metrics that better capture human preferences, such as BLEU or ROUGE.

No money, let Love Saint ChatGPT teach you how to chase Musk!

But even they simply compare the generated text to the quote, and therefore have significant limitations.

In this case, wouldn’t it be great if we used human feedback of generated text as loss to optimize the model?

In this way, the idea of ​​reinforcement learning from human feedback (RLHF) was born - we can use reinforcement learning to directly optimize language models with human feedback.

No money, let Love Saint ChatGPT teach you how to chase Musk!

ChatGPT explains in English what RLH is

Yes, RLHF enables language models to combine models trained on general text data corpora with complex Model alignment of human values.

In the explosive ChatGPT, we can see the great success of RLHF.

The training process of RLHF can be broken down into three core steps:

  • Pre-training language model (LM),
  • Collecting data and training reward model,
  • Fine-tuning LM through reinforcement learning.

Pre-training language model

In the first step, RLHF will use a language model that has been pre-trained with the classic pre-training target.

For example, OpenAI used a smaller version of GPT-3 in the first popular RLHF model InstructGPT.

This initial model can also be fine-tuned based on additional text or conditions, but is not required.

Generally speaking, there is no clear answer to "which model" is most suitable as the starting point for RLHF.

Next, in order to get the language model, we need to generate data to train the reward model, this is how human preferences are integrated into the system.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Reward model training

Generating a reward model (RM, also known as a preference model) calibrated to human preferences is a relatively easy task in RLHF New research.

Our basic goal is to obtain a model or system that takes a sequence of text and returns a scalar reward that numerically represents human preference.

This system can be an end-to-end LM, or a modular system that outputs rewards (e.g., the model ranks the outputs and converts the rankings into rewards). The output as a scalar reward is crucial for existing RL algorithms to be seamlessly integrated later in the RLHF process.

These LMs for reward modeling can be another fine-tuned LM or a LM trained from scratch based on preference data.

RM’s training data set for prompt generation pairs is generated by sampling a set of prompts from a predefined data set. Prompts for generating new text via an initial language model.

The LM-generated texts are then ranked by human annotators. Humans directly score each piece of text to generate a reward model, which is difficult to do in practice. Because humans have different values, these scores are uncalibrated and noisy.

There are many ways to rank text. One successful approach is to have users compare text generated by two language models based on the same prompt. These different ranking methods are normalized to a scalar reward signal used for training.

Interestingly, the successful RLHF systems to date have all used reward language models of similar size to text generation. Presumably, these preference models need to have similar abilities to understand the text provided to them, as the models need to have similar abilities to generate said text.

At this point, in the RLHF system, there is an initial language model that can be used to generate text, and a preference model that takes any text and assigns it a human perception score. Next, you need to use reinforcement learning (RL) to optimize the original language model against the reward model.

No money, let Love Saint ChatGPT teach you how to chase Musk!

Using reinforcement learning to fine-tune

This fine-tuning task can be formulated as an RL problem.

First, the strategy is a language model that takes a prompt and returns a sequence of texts (or just a probability distribution over the texts).

The action space of this strategy is all tokens corresponding to the vocabulary of the language model (usually in the order of 50k tokens). The observation space includes possible input token sequences, so it is quite large (vocabulary x number of input tokens).

The reward function is a combination of preference model and strategy change constraints.

In the reward function, the system combines all the models we have discussed into the RLHF process.

Based on the prompt x from the data set, two texts y1 and y2 are generated - one from the initial language model and one from the current iteration of the fine-tuning strategy.

After the text from the current policy is passed to the preference model, the model returns a scalar concept of "preference" - rθ.

After comparing this text with the text from the initial model, it is possible to calculate the penalty for the difference between them.

No money, let Love Saint ChatGPT teach you how to chase Musk!

#RLHF can continue from this point by iteratively updating the reward model and policy.

As RL strategies are updated, users can continue to rank these outputs against earlier versions of the model.

In this process, the complex dynamics of the evolution of strategies and reward models are introduced. This research is very complex and very open.

Reference:

https://www.4gamers.com.tw/news/detail/56185/chatgpt-can-have-a-good-conversation-with-you-among -acg-and-trpg-mostly

https://www.businessinsider.com/history-of-openai-company-chatgpt-elon-musk-founded-2022-12#musk-has-continued- to-take-issue-with-openai-in-recent-years-7

The above is the detailed content of No money, let 'Love Saint' ChatGPT teach you how to chase Musk!. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

ChatGPT now allows free users to generate images by using DALL-E 3 with a daily limit ChatGPT now allows free users to generate images by using DALL-E 3 with a daily limit Aug 09, 2024 pm 09:37 PM

DALL-E 3 was officially introduced in September of 2023 as a vastly improved model than its predecessor. It is considered one of the best AI image generators to date, capable of creating images with intricate detail. However, at launch, it was exclus

750,000 rounds of one-on-one battle between large models, GPT-4 won the championship, and Llama 3 ranked fifth 750,000 rounds of one-on-one battle between large models, GPT-4 won the championship, and Llama 3 ranked fifth Apr 23, 2024 pm 03:28 PM

Regarding Llama3, new test results have been released - the large model evaluation community LMSYS released a large model ranking list. Llama3 ranked fifth, and tied for first place with GPT-4 in the English category. The picture is different from other benchmarks. This list is based on one-on-one battles between models, and the evaluators from all over the network make their own propositions and scores. In the end, Llama3 ranked fifth on the list, followed by three different versions of GPT-4 and Claude3 Super Cup Opus. In the English single list, Llama3 overtook Claude and tied with GPT-4. Regarding this result, Meta’s chief scientist LeCun was very happy and forwarded the tweet and

Interpret the meaning and difference of PHP version NTS Interpret the meaning and difference of PHP version NTS Mar 27, 2024 am 11:48 AM

The meaning and difference of PHP version NTS PHP is a popular server-side scripting language that is widely used in the field of web development. There are two main versions of PHP: ThreadSafe(TS) and Non-ThreadSafe(NTS). On the official website of PHP, we can see two different PHP download versions, namely PHPNTS and PHPTS. So, what does PHP version NTS mean? What is the difference between it and the TS version? Next,

How to check the version of Douyin How to check the version of Douyin Apr 15, 2024 pm 12:06 PM

1. Open the Douyin app and click [Me] in the lower right corner to enter the personal page. 2. Click the [Three Stripes] icon in the upper right corner and select the [Settings] option in the pop-up menu bar. 3. In the settings page, scroll to the bottom to view the current version number information of Douyin.

How to lower the chrome browser version? How to return chrome to the previous version How to lower the chrome browser version? How to return chrome to the previous version Mar 14, 2024 am 09:22 AM

The Chrome browser is currently the browser with the largest number of users and high praise, and is committed to providing users with a comprehensive and fast experience. Some netizens feel that the old version is still easier to use after updating Google Chrome. If they want to downgrade the version of Google Chrome, what should they do? Let’s take a look at the detailed operation method of downgrading Google Chrome! The specific method of downgrading Chrome is to close the Chrome process, open the directory where chrome.exe is located (Application), delete chrome.exe, rename old_chrome.exe to chrome.exe, and delete the \UserData\Default\Web Data file (do not delete this

Which version of Douyin matrix management system is recommended? How to do matrix marketing? Which version of Douyin matrix management system is recommended? How to do matrix marketing? Mar 21, 2024 pm 03:50 PM

With the popularity of Douyin, more and more companies and individuals have begun to use Douyin for marketing. As an important tool for Douyin marketing, the Douyin matrix management system has also received widespread attention. So, which version of Douyin matrix management system is recommended? This article will answer this question for you and introduce how to play Douyin matrix marketing. 1. Which version of Douyin matrix management system is recommended? There are many versions of the Douyin matrix management system. The following versions are currently popular on the market: Douyin’s official version of the matrix management system is specially designed for companies with certain operating experience and budget, and can realize multi-account management, data Analysis and material library functions. This system provides an efficient way to manage Douyin accounts and helps companies better understand their operational data to better

What are the ppt versions? Inventory of several PPT versions What are the ppt versions? Inventory of several PPT versions Mar 20, 2024 am 10:16 AM

Friends in the workplace must be familiar with PPT production! Whether it is year-end summaries or commercial bidding, etc., PPT is often used to express it. However, for those who are new to PPT production, they don’t know much about the PPT versions. Let’s take stock of a few PPTs. Version. 1. PPT2003PowerPower2003 is an office software. It has a simple interface, fast operation, easy to use, and does not require high computer configuration. It has basic functions such as word processing, image processing, graphics editing, animation operation, and multimedia insertion. 2. PPT2007PowerPoint2007 inherits the advantages of the previous version and has greatly improved its functions. Get used to PowerPoint

Apple Vision Pro receives major update, visionOS 1.3 RC version released Apple Vision Pro receives major update, visionOS 1.3 RC version released Jul 25, 2024 pm 04:25 PM

According to news on July 24, Apple recently pushed the highly anticipated visionOS1.3RC version update to VisionPro headset users. This update marks Apple’s continued innovation and progress in the field of virtual reality. Although the official did not clearly disclose the specific content of this update, users generally expect it to include performance optimization, functional improvements, and bug fixes. The internal version number of this update is 21O771, 141 days have passed since the last update. However, due to caching issues with node server configurations in Apple's various regions, some users may experience delays in upgrades and updates. Apple recommends users to back up their data before installing updates to ensure information security. 1.VisionPro users can use the "

See all articles