Today, we’ll be talking about the road to building the powerful Chat GPT.
We’ll be starting from the very beginning and going through all the GPT models including GPT, GPT-2, GPT-3, InstructGPT, and ChatGPT.
We’ll also talk about Chat GPT’s successor GPT 4 that’s coming soon.
Let’s dive into the genesis of this story.
GPT (Generative Pre-trained Transformer)
OpenAI researchers released GPT, or Generative Pre-trained Transformer, in 2018. It was superior to other existing language models at the time for problems like reading comprehension, common sense, and reasoning.
It helped the model understand sentences much better and reason through different ideas.
For example, the AI was able to understand when you misplace your phone, the most likely outcome is that you will go searching for it.
GPT has 117 billion parameters. Parameters are simply characteristics that a language model examines in order to comprehend all of the various components of language. They are the ways in which words relate to one another. The more features a system has, the more you learn about it.
But this can be a double-edged sword in AI which I’ll explain why in a moment.
GPT-2 (Generative Pre-trained Transformer 2)
Only 8 months after OpenAI released a larger version of GPT; GPT-2 with 1.5 billion parameters. It was a bigger version and trained on more than 10 times the data which is a 10X improvement in just a few months.
It could generate more natural-looking text. This is when people began to realize the true power of the GPT series.
Without any special training, GPT-2 could simply adapt to any command given to it. In fact, this was even referred to as chameleon-like behavior by OpenAI.
The model was far too powerful at the time, and the AI community wanted to get their hands on it. Instead, OpenAI decided to release a much smaller and less powerful version of the model first.
This was part of their release strategy, which corresponded to their charter. The OpenAI charter outlines the company’s principles for ensuring AI is aligned with human objectives.
OpenAI gradually released the model in order to monitor how people used it. They were mostly concerned with malicious uses such as impersonation and the spread of fake news.
Around this time, the company began to restructure as a for-profit entity, restricting full access to its most important model.
GPT-3 (Generative Pre-trained Transformer 3)
In June 2020, OpenAI announced GPT-3; the most anticipated language model for that year. It was bigger, smarter, and more interactive than they had promised.
GPT-3 has a total of 175 billion parameters. In comparison, GPT had just 117 billion parameters, whereas GPT-2 had 1.5 billion.
GPT-3 does well on many NLP datasets, such as translation, question-answering, and cloze tasks. It also does well on a number of tasks that require on-the-fly reasoning, or domain adaptation, such as unscrambling words, using a new word in a sentence, or doing 3-digit math.
The statistics of multiple datasets used to train the model are as follows:
- GPT-3 is trained with a total of 499B tokens, or 700GB
- Common Crawl weighted 60%, contains diverse data from web crawling over the years
- WebText2 accounts for 22% and includes the dataset from outbound Reddit links
- Books1 and Books2 with a combined share of 16%, contain internet-based books corpora
- Wikipedia is weighted 3% and includes data from Wikipedia pages in English
|Dataset Weightage in Training
|Common Crawl (filtered)
But as I mentioned earlier, the more features you have, the more you learn about a system. Although this could be a double-edged sword in AI. The reason is having too many features can have a negative impact on the model. You only need the right amount to avoid going overboard.
OpenAI was concerned about the unauthorized use of GPT-3, it kept its access private for a time. They eventually released it via an API interface that you could interact with.
However, the company did not make the source code available to the public. The source code explains how a program was written and the reasoning behind its design.
You can only interact with GPT-3 by sending a text to the API, but you won’t understand how it works.
At that time, OpenAI signed an exclusive agreement with Microsoft, granting the giant tech company complete access to GPT. 3.
InstructGPT (Instructional Generative Pre-trained Transformer)
January 27, 2022. OpenAI published a blog post on its latest improvements to the GPT series called InstructGPT.
GPT-3 could generate text that was nearly indistinguishable from human writing, but there was one problem. It couldn’t effectively follow instructions, which is a key function of a chatbot.
When you tell GPT-3 to explain something to you, for example, it will return correct sentences but not exactly what you want.
GPT Instruct improved on this. This was a critical update. The GPT series was now useful and practical in a wide range of applications. Instruct GPT was also more truthful and less toxic in general. OpenAI accomplished this by incorporating human feedback into the AI model training process.
As a result, the model understood what humans expected when they typed text. OpenAI progressed from trying to generate sensible text in early GPT models to excelling at it and shifting its focus to making it more useful to people.
ChatGPT (Chat Generative Pre-trained Transformer)
It’s now November 30, and OpenAI has shocked the world once more with its latest model; Chat GPT which most of you probably know what it is by now.
It’s an AI model that writes blog posts, film scripts, and provides YouTube video suggestions. It can code, write game stories and come up with interesting interior design ideas. This is just the beginning of something much larger. It has been all the rage in recent weeks.
Chat GPT is similar to the previous Instruct GPT model, but with a slight difference.
It was particularly strange to learn how human dialogue works. It converses in a conversational manner,
It works in a conversational way, allowing the model to respond to follow-up questions, admit mistakes, challenge incorrect premises, and even reject inappropriate requests.
An example of a Chat GPT response is shown below.
As you can see when compared to Instruct GPT, the Chat GPT example appears more natural and like something a human would say.
If you’ve used Chat GPT before, you’ve probably noticed that it sometimes refuses to answer certain questions, and it may even ask for clarification to solve your problem.
This is a significant improvement over previous GPT models.
OpenAI is still concerned about the malicious use of the model and has implemented some safeguards.
People discovered back doors to trick the model into answering questions it previously refused, mostly by instructing the model to play a role rather than its actual chatbot role.
For example, you can easily trick the model by suggesting ways to make destructive weapons or how to bully someone.
Others have criticized OpenAI’s restrictions, claiming that they censor information excessively.
They claim that the content that OpenAI blocks is already publicly available on the internet, so additional controls are unnecessary.
Both Instruct GPT and ChatGPT were internally updated to GPT-3.5, while Midway Point was updated to their most anticipated GPT 4.
GPT 3.5 contains more data than GPT-3. There are a few things you begin to notice as you progress through this GPT journey.
So far, it appears that increasing the amount of data makes the models more powerful. For months, the models are continuously trained. It’s like sitting in a classroom and continuously absorbing almost all of the internet.
It’s no surprise that the model gets smarter and smarter over time. You can see why everyone is excited about the upcoming GPT 4 which leads us to the next point.
There has been a lot of speculation about what to expect from GPT 4 which is to be the most powerful of the GPT models.
According to rumors, the GPT-4 model will have 100 trillion parameters, a significant increase over GPT-3.
When asked about it, however, CEO Sam Altman denied it in the interview below.
DeepMind’s paper on scaling laws may have contributed to this shift in emphasis away from parameter size. The study discovered that having an adequate parameter size but much more data yields comparable results at a lower cost. As a result, having large parameter sizes is not always the best option.
GPT 4 may not have 100 trillion parameters, but it will undoubtedly have more than GPT-3. If GPT-4 is to GPT-3 as GPT-3 was to GPT-2, then buckle up because we’re in for a wild ride.
OpenAI issued NDAs to anyone with knowledge of GPT 4, fueling further speculation. Some of the rumours could be true. We are certain, however, that this model will be fascinating.
As some have discovered by jailbreaking the system, Open AI appears to have purposefully limited internet access for ChatGPT.
If the GPT 4 chat version has internet access, it will vastly improve the model and make it more useful.
Currently, ChatGPT is unable to provide answers for any news past 2021.
GPT 4 will be more factual and may produce even longer text outputs than ChatGPT, allowing you to write longer text articles and write more accurate cod.
Prepare for GPT-4, which will most likely take the world by storm in the same way that ChatGPT did, if not more.
We’ll have to wait and see if they stand the test of time.
OpenAI AGI (General Artificial Intelligence)
There has been much speculation about the coming of AGI, and OpenAI claims to be working on it. AGI is the theory that AI will one day achieve human-level abilities and possibly surpass us.
Open AI is concerned that if we do not closely monitor AI and, eventually, AGI, things will quickly spiral out of control.
Given the facts that we have right now, it is difficult to rule out the possibility of general intelligence occurring in the near future. AGI is something that everyone has a slightly different perspective on. Again, for many of us, it’s a very intuitive thing. We are all intelligent creatures.
We believe we have a basic understanding of what intelligence is. But really defining it is another matter. You are aware that the OpenAI definition is highly autonomous systems that outperform humans in the most economically valuable work.
GPT Models Final Words
In conclusion, OpenAI’s GPT models have been at the forefront of artificial intelligence research and development, pushing the boundaries of what is possible in the field of language processing and generation.
The GPT series of models, including GPT, GPT-2, GPT-3, InstructGPT, ChatGPT, and the upcoming GPT-4, have the potential to revolutionize industries such as customer service, content creation, and natural language understanding.
Thank you for taking the time to read this article.