GPT-3 (Generative Pre-Trained Transformer) Third Generation is a language model. Open AI presented it in a pioneering paper ‘Language Models Are Few Short Learners’ in May 2020.
Introduction
GPT-3 is a machine learning model build to achieve solid and accurate results on various natural language benchmarks. When it was introduced to the tech world, it grabbed immediate attention due to its stunning magnitude. It is the largest neural network with 175 billion parameters, beating all previous language models by inestimable margins. It can carry out a slew of natural language processing (NLP) tasks, including translation and Q & A. And it can deliver highly accurate results even in tasks that it has never been trained.
GPT-3 is referred to as an autoregressive language model. Unlike other AI systems designed for a use case, GPT-3 can produce texts that simulate human writing using deep learning. It can generate text on any suggested subject. The ‘text input and output’ interface allows users to test in any English language assignment.
“GPT-3 is living confirmation that the Natural Language Processing area is advancing more than ever by leaps and bounds.” Nerea Luis, an expert in artificial intelligence.
The Foundation of GPT Models
All the groundwork concepts are connected to the GPT model in some way or the other. These include Transformers, language models, generative models, semi-supervised learning, zero/one/few-shot learning/transfer and multitask learning. All these concepts together define the GPT model. What is the thread connecting them all?
Models of the GPT family are language models supported by the transformer architecture, pre-trained in a generative manner and show a good performance in zero/one/few-shot multitask settings.
The initial model of GPT was presented by OpenAI in June 2018 in a paper titled Improving Language Understanding by Generative Pre-Training. It mentioned prominently how a combination of the transformer architecture with unsupervised pre-training yields promising results. GPT-1 was a step ahead as it was fine-tuned to perform a specific task and under supervision.
OpenAI published the second paper, Language Models are Unsupervised Multitask Learners, in February 2019, in which they introduced GPT-2. Although more significant by one order of magnitude, the only one additional difference between the two is that GPT-2 is designed for multitasking. It proved that a semi-supervised language model could perform several tasks efficiently without any task-specific training.
GPT-3 Settles a Specific AI Point of Debate
It is essential to highlight here that GPT-3 is around 100X bigger than GPT-2 and yet not very different from other GPTs as the underlying principles are the same, by and large. However, its performance has surpassed expectations, and the results it has produced are stunning.
There was always this debate about AI on the modules that should be used for achieving better results. Should we use bigger models with enhanced parameters, or should we stick to design-specific modules such as intuitive physics, commonsense reasoning, and theory of mind? The incredible performance of the GPT-3 model seems to have settled this debate – bigger is better has won!
Skills, Reasoning Abilities and Other Possibilities
Unrivaled Conversational Skills
GPT-3 has colossal volumes of internet data drawn from sources such as CommonCrawl, WebText, Wikipedia, and a large corpus of books. It knows everything about public and historical figures. But do you know that it can emulate people? It can be used as a chatbot, which is again very impressive. Here are some instances where GPT-3 was used in strange but exciting ways:
- Arram Sabeti, the Chief Executive Officer of ZeroCater used GPT-3 for an interview about Stoicism.
- Mckay Wrigley designed an app called Bionicai to helping people learn anything, from philosophy from Aristotle to writing skills from Shakespeare.
- Jordan Moore talked with the GPT-3 versions of Steve Jobs, Elon Musk, Cleopatra, and Kurt Cobain and posted the results on a Twitter thread.
- Gwern did an excellent job further exploring the possibilities of the model regarding conversations and personification.
Reasoning Abilities
Many people tested GPT-3 only in those areas where humans excel. However, tests were also done on the ability of GPT-3 in the areas of commonsense reasoning and logic. They found that GPT-3 was able to keep up amazingly well. Experts believe that GPT-3 may need detailed uncertainty prompts to perform well in surreal areas.
Other Unique Abilities
Gwern conducted a wide array of experiments in a display of deep research with GPT-3. He made GPT-3 meta prompt and complete an ArXiv paper and clean PDFs by sorting out words and checking hyphens. He also got GPT-3 to design new board games. Every time it came up trumps. It does appear that our imagination is the limit when it comes to listing what the combination of AI and GPT-3 can do.
The Microsoft and GPT-3 Connection
Undoubtedly, there is ample evidence that GPT-3 is the next-generation artificial intelligence technology that can be leveraged to usher in a new breed of startups and applications. Developers are discovering innovative uses for the language model. Startups are keen to use GPT-3 to build new or augment existing products.
However, creating a profitable business around GPT-3 could pose a challenge. That’s why OpenAI and Microsoft have partnered to create a shortcut to OpenAI profitability. Microsoft is aiming to create a marketing advantage for its first application for GPT-3. It is targeted at a non-tech audience and will simply convert natural language queries into data using Power Fx.However, the association will help OpenAI become a commercially viable entity soon.
Conclusion
GPT-3 has produced results that are way beyond expectations. We certainly don’t know what to expect for the future. What’s sure is that GPT-3 is one of its kind and remains the most powerful neural network available. Every industry is looking at GPT-3 with bated breath and heightened expectations. The Center for Research on Foundation Models (CRFM), a new initiative from the Stanford Institute for Human-Centered Artificial Intelligence (HAI), has been laying the groundwork for path-breaking foundation models including GPT-3 and others. CRFM is committed to conducting interdisciplinary research to learn how to create efficient, robust, and multimodal foundation models that are ethically sound as well.
WinWire is leveraging foundation models and considering GPT-3 for various healthcare use-cases for the likes of providers, payors and medical technology companies.
Sources