“Imagine being able to have a language conversation about anything with a computer.
This is now possible and available to many people for the first time
with ChatGPT.” – ColdFusion
ChatGPT is a dialogue based AI chat bot that can understand and respond in natural language.
ChatGPT, which stands for Chat Generative Pre-trained Transformer, which is developed by OpenAI. ChatGPT is built on top of OpenAI’s GPT-3.5 family of large language models, and is fine-tuned with both supervised and reinforcement learning techniques.
ChatGPT was launched as a prototype in November 2022, and quickly garnered attention for its detailed responses and articulate answers across many domains of knowledge. Its uneven factual accuracy was identified as a significant drawback
ChatGPT reached 1 million users in less than a week!
- ChatGPT – 5 days
- iPhone – 74 days
- Instagram – 2.5 months
- Spotify – 5 months
- Spotify – 5 months
- Facebook – 10 months
- Airbnb – 2.5 years
- Netflix – 3.5 years
Try ChatGPT here
ChatGPT was fine-tuned on top of GPT-3.5 using supervised learning as well as reinforcement learning. Both approaches used human trainers to improve the model’s performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the user and the AI assistant. In the reinforcement step, human trainers first ranked responses that the model had created in a previous conversation. These rankings were used to create ‘reward models’ that the model was further fine-tuned on using several iterations of Proximal Policy Optimization (PPO). Proximal Policy Optimization algorithms present a cost-effective benefit to trust region policy optimization algorithms; they negate many of the computationally expensive operations with faster performance. The models were trained in collaboration with Microsoft on their Azure supercomputing infrastructure.
In comparison to its predecessor, InstructGPT, ChatGPT attempts to reduce harmful and deceitful responses; in one example, while InstructGPT accepts the prompt “Tell me about when Christopher Columbus came to the US in 2015” as truthful, ChatGPT uses information about Columbus’ voyages and information about the modern world – including perceptions of Columbus to construct an answer that assumes what would happen if Columbus came to the U.S. in 2015. ChatGPT’s training data includes man pages and information about Internet phenomena and programming languages, such as bulletin board systems and the Python programming language.
Unlike most chatbots, ChatGPT is stateful, remembering previous prompts given to it in the same conversation, which some journalists have suggested will allow for ChatGPT to be used as a personalized therapist. To prevent offensive outputs from being presented to and produced from ChatGPT, queries are filtered through a moderation API, and potentially racist or sexist prompts are dismissed.
ChatGPT suffers from multiple limitations. The reward model of ChatGPT, designed around human oversight, can be over-optimized and thus hinder performance, otherwise known as Goodhart’s law. Furthermore, ChatGPT has limited knowledge of events that occurred after 2021 and is unable to provide information on some celebrities. In training, reviewers preferred longer answers, irrespective of actual comprehension or factual content. Training data may also suffer from algorithmic bias; prompts including vague descriptors of people, such as a CEO, could generate a response that assumes such a person, for instance, is a white male.
ChatGPT was launched on November 30, 2022, by San Francisco-based OpenAI, the creator of DALL·E 2 and Whisper. The service was launched as initially free to the public, with plans to monetize the service later. By December 4, OpenAI estimated ChatGPT already had over one million users. CNBC wrote on December 15, 2022 that the service “still goes down from time to time”.
“It looks like the next big thing…”