Large Language Models that Follow Instructions

Learn about the new version of GPT-3 that follows instructions and see the demo of the application that uses it in action. This could be a game changer for how we interact with machines.

Nov 01, 2022

∙ Paid

TL;DR: Build software using large language models that follow your instructions.

Outline

Large Language Models that follow instructions
InstructGPT model in action with a working example
Extensions and variations
What’s next?
Human Language Technology is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Hi all,

Following our series on building language applications, we will be covering an emergent way of using language models (LLMs) for your application.

In the previous posts, we covered what are large language models (LLMs) like GPT-3 and how to use them when building your AI software applications.

Human Language Technology

Building AI products using Large Language Models

This is the first post in the series of guest posts written by Shubham Saboo. Thanks, Shubham for contributing to the Human Language Technology newsletter.Human Language Technology is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber…

3 years ago · 1 like · 1 comment · hal

In a nutshell, LLMs (like GPT-3) are neural networks that have been trained to predict the next word on a large corpus of text (terabytes of text) crawled from the internet. Once trained, we can use these LLMs using a simple text-in-text-out interface.

That’s we provide very few examples (sometimes 2-3 examples are sufficient) as a so-called prompt, and GPT-3 uses them to perform a certain task.

Prompt Example:

Convert movie titles into emoji.
Back to the Future: 👨👴🚗🕒
Batman: 🤵🦇
Transformers: 🚗🤖
Star Wars:

Prompting has been a de-facto way of using large language models in production (see writing assistant software like Jasper, Copy ai). Engineers play around with different examples in the prompt until they get a satisfactory output from GPT-3.

What’s missing in classical prompting is providing a narrative and instructions behind a task. GPT-3 needs to figure out the task you wanted it to perform given input/output pairs only.

Wouldn’t it be nice to provide some instructions to GPT-3 and ask it to follow these instructions in addition to providing some examples? Enter InstructGPT and other instruction-based large language models.

Large Language Models that follow instructions

What do the large language models (LLMs) that follow instructions offer to the developer? How can the developer utilize them to build software better and faster?

New capabilities unlocked with LLMs that follow instructions:

Perform the task by explaining the model that you would like to generate.
- Example: “Complete the paragraph summarizing what has been written before“.
In addition to relying on examples, the large language model can now use the clues from the instructions to solve the task.
- Chain-of-thought is one way to provide instructions to the model with step-by-step reasoning steps to get it to the final output (we will talk about it more later in the post)
- Example: “Q: Take the last letters of the words in "Elon Musk" and concatenate them. A: The last letter of "Elon" is "n". The last letter of "Musk" is "k". Concatenating them is "nk". The answer is nk.”
For certain tasks, the ground-truth output is sometimes ambiguous. Providing instructions lets us go around it.
- For example, take the news summarization task. The definition of a good summary of a news article can be ambiguous and varies from person to person. With new instruction-based LLMs, we can simply ask “generate a summary for the following news article“ and get the desired output.

Instruct-GPT is a new iteration of the original GPT-3 model by OpenAI. Back at the beginning of 2022, OpenAI researchers released a paper showing that they can train GPT-3 to follow instructions from human feedback.

Aligning Language Models to Follow Instructions — **Three steps outlining how Instruct GPT model is trained**

OpenAI researchers used the following three steps to make the original GPT-3 model follow instructions:

Hire human labelers to provide examples of instructions with the appropriate response. Train GPT-3 on the collected dataset.
Set of different use cases that OpenAI used to collect instruction/response dataset
Train the reward model that ranks different responses from GPT-3 trained on the collected dataset of instructions & responses according to human preference.
Further fine-tuned the GPT-3 on an additional set of instructions using the trained reward model as a critic.

While it is not exactly clear what type of large language model is now behind the current text-davinci-002 at the OpenAI console, the recent OpenAI member remarks and performance on instruction-based tasks make me believe that text-davinci-002 is a GPT-3 tuned to follow human instructions with some secret sauce not released in their paper.

Jan Leike @janleike

@sarahwiegreffe I agree. While OpenAI doesn't like talking about exact model sizes / parameter counts anymore, documentation should definitely be better. text-davinci-002 isn't the model from the InstructGPT paper. The closest to the paper is text-davinciplus-002.

Jan Leike @janleike

PSA: If you want to compare InstructGPT to a base model in your research, the closest comparison is "text-davinciplus-002" with "davinci" (you might need to request access to the former). It's not a super clean comparison, because we haven't deployed the exact paper models.

InstructGPT model in action with a working example

Now let’s put the InstructGPT model into action!

Come follow me as I show you how to build software using InstructGPT that can help e-commerce owners increase their outreach and write more engaging ad copy for all social media networks, even if they have little to no copywriting experience.

This prototype can be further improved and developed into a full SaaS, or it can be used as a tool to help you stand out when applying for jobs as an NLP engineer.

Large Language Models that Follow Instructions

Learn about the new version of GPT-3 that follows instructions and see the demo of the application that uses it in action. This could be a game changer for how we interact with machines.

Outline

Large Language Models that follow instructions

InstructGPT model in action with a working example

This post is for paid subscribers