How Generative AI Works (Part II)

Some key concepts related to Generative AI and ChatGPT to help attorneys grasp the essential concepts that make it work.

Artificial intelligence making possible new computer technologieLast month, I discussed how important it is for attorneys to understand the inner workings of Generative AI tools, including ChatGPT.

I compared how the index at the back of a book is a great metaphor to explain how full-text search engines work, and I covered how incorporating structure within a search index improves search results. Defining terms such as “compensatory damages” in the search index provides better results, as does adding other intelligence to the text, such as incorporating synonyms from a thesaurus.

And lastly, I explained how “plain English” searching was an early application of natural language processing, pioneered in the legal industry.

This month, I’ll walk through more analogies to explain some key concepts related to Generative AI and ChatGPT to help attorneys grasp the essential concepts that make it work.

Note: Generative AI is well nuanced. A data scientist steeped in machine learning and artificial intelligence can provide deeper and more technically accurate explanations.

Type Ahead On Your Smartphone And ChatGPT

Who has ever received (or sent) a funny or perhaps embarrassing text message in which a smartphone autocorrected a word? Most everyone who texts has pushed send a moment too early!

In addition to autocorrect, most smartphones also have a “type ahead” feature that provides three suggestions for the next word in a text message. It’s  also often the source of many autocorrect errors.

So what does “type ahead” have to do ChatGPT?

Both are Generative AI. They both generate content.

When someone texts their spouse and says, “Hey, I’m out walking the <______>”, the smarthone may suggest:

“Dog”

“Daisy” (if your dog’s name is Daisy)

“Kids”

The smartphone is simply guessing, based upon probabilities, the likely next word to be typed.  My dog was named Daisy, so “Daisy” would come up as a suggestion for me in the context of texting my wife to explain that I was walking.

ChatGPT is also a Generative AI that creates content. When you type a question into ChatGPT, it behaves like type ahead on a smartphone, but at massive scale. It generates the first word, then the second word, and so on until it answers the question. ChatGPT does this based upon probabilities and vectors that form a set of connections similar to neurons in a brain — a neural network, if you will. And because the AI has been trained at such massive scale, it “guesses” extremely well.

To oversimplify, an attorney can first visualize a massive decision tree. But rather than a decision tree, the attorney should think about the tree having many decision points that are interconnected so the path navigated is not linear.

The result is a striking ability to generate content that is a meaningful to a human and that can be remarkably accurate.

GPT 3.5 Turbo has over 175 billion parameters (GPT-4 is measured in trillions of parameters). A parameter can be loosely thought of as a decision point in the neural network. And the parameters help ChatGPT create language similar to the way a smartphone can suggest three-word options in the type-ahead feature.

Large Language Models (LLMs)

Large Language Models use billions of examples of language to extract and catalog greater intelligence and relationships between words. Similar to the earlier mention of how a search index can be enhanced with synonyms from a thesaurus, LLMs are enhanced with many other attributes that provide additional intelligence and relationships between words. An LLM catalogs items including:

  • Grammar and language structure. Sentences have a subject and a predicate, adjectives typically precede nouns in English and describe the noun. Multiple sentences form a paragraph.
  • How a word is used in language. Is a word a noun? A verb? An adjective? All of the above?
  • Word meaning. The word “green” for example has many meanings. It can be a color, it can mean “inexperienced,” etc.
  • It understands context. The word green may mean a color when it is closely related to a word like “paint,” “art,” or “grass.”
  • Proper names. Microsoft, Bill Clinton, Shakira, Cincinnati, and Veuve Clicquot are all recognized as proper names or “entities.”
  • Emotions like frustration or infatuation, positive or negative feelings, or types of humor such as sarcasm, are discernible and categorized.

The result is the ability to mathematically relate the meaning of a plain language questions entered into ChatGPT (Input) and construct a response that comes across as a meaningful answer (Output).

Tokens

Unlike the smartphone analogy, ChatGPT actually works on parts of words. The parts of words are called “tokens.”  It is another simplification, but think of a token as the root of a word.

“Creat” is the “root” of many words including Create, Creative, Creator, Creating, and Creation. “Create” would be an example of a token. For real examples of tokens, click here.

When a user inputs a question into ChatGPT, the question is broken down into tokens for processing.  Those word parts are then compared to the LLM that contains the billions of parameters (neural network) to create the Output.

GPT Defined

The “GPT” in ChatGPT stands for Generative Pre-Trained Transformer. In this article, I’ve covered the Generative aspect of ChatGPT; ChatGPT generates content.

I’ve also covered in a simple way, the idea of the Transformer in ChatGPT. The LLM, neural network of parameters, and the concept of tokens provide a basic understanding of the components of the Transformer that takes user Input and transforms it into Output.

I have yet to cover the “P” or Pre-Trained aspect of ChatGPT. Without training and the ability for the algorithm to learn, Generative AI systems like ChatGPT would answer many questions incorrectly, spew nonsense, hate, and hallucinate.

The training of ChatGPT is what really separates it from other Generative AI solutions.

Few people have heard of Meta’s Galactica LLM, even though it was released in demo form two weeks prior to OpenAI’s ChatGPT. Why? Because it was pulled down just three days later, after its responses exhibited bias and spewed nonsense. Galactica’s AI training was not as good as ChatGPT’s, and consumers of Galactica got to see the nonsense firsthand.

Next month, I’ll cover the techniques of training Generative AI, and how training made ChatGPT the fastest-growing consumer application in history.


Ken Crutchfield HeadshotKen Crutchfield is Vice President and General Manager of Legal Markets at Wolters Kluwer Legal & Regulatory U.S., a leading provider of information, business intelligence, regulatory and legal workflow solutions. Ken has more than three decades of experience as a leader in information and software solutions across industries. He can be reached at ken.crutchfield@wolterskluwer.com.

CRM Banner