Share: LinkedIn ↗

Understanding Large Language Models (LLMs)

Here's a breakdown written BY an LLM, ChatGPT:

Imagine trying to create a recipe from all the cooking shows you've ever watched. Your brain sifts through everything you've learned, combining bits of information to help you make a dish. Large Language Models (LLMs) work similarly, but in the world of words and information.

An LLM is a type of artificial intelligence that reads and absorbs vast amounts of text from the internet, books, articles—virtually anywhere you find words. It's like an incredibly well-read friend who's digested more books and sites than you could in a lifetime.

But what does an LLM do with all this information? It learns the art of conversation, writing, and even understanding human language. When you ask an LLM a question or seek its assistance in writing, it isn't pulling responses from thin air. Instead, it recalls all the text it's studied, understanding the context and the way language is used, to provide a response that makes sense.

So, LLMs are not creators of new ideas but rather expert communicators. They synthesize everything they've "read" and assist us in various tasks—from writing emails, generating reports, to answering complex questions, making our digital communications smoother and more efficient.

In shorter and simpler terms

At its base, a language model is a program that uses a probability-based algorithm to decide what words it should use to reply to its input (often called a “prompt”), based on its previous training.

A large language model, or LLM for short, is a powerful language model that has been trained on a massive amount of data in order to accurately understand any input and respond accordingly.

How does a language model use probability to decide how it should respond to our messages?

By training itself on example data, which can resemble anything from an article to someone’s social media message history, language models start to see a trend in what words follow others. Thus, when a language model takes in our messages, it calculates what words it would be most likely to see as a response based on its previous experience, and gives us that response. The fact that LLMs can respond to just about anything is a testament to just how much data they’ve trained on - no matter how uncommon your question might be, they will generally have a pretty good guess as to how to respond.

Why are LLMs such a big deal?

Large Language Models are incredibly powerful

If you ask ChatGPT what information it was trained on in order to be able to converse with humans, it won’t hesitate to tell you: in its training process, ChatGPT has taken in information and human interactions from all kinds of books, articles, papers, and websites. It also has the bandwidth to remember a large portion of that data - since ChatGPT does not have a live connection to the internet, all the questions it is able to answer are from its own training. Whenever we ask an LLM to do something, it is able to use the unimaginable amount of data at its disposal to decide exactly how to respond in a way that helps us the most. No other tool in human history has been able to aggregate and make use of this much information in a way that can be utilized as easily as sending a message.


Source: https://www.linkedin.com/pulse/corpus-used-large-language-models-llms-different-ankit-pareek/

Large Language Models are versatile

Because LLMs are trained on all types of data, it is also much more flexible than any other machine learning program on the market. LLMs are capable of responding to our requests in any way we choose: we can tell one to give us only one-word answers, or structure its output in a table format, or even output code. This allows us to incorporate LLMs into our code and use their flexibility to our advantage, so that a program that would take an intensive amount of computer logic or formatting can instead be tackled as a no-code solution by requesting an LLM to process it. An LLM also takes a much more human approach to problems that we give it, which makes the resulting program much more robust - an oddly written user input or even a malicious attempt to break normal code with a malformed input doesn’t pose a problem to an LLM.


Large Language Models are only getting better

As LLMs have become the new darling of the tech industry, the race to make them even stronger and more accessible has continued over the last few years. This means as time goes on, the LLMs that are being used in our programs will get even better at understanding what we want them to do and performing their work in a fast, cheap way.

The Pros and Cons of Large Language Models

What they're great at:

Performing a wide variety of tasks, using tools we provide it as needed to do the job

One common application of LLMs is a question and answer bot, where an LLM will go through a database for you and answer any questions you have about the data.

A wide variety of tools are now available for the most popular LLMs, which give the LLMs the ability to do much more than just talk: for example, they can search the internet or save any important information a user provides into a database.


Here’s an example pulled from a bot that’s designed to get information on upcoming clinical research trials using DuckDuckGo’s search capabilities as one of its tools.

Taking in constructive criticism and improving its output over multiple interactions

Most LLMs have the capability to “remember” their previous interactions with a user in a conversation, and use that information to help the user as the user gets more specific about what they need.

While this may seem like a process that requires an active back-and-forth from a human user and an LLM, there are also many tools now that utilize this process to the fullest without any further input from the user.

  • One example that’s become increasingly popular is a summarizing tool, which will provide an LLM with something to summarize at first (a meeting, a video, an article, etc). It will then ask the LLM to continually revise its previous summary while bringing in any missing context or important information, a process that creates summaries with much more detail without making them significantly longer.


Source: https://jxnl.github.io/instructor/blog/2023/11/05/chain-of-density/

Are there more applications than these?

Of course! These are just a few of the high-level strengths we’ve been able to put to work in our products. As we build more and more tools that attempt to utilize LLMs to their fullest, we’ll change and add to these pros and cons to give you a better sense of what they can do. If you don’t see your use case listed here, let us know and we can give you a better sense of how we’d approach it with LLMs

What Large Language Models Struggle With

Completing complex tasks in a single call

The powerful computing power behind LLMs is ultimately aimed at making them much more capable of understanding and responding to a user. An LLM’s “problem solving” capabilities are more of a happy coincidence than an intended feature: in its efforts to respond to our input as humanly as possible, it can give us the information we need. However, at the same time, no part of the LLM is double checking the answer it gives us - it doesn’t use a calculator if we give it a math question or fact check any question we have for it.

  • Thus, if we’re not careful with how we ask for information from an LLM, it can often give us the wrong answer.


Source: https://chat.openai.com/share/c9bcda34-81fc-488b-a922-4c5e26388dd7

Performing creative tasks without help or examples

Another aspect that LLMs fall short on is the creative side: since LLMs are designed to replicate what humans have written in the past, they can often struggle significantly when asked to do something that humans have never done before.

However, LLMs can perform well at creative tasks with some initial inspiration and additional context in the form of examples, since that requires it to be less “creative”. Instead, it can use the basis of a previous example and the topic provided to fill in the gaps between, resulting in a much more satisfactory result.

Speed

For each individual call of an LLM, getting a response takes at least a second as of writing this. More complex calls can take even more time, sometimes up to a full ten seconds depending on if we need to give the LLM special tools to do its job.

  • While we do all we can to offload any LLM calls to code that’s not actively part of interacting with the user, many LLM calls must be done before our programs respond to the user in order to provide a satisfactory response.

  • However, just like LLMs will only get more capable over time, they will also get faster - which makes technology built with them faster over time with no additional dev time.

Wrap-Up

In summary, large language models are extremely powerful tools for us as programmers. By leaning on their strengths and accounting for their weaknesses, we can build incredibly potent tools that work intuitively for both end users and other programmers who want to review or modify our products in the future. If you have any more questions about LLMs or how we use them to build products for you, please don’t be afraid to contact us!

Looking to stay connected?
Get access to the only newsletter you need to implement AI in your business.