
Aman Pandey
Wed Jul 01 2026
|4 min read
LLM stands for Large Language Model. It is called a large language model because it is trained on a very large amount of text data.
It makes it easier for computers to understand human language.
Some of the popular ones are LLama, ChatGPT, and Gemini.
Common applications are like searching content over a search engine and crawling manually. We can give that question to the LLM. It will search the data it got trained on and will give us exactly the thing we want, but one condition is it should be trained on data for the question you are asking for.
As we send message to any LLM, consider chatGPT as an example.
Our prompt goes through guardrails and some NLP. For the next steps, it gets tokenized (divided into tokens) and then sent over to the LLM for predicting the next word.
Then that same sentence follows the same flow in transformer architecture until the LLM thinks that it is a complete answer. It also gives a signal for the start and end of the message (also called "streams").
Once that is done, we get our response back. It streams each word (token). That's why we get a response word by word, although it is not the only thing that happens, but this is the high-level overview of things.
Computers do not have any consciousness. And they are meant to understand binary only (0 and 1), as it is just the voltage thing that is handled by semiconductor devices.
Tokens are single-unit blocks of text data that will be used to train AI models. Each model can have its own tokenization system.
Tokens are converted into numbers and stored in vector databases. It is different from words because it is that building block that makes sense to the tokenization system. It is not necessary that it be a word in any language.
The example is: He is dishonest
Tokens are he is dis honest.
Transformer is a neural network architecture kind of thing built by Google for translation purposes in its initial time.
Then it evolved to be one of the most important things in today's AI world.
It is like get some input and iterate it over and over and predict the next word and get the answer completed for the user.
Transformers are very popular and widely used because they use a self-attention kind of flow.
It allows the model to look at a whole sequence of text at once and write how much attention we need to have word by word/token by token.
AI coding tools are here, but they're automating the wrong thing. Discover why problem-solving, not code generation, is what makes software engineers irreplaceable in 2024 and beyond.
The best communicators aren't the ones with the fastest comebacks or the most polished points. They're the ones who make you feel like what you said actually landed
We will learn how to shift our mindset from project to a product for which a customer really pays.