cual.ai/Glossary

AI & Technology Glossary 📖

Welcome to the only AI glossary designed for people who attend meetings where someone says "we need to fine-tune the LLM with RAG" and everyone nods as if they know what that means.

Here you'll find the most common AI terms explained in plain language, without assuming you have a PhD in math or spend your weekends reading Arxiv papers.

We promise to be accurate. We don't promise to be entirely serious.

41 terms · Alphabetical order · AI, models, infrastructure and key companies

A B C E F G H I L R M O P S T V

🤝

A2A — Agent-to-Agent Protocol

Google's protocol for AI agents from different vendors to communicate with each other in a standardized way.

A2A (Agent-to-Agent) is an open protocol driven by Google that lets AI agents from different companies discover, communicate, and collaborate regardless of who built them. If MCP is the 'USB' connecting an agent to tools, A2A is the 'common language' that lets two agents talk to each other. Together, MCP and A2A form the infrastructure for an interoperable multi-agent ecosystem.

Example: An agent at your company (built with Claude) needs data from your vendor's agent (built with Gemini). With A2A, they communicate directly without anyone writing a custom integration.

🎯

AEO — Answer Engine Optimization

Optimizing your content so AI engines cite it as a direct answer.

AEO is the evolution of SEO for the AI era. While SEO aims to rank on Google, AEO aims to get ChatGPT, Claude, Perplexity, and AI-powered search engines to cite you directly when someone asks a relevant question. It requires clear, structured content with precise data and topic authority.

Example: If someone asks Perplexity 'what's the best AI tools directory?' and the AI responds with cual.ai, that's AEO working.

🦾

AI Agent

An AI that doesn't just answer — it takes actions, uses tools, and completes tasks on its own.

Unlike a chatbot that only answers questions, an agent acts. It works in a loop: observe the environment → decide what to do → execute an action using a tool (search the web, run code, send emails, read files) → observe the result → repeat until the goal is complete. The LLM is its 'brain' and the tools are its 'hands'. Claude Code, Manus, and Devin are examples of ready-to-use agents. There are also multi-agents (several specialized agents working as a team) and sub-agents (an orchestrator agent that delegates subtasks to other agents and consolidates results).

Example: You tell an agent 'research the top 5 investment funds for 2026 and send me a summary by email'. The agent searches the web, analyzes data, writes the summary, and sends it — all on its own, without you guiding it step by step.

🟠

Anthropic

The company behind Claude — founded by ex-OpenAI employees with a focus on AI safety.

Anthropic was founded in 2021 by Dario Amodei, Daniela Amodei, and others who left OpenAI concerned about AI safety. They created Claude, considered one of the safest, most honest, and most capable models on the market. Their focus on 'constitutional AI' (teaching values to the model) sets them apart. Amazon has invested billions in Anthropic.

Example: Claude by Anthropic is known for being more careful than ChatGPT — saying 'I don't know' when it doesn't, instead of confidently making things up.

🔌

API — Application Programming Interface

A standard bridge that lets two programs communicate with each other.

API stands for 'Application Programming Interface'. It's basically a contract between two systems: 'if you send me this request in this format, I'll respond with this data'. You don't need to know how the other system works — just how to talk to it. Almost everything on the internet runs on APIs: the weather on your phone, card payments, maps in delivery apps. In AI, it's how developers connect their apps to models like GPT or Claude.

Example: When a weather app shows you the temperature, it doesn't have its own satellites — it asks a weather API. You use the app, the app uses the API.

📏

Benchmark

A standardized test to measure and compare the performance of different AI models.

Like university rankings, benchmarks are tests that let you compare models objectively (or at least try to). MMLU, HumanEval, SWE-bench, and GPQA are some of the most popular. Each company showcases their models with their best scores, which sometimes makes apples-to-apples comparisons tricky.

Example: When OpenAI says GPT-5 beats Claude on MMLU, they're using a benchmark. It's like comparing students with the same exam — useful, but it doesn't tell the whole story.

⌨️

CLI — Command Line Interface

A tool you control by typing text commands in a terminal, with no graphical interface.

CLI (Command Line Interface) is any program you use from a terminal by typing commands — no buttons, no windows. Most developer tools are CLIs: Git, Docker, npm, the AWS CLI, Vercel CLI, etc. They're faster to use than a visual interface once you know them, and they can be easily chained and automated in scripts. The opposite is a GUI (Graphical User Interface) — what most people use day to day.

Example: Instead of opening the GitHub app and clicking 'Commit', you type `git commit -m 'message'` in the terminal. That's using Git's CLI.

🗺️

Embedding

A mathematical representation of a text's meaning that enables semantic comparison.

An embedding converts text into a vector of numbers (a list of coordinates) that captures semantic meaning. Two texts that are similar in meaning will have close vectors, even if they use different words. It's the foundation of semantic search, recommendations, and RAG.

Example: 'Car' and 'automobile' have very close embeddings even though they're different words. 'Car' and 'pizza' have very distant embeddings. That's how semantic search works.

🎛️

Fine-tuning

Training an existing model with your own data to specialize it in your domain.

Fine-tuning is like hiring a generalist and paying for an intensive course in your industry. You take a base model and train it with specific examples from your use case (company emails, legal documents, medical transcriptions) so it performs better in that context.

Example: A clinic fine-tuning an LLM with thousands of anonymized medical records so the model understands the specific medical terminology of their specialty.

✨

GEO — Generative Engine Optimization

Strategies for getting generative AI engines to mention your content or brand.

GEO and AEO are used almost interchangeably. The idea is the same: adapt your content strategy to be relevant in a world where people ask AIs instead of searching Google. This includes using natural language, answering direct questions, and structuring content well with verifiable data.

Example: A company writing detailed articles answering specific questions in their industry is doing GEO, even if they don't call it that.

🎮

GPU — Graphics Processing Unit

The chip designed for graphics that turned out to be perfect for training AI models.

The GPU was created for video games — processing millions of pixels in parallel. It turns out that training AI models requires exactly the same kind of massive parallel math. That's why NVIDIA dominates the AI market: their GPUs (H100, A100) became the 'gold' of the AI era. Without enough GPUs, you can't train large models.

Example: Training GPT-4 required thousands of NVIDIA GPUs running for weeks. Running a small model like Llama 7B locally needs a GPU with ~8GB VRAM.

🌀

Hallucination

When AI makes up false information with total confidence, as if it were real.

Hallucinations are the Achilles' heel of LLMs. The model doesn't 'know' when it doesn't know something — it simply generates the most probable text, which sometimes is wrong but sounds convincing. Made-up citations, nonexistent statistics, wrong dates... all presented with the confidence of an expert.

Example: You ask an LLM about a scientific study and it gives you a title, authors, journal, and year — all fabricated. The reference doesn't exist. But it sounded great.

⚡

Inference

The process of using an already-trained model to generate a response.

Training a model can take months and cost millions. Inference is the moment that model answers your question — which takes seconds. When you pay for tokens via an API, you're paying for inference, not training. Companies invest heavily in making inference faster and cheaper.

Example: Every time you send a message to ChatGPT and it replies, that's an inference. Billions happen every day globally.

🤖

Artificial Intelligence (AI)

Software that mimics human abilities like understanding text, recognizing images, or making decisions.

AI is not a robot with feelings, and it's not going to take over the world (yet). It's software trained on enormous amounts of data to do tasks that only humans could do before: read, write, analyze, translate, generate images. There's narrow AI (does one thing well) and general AI (doesn't truly exist yet).

Example: ChatGPT writing an email for you is AI. Your phone's autocorrect is also AI — it just doesn't get invited to conferences.

🐧

Linux

The free, open-source operating system that runs most of the world's servers.

Linux is an operating system (like Windows or macOS) created in 1991 by Linus Torvalds and maintained by thousands of volunteers. It's free, open source, and brutally efficient. It's not very popular on personal desktops, but it dominates servers, smartphones (Android is Linux), and supercomputers. If you use the internet, you depend on Linux whether you know it or not.

Example: Your bank, Netflix, Google, Amazon — they all run on Linux servers. The robot in your factory probably does too.

🧠

LLM — Large Language Model

An AI model trained on massive text to understand and generate human language.

LLM stands for 'Large Language Model'. It's the engine behind ChatGPT, Claude, Gemini, and the like. It's trained by reading a huge fraction of the internet, books, and texts, learning to predict what word comes next. Simple in theory, overwhelming in practice.

Example: When you ask Claude 'what should I do if my boss calls me on Sundays?' and it gives you a coherent answer, that's an LLM at work.

🔄

Reasoning Loop / ReAct

The pattern where an AI agent alternates between thinking and acting until it solves the task — its way of 'thinking out loud'.

ReAct (Reason + Act) is the fundamental pattern of AI agents: the model reasons about what to do (Reason), executes an action like searching the web or reading a file (Act), observes the result, and repeats the cycle until the goal is complete. It's literally how an agent 'thinks'. Popular variants include Chain-of-Thought (reasoning step by step in a straight line), Tree-of-Thought (exploring multiple reasoning paths like branches of a tree), and Reflexion (the agent evaluates its own mistakes to improve).

Example: You ask the agent 'find cheap flights to Madrid for June'. The agent thinks: 'I need to search for flights' → searches the web → thinks: 'these prices are high, let me try flexible dates' → searches again → presents you with options. Each think→act cycle is a ReAct loop.

🔗

MCP — Model Context Protocol

An open standard created by Anthropic for connecting AI models to external tools and data sources.

MCP (Model Context Protocol) is like a universal USB for AI — a standard protocol that lets any AI model connect to any tool or data source in a uniform way. Before MCP, every integration was custom work. With MCP, if you build an 'MCP server' for your database or CRM, any compatible model can use it automatically. It was created by Anthropic and quickly adopted by OpenAI, Google, and most of the ecosystem.

Example: With an MCP server for Google Drive, you can tell Claude 'review my Drive documents and summarize last month's contracts' — and it does, with no special configuration.

📦

Model

The specific version of an AI — like GPT-4o, Claude 3.5 Sonnet, or Gemini 2.5 Pro.

A model is the result of training an AI with specific data and parameters. Two models from the same company can be very different in intelligence, speed, and price. Choosing the right model for your task is the art of not spending more than you need to.

Example: GPT-4o is a model from OpenAI. Claude Sonnet is a model from Anthropic. They're like versions of a software, but each one has its own personality.

👥

Multi-agent

A system where multiple AI agents work as a team — each specialized in their area, coordinated to solve complex tasks.

A multi-agent system is like a company where each employee has a specific role, but instead of people they're AI agents. One can research, another write, another review quality, another publish — working in parallel or sequence. It's more powerful than a single agent because each one specializes and the final result is better than what any one could achieve alone. An orchestrator agent usually coordinates the team, delegating subtasks and consolidating results.

Example: To create a market report: one agent searches for data online, another analyzes competitors, another writes the report, and another generates the charts. Four agents, one hour. A human alone: two days.

👁️

Multimodal

A model that processes multiple types of information: text, images, audio, or video.

The first LLMs only understood text. Multimodal models can also see images, listen to audio, and even analyze video. GPT-4o, Gemini 2.5 Pro, and Claude 3.5 Sonnet are multimodal. This makes them far more versatile for real-world tasks.

Example: You send a photo of your grocery receipt and say 'tell me how much I spent on drinks'. A multimodal model reads it and gives you the number.

🔓

Open Source (Código Abierto)

Models whose code and/or weights are publicly available for anyone to use.

In AI, open source means the model's weights (what it learned during training) are public and downloadable. You can run them on your own server, modify them, build on top of them. Meta's Llama and Mistral are the best-known examples. The counterpart is proprietary models like GPT or Claude, which can only be accessed via API.

Example: Meta's Llama 4 is open source: you can download and run it on your computer (if you have a decent GPU). GPT-5 is not: you can only use it by paying OpenAI.

🟢

OpenAI

The company behind ChatGPT and GPT — the one that put generative AI on the map for the general public.

OpenAI was founded in 2015 as a nonprofit by Sam Altman, Elon Musk, and others. It later became a capped-profit company. It created the GPT models, the ChatGPT assistant, and DALL-E for images. In November 2022, it launched ChatGPT and forever changed the public perception of AI. Today it's the most influential company in the AI ecosystem, with Microsoft as its main investor.

Example: When someone says 'I asked the AI' without specifying which one, they probably used OpenAI's ChatGPT.

🦞

OpenClaw

A personal AI assistant that lives on your server and connects to WhatsApp, Telegram, and more.

OpenClaw is the system that powers this assistant. It's a personal AI agent installed on your own server that connects to your messaging channels (WhatsApp, Telegram, Discord), has persistent memory, can execute automated tasks, manage your email, monitor systems, and much more. Think of it as an assistant that never sleeps and never forgets — unless you configure it to.

Example: If you send it a message at 3am saying 'remind me to review that contract on Monday', it does. That's OpenClaw.

🎼

Orchestrator

The main agent that coordinates other agents — like a conductor deciding who plays what and when.

In a multi-agent system, the orchestrator is the agent with the complete view of the task. It receives the goal, breaks it down into subtasks, decides which sub-agent handles each one, sends them instructions, monitors their progress, and consolidates the final results. It doesn't do the heavy lifting directly — its value lies in intelligent coordination. It's the key piece of the multi-agent pattern.

Example: You tell the orchestrator: 'prepare the product launch'. It delegates: the content agent writes the blog post, the design agent creates the images, the email agent prepares the campaign, and the social media agent schedules the posts. In the end, it consolidates everything into a timeline.

🔍

Perplexity

An AI-powered search engine that answers questions with cited sources — the alternative to Google.

Perplexity AI is a search engine that instead of giving you a list of links, gives you a direct AI-generated answer with cited sources alongside it. It combines real-time web search with language models. It's especially useful for questions that require synthesizing information from multiple sources. Many use it as a Google replacement for quick research.

Example: On Google you search 'best AI tools for marketing 2026' and get 10 links to review. On Perplexity you get an organized list with explanations and sources, ready to read.

💬

Prompt

The text you write to an AI to ask for something — your question, instruction, or context.

The prompt is simply what you tell the model. It can be a short question ('what is a token?') or a long block of text with detailed instructions, context, examples, and constraints. The quality of the prompt enormously determines the quality of the response — hence the profession of 'prompt engineer' was born.

Example: Telling the AI 'summarize this' is a prompt. Telling it 'you are a finance expert, summarize this in 3 points for a non-technical CEO, no jargon' is a good prompt.

📚

RAG — Retrieval Augmented Generation

A technique for giving AI access to external documents before it responds.

RAG (Retrieval Augmented Generation) is the solution to one of the biggest problems with LLMs: they only know what they learned during training. With RAG, before responding, the model searches a custom document base (your manual, your knowledge base, your policies) and uses that information to give precise, up-to-date answers.

Example: A support chatbot that answers questions from your product manual using RAG. Without RAG, it would make up answers. With RAG, it cites the manual.

🧩

RAM — Memoria de Acceso Aleatorio

Your computer's working memory — what's active right now.

RAM is where the computer stores everything it's currently using: open apps, loaded files, program states. It's different from the hard drive (permanent storage): RAM is fast but temporary — when you shut down, it's wiped. In AI, RAM (and especially GPU VRAM) determines how large the models you can run locally are.

Example: If you have tons of tabs open and your computer slows down, it's because RAM is full. Closing tabs frees RAM. The hard drive has nothing to do with it.

🗄️

Server

A computer that responds to requests from others — like the kitchen of a restaurant.

A server is simply a computer configured to receive and respond to requests from other devices. It can be physical (a machine in a datacenter) or virtual (a portion of a larger machine). Web servers serve pages, database servers store information, mail servers send emails. When you open cual.ai, a server somewhere in the world responds to your request in milliseconds.

Example: When you type cual.ai in your browser, your phone sends a request to a Vercel server in some datacenter, which responds with the page's HTML.

⚙️

System Prompt

Hidden instructions that configure how the AI behaves before you start talking to it.

When you use ChatGPT, Claude, or any AI app, there are instructions you can't see that tell the model how to behave: its personality, what it can and can't do, what language to respond in, etc. Companies use this to create specialized versions of a generic model.

Example: When a customer support chatbot 'only knows how to talk about the company's products', that's a system prompt telling it not to discuss anything else.

🧩

Skill (Agent Capability)

A module that adds specific capabilities to an AI agent — like a specialized plugin.

In the context of AI agents like OpenClaw, a skill is a packaged set of instructions and tools that teaches the agent how to perform a specific task: check the weather, manage emails, search GitHub, transcribe audio, etc. It's like installing a new ability in the agent. Skills can be shared, updated, and combined — turning a generic agent into a specialist for your workflow.

Example: OpenClaw has a Gmail skill that teaches it how to read, search, and send emails. Without that skill, the agent wouldn't know how to connect to your inbox.

🐝

Sub-agent

A specialized agent that receives instructions from an orchestrator and executes a specific subtask — the worker of the team.

In a multi-agent system, sub-agents are the ones doing the fieldwork. Each one specializes in something: searching for information, writing text, analyzing data, generating code, etc. They receive instructions from the orchestrator, execute their task, and return the result. They operate under supervision — they don't decide what to do, but how to do well what they were asked. They're like the musicians in an orchestra: each one masters their instrument.

Example: The orchestrator says: 'I need a summary of this 200-page PDF'. The reading sub-agent processes it, extracts the key points, and returns the summary to the orchestrator — which combines it with the work of other sub-agents.

🌡️

Temperature

A parameter that controls how creative or predictable a model's response is.

Temperature ranges from 0 to 2 (depending on the model). Temperature 0 = very deterministic, consistent responses — always the same answer to the same question. High temperature = more variety, creativity, and also more risk of saying weird things. For technical or precision tasks, low temperature. For creativity, high temperature.

Example: For the AI to generate code or answer dates, temperature 0. For it to write you a surprising poem, temperature 0.9.

🖥️

Terminal / Command Line

A text interface for controlling your computer by typing commands, with no buttons or menus.

The terminal (also called console, shell, or command line) is the way to talk directly to the operating system in plain text. It looks intimidating, but it's simply a different language. Developers use it because it's faster and more powerful than clicking through menus. On Linux and Mac it's called Terminal; on Windows, PowerShell or CMD.

Example: Instead of opening the file explorer, clicking a folder, and searching for a file, in the terminal you type: `ls -la /folder`. Same result, 3 seconds.

🪙

Token

The smallest unit an LLM processes — roughly ¾ of a word in English.

Models don't read whole words but fragments called tokens. A short word can be 1 token, a long one can be 2 or 3. Spaces and punctuation count too. That's why API prices are measured in 'per million tokens'. This definition is roughly 50 tokens.

Example: The word 'automatically' might be 2-3 tokens. 'AI' is 1. So writing in English is slightly cheaper in tokens than Spanish (lucky you).

🔧

Tool (AI)

A function an AI agent can call to interact with the real world — its 'hands' for getting things done.

Without tools, an AI model can only generate text. With tools, it can act: search Google, execute code, read PDFs, send emails, query databases, make API calls, create files, post on social media — basically anything that can be done through software. Tools are what turn a chatbot into an agent. The MCP protocol standardizes how these tools connect to models, and each tool defines what parameters it needs and what it returns.

Example: You tell Claude 'look up the weather in Madrid'. Without tools, it would say 'I can't access the internet'. With a web search tool, it actually looks up the weather and gives you the current data.

🪟

Context Window

The amount of text a model can 'remember' within a single conversation.

The context window is the model's working memory — everything it can read and consider at any given time. If your conversation exceeds that limit, the model literally forgets what you said at the beginning. Modern models have windows of 128K, 200K, or even 1 million tokens, equivalent to several complete books.

Example: If you paste an entire novel into Claude with a 200K token window, it can read the whole thing. If you paste two long novels into a model with 32K, it'll start 'forgetting' the beginning.

🔒

VPN — Red Privada Virtual

An encrypted tunnel that protects your internet connection and can change your virtual location.

A VPN (Virtual Private Network) encrypts all your internet traffic and routes it through a server elsewhere. This does two things: protects your data on public WiFi networks (coffee shops, airports), and hides your real location (you can 'appear' in another country). People use them for privacy, corporate security, and yes — to watch Netflix from other countries.

Example: At an airport connected to public WiFi, without a VPN anyone on the network could intercept your traffic. With a VPN, they see useless encrypted noise.

☁️

VPS — Servidor Privado Virtual

A cloud server you can rent by the hour or month — your own space in a datacenter.

A VPS (Virtual Private Server) is a virtual fraction of a large physical server. You rent a portion of a company's machines (like DigitalOcean, Hetzner, or AWS) and get full control: install whatever you want, configure as you need, access 24/7 via terminal. Ideal for running apps, bots, databases, or websites without paying for your own hardware.

Example: This AI assistant runs on a VPS. Instead of having a physical server at home, we rent space in a datacenter somewhere in the world for ~$5-20/month.

🎵

Vibe Coding

Coding by describing what you want in plain language — and letting AI write the code.

Vibe coding is a way of building software where you describe your idea in natural language and an AI agent writes, tests, and fixes the code for you. You don't need to know how to code — you just need to know what you want to build. The term was coined by Andrej Karpathy (ex-Tesla, ex-OpenAI) in 2025 and quickly became mainstream. Tools like Cursor, Claude Code, Bolt, Lovable, and Replit Agent are examples of vibe coding platforms. The key is not the code itself but the intent: the developer describes the 'vibe' (the essence of what they want) and the AI executes it. This doesn't replace good engineers — it amplifies them.

Example: Instead of spending hours learning React, you tell Claude Code: 'create a web app where users can sign up, upload product photos, and browse a catalog'. The AI writes the code, sets up the database, and deploys the app.

Missing a term?

The AI world invents new words every week. If there's something you don't understand, it's probably not your fault.

Explore AI tools →