AI Development Glossary

Comprehensive dictionary of AI, machine learning, and LLM terms. Clear definitions, examples, and links to detailed guides.

A B C D E F G H I K L M N O P Q R S T U V W X Z

A

AI Agent

An autonomous software system that uses artificial intelligence to perform tasks, make decisions, and interact with environments or users without continuous human intervention. AI agents can be simple (single-task chatbots) or complex (multi-agent systems working together).

AI Agent Development Guide →LangChain for Agents →

API (Application Programming Interface)

A set of protocols and tools that allows different software applications to communicate. In AI context, refers to cloud AI APIs like OpenAI API, Azure AI, AWS Bedrock that provide access to AI models via HTTP requests.

OpenAI API Integration →API Development Services →

Attention Mechanism

A neural network component that allows models to focus on specific parts of input when generating output. Core technology behind Transformers and modern LLMs like GPT-4. Enables models to weigh importance of different input tokens.

Transformer Architecture Guide →

AutoML (Automated Machine Learning)

Automated process of applying machine learning to real-world problems. AutoML tools automatically select algorithms, tune hyperparameters, and engineer features, making ML accessible to non-experts.

B

BERT (Bidirectional Encoder Representations from Transformers)

Google's pre-trained NLP model that understands context from both left and right sides of a word. Revolutionized natural language understanding tasks like question answering, sentiment analysis, and text classification. BioBERT and ClinicalBERT are domain-specific variants.

NLP Development Services →

Bias (in AI)

Systematic errors in AI model predictions that favor certain outcomes over others, often reflecting biases in training data. Can lead to unfair treatment of demographic groups. Requires bias testing and mitigation strategies.

AI Ethics & Fairness Guide →

BM25

A ranking function used for information retrieval that scores documents based on query term frequency and inverse document frequency. Often combined with vector search in hybrid search systems for better RAG retrieval.

RAG Implementation Guide →

C

ChatGPT

OpenAI's conversational AI system based on GPT-3.5 and GPT-4 models. Trained using reinforcement learning from human feedback (RLHF) to follow instructions and provide helpful, harmless, and honest responses.

AI Chatbot Development →

Claude

Anthropic's family of large language models (Claude 1, 2, 3) designed for safety and helpfulness. Known for long context windows (100K-200K tokens) and strong reasoning capabilities. Competes with GPT-4.

GPT-4 vs Claude Comparison →

Computer Vision

AI field focused on enabling computers to interpret and understand visual information from images and videos. Applications include object detection, image classification, facial recognition, and medical imaging analysis.

Computer Vision Services →Medical Imaging AI →

CNN (Convolutional Neural Network)

Deep learning architecture designed for processing grid-like data such as images. Uses convolutional layers to automatically learn spatial hierarchies of features. Backbone of most computer vision applications.

Context Window

Maximum number of tokens (words/subwords) an LLM can process at once. GPT-4: 8K-32K tokens, Claude 2: 100K tokens, GPT-4 Turbo: 128K tokens. Larger context windows enable processing longer documents.

LLM Fine-tuning Guide →

D

Data Augmentation

Technique to artificially expand training datasets by creating modified versions of existing data (rotation, cropping, noise addition). Improves model generalization and reduces overfitting.

Deep Learning

Subset of machine learning using multi-layer neural networks (deep networks) to learn hierarchical representations of data. Enables breakthroughs in computer vision, NLP, and speech recognition.

Machine Learning vs Deep Learning →

Diffusion Models

Generative AI models that create images by gradually removing noise from random data. Examples: Stable Diffusion, DALL-E 2, Midjourney. Used for text-to-image generation and image editing.

Stable Diffusion Guide →

E

Embeddings

Dense vector representations of data (text, images, audio) in high-dimensional space where similar items are close together. Text embeddings from models like text-embedding-ada-002 enable semantic search and RAG systems.

Vector Databases Guide →

Encoder-Decoder

Neural network architecture with two components: encoder processes input into representation, decoder generates output from that representation. Used in sequence-to-sequence tasks like translation.

Epoch

One complete pass through the entire training dataset. Model weights are updated after each epoch. Typical training uses 3-10 epochs; more epochs risk overfitting.

F

FAISS (Facebook AI Similarity Search)

Meta's library for efficient similarity search and clustering of dense vectors. Enables fast nearest neighbor search among millions/billions of vectors. Used in RAG systems and recommendation engines.

Vector Database Comparison →

Few-Shot Learning

ML technique where models learn from very few examples (2-10). LLMs excel at few-shot learning - providing examples in the prompt enables task performance without fine-tuning.

Prompt Engineering Guide →

Fine-tuning

Process of adapting a pre-trained model to a specific task by training on domain-specific data. More efficient than training from scratch. Techniques include full fine-tuning, LoRA, and QLoRA.

LLM Fine-tuning Guide →Fine-tuning Services →

G

GAN (Generative Adversarial Network)

Two neural networks (generator and discriminator) competing against each other. Generator creates fake data, discriminator tries to detect it. Used for image generation, style transfer, and data augmentation.

GPT (Generative Pre-trained Transformer)

OpenAI's family of large language models (GPT-2, GPT-3, GPT-3.5, GPT-4). Autoregressive models trained on vast text data to generate human-like text. GPT-4 is the most capable as of 2025.

GPT-4 Development Services →

Gradient Descent

Optimization algorithm that iteratively adjusts model parameters to minimize loss function. Variations include SGD (stochastic), Adam, AdamW. Core training mechanism for neural networks.

H

Hallucination

When AI models generate false or nonsensical information presented as fact. Major challenge in LLMs. Mitigation strategies: RAG, fine-tuning on factual data, lower temperature, citation requirements.

RAG Reduces Hallucinations →

Hugging Face

Leading platform for open-source NLP/AI models, datasets, and libraries. Hosts 200K+ models including BERT, GPT-2, Llama, Mistral. Transformers library is the standard for using pre-trained models.

Hugging Face Model Deployment →

Hyperparameter

Model configuration set before training (learning rate, batch size, number of layers). Unlike model parameters (weights), hyperparameters are not learned from data. Tuning them optimizes performance.

I

Inference

Using a trained model to make predictions on new data. In production, inference speed and cost are critical. Techniques: quantization, model pruning, batch inference, caching.

Instruction Tuning

Fine-tuning LLMs on instruction-following datasets (e.g., 'Summarize this text:', 'Translate to Spanish:'). Creates models better at following user instructions. Used for ChatGPT, Claude.

K

K-Nearest Neighbors (KNN)

Simple ML algorithm that classifies data points based on majority class of k nearest neighbors. Used in recommendation systems and as baseline for complex models.

Keras

High-level neural network API running on top of TensorFlow. Provides user-friendly interface for building and training deep learning models. Good for rapid prototyping.

L

LangChain

Framework for developing applications powered by language models. Provides chains, agents, memory, and tools for building RAG systems, chatbots, and AI agents. Supports multiple LLMs.

LangChain Development →

Latency

Time delay between input and output. Critical metric for production AI systems. Target: <100ms for most applications, <2s for LLM responses. Reduced via optimization, caching, edge deployment.

LLM (Large Language Model)

Neural network trained on massive text datasets (trillions of tokens) to understand and generate human language. Examples: GPT-4, Claude, Llama 2, Mistral. Size: 7B-175B+ parameters.

LLM Development Services →LLM Fine-tuning →

Llama (LLaMA)

Meta's family of open-source large language models (7B, 13B, 70B parameters). Llama 2 released July 2023 with commercial license. Strong alternative to proprietary models like GPT-4.

LoRA (Low-Rank Adaptation)

Parameter-efficient fine-tuning technique that trains small adapter matrices instead of full model weights. Reduces memory by 90%, enables fine-tuning large models on consumer GPUs.

LoRA Fine-tuning Tutorial →

Loss Function

Mathematical function measuring difference between predicted and actual outputs. Model training minimizes loss. Common losses: cross-entropy (classification), MSE (regression), triplet loss (embeddings).

LSTM (Long Short-Term Memory)

Type of recurrent neural network that can learn long-term dependencies. Addresses vanishing gradient problem of basic RNNs. Used for time series, text, and sequential data.

M

Machine Learning

AI approach where systems learn patterns from data without explicit programming. Three types: supervised (labeled data), unsupervised (unlabeled), reinforcement (reward-based).

Machine Learning Services →

Milvus

Open-source vector database built for billion-scale similarity search. Supports horizontal scaling, multiple index types (IVF, HNSW), and various distance metrics. Enterprise-grade alternative to Pinecone.

Vector Database Comparison →

Mistral

French AI startup's family of open-source LLMs. Mistral 7B outperforms Llama 2 13B. Mixtral 8x7B uses mixture-of-experts for efficient inference. Apache 2.0 license allows commercial use.

MLOps (Machine Learning Operations)

Practices for deploying and maintaining ML models in production. Includes CI/CD for ML, model monitoring, drift detection, retraining pipelines, and experiment tracking.

MLOps Services →

Model Drift

Degradation of model performance over time as real-world data distribution changes. Requires monitoring and periodic retraining. Types: concept drift, data drift, upstream drift.

N

NLP (Natural Language Processing)

AI field focused on interaction between computers and human language. Tasks: sentiment analysis, named entity recognition, machine translation, question answering, text generation.

NLP Development Services →

Neural Network

Computing system inspired by biological neural networks. Consists of interconnected nodes (neurons) organized in layers. Learns by adjusting connection weights during training.

O

OpenAI

AI research company that created GPT models, DALL-E, Whisper, and ChatGPT. Provides API access to GPT-4, embeddings, and other models. Leader in large language model development.

OpenAI API Integration →

Overfitting

When model learns training data too well, including noise and outliers, hurting performance on new data. Prevented via regularization, dropout, early stopping, more training data.

P

PEFT (Parameter-Efficient Fine-Tuning)

Family of techniques for fine-tuning large models by updating only a small subset of parameters. Includes LoRA, QLoRA, prefix tuning, adapter layers. Reduces memory and compute costs.

PEFT Techniques →

Pinecone

Managed vector database service for similarity search at scale. Fully managed, auto-scaling, <50ms latency. Best for fast prototyping. Pricing: $70-200/month for 10M vectors.

Vector DB Comparison →

Prompt Engineering

Art and science of crafting effective prompts to get desired outputs from LLMs. Techniques: few-shot examples, chain-of-thought, role prompting, system messages, temperature tuning.

Prompt Engineering Guide →

PyTorch

Meta's open-source deep learning framework. Dynamic computational graphs, Pythonic API, strong research community. Preferred for research and increasingly for production.

PyTorch Development →

Q

QLoRA (Quantized LoRA)

Combines LoRA with 4-bit quantization for extreme memory efficiency. Fine-tune 65B parameter models on single 24GB GPU. 95% less memory than full fine-tuning with minimal performance loss.

QLoRA Guide →

Quantization

Reducing model precision from 32-bit floating point to 8-bit or 4-bit integers. Reduces memory by 75-94% and speeds inference 2-4x with minimal accuracy loss.

Qdrant

Open-source vector database written in Rust. Fast performance, rich filtering, supports quantization. Good balance of features and cost. Self-host or use managed cloud.

Vector DB Comparison →

R

RAG (Retrieval Augmented Generation)

Technique combining information retrieval with LLM generation. Retrieves relevant documents from knowledge base, injects into prompt, then generates response. Reduces hallucinations, enables up-to-date answers.

RAG Implementation Guide →RAG Services →

Reinforcement Learning

ML paradigm where agents learn optimal actions through trial-and-error interactions with environment. Used for robotics, game playing (AlphaGo), recommendation systems, and RLHF for LLMs.

RLHF (Reinforcement Learning from Human Feedback)

Training technique where humans rank model outputs, and model learns to generate outputs humans prefer. Used to align LLMs with human values. Core technology behind ChatGPT and Claude.

RNN (Recurrent Neural Network)

Neural network type designed for sequential data (text, time series, audio). Processes inputs sequentially, maintaining hidden state. Largely superseded by Transformers for NLP.

S

Semantic Search

Search by meaning rather than exact keyword matching. Uses embeddings to find documents semantically similar to query. Core technology in RAG systems and modern search engines.

RAG Guide →

SHAP (SHapley Additive exPlanations)

Method for explaining ML model predictions by assigning importance value to each feature. Provides consistent and accurate feature attributions. Widely used for model interpretability.

Stable Diffusion

Open-source text-to-image diffusion model. Can generate, edit, and manipulate images from text prompts. Runs on consumer GPUs. Alternatives: DALL-E, Midjourney.

Stable Diffusion Services →

Supervised Learning

ML approach using labeled training data (input-output pairs). Model learns mapping from inputs to outputs. Used for classification and regression tasks.

Synthetic Data

Artificially generated data that mimics real data. Used when real data is scarce, expensive, or privacy-sensitive. Generated via GANs, simulation, or LLMs.

Synthetic Data Services →

T

Temperature

LLM parameter controlling output randomness. Low (0-0.3): focused, deterministic. Medium (0.7-0.9): balanced. High (1.0+): creative, diverse. Adjust based on use case.

TensorFlow

Google's open-source ML framework. Production-focused with TensorFlow Serving, TFX, and TensorFlow Lite for mobile/edge. Strong ecosystem and enterprise adoption.

Token

Smallest unit of text LLMs process. Roughly 0.75 words (1 token ≈ 4 characters). GPT-4 pricing based on tokens. Context limits measured in tokens (8K, 32K, 128K).

Transformer

Neural network architecture using self-attention mechanisms. Revolutionized NLP. Foundation of all modern LLMs (GPT, BERT, T5). Introduced in 'Attention is All You Need' (2017).

Transfer Learning

Using knowledge learned from one task to improve performance on related task. Pre-training on large datasets then fine-tuning on specific task. Core approach in modern deep learning.

U

Unsupervised Learning

ML approach using unlabeled data to discover patterns. Tasks: clustering, dimensionality reduction, anomaly detection. Examples: K-means, PCA, autoencoders.

V

Vector Database

Specialized database for storing and querying high-dimensional vectors (embeddings). Enables fast similarity search via approximate nearest neighbor algorithms. Examples: Pinecone, Weaviate, Qdrant, Milvus.

Vector DB Guide →

Vision Transformer (ViT)

Transformer architecture adapted for computer vision. Treats image patches as tokens. Outperforms CNNs on many tasks. Examples: ViT, CLIP, DINO.

W

Weaviate

Open-source vector database with GraphQL API, built-in vectorization, and hybrid search. Supports multi-tenancy and complex filtering. Good for on-premise deployments.

Vector DB Comparison →

Whisper

OpenAI's automatic speech recognition (ASR) model. Transcribes and translates audio to text with high accuracy. Supports 99 languages. Open-source and API available.

Whisper Integration →

X

XGBoost

Gradient boosting library optimized for speed and performance. Dominates structured/tabular data ML competitions. Used for classification, regression, ranking tasks.

Z

Zero-Shot Learning

Model performing tasks without specific training examples. LLMs can do zero-shot classification, translation, etc. by understanding instructions alone. Less accurate than few-shot but more flexible.

Need Help with AI Development?

Get a free consultation from our AI experts. We'll help you choose the right technologies and build production-ready AI solutions.

Schedule Free Consultation →