How Recommendation Systems Work (No Jargon)

The same idea powers Amazon, Netflix, and YouTube. Let's build it up slowly—from simple hand-made lists to a clean mental model that scales.

The Question

Have you ever wondered how websites like Amazon know exactly what products to recommend when you shop? Or how Netflix seems to suggest movies you end up enjoying?

The core question these companies solve every day is: How do we show each person things they'll probably like? Whether it's products on Amazon, videos on YouTube, or movies on Netflix.

This guide will walk through exactly how modern recommendation systems work, starting from the simplest approach and building step by step to the powerful vector-based systems used today.

Attempt 1: Hand‑Made Lists

First idea: for each item, keep a small list of related items.

const related: Record<string, string[]> = {
  banana: ["milk", "bread", "eggs"],
  protein: ["shaker", "creatine"],
  // ... and so on, by hand
};

// Show related["banana"] when user views a banana

Why this breaks

Too many items to maintain by hand.
New items have empty lists.
Misses surprising combos users discover over time.

Walkthrough: Our Tiny Store

Imagine we launch a 6‑item store. We try to maintain related lists by hand. Then we add a new item and realize how fragile this gets.

items = [banana, milk, bread, eggs, protein, shaker]
related = {
  banana: [milk, bread, eggs],
  protein: [shaker],
}
// Add "almond butter" → nothing recommends it; it recommends nothing
// You'd have to edit many lists by hand.

Attempt 2: Let Data Draw the Map

Let's improve our approach. Instead of manually creating lists, we can track what customers actually buy together and let the data guide us.

Every time two products are purchased together, we strengthen the connection between them. The more often they're bought together, the stronger their link becomes.

Real-world example: Beer and Diapers

A famous retail discovery found that on Friday nights, men would often buy beer and diapers together. This unexpected connection wasn't obvious until the data revealed it. The theory was that fathers, asked to pick up diapers on their way home, would also grab beer for the weekend.

This type of insight is almost impossible to discover with manual lists, but a data-driven approach catches it automatically.

How it works: Connection Strength

Banana— 42 →Milk

Beer— 7 →Diapers

The numbers represent how many times these items were bought together. To recommend products, we simply look for the strongest connections to what a customer is currently viewing.

This approach works well and learns automatically from customer behavior. But it has problems at scale:

Problems with Graph Approach

1. Storage Problem:

For N products, we need an N×N matrix to track all relationships. With millions of products, this becomes massive.

2. Cold Start Problem:

New products have no purchase history, so they have no connections and won't be recommended.

3. Computational Problem:

Finding the top-5 strongest connections means sorting thousands or millions of values for each product view.

// Adjacency matrix example (co-purchase counts)
// rows/cols: [banana, milk, bread, beer]
[
  [0, 42, 31,  1],  // banana was bought with milk 42 times
  [42, 0,  15,  0],  // milk was bought with bread 15 times
  [31, 15, 0,   0],  // etc.
  [1,  0,  0,   0]
]
// New item "chia" → new row & column full of 0s (cold start)
// With millions of products, this matrix is gigantic

Attempt 3: Organizing Items on a Number Line

Let's try a different approach. What if we assign each product a single number based on some important feature? Then we could place all products on a simple number line.

How it works:

1. Assign each product a number (maybe based on how sweet, expensive, or popular it is)

2. When a customer views a product, find other products with similar numbers

Number Line Example:

10—11—(milk=12)—(banana=13)—(bread=14)—15

When viewing banana (13), recommend milk (12) and bread (14) as they're nearby

Problems with the Number Line Approach:

1. The Insertion Problem

What happens when a new product arrives? Where do you put it? If you have:

… 12 (milk) — 13 (banana) — 14 (bread) …

And need to add cereal that's similar to both milk and banana?

… 12 (milk) — 12.5 (cereal) — 13 (banana) — 14 (bread) …

But this would require reshuffling all other items!

2. The Edge Problem

Products at the ends of the line only have neighbors on one side, limiting recommendation options.

3. The Complexity Problem

Products have multiple attributes. A banana might be sweet (8/10), cheap (3/10), and healthy (7/10). How do you express all these attributes with just one number?

The Multi-Dimensional Solution: Movies Have Multiple Traits

We've tried organizing items with one number, but it's not enough. Let's think about Netflix movies as a real-world example:

Think about how you describe movies to friends. You might say:

"It's mostly a comedy, but has some romance too"
"It's super action-packed with a bit of humor"
"It's very dramatic and quite realistic"

Movies have different amounts of different traits, all at the same time.

What if instead of one number, we give each movie several numbers - one for each important trait?

Example: Calm Romantic Comedy

Each trait scored from 0 (none) to 1 (maximum):

Funny: 0.8Romance: 0.7Realistic: 0.6Drama: 0.3Action: 0.1

Example: Action Comedy

Different blend of the same traits:

Action: 0.9Funny: 0.6Drama: 0.4Realistic: 0.2Romance: 0.1

The Big Idea:

Instead of giving each item one number and placing it on a line, we give it multiple numbers (one for each trait) and place it in a multi-dimensional space.

This means:

Each movie has its own unique "fingerprint" of traits
We can find movies with similar fingerprints
We can place new movies anywhere in this space without disrupting others
We can capture the full complexity of each item

Mapping Items in Multi-Dimensional Space

Let's visualize how this works with a simple example. Imagine we track just two traits for simplicity: Action (0-1) and Funny (0-1). Each movie becomes a point on a 2D map.

How It Works: The Coordinate System

Each movie gets coordinates based on its traits:

Rom-Com Movie:

[Action: 0.1, Funny: 0.8]

Low action, very funny

Action Comedy:

[Action: 0.9, Funny: 0.6]

High action, moderately funny

The Fundamental Idea:

Similar movies = Close points

Different movies = Far points

This works with any number of dimensions!

Finding Recommendations: Similarity Search

When a user watches and enjoys a movie with certain traits, we can find other movies with similar trait patterns:

Example: A user watches and likes several movies averaging around:

[Action: 0.2, Funny: 0.7]

Our recommendation system looks for other movies with similar coordinates - low action, high humor - perhaps a light-hearted comedy.

Simple Explanation of Similarity Search:

Take the traits of movies the user likes
Look around that area on our trait map
Recommend the closest movies they haven't seen yet

That's all "similarity search" means!

How Netflix Actually Uses This Approach

When you use Netflix, this multi-dimensional approach powers your recommendations behind the scenes:

Trait Analysis: Netflix analyzes each movie/show across dozens or hundreds of traits (not just the few we've discussed)
Learning Your Preferences: As you watch content, Netflix builds a profile of trait combinations you enjoy
Finding Similar Content: The system searches its library for unwatched content with similar trait patterns
Continuous Adaptation: Your recommendations adjust as you keep watching and rating content

Solving the Cold Start Problem

One huge advantage of this approach: when Netflix adds a brand new show or movie, they don't need to wait for people to watch it before they can recommend it.

They can analyze the content, assign trait values, and immediately place it in their multi-dimensional space. If it lands near other content you like, it may appear in your recommendations on day one!

Introducing Vectors and Embeddings

Now that you understand the concept, let's introduce the proper terminology that engineers and data scientists use:

Vector

A list of numbers like [0.1, 0.8, 0.7, 0.6, 0.3] is called a vector.

Each number in the vector represents a different dimension or trait. A vector with 5 numbers is called a 5-dimensional vector.

If you remember coordinate pairs (x,y) from math class, those are just 2-dimensional vectors!

Embedding

When AI or machine learning creates these vectors automatically by analyzing data, we call the result an embedding.

Instead of manually assigning trait values, the system learns them from patterns in the data.

Cosine Similarity

When comparing vectors, we often care more about their direction than their exact position. The technical name for this comparison is cosine similarity.

Why direction matters:

Consider two comedy-romance movies, one with stronger traits than the other:

Movie A: [Funny: 0.5, Romance: 0.4]
Movie B: [Funny: 0.9, Romance: 0.7]

Both have similar trait proportions (direction), so they're similar despite different intensities.

// Simple vector comparison
const A = [2, 2]; // direction is 45°
const B = [4, 4]; // same direction, 2x magnitude
const C = [4, 1]; // different direction

// A and B point the same way (similar movies)
// A and C point differently (different movies)

Vector Databases: Storing and Searching Efficiently

There's one more problem to solve: how do we store millions of vectors and search through them quickly?

The Scale Problem

Imagine Netflix with 10,000+ movies and shows. Each item has a vector with 100+ dimensions. When a user wants recommendations, we need to find the closest vectors quickly.

The naive approach would be to compare the user's preference vector with every single item vector, but that would be too slow for real-time recommendations.

Enter Vector Databases

A vector database is specialized for storing and searching vectors efficiently. Here's what it typically stores:

For each item:

Unique ID (movie_123)
Vector ([0.2, 0.8, 0.5, ...])
Metadata (title, image URL, etc.)

Specialized features:

Efficient indexing for fast similarity search
Support for various similarity metrics
Optimized for high-dimensional data

Technical note: Vector databases use advanced indexing techniques like HNSW (Hierarchical Navigable Small World), Annoy, or FAISS to avoid comparing with every vector in the database.

Beyond Recommendations: Other Vector Database Applications

This same technology powers many modern AI applications:

Image Search

Convert images to vectors capturing visual features, then find visually similar images.

Semantic Text Search

Find documents with similar meaning, not just matching keywords.

Fraud Detection

Find transactions with similar patterns to known fraudulent ones.

AI Chatbots

Store knowledge as vectors to retrieve relevant information for answering questions.

Try It Yourself: Vector Thinking

Now that you understand the concepts, try applying vector thinking to a simple example:

Snack Recommendation Exercise:

Let's use just two dimensions for simplicity:
- Sweetness: from 0 (not sweet) to 1 (very sweet)
- Crunchiness: from 0 (soft) to 1 (very crunchy)
Here are some snacks with their vectors:
Potato Chips
[Sweet: 0.1, Crunchy: 0.9]
Chocolate Chip Cookie
[Sweet: 0.8, Crunchy: 0.3]
Granola Bar
[Sweet: 0.5, Crunchy: 0.6]
Questions to consider:
- Which two snacks are most similar?
- If someone likes sweet snacks with medium crunch, what would you recommend?
- Where would a caramel popcorn go in this space? [Sweet: 0.7, Crunchy: 0.8]

This simple example demonstrates how vector representation and similarity work in practice.

Summary: From Arrays to Vector Databases

We've covered a lot! Let's recap the journey we've taken from simple recommendation systems to modern vector databases:

1. We started with manual lists

Hand-created arrays of related items that quickly become unmanageable.

2. We tried data-driven connections

Tracking which items are bought together, but the matrices grow too large at scale.

3. We tried a simple number line

Placing items on a one-dimensional scale, but that couldn't capture complexity.

4. We moved to multi-dimensional vectors

Giving items multiple numbers to represent different traits, creating a rich feature space.

5. We introduced vector databases

Specialized storage systems that make it efficient to find similar vectors quickly.

Key Terms to Remember:

Vector: A list of numbers representing an item's traits or features.
Embedding: A vector learned from data by AI/ML systems.
Similarity: How close two vectors are in direction/pattern (often measured by cosine similarity).
Vector Database: Specialized storage optimized for vector similarity search.

Continue learning

How LLMs Work Building AI Chatbots What are AI Agents?