Design a URL Shortener | System Design Course

Who Asks This Question?

The URL shortener is arguably the most common system design interview question. It's a "warm-up" problem at senior levels but can be the entire round at mid-level. Companies that frequently ask it:

Amazon — Appears in SDE II and senior system design rounds
Google — Often as the first of two design questions in a 45-minute block
Oracle — Reported on Glassdoor for Principal Software Engineer interviews
Microsoft — Used for both SDE and senior-level system design
Meta — A classic in their system design loop
LinkedIn — The URL shortener maps directly to their link tracking infrastructure

It's also a popular choice at startups and mid-size companies because it's accessible enough to evaluate candidates who haven't memorized the "Design YouTube" playbook, while still testing distributed systems fundamentals.

Why do so many companies ask this? Because it's deceptively simple. "Store a mapping between two strings" sounds trivial — until you realize you need to generate unique keys across a distributed fleet, handle billions of reads, and the service must never go down. The gap between a junior answer and a senior answer is enormous.

What the Interviewer Is Really Testing

The URL shortener is not about URL shortening. It's a proxy for testing these core skills:

Skill Being Tested	How It Manifests
Estimation	Can you size the system and justify your short code length?
Data modeling under constraints	Key-value access pattern → database and cache choices
Unique ID generation	The hardest distributed systems sub-problem hidden in a "simple" question
Read-heavy optimization	10:1 or 100:1 read-to-write ratio → cache strategy
Trade-off communication	301 vs 302, SQL vs NoSQL, hash vs sequential — every choice has a cost

The #1 scoring differentiator: Can you explain why you chose something, or do you just state what you chose? "I'd use DynamoDB" gets zero points. "Our primary access pattern is a point lookup by short code — a key-value pattern. At billions of records, DynamoDB gives single-digit millisecond reads without us managing sharding" gets full marks.

Step 1: Clarify Requirements

Questions That Change Your Design

"What's the expected scale?" This determines everything. 1,000 URLs/day is a single PostgreSQL instance. 100 million URLs/day requires distributed ID generation, sharding, and a cache layer. Ask this first.

"Do we need analytics?" If yes, you must use HTTP 302 (temporary redirect) so every click hits your server. If no, HTTP 301 (permanent redirect) reduces load dramatically because browsers cache it. This single requirement changes your redirect strategy, server load calculations, and whether you need a message queue.

"Should the same long URL always produce the same short URL?" This sounds minor but changes your encoding strategy entirely. If yes, you need a hash-based approach (or a dedup lookup). If no, you can use sequential IDs which are simpler and collision-free.

"Do short URLs expire?" Expiration means you need a TTL column, a cleanup job, and the ability to recycle short codes. No expiration is simpler but means your storage grows forever.

State Your Requirements

After discussion, explicitly write these down (interviewers notice):

Functional:

Given a long URL, return a unique short URL
Given a short URL, redirect to the original
Optional: custom aliases, click analytics

Non-Functional:

High availability — redirects must work 99.99% of the time
Low latency — redirects under 100ms
Short codes must be as short as possible while supporting the required scale
Once created, a short URL must never break (durability)

Back-of-the-Envelope Estimation

This is where many candidates either skip (bad) or spend too long (also bad). Keep it to 2 minutes:

Given: 100M new URLs per day

Metric	Calculation	Result
Write QPS	100M / 86,400	~1,200 writes/sec
Read QPS (10:1 ratio)	1,200 * 10	~12,000 reads/sec
Storage per year	100M * 365 * 500 bytes	~18 TB/year
Short code length	62^7 = 3.5 trillion codes	7 chars lasts 95+ years

Why 7 characters? With base62 encoding (a-z, A-Z, 0-9), each character has 62 possible values. 62^6 = 56 billion (would run out in ~1.5 years at 100M/day). 62^7 = 3.5 trillion (lasts nearly a century). Seven characters it is.

Strong candidates derive the short code length from the scale estimate. Weak candidates just say "7 characters" without showing why. The math takes 30 seconds and demonstrates analytical thinking.

Step 2: High-Level Design

API Design

Keep it simple — two endpoints:

POST /api/v1/shorten
Body:     { "longUrl": "https://example.com/very/long/path..." }
Response: { "shortUrl": "https://short.ly/a3Bc9xZ" }

GET /{shortCode}
Response: HTTP 302 → Location: https://example.com/very/long/path...

The 301 vs 302 Decision

This is a trick question that interviewers love because it reveals whether you've thought about analytics:

Aspect	301 (Permanent)	302 (Temporary)
Browser behavior	Caches the redirect forever	Hits your server every time
Server load	Much lower	Higher
Analytics	Can't track repeat clicks	Tracks every click
SEO	Passes link equity to destination	Retains link equity on short URL

The right answer depends on your requirements. If analytics matters (Bit.ly's entire business model), use 302. If you're building an internal link shortener and want minimal infrastructure, use 301.

Most candidates just say "302 for analytics." A stronger answer: "I'd default to 302 because analytics is typically expected, and it gives us flexibility to add features like A/B testing redirects or link expiration later. If we later find the server load is problematic, we can switch specific high-traffic links to 301."

Architecture Overview

Write path:

Client → Load Balancer → API Server → ID Generator → Database → (warm cache) → Response

Read path:

Client → Load Balancer → API Server → Cache (hit?) → Database (miss) → Response + cache backfill

Components:

Load Balancer — Round-robin across stateless API servers
API Servers — Stateless, horizontally scalable
ID Generator — The hardest sub-problem (see deep dive)
Database — Persistent URL mappings
Cache (Redis) — Hot URL lookups for the read path

Step 3: Deep Dive — The Encoding Problem

This is the core technical challenge and where the interview gets interesting. There are three approaches, each with real trade-offs.

Approach 1: Base62 Encode a Unique ID

Generate a unique numeric ID (e.g., 123456789), then convert it to a base62 string ("8m0Kx").

public class Base62 {
    private static final String CHARS =
        "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";

    public static String encode(long num) {
        if (num == 0) return "0";
        StringBuilder sb = new StringBuilder();
        while (num > 0) {
            sb.append(CHARS.charAt((int)(num % 62)));
            num /= 62;
        }
        return sb.reverse().toString();
    }

    public static long decode(String code) {
        long result = 0;
        for (char c : code.toCharArray()) {
            result = result * 62 + CHARS.indexOf(c);
        }
        return result;
    }
}

Pros: Zero collisions (every ID maps to a unique string). Deterministic. Simple.

Cons: Short codes are sequential — users can guess nearby URLs by incrementing. This is a security concern if URLs contain private content.

When to pick it: When you have a reliable unique ID generator and predictability isn't a concern.

Approach 2: Hash the URL and Truncate

Hash the long URL (e.g., SHA-256) and take the first 7 characters of the base62-encoded hash.

Pros: Same URL always produces the same short code (built-in deduplication). No ID generator needed.

Cons: Collisions are inevitable. With 7 characters (3.5T combinations) and billions of URLs, you'll eventually get two different URLs producing the same 7-char prefix. You need collision resolution — check the database, and if the code exists, append a counter and rehash.

When to mention it: Show the interviewer you know this approach exists and why it's problematic at scale. Then explain why you prefer approach 1.

Approach 3: Pre-Generated Key Service

Generate millions of random 7-character codes offline, store them in a database with two tables: unused_keys and used_keys. When a URL needs shortening, grab a key from the unused pool.

Pros: Zero runtime computation. No collisions. Codes are random (not guessable).

Cons: Requires a separate service. What happens when two servers grab the same key simultaneously? You need synchronization — typically each server batch-loads keys into memory (e.g., 1,000 at a time) and marks them as reserved.

When to mention it: As an optimization for high-write scenarios where you need non-sequential codes and want to eliminate any possibility of collision at write time.

In most interviews, pick Base62 + unique ID and explain why. Mention hashing to show awareness of alternatives. Bring up key pre-generation only if the interviewer pushes on predictability or write performance.

Step 4: Deep Dive — Unique ID Generation

If you chose base62 encoding (which you should as your default), the follow-up question is immediate: "How do you generate globally unique IDs across a distributed fleet?"

This sub-problem is where the interview really tests your distributed systems knowledge.

Approach	Throughput	Coordination	Ordering	Drawback
Auto-increment DB	Moderate	Centralized	Strict	Single point of failure
UUID	Unlimited	None	None	128 bits — way too long for short URLs
Snowflake ID	High	Minimal	Time-sorted	Requires clock sync
Redis INCR	Very high	Centralized	Strict	Redis becomes a hard dependency
Range-based allocation	High	Periodic	Within range	Gaps in IDs, wasted ranges on crash

Snowflake ID (The Standard Answer)

Twitter created Snowflake to generate 64-bit unique IDs without coordination between servers. The bit layout:

| 1 bit unused | 41 bits timestamp | 10 bits machine ID | 12 bits sequence |

41-bit timestamp (ms since custom epoch): ~69 years of unique timestamps
10-bit machine ID: 1,024 machines
12-bit sequence: 4,096 IDs per millisecond per machine

That's 4 million IDs/second per machine with no coordination.

public class SnowflakeGenerator {
    private static final long EPOCH = 1704067200000L; // 2024-01-01
    private static final int MACHINE_BITS = 10;
    private static final int SEQUENCE_BITS = 12;
    private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;

    private final long machineId;
    private long lastTimestamp = -1;
    private long sequence = 0;

    public SnowflakeGenerator(long machineId) {
        if (machineId < 0 || machineId >= (1L << MACHINE_BITS)) {
            throw new IllegalArgumentException(
                "Machine ID must be between 0 and " + ((1L << MACHINE_BITS) - 1));
        }
        this.machineId = machineId;
    }

    public synchronized long nextId() {
        long timestamp = System.currentTimeMillis();

        if (timestamp < lastTimestamp) {
            throw new RuntimeException(
                "Clock moved backwards by " + (lastTimestamp - timestamp) + "ms");
        }

        if (timestamp == lastTimestamp) {
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0) {
                // Sequence exhausted for this ms — wait
                while (timestamp <= lastTimestamp) {
                    timestamp = System.currentTimeMillis();
                }
            }
        } else {
            sequence = 0;
        }

        lastTimestamp = timestamp;

        return ((timestamp - EPOCH) << (MACHINE_BITS + SEQUENCE_BITS))
             | (machineId << SEQUENCE_BITS)
             | sequence;
    }
}

Notice the clock-backwards check. This is a real production concern — NTP adjustments can move the clock back. Weak candidates ignore this entirely. Strong candidates mention it and explain the mitigation: throw an exception and alert ops, or use a monotonic clock source.

Range-Based Allocation (Simpler Alternative)

If Snowflake feels complex for the interview, a simpler approach: a coordinator service hands out ranges. Server A gets IDs 1-10,000. Server B gets 10,001-20,000. Each server increments locally within its range. When a range runs out, request a new one.

Pros: Dead simple. No clock dependency. Cons: IDs aren't time-sorted. Wasted IDs if a server crashes mid-range.

Mention this as an alternative to show you can simplify when appropriate.

Step 5: Deep Dive — Database and Cache

Database Choice

Your access pattern is simple: write a (short_code, long_url) pair, read by short_code. This is a textbook key-value workload.

Relational DB (PostgreSQL/MySQL): Works fine up to a few hundred million records. Strong consistency, familiar tooling, easy joins for analytics. Shard by short_code hash when you outgrow a single instance.

NoSQL (DynamoDB/Cassandra): Natural fit for key-value at massive scale. DynamoDB gives single-digit millisecond reads at any scale without managing shards yourself. But you lose ad-hoc queries and joins.

The answer interviewers want: "I'd start with PostgreSQL because it handles our initial scale, supports analytics queries, and our team knows it. If we grow beyond a single instance, our access pattern is pure key-value, which makes migration to DynamoDB straightforward."

Schema

CREATE TABLE url_mappings (
    id           BIGINT PRIMARY KEY,  -- Snowflake ID
    short_code   VARCHAR(7) UNIQUE NOT NULL,
    long_url     TEXT NOT NULL,
    user_id      BIGINT,              -- nullable, for user-owned links
    created_at   TIMESTAMP DEFAULT NOW()
);

-- Optional: analytics
CREATE TABLE click_events (
    id           BIGINT PRIMARY KEY,
    short_code   VARCHAR(7) NOT NULL,
    clicked_at   TIMESTAMP NOT NULL,
    ip_address   INET,
    user_agent   TEXT,
    referrer     TEXT
);

Caching Strategy

Redirects follow a power-law distribution: the top 20% of URLs handle 80% of traffic. This is an ideal caching scenario.

Cache-aside pattern:

Check Redis for short_code
Cache hit → return the long URL immediately
Cache miss → query database, store result in Redis with TTL, return

public class RedirectService {
    private final JedisPool redis;
    private final UrlRepository db;

    public String resolve(String shortCode) {
        try (Jedis jedis = redis.getResource()) {
            String cached = jedis.get("url:" + shortCode);
            if (cached != null) return cached;
        }

        String longUrl = db.findByShortCode(shortCode);
        if (longUrl == null) return null;

        try (Jedis jedis = redis.getResource()) {
            jedis.setex("url:" + shortCode, 86400, longUrl);
        }
        return longUrl;
    }
}

Key decisions to discuss:

TTL: 24 hours is reasonable. Popular URLs get refreshed before expiry from ongoing traffic.
Eviction: LRU — least popular URLs get evicted first, keeping the hot set in cache.
Cache size: With 20% of URLs serving 80% of reads, you need to cache far less than the total dataset. At 500 bytes per entry, 10 million cached URLs = ~5 GB — fits easily in a single Redis instance.
Cache warming on write: When a new URL is shortened, proactively write it to cache. First-time access won't be a cold miss.

Step 6: Wrap Up and Trade-offs

The Summary Table

Decision	Choice	Why
Short code generation	Base62 of Snowflake ID	No collisions, distributed, simple
Code length	7 characters	3.5 trillion codes, lasts 95+ years at 100M/day
Redirect type	HTTP 302	Enables analytics, maintains flexibility
Database	PostgreSQL → DynamoDB at scale	Start simple, migrate when access pattern validates
Cache	Redis, LRU, 24h TTL	Power-law distribution makes caching highly effective
Analytics	Async via message queue	Never slow down redirects for analytics writes

Follow-Up Questions to Prepare For

"What if someone shortens a malicious URL?" URL validation matters. Check against a blocklist of known malicious domains (Google Safe Browsing API). Scan for phishing patterns. Rate-limit the shortening endpoint to prevent abuse.

"How would you handle a URL that goes viral — 1 million clicks per second?" Your cache handles this. Redis can serve 100K+ reads per second on a single instance. Multiple Redis replicas behind a load balancer can handle millions. The database never sees the traffic because the cache hit rate for a viral URL is effectively 100%.

"What about custom aliases — short.ly/my-brand?" Check if the alias is already taken (DB lookup). Validate format (alphanumeric, 3-30 chars). Reserve certain aliases to prevent brand abuse ("admin", "api", "help"). Custom aliases bypass the ID generator entirely — they're a direct insert with the user's chosen code.

"How would you implement link expiration?" Add a expires_at column. On redirect, check if the link has expired. Run a background cleanup job to delete expired entries and free cache space. Consider: should expired short codes be recyclable? Most services say no — broken links are worse than running out of codes.

"How do you shard the database?" Shard by the hash of short_code. This distributes reads and writes evenly because short codes are already distributed (base62 of Snowflake IDs). Don't shard by user_id — one power user could hot-spot a shard, and reads are by short_code, not user_id.

Common Mistakes

Mistake 1: Skipping Estimation

If you say "7 characters" without showing why, the interviewer doesn't know if you derived it or memorized it. Spend 90 seconds on the math — it's easy points.

Mistake 2: Using UUID as the Short Code

UUIDs are 128 bits. Base62-encoded, that's 22 characters — not "short" at all. If you suggest UUID, immediately acknowledge this problem and pivot to Snowflake.

Mistake 3: Ignoring the Read-Heavy Nature

This system is 10:1 or 100:1 reads to writes. If your design spends equal time on the write and read paths, you've misjudged the workload. The cache strategy deserves more airtime than the write path.

Mistake 4: "I'd Use NoSQL Because It Scales Better"

This is a red flag in any system design interview. Explain the access pattern first, then match it to a database. Key-value lookups happen to favor NoSQL at extreme scale — but say it that way, not as a blanket statement.

Mistake 5: Forgetting About Analytics Being Async

If you say "on every redirect, write a row to the analytics table," you've just made your redirect latency dependent on database write speed. Analytics must be async — publish to a queue (Kafka, SQS), consume and write in the background.

Your 35-Minute Interview Plan

Time	What to Do
0-5 min	Requirements: scale, analytics, expiration, custom aliases
3-5 min	Estimation: QPS, storage, short code length derivation
5-12 min	High-level design: API, 301/302 decision, architecture diagram
12-22 min	Deep dive: encoding approach, unique ID generation (Snowflake)
22-28 min	Database choice with justification, caching strategy
28-32 min	Analytics (async), custom aliases, edge cases
32-35 min	Summary table, state trade-offs, what you'd improve

The URL shortener seems simple, but it touches ID generation, caching, database selection, and distributed systems. The companies that ask it want to see structured thinking and trade-off awareness — not a memorized solution. Practice explaining your reasoning out loud, not just the architecture.