Design a URL Shortener
Who Asks This Question?
The URL shortener is arguably the most common system design interview question. It's a "warm-up" problem at senior levels but can be the entire round at mid-level. Companies that frequently ask it:
- Amazon — Appears in SDE II and senior system design rounds
- Google — Often as the first of two design questions in a 45-minute block
- Oracle — Reported on Glassdoor for Principal Software Engineer interviews
- Microsoft — Used for both SDE and senior-level system design
- Meta — A classic in their system design loop
- LinkedIn — The URL shortener maps directly to their link tracking infrastructure
It's also a popular choice at startups and mid-size companies because it's accessible enough to evaluate candidates who haven't memorized the "Design YouTube" playbook, while still testing distributed systems fundamentals.
Why do so many companies ask this? Because it's deceptively simple. "Store a mapping between two strings" sounds trivial — until you realize you need to generate unique keys across a distributed fleet, handle billions of reads, and the service must never go down. The gap between a junior answer and a senior answer is enormous.
What the Interviewer Is Really Testing
The URL shortener is not about URL shortening. It's a proxy for testing these core skills:
| Skill Being Tested | How It Manifests |
|---|---|
| Estimation | Can you size the system and justify your short code length? |
| Data modeling under constraints | Key-value access pattern → database and cache choices |
| Unique ID generation | The hardest distributed systems sub-problem hidden in a "simple" question |
| Read-heavy optimization | 10:1 or 100:1 read-to-write ratio → cache strategy |
| Trade-off communication | 301 vs 302, SQL vs NoSQL, hash vs sequential — every choice has a cost |
The #1 scoring differentiator: Can you explain why you chose something, or do you just state what you chose? "I'd use DynamoDB" gets zero points. "Our primary access pattern is a point lookup by short code — a key-value pattern. At billions of records, DynamoDB gives single-digit millisecond reads without us managing sharding" gets full marks.
Step 1: Clarify Requirements
Questions That Change Your Design
"What's the expected scale?" This determines everything. 1,000 URLs/day is a single PostgreSQL instance. 100 million URLs/day requires distributed ID generation, sharding, and a cache layer. Ask this first.
"Do we need analytics?" If yes, you must use HTTP 302 (temporary redirect) so every click hits your server. If no, HTTP 301 (permanent redirect) reduces load dramatically because browsers cache it. This single requirement changes your redirect strategy, server load calculations, and whether you need a message queue.
"Should the same long URL always produce the same short URL?" This sounds minor but changes your encoding strategy entirely. If yes, you need a hash-based approach (or a dedup lookup). If no, you can use sequential IDs which are simpler and collision-free.
"Do short URLs expire?" Expiration means you need a TTL column, a cleanup job, and the ability to recycle short codes. No expiration is simpler but means your storage grows forever.
State Your Requirements
After discussion, explicitly write these down (interviewers notice):
Functional:
- Given a long URL, return a unique short URL
- Given a short URL, redirect to the original
- Optional: custom aliases, click analytics
Non-Functional:
- High availability — redirects must work 99.99% of the time
- Low latency — redirects under 100ms
- Short codes must be as short as possible while supporting the required scale
- Once created, a short URL must never break (durability)
Back-of-the-Envelope Estimation
This is where many candidates either skip (bad) or spend too long (also bad). Keep it to 2 minutes:
Given: 100M new URLs per day
| Metric | Calculation | Result |
|---|---|---|
| Write QPS | 100M / 86,400 | ~1,200 writes/sec |
| Read QPS (10:1 ratio) | 1,200 * 10 | ~12,000 reads/sec |
| Storage per year | 100M * 365 * 500 bytes | ~18 TB/year |
| Short code length | 62^7 = 3.5 trillion codes | 7 chars lasts 95+ years |
Why 7 characters? With base62 encoding (a-z, A-Z, 0-9), each character has 62 possible values. 62^6 = 56 billion (would run out in ~1.5 years at 100M/day). 62^7 = 3.5 trillion (lasts nearly a century). Seven characters it is.
Strong candidates derive the short code length from the scale estimate. Weak candidates just say "7 characters" without showing why. The math takes 30 seconds and demonstrates analytical thinking.
Step 2: High-Level Design
API Design
Keep it simple — two endpoints:
POST /api/v1/shorten
Body: { "longUrl": "https://example.com/very/long/path..." }
Response: { "shortUrl": "https://short.ly/a3Bc9xZ" }
GET /{shortCode}
Response: HTTP 302 → Location: https://example.com/very/long/path...
The 301 vs 302 Decision
This is a trick question that interviewers love because it reveals whether you've thought about analytics:
| Aspect | 301 (Permanent) | 302 (Temporary) |
|---|---|---|
| Browser behavior | Caches the redirect forever | Hits your server every time |
| Server load | Much lower | Higher |
| Analytics | Can't track repeat clicks | Tracks every click |
| SEO | Passes link equity to destination | Retains link equity on short URL |
The right answer depends on your requirements. If analytics matters (Bit.ly's entire business model), use 302. If you're building an internal link shortener and want minimal infrastructure, use 301.
Most candidates just say "302 for analytics." A stronger answer: "I'd default to 302 because analytics is typically expected, and it gives us flexibility to add features like A/B testing redirects or link expiration later. If we later find the server load is problematic, we can switch specific high-traffic links to 301."
Architecture Overview
Write path:
Client → Load Balancer → API Server → ID Generator → Database → (warm cache) → Response
Read path:
Client → Load Balancer → API Server → Cache (hit?) → Database (miss) → Response + cache backfill
Components:
- Load Balancer — Round-robin across stateless API servers
- API Servers — Stateless, horizontally scalable
- ID Generator — The hardest sub-problem (see deep dive)
- Database — Persistent URL mappings
- Cache (Redis) — Hot URL lookups for the read path
Step 3: Deep Dive — The Encoding Problem
This is the core technical challenge and where the interview gets interesting. There are three approaches, each with real trade-offs.
Approach 1: Base62 Encode a Unique ID
Generate a unique numeric ID (e.g., 123456789), then convert it to a base62 string ("8m0Kx").
public class Base62 {
private static final String CHARS =
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static String encode(long num) {
if (num == 0) return "0";
StringBuilder sb = new StringBuilder();
while (num > 0) {
sb.append(CHARS.charAt((int)(num % 62)));
num /= 62;
}
return sb.reverse().toString();
}
public static long decode(String code) {
long result = 0;
for (char c : code.toCharArray()) {
result = result * 62 + CHARS.indexOf(c);
}
return result;
}
}
Pros: Zero collisions (every ID maps to a unique string). Deterministic. Simple.
Cons: Short codes are sequential — users can guess nearby URLs by incrementing. This is a security concern if URLs contain private content.
When to pick it: When you have a reliable unique ID generator and predictability isn't a concern.
Approach 2: Hash the URL and Truncate
Hash the long URL (e.g., SHA-256) and take the first 7 characters of the base62-encoded hash.
Pros: Same URL always produces the same short code (built-in deduplication). No ID generator needed.
Cons: Collisions are inevitable. With 7 characters (3.5T combinations) and billions of URLs, you'll eventually get two different URLs producing the same 7-char prefix. You need collision resolution — check the database, and if the code exists, append a counter and rehash.
When to mention it: Show the interviewer you know this approach exists and why it's problematic at scale. Then explain why you prefer approach 1.
Approach 3: Pre-Generated Key Service
Generate millions of random 7-character codes offline, store them in a database with two tables: unused_keys and used_keys. When a URL needs shortening, grab a key from the unused pool.
Pros: Zero runtime computation. No collisions. Codes are random (not guessable).
Cons: Requires a separate service. What happens when two servers grab the same key simultaneously? You need synchronization — typically each server batch-loads keys into memory (e.g., 1,000 at a time) and marks them as reserved.
When to mention it: As an optimization for high-write scenarios where you need non-sequential codes and want to eliminate any possibility of collision at write time.
In most interviews, pick Base62 + unique ID and explain why. Mention hashing to show awareness of alternatives. Bring up key pre-generation only if the interviewer pushes on predictability or write performance.
Step 4: Deep Dive — Unique ID Generation
If you chose base62 encoding (which you should as your default), the follow-up question is immediate: "How do you generate globally unique IDs across a distributed fleet?"
This sub-problem is where the interview really tests your distributed systems knowledge.
| Approach | Throughput | Coordination | Ordering | Drawback |
|---|---|---|---|---|
| Auto-increment DB | Moderate | Centralized | Strict | Single point of failure |
| UUID | Unlimited | None | None | 128 bits — way too long for short URLs |
| Snowflake ID | High | Minimal | Time-sorted | Requires clock sync |
| Redis INCR | Very high | Centralized | Strict | Redis becomes a hard dependency |
| Range-based allocation | High | Periodic | Within range | Gaps in IDs, wasted ranges on crash |
Snowflake ID (The Standard Answer)
Twitter created Snowflake to generate 64-bit unique IDs without coordination between servers. The bit layout:
| 1 bit unused | 41 bits timestamp | 10 bits machine ID | 12 bits sequence |
- 41-bit timestamp (ms since custom epoch): ~69 years of unique timestamps
- 10-bit machine ID: 1,024 machines
- 12-bit sequence: 4,096 IDs per millisecond per machine
That's 4 million IDs/second per machine with no coordination.
public class SnowflakeGenerator {
private static final long EPOCH = 1704067200000L; // 2024-01-01
private static final int MACHINE_BITS = 10;
private static final int SEQUENCE_BITS = 12;
private static final long MAX_SEQUENCE = (1L << SEQUENCE_BITS) - 1;
private final long machineId;
private long lastTimestamp = -1;
private long sequence = 0;
public SnowflakeGenerator(long machineId) {
if (machineId < 0 || machineId >= (1L << MACHINE_BITS)) {
throw new IllegalArgumentException(
"Machine ID must be between 0 and " + ((1L << MACHINE_BITS) - 1));
}
this.machineId = machineId;
}
public synchronized long nextId() {
long timestamp = System.currentTimeMillis();
if (timestamp < lastTimestamp) {
throw new RuntimeException(
"Clock moved backwards by " + (lastTimestamp - timestamp) + "ms");
}
if (timestamp == lastTimestamp) {
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0) {
// Sequence exhausted for this ms — wait
while (timestamp <= lastTimestamp) {
timestamp = System.currentTimeMillis();
}
}
} else {
sequence = 0;
}
lastTimestamp = timestamp;
return ((timestamp - EPOCH) << (MACHINE_BITS + SEQUENCE_BITS))
| (machineId << SEQUENCE_BITS)
| sequence;
}
}
Notice the clock-backwards check. This is a real production concern — NTP adjustments can move the clock back. Weak candidates ignore this entirely. Strong candidates mention it and explain the mitigation: throw an exception and alert ops, or use a monotonic clock source.
Range-Based Allocation (Simpler Alternative)
If Snowflake feels complex for the interview, a simpler approach: a coordinator service hands out ranges. Server A gets IDs 1-10,000. Server B gets 10,001-20,000. Each server increments locally within its range. When a range runs out, request a new one.
Pros: Dead simple. No clock dependency. Cons: IDs aren't time-sorted. Wasted IDs if a server crashes mid-range.
Mention this as an alternative to show you can simplify when appropriate.
Step 5: Deep Dive — Database and Cache
Database Choice
Your access pattern is simple: write a (short_code, long_url) pair, read by short_code. This is a textbook key-value workload.
Relational DB (PostgreSQL/MySQL):
Works fine up to a few hundred million records. Strong consistency, familiar tooling, easy joins for analytics. Shard by short_code hash when you outgrow a single instance.
NoSQL (DynamoDB/Cassandra): Natural fit for key-value at massive scale. DynamoDB gives single-digit millisecond reads at any scale without managing shards yourself. But you lose ad-hoc queries and joins.
The answer interviewers want: "I'd start with PostgreSQL because it handles our initial scale, supports analytics queries, and our team knows it. If we grow beyond a single instance, our access pattern is pure key-value, which makes migration to DynamoDB straightforward."
Schema
CREATE TABLE url_mappings (
id BIGINT PRIMARY KEY, -- Snowflake ID
short_code VARCHAR(7) UNIQUE NOT NULL,
long_url TEXT NOT NULL,
user_id BIGINT, -- nullable, for user-owned links
created_at TIMESTAMP DEFAULT NOW()
);
-- Optional: analytics
CREATE TABLE click_events (
id BIGINT PRIMARY KEY,
short_code VARCHAR(7) NOT NULL,
clicked_at TIMESTAMP NOT NULL,
ip_address INET,
user_agent TEXT,
referrer TEXT
);
Caching Strategy
Redirects follow a power-law distribution: the top 20% of URLs handle 80% of traffic. This is an ideal caching scenario.
Cache-aside pattern:
- Check Redis for
short_code - Cache hit → return the long URL immediately
- Cache miss → query database, store result in Redis with TTL, return
public class RedirectService {
private final JedisPool redis;
private final UrlRepository db;
public String resolve(String shortCode) {
try (Jedis jedis = redis.getResource()) {
String cached = jedis.get("url:" + shortCode);
if (cached != null) return cached;
}
String longUrl = db.findByShortCode(shortCode);
if (longUrl == null) return null;
try (Jedis jedis = redis.getResource()) {
jedis.setex("url:" + shortCode, 86400, longUrl);
}
return longUrl;
}
}
Key decisions to discuss:
- TTL: 24 hours is reasonable. Popular URLs get refreshed before expiry from ongoing traffic.
- Eviction: LRU — least popular URLs get evicted first, keeping the hot set in cache.
- Cache size: With 20% of URLs serving 80% of reads, you need to cache far less than the total dataset. At 500 bytes per entry, 10 million cached URLs = ~5 GB — fits easily in a single Redis instance.
- Cache warming on write: When a new URL is shortened, proactively write it to cache. First-time access won't be a cold miss.
Step 6: Wrap Up and Trade-offs
The Summary Table
| Decision | Choice | Why |
|---|---|---|
| Short code generation | Base62 of Snowflake ID | No collisions, distributed, simple |
| Code length | 7 characters | 3.5 trillion codes, lasts 95+ years at 100M/day |
| Redirect type | HTTP 302 | Enables analytics, maintains flexibility |
| Database | PostgreSQL → DynamoDB at scale | Start simple, migrate when access pattern validates |
| Cache | Redis, LRU, 24h TTL | Power-law distribution makes caching highly effective |
| Analytics | Async via message queue | Never slow down redirects for analytics writes |
Follow-Up Questions to Prepare For
"What if someone shortens a malicious URL?" URL validation matters. Check against a blocklist of known malicious domains (Google Safe Browsing API). Scan for phishing patterns. Rate-limit the shortening endpoint to prevent abuse.
"How would you handle a URL that goes viral — 1 million clicks per second?" Your cache handles this. Redis can serve 100K+ reads per second on a single instance. Multiple Redis replicas behind a load balancer can handle millions. The database never sees the traffic because the cache hit rate for a viral URL is effectively 100%.
"What about custom aliases — short.ly/my-brand?"
Check if the alias is already taken (DB lookup). Validate format (alphanumeric, 3-30 chars). Reserve certain aliases to prevent brand abuse ("admin", "api", "help"). Custom aliases bypass the ID generator entirely — they're a direct insert with the user's chosen code.
"How would you implement link expiration?"
Add a expires_at column. On redirect, check if the link has expired. Run a background cleanup job to delete expired entries and free cache space. Consider: should expired short codes be recyclable? Most services say no — broken links are worse than running out of codes.
"How do you shard the database?"
Shard by the hash of short_code. This distributes reads and writes evenly because short codes are already distributed (base62 of Snowflake IDs). Don't shard by user_id — one power user could hot-spot a shard, and reads are by short_code, not user_id.
Common Mistakes
Mistake 1: Skipping Estimation
If you say "7 characters" without showing why, the interviewer doesn't know if you derived it or memorized it. Spend 90 seconds on the math — it's easy points.
Mistake 2: Using UUID as the Short Code
UUIDs are 128 bits. Base62-encoded, that's 22 characters — not "short" at all. If you suggest UUID, immediately acknowledge this problem and pivot to Snowflake.
Mistake 3: Ignoring the Read-Heavy Nature
This system is 10:1 or 100:1 reads to writes. If your design spends equal time on the write and read paths, you've misjudged the workload. The cache strategy deserves more airtime than the write path.
Mistake 4: "I'd Use NoSQL Because It Scales Better"
This is a red flag in any system design interview. Explain the access pattern first, then match it to a database. Key-value lookups happen to favor NoSQL at extreme scale — but say it that way, not as a blanket statement.
Mistake 5: Forgetting About Analytics Being Async
If you say "on every redirect, write a row to the analytics table," you've just made your redirect latency dependent on database write speed. Analytics must be async — publish to a queue (Kafka, SQS), consume and write in the background.
Your 35-Minute Interview Plan
| Time | What to Do |
|---|---|
| 0-5 min | Requirements: scale, analytics, expiration, custom aliases |
| 3-5 min | Estimation: QPS, storage, short code length derivation |
| 5-12 min | High-level design: API, 301/302 decision, architecture diagram |
| 12-22 min | Deep dive: encoding approach, unique ID generation (Snowflake) |
| 22-28 min | Database choice with justification, caching strategy |
| 28-32 min | Analytics (async), custom aliases, edge cases |
| 32-35 min | Summary table, state trade-offs, what you'd improve |
The URL shortener seems simple, but it touches ID generation, caching, database selection, and distributed systems. The companies that ask it want to see structured thinking and trade-off awareness — not a memorized solution. Practice explaining your reasoning out loud, not just the architecture.