Design a Notification System | System Design Course

Who Asks This Question?

The notification system is a deceptively complex question that appears at companies where real-time communication matters. Based on interview reports, it's frequently asked at:

Meta — Their entire ecosystem (Facebook, Instagram, WhatsApp) runs on notifications
Slack — Real-time messaging is their core business; they test notification fanout extensively
LinkedIn — Job alerts, connection requests, and activity notifications at massive scale
Airbnb — Booking confirmations, host communications, and travel reminders across multiple channels
Uber — Trip updates, driver matching, and payment notifications with tight delivery SLAs
Discord — Real-time chat notifications and presence updates for millions of concurrent users
Stripe — Payment confirmations, webhook deliveries, and fraud alerts via multiple channels
Zoom — Meeting reminders, recording notifications, and real-time collaboration alerts

This question tests whether you've dealt with real-time systems and understand the complexity of reliable delivery across different channels. Companies that ask it want to see that you can design for both scale and reliability — notifications must be fast but never lost.

What the Interviewer Is Really Testing

Most candidates focus on the basic "send message to user" flow and miss the hard distributed systems problems. Here's what interviewers actually evaluate:

Evaluation Area	Weight	What They're Looking For
Requirements gathering	20%	Do you ask about notification types, delivery guarantees, and failure scenarios?
Multi-channel design	25%	Push, email, SMS, in-app — each has different constraints and delivery mechanisms
Fan-out strategies	25%	How do you handle one message going to millions of users efficiently?
Delivery guarantees	20%	At-least-once vs exactly-once, handling failures, retry logic
Production concerns	10%	Rate limiting, priority queues, monitoring, user preferences

The #1 reason candidates struggle: they describe a single notification channel (usually push) and ignore the multi-channel complexity. Real notification systems must coordinate delivery across push notifications, email, SMS, and in-app messages — each with different latency, cost, and reliability characteristics.

Step 1: Clarify Requirements

Questions That Define Your Architecture

"What types of notifications do we need to support?" This determines your entire system design. Different channels have vastly different requirements:

Push notifications: Fast (seconds), mobile-focused, limited payload size
Email: Slower (minutes), rich content, high deliverability standards
SMS: Expensive, ultra-reliable, character limits, regulatory constraints
In-app notifications: Real-time, supports rich media, only works when user is active

"What's the expected scale?" Changes everything from database choice to fanout strategy:

10,000 users: Single server with simple message queues
10 million users: Distributed queues, database sharding, rate limiting
1 billion users: Global distribution, sophisticated fan-out, priority systems

"What are the delivery guarantee requirements?" This is the most technical question:

At-most-once: Fast, simple, but notifications might be lost
At-least-once: Reliable, but users might receive duplicates
Exactly-once: Complex distributed consensus, usually not worth it

"Do we need analytics and delivery tracking?" If yes, you need to track delivery status, open rates, and click-through rates. This adds event logging, analytics pipelines, and potentially webhook callbacks to third parties.

Functional Requirements

After clarifying, state what you're building:

Core functionality:

Send notifications via multiple channels (push, email, SMS, in-app)
Support different notification types: transactional, promotional, system alerts
Handle both single-user and broadcast notifications
Track delivery status and user engagement
User preference management (opt-out, channel selection, frequency limits)

Non-functional:

High availability — critical notifications (security alerts, payment confirmations) must be delivered
Low latency — real-time notifications within seconds for active users
Scale — handle millions of users and billions of notifications per day
Cost optimization — SMS is expensive (~$0.01 per message), optimize channel selection

Step 2: High-Level Design

API Design

POST /api/v1/notifications/send
Body: {
  "userId": "user123",
  "type": "payment_confirmation", 
  "channels": ["push", "email"],
  "priority": "high",
  "payload": {
    "title": "Payment Successful",
    "message": "Your payment of $25.99 has been processed",
    "deepLink": "/transactions/tx_abc123"
  }
}

POST /api/v1/notifications/broadcast  
Body: {
  "audience": {
    "segment": "premium_users",
    "filters": {"country": "US", "active_last_30_days": true}
  },
  "type": "product_announcement",
  "channels": ["push", "email"],
  "payload": {...}
}

GET /api/v1/notifications/{userId}
Response: [...] // user's notification history

PUT /api/v1/users/{userId}/preferences
Body: {
  "email_marketing": false,
  "push_transactional": true,
  "sms_critical_only": true
}

System Architecture

Notification API
    |
    v
[Message Queue] → [Channel Workers] → [Third-party Services]
    |                    |                     |
    |                    |→ Push Worker → FCM/APNs
    |                    |→ Email Worker → SendGrid/SES
    |                    |→ SMS Worker → Twilio/AWS SNS
    |                    |→ In-App Worker → WebSocket/SSE
    |
    v
[Database] ← [Analytics Service]

Request flow:

API server receives notification request
Validate user preferences and notification type
Enqueue messages to channel-specific queues
Workers consume from queues and call third-party APIs
Track delivery status and update analytics

Channel-Specific Considerations

Channel	Latency	Cost	Payload Size	Reliability	Use Case
Push	~2 seconds	Free	4KB	95% delivery	Real-time alerts
Email	~30 seconds	$0.0001	Unlimited	99% delivery	Rich content, receipts
SMS	~5 seconds	$0.01	160 chars	99.9% delivery	Critical alerts
In-app	~1 second	Free	Unlimited	Only if user active	Activity feeds

Step 3: Deep Dive — Fan-out Strategies

This is the core technical challenge that separates strong candidates from average ones. How do you efficiently deliver one message to millions of users?

Push Model vs Pull Model

Push Model (Write Fanout): When a notification is triggered, immediately write it to all recipients' individual queues.

Example: A popular user posts an update that needs to notify 10 million followers. Push model immediately writes 10 million entries.

User A posts → Fan out to 10M queues → Each follower's queue gets the message

Pros: Fast delivery — users get notifications immediately Cons: High write volume, expensive storage, potential hot-spotting

Pull Model (Read Fanout): Store the notification once, and users pull/fetch relevant notifications when they become active.

User A posts → Store once in global feed → Users pull relevant messages when active

Pros: Efficient storage, handles inactive users well Cons: Higher latency, complex ranking/filtering logic

Hybrid Approach (Production Reality)

Most production systems use a hybrid based on user activity and notification priority:

public class FanoutStrategy {
    private static final int ACTIVE_USER_THRESHOLD = 7; // days
    private static final int VIP_FOLLOWER_LIMIT = 1_000_000;
    
    public FanoutDecision decideFanout(User sender, NotificationType type) {
        if (type == NotificationType.CRITICAL) {
            return FanoutDecision.PUSH_ALL; // Security alerts, payments
        }
        
        List<User> followers = getFollowers(sender);
        List<User> activeFollowers = followers.stream()
            .filter(u -> u.lastActive().isAfter(now().minusDays(ACTIVE_USER_THRESHOLD)))
            .collect(toList());
            
        if (activeFollowers.size() < VIP_FOLLOWER_LIMIT) {
            return FanoutDecision.PUSH_TO_ACTIVE_USERS;
        } else {
            return FanoutDecision.PULL_MODEL; // Celebrity users
        }
    }
}

Strategy rules:

Active users (last 7 days): Push fanout for immediate delivery
Inactive users: Pull model — they'll see notifications when they return
High-follower accounts: Pull model to avoid write amplification
Critical notifications: Always push regardless of fanout cost

Strong candidates explain the trade-off explicitly: "For a user with 50k active followers, I'd use push fanout because the write cost is manageable and delivery is immediate. For a celebrity with 10M followers, I'd use pull model because writing 10M entries per post would overwhelm our write capacity."

Message Queue Architecture

[High Priority Queue] → [Critical Worker] (security, payments)
[Medium Priority Queue] → [Standard Worker] (social, updates)  
[Low Priority Queue] → [Batch Worker] (marketing, newsletters)

Queue configuration by channel:

queues:
  sms_critical:
    priority: high
    workers: 20
    rate_limit: 100/second  # Regulatory limits
    
  push_realtime:
    priority: medium  
    workers: 50
    rate_limit: 10000/second
    
  email_marketing:
    priority: low
    workers: 10
    rate_limit: 1000/second
    batch_size: 100  # Send in batches for efficiency

Step 4: Deep Dive — Delivery Guarantees and Failure Handling

The At-Least-Once Challenge

Most notification systems provide at-least-once delivery — notifications are guaranteed to be delivered but might arrive multiple times.

Implementation pattern:

Store notification in database with status "pending"
Send to third-party service (FCM, SendGrid, Twilio)
On success response, mark as "delivered"
On failure/timeout, retry with exponential backoff
After max retries, mark as "failed" and alert

@Service
public class NotificationDeliveryService {
    private static final int MAX_RETRIES = 3;
    private static final Duration BASE_DELAY = Duration.ofSeconds(5);
    
    public DeliveryResult deliver(Notification notification) {
        for (int attempt = 1; attempt <= MAX_RETRIES; attempt++) {
            try {
                DeliveryResult result = sendToProvider(notification);
                if (result.isSuccess()) {
                    updateStatus(notification.getId(), NotificationStatus.DELIVERED);
                    return result;
                }
            } catch (Exception e) {
                log.warn("Delivery attempt {} failed for {}: {}", 
                    attempt, notification.getId(), e.getMessage());
                    
                if (attempt < MAX_RETRIES) {
                    sleep(BASE_DELAY.multipliedBy(1L << (attempt - 1))); // Exponential backoff
                }
            }
        }
        
        updateStatus(notification.getId(), NotificationStatus.FAILED);
        alertOnFailure(notification);
        return DeliveryResult.failed();
    }
}

Handling Third-Party Service Failures

Each notification channel depends on external services that can fail:

Service	Failure Mode	Mitigation
FCM/APNs	Rate limiting, invalid tokens	Retry with backoff, token cleanup
SendGrid/SES	Temporary outages	Multiple provider fallback
Twilio	Account suspension, regional blocks	Pre-approved backup providers

Circuit breaker pattern for external dependencies:

@Component
public class EmailServiceCircuitBreaker {
    private final CircuitBreaker circuitBreaker = CircuitBreaker.ofDefaults("email-service");
    
    public CompletableFuture<EmailResult> sendEmail(EmailNotification email) {
        return circuitBreaker.executeSupplier(() -> {
            return primaryEmailProvider.send(email);
        }).recover(throwable -> {
            log.error("Primary email provider failed, using fallback", throwable);
            return fallbackEmailProvider.send(email);
        });
    }
}

Dead Letter Queues and Manual Intervention

After all retries fail, notifications go to a dead letter queue for manual investigation:

Failed Notification → Dead Letter Queue → Admin Dashboard → Manual Retry/Investigation

Common failure scenarios requiring manual intervention:

Invalid user data (malformed email addresses, deregistered push tokens)
Third-party service account issues (billing, API limits exceeded)
Compliance violations (user in Do Not Call registry for SMS)

Step 5: Deep Dive — User Preferences and Rate Limiting

Preference Management

Users must control their notification experience to prevent spam and maintain engagement:

CREATE TABLE user_notification_preferences (
    user_id BIGINT PRIMARY KEY,
    email_marketing BOOLEAN DEFAULT false,
    email_transactional BOOLEAN DEFAULT true,
    push_social BOOLEAN DEFAULT true,
    push_marketing BOOLEAN DEFAULT false,
    sms_critical_only BOOLEAN DEFAULT true,
    frequency_limit_per_hour INTEGER DEFAULT 10,
    quiet_hours_start TIME DEFAULT '22:00',
    quiet_hours_end TIME DEFAULT '07:00',
    timezone VARCHAR(50) DEFAULT 'UTC'
);

Preference enforcement logic:

public boolean shouldSendNotification(User user, NotificationType type, 
                                    NotificationChannel channel) {
    UserPreferences prefs = getPreferences(user.getId());
    
    // Check channel opt-in
    if (!prefs.isChannelEnabled(channel, type)) {
        return false;
    }
    
    // Check frequency limits
    int recentCount = countRecentNotifications(user.getId(), 
                                             Duration.ofHours(1));
    if (recentCount >= prefs.getHourlyLimit()) {
        return false;
    }
    
    // Check quiet hours (convert to user's timezone)
    LocalTime now = LocalTime.now(ZoneId.of(prefs.getTimezone()));
    if (isInQuietHours(now, prefs) && !type.isCritical()) {
        return false;
    }
    
    return true;
}

Rate Limiting Per User and Per Channel

Different rate limits prevent notification fatigue while ensuring critical messages get through:

public class NotificationRateLimiter {
    private final RedisTemplate<String, String> redis;
    
    public boolean allowNotification(String userId, NotificationChannel channel, 
                                   NotificationType type) {
        if (type.isCritical()) {
            return true; // Never rate limit critical notifications
        }
        
        String key = String.format("rate_limit:%s:%s", userId, channel);
        String count = redis.opsForValue().get(key);
        
        int currentCount = count != null ? Integer.parseInt(count) : 0;
        int limit = getLimitForChannel(channel);
        
        if (currentCount >= limit) {
            return false;
        }
        
        redis.opsForValue().increment(key);
        redis.expire(key, Duration.ofHours(1));
        return true;
    }
    
    private int getLimitForChannel(NotificationChannel channel) {
        return switch (channel) {
            case PUSH -> 20;      // 20 push notifications per hour max
            case EMAIL -> 5;      // 5 emails per hour max
            case SMS -> 2;        // SMS is expensive, very limited
            case IN_APP -> 50;    // In-app can be higher volume
        };
    }
}

Step 6: Deep Dive — Template System and Personalization

Template Management

Production notification systems use templates to separate content from delivery logic:

{
  "template_id": "payment_confirmation",
  "channels": {
    "push": {
      "title": "Payment successful",
      "body": "Your payment of {{amount}} has been processed"
    },
    "email": {
      "subject": "Payment confirmation - {{merchant_name}}",
      "template_url": "s3://templates/payment_confirmation.html",
      "variables": ["amount", "merchant_name", "transaction_id", "date"]
    },
    "sms": {
      "body": "{{merchant_name}}: Your {{amount}} payment was successful. Ref: {{transaction_id}}"
    }
  },
  "localization": {
    "es": {
      "push": {
        "title": "Pago exitoso",
        "body": "Tu pago de {{amount}} ha sido procesado"
      }
    }
  }
}

Template rendering service:

@Service
public class NotificationTemplateService {
    
    public RenderedNotification renderTemplate(String templateId, 
                                             NotificationChannel channel,
                                             Map<String, Object> variables,
                                             String locale) {
        Template template = templateRepository.findByIdAndChannel(templateId, channel);
        
        if (template == null) {
            throw new TemplateNotFoundException(templateId, channel);
        }
        
        // Apply localization if available
        Template localizedTemplate = getLocalizedTemplate(template, locale);
        
        // Render with variables using Mustache/Handlebars
        String renderedTitle = templateEngine.render(localizedTemplate.getTitle(), variables);
        String renderedBody = templateEngine.render(localizedTemplate.getBody(), variables);
        
        return RenderedNotification.builder()
            .title(renderedTitle)
            .body(renderedBody)
            .deepLink(renderDeepLink(localizedTemplate.getDeepLink(), variables))
            .build();
    }
}

Personalization and Segmentation

Advanced notification systems support audience segmentation and personalized content:

public class NotificationAudienceService {
    
    public List<User> resolveAudience(AudienceDefinition definition) {
        QueryBuilder query = new QueryBuilder();
        
        // Apply demographic filters
        if (definition.getCountries() != null) {
            query.whereIn("country", definition.getCountries());
        }
        
        // Apply behavioral filters  
        if (definition.getLastActiveWithin() != null) {
            query.where("last_active_at", ">", 
                       now().minus(definition.getLastActiveWithin()));
        }
        
        // Apply engagement filters
        if (definition.getEngagementLevel() != null) {
            query.where("engagement_score", ">=", 
                       definition.getEngagementLevel().getMinScore());
        }
        
        return userRepository.findByQuery(query.build());
    }
    
    public Map<String, Object> getPersonalizationVariables(User user, 
                                                          NotificationType type) {
        Map<String, Object> variables = new HashMap<>();
        variables.put("first_name", user.getFirstName());
        variables.put("timezone", user.getTimezone());
        
        if (type == NotificationType.RECOMMENDATION) {
            variables.put("recommendations", 
                         recommendationService.getForUser(user.getId()));
        }
        
        return variables;
    }
}

Step 7: Common Mistakes and Follow-up Questions

Mistake 1: Ignoring Multi-Channel Complexity

Describing only push notifications while ignoring email, SMS, and in-app channels. Real systems must coordinate across all channels with different delivery guarantees and cost models.

Mistake 2: Missing User Preferences

Designing a system that bombards users without respecting opt-outs, frequency limits, and quiet hours. This leads to poor user experience and legal compliance issues.

Mistake 3: No Failure Handling Strategy

Assuming third-party services (FCM, SendGrid, Twilio) never fail. Production systems need circuit breakers, fallback providers, and dead letter queues.

Mistake 4: Inefficient Fan-out for High-Volume Users

Using push fanout for celebrity users with millions of followers would overwhelm write capacity. Need hybrid push/pull strategy based on user activity.

Mistake 5: Forgetting About Cost Optimization

Treating all notification channels equally when SMS costs ~$0.01 per message. Channel selection should consider cost, urgency, and user preferences.

Follow-up Questions to Prepare For

"How would you handle a notification going to 100 million users simultaneously?" This tests your understanding of write amplification. Use pull model for such large broadcasts, with push notifications only for recently active users. Consider staged rollout (10% → 50% → 100%) to detect issues early.

"What if a user is offline for weeks and has thousands of pending notifications?" Implement notification expiration and consolidation. Expire old notifications, consolidate similar ones ("You have 50 new messages" instead of 50 individual notifications), and prioritize by importance when they return.

"How do you ensure exactly-once delivery?" Explain that exactly-once is expensive and often unnecessary. Most systems use at-least-once with idempotency keys — duplicate notifications are acceptable for most use cases. For critical cases, use distributed consensus (costly).

"How would you implement real-time in-app notifications?" WebSockets or Server-Sent Events for active connections. When a user comes online, establish a connection and deliver pending notifications immediately. Use connection pooling and heartbeat mechanisms to handle connection failures.

"What about international compliance (GDPR, CAN-SPAM)?" Implement explicit consent tracking, easy unsubscribe mechanisms, and data retention policies. Different regions have different rules (EU requires explicit opt-in, US allows opt-out). Store consent timestamps and audit trails.

Summary: Your 35-Minute Interview Plan

Time	What to Do
0-5 min	Clarify requirements: notification types, scale, delivery guarantees, analytics
5-12 min	High-level design: API, multi-channel architecture, message queue design
12-22 min	Deep dive: Fan-out strategies (push vs pull vs hybrid), handling high-volume users
22-28 min	Delivery guarantees: at-least-once implementation, failure handling, circuit breakers
28-32 min	User preferences, rate limiting, template system
32-35 min	Wrap up: trade-offs, cost optimization, compliance considerations

The notification system interview tests your ability to design for both scale and reliability across multiple delivery channels. Companies want to see that you understand the complexity of coordinating push, email, SMS, and in-app notifications — each with different constraints, costs, and user expectations.