#29 Designing a Scalable URL Shortener – Hashing, Database Choices, Redirection Optimization

The Need for a URL Shortener

A social media platform faced a challenge: users shared long URLs that cluttered posts and messages.

The solution? A URL shortener, like Bitly or TinyURL, that converts long URLs into short, easily shareable links.

But how do we design a scalable, efficient, and fault-tolerant URL shortener that handles millions of requests per second?

How a URL Shortener Works

A URL shortener maps a long URL to a short, unique identifier.

Example:

Original: https://example.com/articles/designing-url-shorteners
Shortened: https://short.ly/abc123

Key Steps:

  1. User submits a long URL.

  2. System generates a unique short code.

  3. Shortened URL is stored in a database.

  4. Users are redirected when they access the short URL.

Generating Unique Short Codes

Short codes must be unique, short, and collision-free.

1. Hashing-Based Approach

  • Uses hash functions (e.g., MD5, SHA-256) to generate unique IDs.

  • Example: Hashing https://example.com/article-123 produces e9a2b5.

  • Problem: Hash collisions may occur, requiring a collision resolution strategy.

2. Base62 Encoding

  • Converts a numeric ID into an alphanumeric string (a-z, A-Z, 0-9).

  • Example: ID 123456789 → Base62 → d3Zc1.

  • Advantage: Shorter, human-readable, and URL-safe.

Example Code (Python Base62 Encoding):

const encode = (number) => {
  const characters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
  const base = characters.length;
  let encoded = "";

  while (number > 0) {
    let remainder = number % base; // Use let here
    number = Math.floor(number / base);
    encoded = characters[remainder] + encoded;
  }

  return encoded;
};

Choosing the Right Database

A URL shortener must store and retrieve short URLs efficiently, even at high scale.

Database Choices:

  • Relational Databases (PostgreSQL, MySQL) – Works well for small-scale apps but struggles at high read/write loads.

  • NoSQL Databases (Redis, DynamoDB, Cassandra) – Handles high traffic, fast lookups, and distributed storage.

  • Key-Value Stores (Redis, Memcached) – Ideal for caching frequently accessed URLs.

Example Data Schema (SQL):

CREATE TABLE short_urls (
  id BIGINT PRIMARY KEY AUTO_INCREMENT,
  short_code VARCHAR(10) UNIQUE,
  original_url TEXT NOT NULL,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Redirection Optimization – Handling High Traffic

A URL shortener must handle millions of redirects per second with minimal latency.

1. Caching for Faster Lookups

  • Store frequently accessed URLs in Redis for fast retrieval.

  • Cache eviction policies (LRU, LFU) manage memory efficiently.

  • Example: short.ly/abc123 → Redis cache → URL retrieved instantly.

Example Redis Lookup:

import redis
r = redis.Redis()

# Store URL in cache
r.set("abc123", "https://example.com/articles/designing-url-shorteners")

# Retrieve URL
url = r.get("abc123")

2. Load Balancing

  • Use Nginx, HAProxy to distribute traffic across multiple servers.

  • Round-robin or least-connections strategy ensures even load distribution.

Example:

User requests → Load Balancer → Redirect handled by multiple backend servers

3. Distributed Systems for Scalability

  • Sharding: Distributes data across multiple database nodes.

  • Replication: Ensures redundancy for high availability.

  • CDNs: Cache redirects at the edge for global performance.

Real-World Challenges & Solutions

1. Handling Expiring & Custom URLs

  • Allow users to set expiration times for short links.

  • Support custom short URLs (e.g., short.ly/my-product).

2. Preventing Abuse & Spam

  • Implement rate limiting to prevent bot-generated links.

  • Scan URLs for malicious content before shortening.

3. Analytics & Tracking

  • Track click-through rates, location, device type.

  • Store logs in BigQuery, Snowflake, or Elasticsearch.

Example Tracking Schema:

CREATE TABLE url_clicks (
  short_code VARCHAR(10),
  clicked_at TIMESTAMP,
  user_ip VARCHAR(45),
  user_agent TEXT
);

Choosing the Right Architecture

Component

Best Choice

Short Code Generation

Base62 Encoding

Database

Redis (cache), DynamoDB (persistent)

Redirection Optimization

Caching + Load Balancing

Scaling Strategy

Sharding + Replication

Security

Rate limiting + Spam detection

Real-World Use Cases

1. Social Media & Content Sharing

  • Twitter, LinkedIn use URL shorteners for compact links.

  • Bitly & TinyURL optimize social media sharing.

2. Marketing & Campaign Tracking

  • Marketers use shortened links to track engagement.

  • UTM parameters capture campaign effectiveness.

3. Affiliate & Referral Links

  • E-commerce sites shorten URLs for affiliate tracking.

  • Amazon generates short URLs for product promotions.

Conclusion

A scalable URL shortener requires efficient hashing, optimized databases, and fast redirection handling.

  • Base62 Encoding generates short, unique codes.

  • Redis + NoSQL provides fast lookups and scalability.

  • Caching & Load Balancing ensures high availability.

Next, we’ll explore Designing a Video Streaming Service – Video Encoding, Storage, Caching, CDN, Bitrate Adaptation.

Powered by wisp

3/6/2025
Related Posts
#28 Designing a Search System – Elasticsearch, Inverted Index, Ranking Algorithms

#28 Designing a Search System – Elasticsearch, Inverted Index, Ranking Algorithms

Frustrated with slow and irrelevant search? Learn about search systems! We'll show you how Elasticsearch, inverted indexes, and ranking algorithms deliver fast, accurate results.

Read Full Story
#30 Designing a Video Streaming Service – Video Encoding, Storage, Caching, CDN, Bitrate Adaptation

#30 Designing a Video Streaming Service – Video Encoding, Storage, Caching, CDN, Bitrate Adaptation

Ever wonder how Netflix streams so smoothly? We'll dive into designing a video streaming service. Basically, how to deliver high-quality video to millions of users without lag or buffering.

Read Full Story
#7 Distributed Caching – Redis, Memcached, CDN-Based Caching

#7 Distributed Caching – Redis, Memcached, CDN-Based Caching

Overwhelmed by traffic? Learn how distributed caching with Redis, Memcached, and CDNs can save the day. Deliver content faster, reduce server load, and keep your users happy.

Read Full Story
© Rahul 2025
    #29 Designing a Scalable URL Shortener – Hashing, Database Choices, Redirection Optimization - Rahul Vijay