URL Shortener System Design | TinyURL System Design

4 min readApr 8, 2021

In this article we’ll be discussing about System design of URL shortening Service such as TinyURL where given a long URL it should return a short URL and given a short URL it should redirect to the corresponding long URL.

Let’s write down the functional and nonfunctional requirement of the system.

Functional Requirements
1. Given a long url, the system should return the short url.
2. Given a short url, the system should return the corresponding long url.

Nonfunctional Requirements
1. System should be Highly Scalable.
2. System should have Low Latency.
3. System should be highly available.

Capacity Estimation
Assume Create Short URL load to be 10000 rpm.
This system should be ready heavy. Read to write ratio will be 100:1.
Get long URL load will 1000000 rpm. The system should be designed in such a way that it can store urls for next 10 years.

What will be total Unique URL stored in the system as per the given estimation ??

we need 10k * 60 * 24 * 365 * 10 = 8,640,000,000 = 8.6 billion ~ round upto 10 billion unique keys.

To generate 10 billion unique keys, assume we use characters A-Z, a-z, 0–9 (62 characters in total).

What will be the minimum length of key where each character can take value from above mentioned 62 character set??

N ^ 62 ≥ 10 billion -> N = 6

Now, we know that with key of length of 6, we can generate more than 10 billion keys, 56 billion unique keys to be exact.

Let’s design the REST APIs for this system

For a long url the API should return the short url
POST /shorten-url/{long_url}
For a short URL, the API should return the long URL
GET /{short_unique_key}

Let’s look at the initial HLD of the system

Key Generator Service (KGS) generates and stores 10 billion unique keys. It is a single threaded service. When a request comes to Shortening Service it asks Token Service for the unique key and stores the long url against the key in it’s cassandra cluster.

Cassandra is used as database as it is horizontally scalable and can easily handle load of thousands of reads/writes per second.

What’s the drawback of this system??

Token Service(TS) becomes the single point of failure for this system as all the load comes to TS and being the single threaded system it can’t respond to all the requests from shortening service on time. So what if we make it multi-threaded…what’s issue in that? if we do so then conflicts may arise like same key is getting returned to two different requests.

What should we do now?

Rather than returning a single key for each request from shortening service, Token Service can return a set of keys (say 1 lakh keys or depending on the load) which Shortening Service can store in it’s local cache and don’t request TS everytime it receives a request from the client. This reduces the load on TS.

Let’s have look at overall workflow
1. KGS generates and stores keys in it’s cassandra cluster.
2. When Shortening Service is started, it asks for a set of keys from Token Service which it stores in its local cache and when remaining keys drops below a configured threshold, it asks Token Service to send the new set of keys.
3. Client sends request to Shortening Service, it gets the available key from it’s cache and stores long url against that key in it’s cassandra cluster.
4. Client sends short url to Shortening Service, it extracts the key from the request and gets the long url from it’s cassandra cluster.

This looks fine. All the components in the System are horizontally scalable. We have built a Highly Scalable System as per our Functional and Nonfunctional Requirements.

Is there a room for any optimisation ?

Remember KGS generates 10 billion keys. It’s a background service which keeps on generating keys and stores it in cassandra.
Token Service sends the set of keys to Shortening Service(SS). Rather than sending the keys, if Token Service sends the range to SS like [start: 100000 end: 200000] and SS converts the number to Base62 to generates the key. This will remove the Key Generator (KGS) component from the system and it need not store the 10 billion keys in it’s cassandra. If you see closely, we don’t need cassandra as well as we just need to store the last range returned to SS. Let’s look at new optimised HLD.

MySQL db will contain one table named token which will have one column named start

SS requests Token Service for 100000 numbers. Token service will run the following queries.

select start from token;
update token set start = start + 100000;

Token Service will then return range [ start: {start} + 1, end: {start} + 100000] to SS.

Shortening service will store the range in local cache.
For every request it’ll take one value from the cache and converts it in Base62 and stores long url against the key in cassandra.

What’s the downside of this design??

What if any of the SS goes down, we lose all the keys present in its local cache. But considering we have abundance of keys[56 billion] than our requirement[10 billion], we can afford to lose some keys. This is a trade off that we have to live with.

This is all about Scalable URL Shortening Service. I have tried to cover almost all the aspects which can be asked in a system design interview. Do let me know your feedbacks in the comment below.