System Design: Create a url shortening service (Part 2): Design the write API

This is part of a blog series where we design, develop, optimize, deploy and test a URL shortener service from scratch

In this article, we’ll discuss:

API signature is simple

shorten(longUrl) // Returns a unique short url

How short should the short url be?

We can use the following characters in our short code:

  • A-Z(26)
  • a-z(26)
  • 0–9(10)
  • _, -(2)

A total of 64. With 6 characters, something like http://ad.com/abcdef, we will be able to store 64⁶ unique urls. This is more than 68 billion (68,719,476,736). At 100 requests per second, it would take 21 years for this limit to exhaust.

How to generate short code?

There are several ways to do it. One way would be to generate a random 6 character string. The problem with random is that if different people want to use the service with same longUrl, the system will generate different code and store it multiple times. This would be wastage of space. To solve this, let’s use hashing, specifically MD5 hash. This would generate the same hash for the same input. We’ll use base64 encoding on the hash to generate the string and take the first 6 characters. However, there is one catch. Base64 is not url safe as it contains /and +. So we’ll replace these characters from Base64 encoding.

Here is the code snippet in Node.js

Create MD5 hash of a string

How to save the short code

As discussed earlier, we’ll use postgres. Other DBs are fine as well.

Database needs two fields: code to save the 6 character hash and the corresponding originalurl.

Database schema for url shortner

The following migration will generate the above schema

Database migration for creating url shortener schema

unique constraint on the code creates index. Database part is done.

There can be several node servers running. We need to solve for concurrent writes. There can be a race condition. We’ll use findOrCreate to insert into postgres. If you are using noSQL database, find out how one can make an atomic transaction.

Workflow to write to the database

Model code should look like this

Write into the database

Since we are only using first 6 characters of the md5 hash, it is possible that we can get the same short code for two different long URLs. In order to resolve this, we’ll take next 6 characters from our base64 code and use them.

Here is the code which, when in conflict, would fetch the next 6 characters from the hash and use that for the short url

Resolve MD5 hash colissions

The only thing remaining now is to call this function from the routes.

Route to create a short URL

If you are new to Hapi, here is how you can write your main server

HAPI server

The complete code can be found on Github.

If you found this story interesting or useful, please support it by clapping it👏

Senior Staff Engineer @freshworks. Ex-McKinsey/Microsoft/Slideshare/SAP, Tech Enthusiast, Passionate about India. Opinions are mine

Senior Staff Engineer @freshworks. Ex-McKinsey/Microsoft/Slideshare/SAP, Tech Enthusiast, Passionate about India. Opinions are mine