IPFS – The New Internet's Protocol

What Is the Internet?

Most people think the Internet is a set of services like Facebook, Amazon, Netflix, Google, Instagram, Snapchat, YouTube, and so on. But actually, they are simply web services that run on the internet’s infrastructure.

The Internet is a global computer communication system that has made all such services possible. Internet services have become deeply embedded in our daily lives, and that instantaneous access to information has changed just about everything.

Though the internet was there for a while, the actual revolution began with the launch of the World Wide Web. Before its launch, very few people knew what the internet was. After CERN made the World Wide Web free, things started to change rapidly and a number of browsers were made, with Mosaic being the most famous one.

The growth of easy-to-use Web browsers coincided with the growth of the commercial ISP (Internet service provider) businesses, with companies like CompuServe bringing an increasing number of people from outside the scientific community onto the Web – and that was the start of the Web we know today.

The Web has become part of our day-to-day life. Initially, when it was launched, it was merely a network of static HTML documents, which, over time, changed and evolved, and today we can display data dynamically, play media, and update pages in real-time.

All this would have been impossible if two computers would have failed to communicate. For two people to exchange information they need common sets of rules that define how & when information is transmitted. These rules are broadly known as communication protocols.

For example, if you visit France and do not know French then it will be hard to communicate with people. What you experience is a lack of communication protocol. This was the case with earlier computers. They were isolated until the communication protocol was invented.

Communication protocols exist in bundles of several layers. As an example, internet protocol consists of four layers. Along with communication protocols, it is also important to understand the system architecture, which is nothing but the basic structure of the interconnection between the computers. The two relevant system architecture to this article are Client-server and peer-to-peer networks.

Client-server network is a widely used system architecture that uses HTTP protocol for communication. Here, the data is kept in a centralized server and accessed by location-based addressing. When the client makes an HTTP request to the server the data is sent to the client.

IPFS – InterPlanetary File System

IPFS is content addressed unlike HTTP which is IP addressed. Which means when you add a file to IPFS, your file is split into smaller chunks, cryptographically hashed, and given a unique fingerprint called a content identifier (CID). This CID acts as a permanent record of your file as it exists at that point in time. Anyone can access the data that have that hash.

InterPlanetary File System (IPFS) is a peer-to-peer distributed file system that seeks to connect all computing devices with the same system of files. Where HTTP downloads data from one server, peer-to-peer IPFS downloads data from multiple nodes at once, enabling substantial bandwidth savings.

Today, if someone decides to remove a particular data or website from the internet they can do it and you cannot do anything about it, but in IPFS, as the data is distributed on nodes it is hard to remove data from the web as someone on node has the copy of the data.

As mentioned before, on IPFS you will always have access to the data as someone on the node may have the copy of the data. But the question may arise; can we trust the person from whom we are requesting the data? The answer is no. You don’t have to trust the person to get the right data. As we saw before, on IPFS we look for content and each content has a unique hash. If someone tries to temper the data the hash of the content will change and then you can compare the two hashes to see if the data is tampered. Hence the security is inbuilt.

How does IPFS Store Data?

IPFS stores files as IPFS Objects, with each object supporting 256 kb of data. It also supports linking data. If the size of data is more than 256 kb, the data is split into small chunks. Then, IPFS creates another empty IPFS Object and links all the chunks together. Using this feature, you can create a file system that can look like the file system shown below:

Since the IPFS gives a unique identifier to each file object, it is impossible to add or remove content from the file as it will change the hash of the file. To solve this issue IPFS supports versioning of the file.

If you edit some data in a file, IPFS creates a new file having a unique hash and that file is linked to the previous file which was edited. In layman’s terms, if version 1 is original data, after editing, version 2 is created, which is now linked to version 1, implying that version 1 is the previous history of version 2. In this way various versions can be created and linked to previous versions.

But why is it interplanetary? Suppose you live on mars and want to access some data. If the data server is on earth then it might take upto 24 minutes to access the data which is on earth. Again if the data is moved somewhere else or removed, then yes you have wasted your time looking for data which is unavailable. But using IPFS there may be few nodes on Mars which may already have the data you need. Thus, saving a lot of time.

Now here comes a demerit. We know that in IPFS data is distributed on nodes (node implies the user who is connected to IPFS). If anomalously, the users who have the data get disconnected, then that data will not be accessed by others. To tackle this, there must be a sufficient number of users on the network who have that data.

IPFS could be seen as a single Bit Torrent swarm, exchanging objects within one Git Repository.

Summary

IPFS can be seen as a new decentralized Internet infrastructure on which various applications can be built. It can be seen as a next-generation file-sharing system. With decentralization of authority, users now have the control of their own data. As more and more users get connected to the IPFS forming a distributed network, content based addressing provided by IPFS ensures security on the system. It is clear that IPFS is going to change the picture of the web we experience today.

Also published here.