In this era, we all are using computers, at least mobile phones. By using these devices you definitely come across the word DATA. We all have a different meaning of data and here we will talk about the storage where data is going to be stored.
With the rise of advancement in technology, where DATA is everything, we need strategies to store and access this data. Now, you should be thinking that what's point in this. We can store it in Hard disks, SSDs. You are right. But for limited use cases. IT industry has so many use cases where different things have different priorities. In personal use, we have a storage size (GBs) matter. In the database scenario, they have accessing speed is a priority. In computer boot, we need storage which can easily boot by CPU.
With the increase in storage demand, old style File base and Block base storage will not work. They have created Object-based storage solutions. Let's understand what is the difference between these three types and why we need this.
File and Block-based storage:
These are methods which we all are using.
If you have Hard drive and when you attach it to the computer, it displays properly organized file system in files and folders. It also has functionality for access rights so that you can assign some files' access to some users. So that you can prevent unauthorized access attempts for sensitive files. In companies, you may have seen SAN and NAS, a type of network storage, which you can access in the company's network. This can also be an example of a file system. This file-based storage works well for thousands of files. But what if I have billions! Just think we have a thousand folders and inside hundreds of files. Then how tedious is to move back and forth in folders, even for computers they require more time to
- Files are higher level, manages data as a numbered file system
- You can mount it to OS
Block storage works the identical way as file based. Unlike file based storage, managed data in form of file and folders, data is stored in blocks here. As an example, you have a big file than multiple blocks complete this whole file. Blocks have an address attached so that they can be able to find the next part of the file. Now we don’t need to be in deep that how this can be stored and in which part of the disk. These all things are managed by the storage application and algorithm built to manage that. So, we just take care of blocks rather than files. Blocks do not have metadata of systems or data systems. The Performance seeker applications require block-based storage like OS boot, database, etc.
- Operates at a lower level, manages data as fixed size blocks
- Can be used for OS root partition
These methods are working fine before cloud computing arrived. So why we need something else now? Obviously, without the reason why should we care! That's human nature. You will find the answer soon.
The object is data with its corresponding metadata. Every object has UniqueId, which will be used to identify that particular object. Again, we won't go deep how objects are stored in the disk as it will be dependent on the algorithm. But in abstract, I can say, it will be stored anywhere in the disk but can be found by its UniqueId. These objects are Immutable, that is it will not modify the object. (Wait wait, I know you are thinking then why we should use it.) If we change the object, then it will create the new object and UniqueId, but you can have the option to access it like the older object. It's algorithm's responsibility to let you modify the object, but in the back side, it will create a new object.
Suppose you have an object at /Documents/Important/Secrets.txt While considering file system, we can say Secrets.txt is a file name. In Object base storage, this whole path is KEY. That is you can't say Secrets.txt is an object but /Documents/Important/Secrets.txt is an object. Also, you can't take Documents/Important as a directory but this name is as per our convenience so that we can imagine it as the file system.
By this, we can access the object by KEY NAME. So we don't need to take care of folders or blocks but the whole object will be there and identified by a key. So it is easily managed and organized. And it would be really useful for unorganized data where we just need to perform analytics on the data, we don't care about other things.
Seems confusing though! Simply remember "Object as a file and key as a file name. However, it's not a traditional file system. It replaces the whole object rather than modifying. "
- It has both data and metadata
- Operates on the whole object at once
- You can't mount it like the file system and move through like folders. But it can definitely have names like folder/filename. But remember folder/filename this whole is object KEYName, not folder and inside filename.
- Privilege management possible
- Can handle real-time billions of data
It’s a good example of an object-based storage solution. Each object resides in a container called buckets and objects have a unique name called Object Key and have metadata. You cannot mount bucket like a file system. You can have any number of objects in a single bucket but the single object size should not exceed 5TB.
Amazon S3 is a highly scalable and highly durable storage that is read optimized. You don't need to do capacity planning, it's availability and reliability. It provides more than 99% of availability and durability so you can simply use storage without worrying underlying things. Amazon will replicate your data across multiple data centers itself to prevent data loss. Also, Scalability will be maintained by AWS, if your request grows, it will also grow steadily. Amazon will automatically partition your bucket to handle needed requested scalability.
For High stream throughout or analytics
For Database transactions
Can be scalable at Petabyte scale
Block addressing limits its scalability
Can be stored anywhere and accessed
Need to be at a lesser distance for higher latency