Storage Basics Part 3 - or Time Waits For No WAN

So far in this series we have looked at some of the social aspects of storage as well as starting our journey into the terminology that surrounds the storage industry – as we progress into part three, the terminology drops down a gear and races quickly to the speed of light.

File System

This is generally a hierarchical structure and map between a human interpretable name, such as /home/glenn and a set of stored data. This is the piece of software that knows how to find the data in your files. In most operating systems you lay down a file system as part of formatting a disk and there are lots of file system types, with some of the more common ones being ntfs, fat32, ext4 or hfs+.

Volume Manager

A volume manager or logical volume manager (lvm) is a software abstraction layer in between the file system software and the software that talks to the storage device. It allows lots of actions to be taken on the data without the need to re-write lots of downstream interfaces such as host based mirroring or RAID, journaling, replication and probably the most widely used feature – non-disruptive resizing of volumes.

Capacity

The size of a storage entity, usually measured in bytes, prepended often by mega, giga, tera or peta - and something you hear a lot of in storage discussions. It is one of those measurements that always needs another question to qualify it, so it’s best not to make any assumptions on the number alone, though it’s easily done. You will hear adjectives such as raw and usable in the context of storage capacity and knowing the difference can help you have an informed discussion with your supplier. In simple terms, in most conversations, raw capacity is the sum of the individual formatted capacity of the physical drives before any form of virtualization is applied. Usable capacity is the space that is left over for use once the overhead of virtualization at the controller/disk level and the capacity of the disks that are installed to replace failed units is subtracted from the raw.

Latency

An important term that is not used enough in discussions around storage, it loosely describes the amount of time that it takes for a storage request to be actioned. In applications where time sensitive responses are critical then this is a significant measurement. Latency can be affected by many things in the data path, including protocol conversion and the speed of light over distance. Applications designed for the internet/cloud tend to be more tolerant of latency, where updates to data are performed in an asynchronous manner, simply put this means the application does not wait for confirmation of delivery of its data to its final destination, only that the request has been handed to the next receiver in the chain, and at a later point confirmation of delivery may be requested, if the app cares! The asynchronous approach is essential due to the use of internet service providers as a primary data store and the comparatively huge amount of time (in computing terms) it takes to set up a storage request across the internet.

Previously…

Next Time…

We start to break out the storage acronyms (hooray!), and look at some common and more obscure forms of RAID protection

Don’t forget to comment, get involved and get in touch using the boxes below and reach out direct @glennaugustus