Jump to section

Understanding data storage

Copy URL

Data storage has come a long way since the days of disk systems. Sure, those disk systems might still be used here and there—but now all that data is attached to a network and software-defined.

Data storage is the collection and retention of digital information—the bits and bytes behind applications, network protocols, documents, media, address books, user preferences, and more. Data storage is a central component of big data and data management.

Think about it like this. Computers are like brains. Both have short-term and long-term memories. Brains handle short-term memory in the prefrontal cortex, while computers handle it with random-access memory (RAM).

Brains and RAM process and remember things while awake, and both get tired after a while. Your brain converts working memories into long-term memories while you sleep, and computers transfer active memory into storage volumes when it sleeps. Computers also distribute data by type in the same way brains distribute memories by semantic, spatial, emotional, or procedural.

Perhaps the best consolidated history of data storage devices is contained within the first dozen pages of Gordan Haff and William Henry’s From Pots and Vats to Programs and Apps: How Software Learned to Package Itself.

In it, Haff and Henry describe how a 1725 textile worker programmed looms using punchcards that were inspired by automated organs’ cylinders. Punchcards fed information into a 19th century computer as part of the 1890 U.S. Census and remained popular until the era of magnetic tape drives began in the 1950s. From there, the size of magnetic tape drives shrank until they became cassette tapes.

Right before the 1970s, IBM released the floppy disk—which were used for almost everything. Floppies initialized mainframes, stored software applications, and were the only persistent storage device available until hard disk drives (HDDs) dropped in price. HDDs became compact disks (CDs) in the 1980s, and solid state drives (SSDs) replaced the spinning disks with solid chips and flash memory. Flash storage now fits in our pockets as flash drives that hold hard copies of everything we want or need.

Software-defined storage

Software-defined storage (SDS) uses abstraction management software to decouple data from hardware before reformating and organizing it for network use. SDS works particularly well with container and microservice workloads that use unstructured data, since it can scale in ways hardwired storage solutions simply can’t.

Cloud storage

Cloud storage is the organization of data kept somewhere that can be accessed through the internet by anyone—given the right permissions. You don’t need to be connected to an internal network (that’s known as NAS) and aren’t accessing the data from hardware directly attached to your computer. Popular cloud storage providers include Microsoft, Google, and IBM.

Network-attached storage

Network-attached storage (NAS) makes data more accessible to internal networks by installing a lightweight operating system onto a server that turns it into something called a NAS box, unit, or head. The NAS box becomes an important part of intranets because it processes every single storage request.

Object storage

Object storage, also known as object-based storage, is a flat structure in which files are broken into pieces and spread out among hardware. In object storage, the data is broken into discrete units called objects and is kept in a single repository, instead of being kept as files in folders or as blocks on servers.

File storage

File storage arranges data as hierarchical files that users can open and navigate from top to bottom. Since files are stored on back ends and front ends the same way, users can requests files by unique identifiers such as names, locations, or URLs. This is the predominant human-readable storage format.

Block storage

Block storage splits storage volumes into individual instances known as blocks. Each block exists independently, which gives users complete configuration autonomy. Because blocks aren’t burdened with the same unique identifier requirements as files, blocks are a faster storage system—making them ideal formats for rich media databases.

The way you learn to do anything else: practice. Deploying a new storage system is a lot smoother with training, and we have a ton of ways to make sure you’re ready. If you think you’re blessed with an innate knowledge of storage systems—or just want to see if you know enough to be dangerous—take this little storage quiz to assess your skill level. If you need some training, take a few courses from our cloud computing, virtualization, and storage curriculum, complete the whole thing, or take the ones required for you to get a Red Hat Certificate of Expertise in Hybrid Cloud Storage.

Software-defined storage is inherently open. It decouples hardware from software, freeing you from vendor lock-in. Red Hat has taken "open" a step further. Our software-defined storage is also open source. It draws on the innovations of a community of developers, partners, and customers. This gives you control over exactly how your storage is formatted and used—based on your business’ unique workloads, environments, and needs.

Keep reading

Article

Why choose Red Hat storage

Learn what software-defined storage is and how to deploy a Red Hat software-defined storage solution that gives you the flexibility to manage, store, and share data as you see fit.

Article

What is cloud storage?

Cloud storage is the organization of data kept somewhere that can be accessed by anyone with the right permissions over the internet. Learn about how it works.

Topic

Understanding data services

Data services are collections of small, independent, and loosely coupled functions that enhance, organize, share, or calculate information collected and saved in data storage volumes.

More about storage

Products

Software-defined storage that gives data a permanent place to live as containers spin up and down and across environments.

An open, massively scalable, software-defined storage system that efficiently manages petabytes of data.

Resources

Podcast

Command Line Heroes Season 4, Episode 4:
"Floppies: The disks that changed the world"