Data Storage Hardware Guide for Software Engineers

Recently I have been dealing with lower level optimizations for different spark and flink jobs, which lead me into a rabbit hole of learning about the internal hardware of storage. A lot of what I learned is probably not applicable to most, but I found a few tid bits that I think would be useful to most engineers. In this post we will focus on storage hardware, and examples of how we can use this knowledge to improve our software.

There are three types of storage hardware that we will focus on

  1. Magnetic Disk Drives (HDDs)

  2. Solid State Drives (SSD)

  3. RAM

Magnetic Disk Drives (HDD)

HDDs use magnetization to physically encode binary data, which requires a rotating disk and a mechanical arm that moves over it. The mechanical arm reads the data by detecting the transitions in magnetization on the disk.

Because of these mechanical parts, and how they are used to read/write data we have the following limitations

  • Need data to rotate under-read and write heads, which adds to read latency. This is called rotational latency is generally ~4ms. This contributes to poor random access

  • IOPS range from 50-500, which limits read and write throughput

Despite these limitations, HDDs are very cheap, ~3 cents per GB of storage.

Summary

  • Cheap storage, currently ~$0.03 per GB

  • higher read latency than SSD and RAM

  • has higher failure rates than SSD, but this doesn’t usually matter since backups are quite cheap

  • very slow random access

Solid State Drives (SSD)

SSDs store data as charges in flash memory cells, which don’t require many of the mechanical components that are in magnetic drives by leveraging flash memory over a rotating disk and mechanical arm, SSDs provide lower latencies for both sequential and random access. This also leads to SSDs being much more reliable than HDDs(4-10x lower failure rates). The most notable advantage of SSDs over HDDs is the ability to support thousands of IOPS. Very popular databases like Postgres and MySQL leverage SSDs to support thousands of transactions per second.

Up until recently, SSDs did not provide as much storage capacity as HDDs but with recent improvements, this is not really an issue. In practice, the main noticeable downside of SSD over HDDs is cost, with SSD’s costing 30 cents per GB (10X the cost of HDDs).

Summary

  • ~$0.30 per GB, 10x the price of magnetic disk drives

  • Support much higher transfer speeds and tens of thousands of IOPS

  • More reliable than HDDs because it doesn’t have mechanical parts that may wear out

  • faster random access than HDD

Random Access Memory (RAM)

RAM provides the lowest read latency and highest bandwidth of the three types of storage we discuss in this post. But RAM is also the most expensive and least reliable. RAM currently costs around $10/GB of capacity, and it only takes 1 second without power to lose all its data.

Summary

  • $10 per GB

  • Not reliable, loses data in less than a second when access to power is lost

  • Much faster retrieval than SSDs and HDDs

Comparison

Optimizing your System

Below are examples of how we can leverage the properties of these storage types depending on the use case.

Example: Apache Kafka

Random access on HDDs is slow because it takes time to physically move the mechanical arm to different locations on the magnetic disk. Kafka limits the need for random access by taking advantage of sequential IO. Kafka uses an append-only log where data is written sequentially, allowing for much higher write throughput than if random access was used. Similarly, data is usually read in the same order it was written, and Kafka can take advantage of batching to read a group of records quickly rather than reading them one by one.

Although SSDs provide lower read latency, sequential access performance is comparable to HDDs. Assuming your system doesn’t have frequent lagging consumers (these consumers will require random access), you may be able to run all your brokers using HDDs exclusively, saving up to 10x in storage costs.

Example: Caching Data

For data that is accessed very frequently with low latency requirements, it may make sense to cache your data in an in-memory data store like Redis or Memcached. Since RAM is the most expensive type of storage, you will need to design an intelligent cache invalidation strategy that fits your use case. The most common pattern is LRU and most data stores used for caching can employ this strategy.

Example: Apache Druid

Apache Druid is an OLAP datastore that leverages all three types of storage in interesting ways. We can dedicate an entire post to Druid, but we’ll briefly discuss the way Druid handles Deep Storage, metadata storage, and compute node storage.

A very common Druid setup is to use S3 for deep storage (uses HDD by default), a relational database like Postgres for metadata storage (SSD by default), and then very large nodes with lots of memory and SSDs for compute.

  • Deep Storage is used as a backup for your data and is used to transfer data in the background between processes.

  • Metadata storage holds lots of system metadata that Druid uses to optimize queries and transformations of data.

  • Your regular compute nodes store recently written data, and other data that Druid believes will be accessed soon.

Druid tiers your data to intelligently move data from more expensive storage like SSDs on Druid nodes, to cheaper storage like S3. This allows Druid to serve low-latency access for data that is likely to be used for OLAP workloads while saving costs on data that does not need to be accessed right away. When Druid requires data stored in deep storage, it shuffles this data from S3 back onto Druid’s compute nodes.

Previous
Previous

Moving to Substack

Next
Next

Data Modeling for Streams