RAID stands for Redundant Array of Independent Disks. RAID represents computer storage technology that aggregates several disks into one or more storage pools or volumes. By storing multiple copies of data or generated parity information, RAID provides protection against disk failures and improves data storage availability and reliability.
There are several RAID types or levels varying in functionality, performance and level of data protection. In this post we will review some of the most common RAID types and their characteristics.
RAID 0 | RAID 1 | RAID 1E | 3‑Way Mirror | RAID 3 | RAID 4 | RAID 5 | RAID 5E | RAID 5EE | RAID 6 | RAID 10 | RAID 50 | RAID 60 | RAID‑DP | RAID‑TEC | RAID Z
RAID0 is also known as Striped Volume or Stripe Set. Despite the word "Redundant" in the name, RAID0 does not actually provide any fault tolerance or redundancy. It works by simply stripping data across two or more disks - i.e., evenly distributing blocks across all disks in the array. Read and write operations can be performed by all disks independently.
RAID0 neither creates multiple copies of data nor generates parity information, therefore 100% of the total disk capacity is available to store user data. RAID0 provides the best possible storage efficiency and read/write performance. On the other hand, a failure of a single disk in a RAID0 array will lead to data loss.
RAID0 configurations with hard disk drives (HDD) are suited only for non-mission-critical systems tolerant to data loss.
Software-based RAID0 configurations are often used for creation of large storage volumes from multiple virtual disks or LUNs provided by highly reliable and redundant storage area network (SAN) or cloud storage infrastructure.
RAID1 is also known as Mirror Set or Mirroring. RAID1 requires two disks and works by storing two copies of each data block - one on each disk. Other mirroring RAID types capable of utilizing more than two disks (e.g., RAID1E, RAID10, Triple Mirror) are described further in this post.
Incoming read requests received by a RAID1 set can be distributed between both drives and served independently. In contrast, for each incoming write requests received data block must be written to both drives in order to successfully complete single write operations. Such effect is known as "IO multiplication". Another term also used to describe such conditions is "write penalty". RAID1 write penalty is 2, which means, each incoming write request results in two internal write operations. For more details see RAID1 Performance Calculator.
As a result of data duplication, only 50% of the total RAID1 disk capacity is available for user data. Two copies of data safeguard against a disk failure. When a disk fails, RAID1 array transitions into degraded state and continues to operate by directing all read and write requests to the remaining operational disk. Array starts rebuild process if a hot spare disk is available or as soon as the failed disk is replaced. Rebuild is performed by copying all data blocks from the operational disk to the replacement disk.
RAID1 is suitable for system and applications requiring reliable storage, good read/write performance and relatively low storage capacity.
RAID1E, also known as Enhanced RAID1, combines data striping and disk mirroring functionality. Similarly to RAID1, it maintains two copies of data. Data blocks are striped across all disks in the array but mirrored only to two of them. RAID1E requires a minimum of three disks and supports odd and even numbers of disks.
It stores two copies of each data block, therefore, RAID1E usable capacity is equal to 50% of the total capacity of all disks in the set.
Read requests can be distributed between the disks and served independently. Similarly to RAID1, each incoming write request requires two internal write operations. Therefore, RAID1E write penalty is also 2.
RAID1E array can withstand a failure of one disk without data loss. When a disk fails, RAID1E array transitions into degraded state and continues to operate by directing all read and write requests to the remaining operational disks. Array starts rebuild process if a hot spare disk is available or as soon as the failed disk is replaced. Rebuild is performed by copying data blocks from the operational disks to the replacement disk.
Overall, RAID1E capacity and performance characteristics are comparable to RAID10.
3-Way Mirror RAID set, also known as Triple Mirror, works in the same way as RAID1 but requires three disks and keeps three copies of each data block - one on each disk in the set. As a result, 3-Way Mirror usable capacity is equal only to one third (1/3) or 33.33% of the total disk capacity.
3-Way Mirror can withstand simultaneous failure of up to two disks without data loss. When a disk fails, 3-Way Mirror array transitions into degraded state and continues to operate by directing all read and write requests to the remaining operational disks. Rebuild is performed by copying data blocks from the operational disks to the replacement disk.
Read requests can be distributed between all disks in the 3-Way Mirror set and served independently. In contrast, each write request translates into three internal write operations - one per each disk. Therefore, write penalty for 3-Way Mirror is 3.
RAID3 is a parity based RAID with byte-level striping and dedicated parity disk. It works by striping incoming data blocks on the byte level to all data disks. In addition, RAID3 calculates parity information for each stripe and writes it to the dedicated disk. Each data disk stores only a portion of every data block, therefore, a single incoming read or write request requires corresponding parallel operations from each disk in the set.
Since one disk is dedicated for storing parity information, RAID3 usable storage capacity can be calculated as (N-1)*S, where N is the total number of disks in a set and S is the drive capacity.
A RAID3 array can withstand a failure of one disk without data loss. When the failed disk is replaced, RAID3 recreates the missing data by reading data blocks from the operational disks and performing parity calculations.
RAID4 is a parity based RAID with block-level data striping and dedicated parity disk. RAID4 is similar to RAID3 in the way it uses a dedicated drive to store parity blocks but it stripes data on the block level and not on the byte level.
Since one disk is dedicated for storing parity information, RAID4 usable storage capacity can be calculated as (N-1)*S, where N is the total number of disks in the set and S is the capacity of a single drive.
Incoming read requests received by a RAID4 set are directed to the disks containing requested data blocks. Data striping helps distribute load between data disk in the same way as in RAID0. Incoming write requests require calculation and update of parity information in addition to modification of the actual data blocks. As a result, RAID4 must perform four internal input-output (IO) operations in order to modify a single data block:
Therefore, RAID4 write penalty is defined as 4 (four).
Since all parity blocks are stored on a dedicated disk, it can become a bottleneck during random write operations. As a result, overall RAID4 write performance is comparable to the performance of the parity disk.
RAID4 can withstand a failure of a single disk without data loss. In a degraded state (when one of the disks has failed), data will be read from the operational disks containing the required data blocks. If the required block was located on the failed disk, the missing data must be recalculated by using all remaining blocks in the stripe. This requires reading all blocks in the stripe from all remaining operational disks and, as a result, creates additional load on the RAID4 set.
When the failed disk is replaced, RAID4 recreates missing data in a similar way - by reading all data blocks from the operational disks, performing parity calculations, and writing restored data blocks to the replacement disk.
RAID4 is not commonly used due to its write performance shortcomings. NetApp is probably the only major storage vendor that has implemented RAID4 in commercial storage appliances. NetApp proprietary Write Anywhere File Layout (WAFL) file systems mitigates RAID4 performance issues and optimizes disk write operations by utilizing caching, aggregating incoming write requests in non-volatile random-access memory (NVRAM), and writing data to disks in full stripes.
RAID5 is a parity based RAID with block-level data striping and distributed parity.
Each RAID5 data stripe contains one parity block calculated using all data blocks in the stripe. The placement of the parity block (i.e., the disk storing the parity information for a particular stripe) changes from stripe to stripe. In other words, parity information is distributed across all disks in a RAID5 set, allowing RAID5 to effectively avoid the parity drive bottleneck issue inherent to RAID4.
The amount of disk capacity required to store parity information is equivalent to the capacity of one drive. Therefore, RAID5 usable storage capacity can be calculated as (N-1)*S, where N is the total number of disks in the set and S is the capacity of a single drive.
RAID5 exhibits good read performance. However, its write performance is impacted by so-called write penalty or IO multiplication effect. More specifically, incoming write requests trigger additional internal disk operations required to update parity information. For example, in order to overwrite a single data block, RAID5 must complete the following operations:
Therefore, write penalty for RAID5 is defined as 4. For more details see RAID5 Performance Calculator.
RAID5 can withstand a failure of a single disk without data loss. In a degraded state (when one of the disks has failed) data will be read from the operational disks containing the required data blocks. If the required block was located on the failed disk, missing data will be recalculated using the remaining blocks in the stripe. This requires reading all blocks in the stripe, and as a result, creates additional load on the remaining operational disks in the RAID set.
Array rebuild operation is performed in a similar way. When the failed disk is replaced, RAID5 reads all data blocks from the operational disks, restores missing data by performing parity calculations, and then writes the restored blocks to the replacement disk. Rebuild process may take significant time and also puts additional stress on the RAID set. Any additional disk failure or read errors during rebuild process will likely cause data loss. Due to high risk of data loss, RAID5 is not recommended for hard disk drives (HDD) with capacity over 1 TB.
RAID5E or RAID5 Enhanced extends general capabilities of RAID5 by integrating an additional (hot spare) disk drive into RAID operations.
RAID5E works by distributing data and parity blocks across all disks in the array, including the additional host spare disk. This additional disk becomes a part of the overall RAID5E configuration and participates in read and write operations. However, some capacity (hot spare space) is reserved at the end of each disk. This hot spare space is used to rebuild and re-stripe the data in a situation of disk failure.
The amount of storage reserved for the hot spare space is equal to the capacity of one disk. Taking also into account the space required to store parity information, RAID5E usable capacity can be calculated as (N-2)*S, where N is the total number of disks in the set and S is the capacity of a single drive.
Generally, RAID5E performs better than an equivalent RAID5 array simply because there is an additional disk in the RAID5E array actively serving IO requests (assuming that both arrays are built with disks of the same type/model).
RAID5E write penalty is defined as 4, same as for RAID5.
RAID5E array can withstand a failure of a single drive. Second disk failure or read error during rebuild operation will lead to data loss.
When one of the disks fails, RAID5E immediately starts rebuild process by utilizing the reserved hot spare space. It re-stripes data across all remaining disks to form "regular" RAID5 set. This re-striping process can take significant time and creates significant additional load on the array.
RAID5EE or Enhanced RAID5E integrates the hot spare space within each stripe. RAID5EE adds an empty block within each stripe, instead of reserving hot spare space at the end of each disk as it is done in RAID5E. The empty blocks will be used to store regenerated data blocks if one of the disks fails. As a result, RAID5EE rebuild process is faster comparing to RAID5E since there is no need to re-stripe all data.
Similarly to RAID5E, RAID5EE usable capacity can be calculated as (N-2)*S, where N is the total number of disks in the set and S is the capacity of a single drive.
Performance and fault tolerance characteristics of RAID5EE are also comparable to those of RAID5E.
RAID6, also known as double-parity RAID, is a parity based array with block-level data striping and two independent parity blocks per stripe.
There are several double-parity RAID implementations, such as IBM’s EVENODD, NetApp’s Row-Diagonal Parity (see RAID-DP), or more generic Reed-Solomon encoding, varying in the methods of parity calculation and distribution. They all, however, demonstrate comparable levels of failure protection and storage efficiency.
Parity-based RAIDs use a portion of the total disk capacity (often called an overhead) to store parity blocks. For RAID6 this portion is equivalent to the capacity of 2 disks. Therefore, a RAID6 group with the total of N disks will have usable capacity of (N-2)*S bytes, where S is the capacity of one disk.
RAID6 exhibits good read performance. On the other hand, its write performance is impacted by so-called write penalty or IO multiplication effect - the need to update parity information when a data block is changed. In other words, incoming write requests trigger additional disk operations that RAID must perform in order to update parity information. For example, to modify a single data block RAID6 must complete the following operations:
Which means, in order to complete one write request, RAID6 needs to perform 6 internal IO operations. Therefore, write penalty for RAID6 is defined as 6. As a result, RAID6 arrays are considered not suitable for write-intensive workloads. For more details see RAID6 Performance Calculator.
A RAID6 array can withstand a simultaneous failure of up to two disks, or failure of one disk and subsequent data read errors. In a degraded state (when one of the disks has failed) RAID6 operates similarly to RAID5.
When the failed disk is replaced, RAID6 reads all remaining blocks, performs calculations using parity information and writes data to the replacement drive. Rebuild process can take significant time and puts additional load on the RAID set.
RAID10, also referred to as RAID1+0, is a combination of both RAID0 and RAID1 techniques. It works by striping data blocks (as in RAID0) across two or more mirrored (RAID1) disk groups. This approach is often called nested RAID. RAID10 combines the capability of creating large data volumes typical for RAID0 with the level of fault tolerance inherent to RAID1.
Similarly to RAID1, usable storage of RAID10 is 50% of the total capacity of all disks in the array.
RAID10 read and write performance characteristics are also similar to those of RAID1. RAID10 write penalty is defined as 2, as it must perform two write operations for each incoming write request. For more details see RAID10 Performance Calculator.
RAID10 can withstand a simultaneous failure of more than one disk, as long as the failed disks are not the members of the same underlying mirrored set. RAID10 functions in the same way as RAID1 in a degraded state (when one of the disks has failed) and during rebuild process.
RAID50, also referred to as RAID5+0, is a nested RAID combining RAID0 and RAID5 techniques. RAID50 works by striping data (as in RAID0) across two or more RAID5 groups. RAID50 combines the capability of creating large data volumes typical for RAID0 with the level of fault tolerance inherent to RAID5.
Each RAID5 group in a RAID50 set uses a portion of its total disk capacity to store parity blocks. For RAID5 this portion is equivalent to the capacity of one disk. Therefore, RAID50 with M RAID5 groups, comprised of N disks each, will have usable capacity of M*(N-1)*S bytes; where S is the capacity of one disk.
RAID50 read and write performance characteristics are also similar to those of RAID5. RAID50 write penalty is 4, as it must perform four IO operations for each incoming write request.
RAID50 can withstand a simultaneous failure of more than one disk, as long as the failed disks are not the members of the same underlying RAID5 group. RAID50 operates similarly to RAID5 in a degraded state (when one of the disks has failed) and during rebuild process.
RAID60, also referred to as RAID6+0, is a nested RAID combining both RAID0 and RAID6 techniques. RAID60 works by striping data (as in RAID0) across two or more RAID 6 groups. RAID60 combines the capability of creating large data volumes typical for RAID0 with the level of fault tolerance inherent to RAID 6.
Each RAID 6 group in a RAID60 set uses a portion of its total disk capacity to store parity blocks. For RAID 6 this portion is equivalent to the capacity of two disks. Therefore, RAID60 with M RAID6 groups, comprised of N disks each, will have usable capacity of M*(N-2)*S bytes; where S is the capacity of one disk.
RAID60 read and write performance characteristics are also similar to those of RAID 6. RAID60 write penalty is defined as 6, as it must perform six internal IO operations for each incoming write request.
RAID60 can withstand a simultaneous failure of up to two disks per each underlying RAID 6 group. RAID60 operates similarly to RAID 6 in a degraded state (when one of the disks has failed) and during rebuild process.
RAID-DP is a NetApp implementation of dual parity RAID. RAID-DP is based on RAID4 with an addition of second "diagonal" parity stored on a dedicated disk.
Performance shortcomings inherent to RAID arrays with dedicated parity drives are mitigated by the NetApp’s proprietary Write Anywhere File Layout (WAFL) file system. WAFL never overwrites existing data blocks and stores new data using the available free storage space. In addition, WAFL optimizes disk write operations by first caching and aggregating incoming write requests in non-volatile random-access memory (NVRAM) and then writing data to disks in full stripes.
RAID-DP data protection and fault tolerance capabilities are similar to RAID 6.
A RAID-DP group with a total of N disks with size of S bytes each will have usable capacity of approximately (N-2)*S bytes. Note that WAFL reserves some disk space for snapshots and for its internal operations, slightly decreasing the total usable space as a result. Typically, NetApp systems are capable of supporting relatively large RAID-DP groups - up to 28 disks per group for SAS/SSD, and up to 14 disks per group for large capacity NL-SAS disks. For more details see NetApp Usable Space Calculator.
RAID-TEC (Triple Erasure Coding) is a triple parity RAID configuration by NetApp. It is implemented by adding a third, "anti-diagonal" parity to RAID-DP. This additional parity is also stored on its dedicated disk. Overall, RIAD-TEC functionality is similar to RAID-DP.
A RAID-TEC group with a total of N disks with the the size of S bytes each will have usable capacity of approximately (N-3)*S bytes. Note that some disk space is reserved for snapshots and for the needs of NetApp’s proprietary Write Anywhere File Layout (WAFL) file system. Typically, NetApp systems are capable of supporting relatively large RAID-TEC groups - up to 29 disks per group. For more details see NetApp Usable Space Calculator.
RAID-TEC can survive up to 3 simultaneous disk failures without data loss.
RAIDZ is a software-based RAID integrated with ZFS file system. There are three types of RAIDZ: RAIDZ1, RAIDZ2, and RAIDZ3 with single parity, dual parity, and triple parity, respectively.
A RAID group comprised of one or more disks is called a virtual device (vdev). Each vdev can be either a single disk, multiple disks in a mirrored configuration or multiple disk in a RAIDZ (RAIDZ1, RAIDZ2 or RAIDZ3) configuration. One or multiple vdevs form a ZFS pool (zpool) which is used as a storage container for file systems and volumes. Data blocks are striped across all vdevs included in a zpool in a way similar to RAID0.
Similarly to other parity-based RAIDs, RAIDZ protects data by generating and storing parity information alongside data blocks. RAIDZ uses variable width stripe. It independently stripes data blocks from each write request across disks in the vdev. Therefore, RAIDZ must access multiple disks in order to complete a single read/write request. As a result, overall vdev performance is equivalent to the performance of a single drive.
A RAIDZ group with N disks of size S can store approximately (N-P)*S bytes; where P is the number of parity blocks per stripe. P=1 for RAIDZ1, P=2 for RAIDZ2, and P=3 for RAIDZ3. Note that ZFS reserves some disk space for its internal operations. For more details see ZFS / RAIDZ Capacity Calculator.
In addition, ZFS offers data compression and deduplication, data replication, self-healing mechanisms, and snapshots.