What disks are in raid. RAID disk arrays: what is it and why is it needed? Building a disk array - from theory to practice

RAID arrays were developed to improve data storage reliability, increase processing speed, and provide the ability to combine multiple disks into one large one. Different RAID types decide different tasks, here we will look at several of the most common configurations of RAID arrays of the same size.

RAID 0

RAID 0(Stripe). The mode used to achieve maximum performance. The data is evenly distributed across the array disks and combined into one, which can be divided into several. Distributed read and write operations can significantly increase operating speed, since several simultaneously read/write their portion of data. The entire volume is available to the user, but this reduces the reliability of data storage, since if one of the disks fails, the array is usually destroyed and it is almost impossible to restore the data. Scope of application - applications that require high speeds of exchange with the disk, for example video capture, video editing. Recommended for use with highly reliable drives.

RAID 1

RAID 1(Mirror). Several disks (usually 2), working synchronously for recording, that is, completely duplicating each other. The performance improvement occurs only when reading. Most reliable way protect information from failure of one of the disks. Due to its high cost, it is usually used when storing very important data. The high cost is due to the fact that only half of the total capacity is available to the user.

RAID 10

RAID 10, also sometimes called RAID 1+0- a combination of the first two options. (RAID0 array from RAID1 arrays). Has all the speed advantages of RAID0 and the reliability advantage of RAID1, while retaining the disadvantage - high cost disk array, since the effective capacity of the array is equal to half the capacity of the disks used in it. To create such an array, a minimum of 4 disks is required. (In this case, their number must be even).

RAID 0+1- RAID1 array from RAID0 arrays. In fact, it is not used due to the lack of advantages compared to RAID10 and lower fault tolerance.

RAID 1E

RAID 1E- A variant of data distribution across disks, similar to RAID10, allowing the use of an odd number (minimum number - 3)

RAID 2, 3, 4 - various options distributed data storage with disks allocated for parity codes and various sizes block. Currently, they are practically not used due to low performance and the need to allocate a lot disk capacity for storing ECC and/or parity codes.

RAID 5

RAID 5- an array that also uses distributed data storage similar to RAID 0 (and combining into one large logical one) + distributed storage of parity codes for data recovery in case of failures. Compared to previous configurations, the Stripe block size has been increased even more. Both simultaneous reading and writing are possible. The advantage of this option is that the array capacity available to the user is reduced by the capacity of only one disk, although the reliability of data storage is lower than that of RAID 1. In essence, it is a compromise between RAID0 and RAID1, providing sufficient high speed work with good data storage reliability. If one disk in the array fails, data can be restored without loss of data. automatic mode. Minimum quantity There are 3 disks for such an array.
"Software" implementations of RAID5 built into south bridges motherboards, do not have a high write speed, so they are not suitable for all applications.

RAID 5EE

RAID 5EE- an array similar to RAID5, however, in addition to distributed storage of parity codes, the distribution of spare areas is used - in fact, it is used, which can be added to the RAID5 array as a spare (such arrays are called 5+ or 5+spare). In a RAID 5 array backup disk remains idle until one of the main ones fails, while in a RAID 5EE array this disk is shared with the rest of the HDDs all the time, which has a positive effect on the performance of the array. For example, a RAID5EE array of 5 HDDs will be able to perform 25% more I/O operations per second than a RAID5 array of 4 primary and one backup HDD. The minimum number of disks for such an array is 4.

RAID 6

RAID 6- an analogue of RAID5 with a high level of redundancy - information is not lost if any two disks fail; accordingly, the total capacity of the array is reduced by the capacity of two disks. The minimum number of disks required to create an array of this level is 4. The operating speed is general case roughly similar to RAID5. Recommended for applications where the highest possible reliability is important.

RAID 50

RAID 50- combining two (or more, but this is extremely rarely used) RAID5 arrays into a stripe, i.e. a combination of RAID5 and RAID0, partially correcting the main disadvantage of RAID5 - low speed recording data through the parallel use of several such arrays. Total capacity The array is reduced by the capacity of two, but, unlike RAID6, such an array can withstand the failure of only one disk without data loss, and the minimum required number of disks to create a RAID50 array is 6. Along with RAID10, this is the most recommended RAID level for use in applications where high performance combined with acceptable reliability is required.

RAID 60

RAID 60- combining two RAID6 arrays into a stripe. The write speed is approximately doubled compared to the write speed in RAID6. The minimum number of disks to create such an array is 8. Information is not lost if two disks from each RAID 6 array fail.

Matrix RAID- technology implemented by Intel in their south bridges, starting with ICH6R, which allows you to organize several RAID0 and RAID1 arrays on just two disks, while simultaneously creating partitions with both increased speed work, and with increased reliability of data storage.

JBOD(From the English "Just a Bunch Of Disks") - sequential combination of several physical ones into one logical one, which does not affect performance (reliability drops similarly to RAID0), and can have different sizes. Currently practically not used.

There are a lot of articles on the Internet with RAID description. For example, this one describes everything in great detail. But as usual, there is not enough time to read everything, so you need something short to understand - whether it is necessary or not, and what is better to use in relation to working with a DBMS (InterBase, Firebird or something else - it really doesn’t matter). Before your eyes is exactly such material.

To a first approximation, RAID is a combination of disks into one array. SATA, SAS, SCSI, SSD - it doesn't matter. Moreover, almost every normal motherboard now supports SATA RAID. Let's go through the list of what RAIDs are and why they are. (I would like to immediately note that in RAID you need to combine identical disks. Consolidating disks from different manufacturers, from one but different types, or different sizes- this is pampering for a person sitting on a home computer).

RAID 0 (Stripe)

Roughly speaking, this is a sequential combination of two (or more) physical disks into one "physical" disk. It is only suitable for organizing huge disk spaces, for example, for those who work with video editing. There is no point in keeping databases on such disks - in fact, even if your database is 50 gigabytes in size, then why did you buy two disks of 40 gigabytes each, and not 1 by 80 gigabytes? The worst thing is that in RAID 0, any failure of one of the disks leads to the complete inoperability of such RAID, because data is written alternately to both disks, and accordingly, RAID 0 has no means of recovery in case of failures.

Of course, RAID 0 provides faster performance due to read/write striping.

RAID 0 is often used to host temporary files.

RAID 1 (Mirror)

Disk mirroring. If Shadow in IB/FB is software mirroring (see Operations Guide.pdf), then RAID 1 is hardware mirroring, and nothing more. Forbid you from using software mirroring using OS tools or third-party software. You need either an “iron” RAID 1 or shadow.

If a failure occurs, carefully check which disk has failed. The most common case of data loss on RAID 1 is incorrect actions during recovery (the wrong disk is specified as the “whole”).

As for performance - the gain for writing is 0, for reading - perhaps up to 1.5 times, since reading can be done “in parallel” (alternately from different disks). For databases, the acceleration is small, while when accessing different (!) parts (files) of the disk in parallel, the acceleration will be absolutely accurate.

RAID 1+0

By RAID 1+0 they mean the RAID 10 option, when two RAID 1s are combined into RAID 0. The option when two RAID 0s are combined into RAID 1 is called RAID 0+1, and “outside” it is the same RAID 10.

RAID 2-3-4

These RAIDs are rare because they use Hamming codes, or byte blocking + checksums, etc., but the general summary is that these RAIDs only provide reliability, with a 0-performance increase, and sometimes even its deterioration.

RAID 5

It requires a minimum of 3 disks. Parity data is distributed across all disks in the array

It is commonly said that "RAID5 uses independent disk access so that requests to different disks can be executed in parallel." It should be kept in mind that we're talking about, of course, about parallel I/O requests. If such requests go sequentially (in SuperServer), then of course you will not get the effect of parallelizing access on RAID 5. Of course, RAID5 will give a performance boost if the operating system and other applications work with the array (for example, it will contain virtual memory, TEMP, etc.).

In general, RAID 5 used to be the most commonly used disk array for working with DBMSs. Now such an array can be organized on SATA drive x, and it will be significantly cheaper than SCSI. You can see prices and controllers in the articles
Moreover, you should pay attention to the volume of purchased disks - for example, in one of the mentioned articles, RAID5 is assembled from 4 disks with a capacity of 34 gigabytes, while the volume of the “disk” is 103 gigabytes.

Testing five SATA controllers RAID - http://www.thg.ru/storage/20051102/index.html.

Adaptec SATA RAID 21610SA in RAID 5 arrays - http://www.ixbt.com/storage/adaptec21610raid5.shtml.

Why RAID 5 is bad - https://geektimes.ru/post/78311/

Attention! When purchasing disks for RAID5, they usually take 3 disks, at a minimum (most likely because of the price). If suddenly, over time, one of the disks fails, then a situation may arise when it is not possible to purchase a disk similar to the ones used (no longer produced, temporarily out of stock, etc.). Therefore more interesting idea It seems like purchasing 4 disks, organizing a RAID5 of three, and connecting the 4th disk as a backup (for backups, other files and other needs).

The volume of a RAID5 disk array is calculated using the formula (n-1)*hddsize, where n is the number of disks in the array, and hddsize is the size of one disk. For example, for an array of 4 disks of 80 gigabytes, the total volume will be 240 gigabytes.

There is a question about the “unsuitability” of RAID5 for databases. At a minimum, it can be viewed from the point of view that to get good RAID5 performance, you need to use a specialized controller, and not what is included by default on the motherboard.

Article RAID-5 must die. And more about data loss on RAID5.

Note. As of 09/05/2005, the cost of SATA Hitachi disk 80Gb is 60 dollars.

RAID 10, 50

Next come combinations of the listed options. For example, RAID 10 is RAID 0 + RAID 1. RAID 50 is RAID 5 + RAID 0.

Interestingly, the RAID 0+1 combination turns out to be worse in terms of reliability than RAID5. The database repair service has a case of one disk failure in the RAID0 (3 disks) + RAID1 (3 more of the same disks) system. At the same time, RAID1 could not “raise” the backup disk. The base turned out to be damaged without any chance of repair.

RAID 0+1 requires 4 drives, and RAID 5 requires 3. Think about it.

RAID 6

Unlike RAID 5, which uses parity to protect data against single faults, RAID 6 uses the same parity to protect against double faults. Accordingly, the processor is more powerful than in RAID 5, and not 3, but at least 5 disks are required (three data disks and 2 parity disks). Moreover, the number of disks in raid6 does not have the same flexibility as in raid 5, and should be equal prime number(5, 7, 11, 13, etc.)

Let's say two disks fail at the same time, but such a case is very rare.

I haven’t seen any data on RAID 6 performance (I haven’t looked), but it may well be that due to redundant control, performance could be at the level of RAID 5.

Rebuild time

Any RAID array that remains operational if one drive fails has a concept called rebuild time. Of course, when you replace a dead disk with a new one, the controller must organize the functioning of the new disk in the array, and this will take some time.

When “connecting” a new disk, for example, for RAID 5, the controller can allow operation of the array. But the speed of the array in this case will be very low, at least because even if the new disk is “linearly” filled with information, writing to it will “distract” the controller and disk heads from synchronizing operations with the rest of the disks of the array.

Restoration time for the array to function in normal mode directly depends on the size of the disks. For example, Sun StorEdge 3510 FC Array with an array size of 2 terabytes in exclusive mode does a rebuild within 4.5 hours (at a hardware price of about $40,000). Therefore, when organizing an array and planning disaster recovery, you need to first of all think about rebuild time. If your database and backups occupy no more than 50 gigabytes, and the growth per year is 1-2 gigabytes, then it hardly makes sense to assemble an array of 500 gigabyte disks. 250 GB will be enough, and even for raid5 this will be at least 500 GB of space to accommodate not only the database, but also movies. But the rebuild time for 250 GB disks will be approximately 2 times less than for 500 GB disks.

Resume

It turns out that the most meaningful thing is to use either RAID 1 or RAID 5. However, the most common mistake, which almost everyone does is to use RAID “for everything”. That is, they set up a RAID, pile everything they have on it, and... they get best case scenario reliability, but not an improvement in performance.

Write cache is also often not enabled, as a result of which writing to a raid is slower than writing to a regular single disk. The fact is that for most controllers this option is disabled by default, because... It is believed that to enable it, it is desirable to have at least a battery on the raid controller, as well as the presence of a UPS.

Text
The old hddspeed.htmLINK article (and doc_calford_1.htmLINK) shows how you can get significant performance gains by using multiple physical disks, even for an IDE. Accordingly, if you organize a RAID, put the base on it, and do the rest (temp, OS, virtual disk) on other hard drives. After all, all the same, RAID itself is one “disk”, even if it is more reliable and fast.
declared obsolete. All of the above has a right to exist on RAID 5. However, before such placement, you need to find out how you can backup/restore the operating system, and how long it will take, how long it will take to restore a “dead” disk, whether there is (will be) ) a disk is at hand to replace the “dead” one, and so on, i.e. you will need to know in advance the answers to the most basic questions in case of a system failure.

I still recommend operating system keep it on a separate SATA drive, or if you prefer, on two SATA drives connected in RAID 1. In any case, placing the operating system on a RAID, you must plan what you will do if the motherboard suddenly stops working - sometimes moving the raid drives array to another motherboard (chipset, raid controller) is not possible due to incompatibility of default raid parameters.

Placement of the base, shadow and backup

Despite all the advantages of RAID, it is strictly not recommended, for example, to make a backup on the same logical drive. Not only is this bad for performance, but it can also lead to problems with missing free space(on large databases) - after all, depending on the data backup file may be equivalent to the size of the database, or even larger. Making a backup to the same physical disk is still all right, although the most best option- backup to a separate hard drive.

The explanation is very simple. Backup is reading data from a database file and writing to a backup file. If all of this is physically happening on one drive (even RAID 0 or RAID 1), then performance will be worse than if reading from one drive and writing to another. The benefit from this separation is even greater when backup is done while users are working with the database.

The same applies to shadow - there is no point in putting shadow, for example, on RAID 1, in the same place as the base, even on different logical drives. If shadow is present, the server writes data pages to both the database file and the shadow file. That is, instead of one write operation, two are performed. When dividing the base and shadow into different physical disks Write performance will be determined by the slowest drive.

The problem of increasing the reliability of information storage is always on the agenda. This is especially true for large amounts of data, databases on which work depends complex systems in a wide range of industries. This is especially important for high-performance servers.

As you know, performance modern processors is constantly growing, which modern ones clearly cannot keep up with in their development.
hard drives. Having one disk, be it SCSI or, even worse, IDE, is already won't be able to decide tasks relevant to our time. You need many disks that will complement each other, replace them if one of them fails, store backup copies, and work efficiently and productively.

However, simply having several hard drives not enough, they are needed integrate into a system, which will work smoothly and will not allow data loss in the event of any disk-related failures.

You need to take care of creating such a system in advance, because, as the famous proverb says, Bye fried the rooster won't bite- they won’t miss it. You may lose your data irrevocably.

This system could become RAID– virtual storage technology that combines several disks into one logic element. A RAID array is called redundant array independent disks. Typically used to improve performance and reliability.

What is needed to create a raid? At least two hard drives. Depending on the array level, the number of storage devices used varies.

What types of raid arrays are there?

There are basic combined arrays RAID. The Berkeley Institute in California proposed dividing the raid into specification levels:

Basic:
- RAID 1 ;
- RAID 2 ;
- RAID 3 ;
- RAID 4 ;
- RAID 5 ;
- RAID 6 .
Combined:
- RAID 10 ;
- RAID 01 ;
- RAID 50 ;
- RAID 05 ;
- RAID 60 ;
- RAID 06 .

Let's look at the most commonly used ones.

Raid 0

RAID 0 intended to increase speed and recording. It does not increase storage reliability and is therefore not redundant. His other name is stripe (striping - “alternation”). Usually used from 2 to 4 disks.

The data is divided into blocks, which are written to disks one by one. Speed writing/reading increases by a number of times that is a multiple of the number of disks. From shortcomings One can note the increased likelihood of data loss with such a system. It makes no sense to store databases on such disks, because any serious failure will lead to complete inoperability of the raid, since there are no recovery tools.

Raid 1

RAID 1 provides mirror data storage on hardware level. Also called an array Mirror what does it mean « mirror» . That is, the disk data in this case is duplicated. Can use with the number of storage devices from 2 to 4.

Speed writing/reading practically does not change, which can be attributed to benefits. The array works if at least one raid disk is in operation, but the system volume is equal to the volume of one disk. In practice, when failure one of the hard drives, you will need to take steps to replace it as quickly as possible.

Raid 2

RAID 2 - uses the so-called Hamming code. Data is split across hard drives similar to RAID 0, and is stored on the remaining drives error correction codes, in case of failure by which you can regenerate information. This method allows on-the-fly find, and then correct system failures.

Rapidity read/write in this case compared to using a single disk rises. The downside is the large number of disks, for which it is rational to use it so that there is no data redundancy, usually this 7 and more.

RAID 3 - in an array, the data is split across all disks except one, which stores the parity bytes. Resistant to system failures. If one of the disks fails. Then its information can be easily “raised” using data checksums parity.

Compared to RAID 2 no possibility error correction on the fly. This array is different high performance and the ability to use 3 disks or more.

Main minus Such a system can be considered an increased load on the disk that stores parity bytes and low reliability of this disk.

Raid 4

In general, RAID 4 is similar to RAID 3 except difference that parity data is stored in blocks rather than bytes, which allows for increased speed of small data transfer.

Minus The specified array turns out to have a write speed, because write parity is generated on one single disk, just like RAID 3.

This seems to be a good solution for those servers where files are read more often than written.

Raid 5

RAID 2 to 4 have disadvantages due to the inability to parallelize write operations. RAID 5 eliminates this drawback. Parity blocks are written simultaneously for everything disk devices array, no asynchrony in the data distribution, which means parity is distributed.

Number used hard drives from 3. The array is very common due to its versatility And efficiency, how larger number disks will be used, the more economically it will be spent disk space. Speed at the same time high due to data parallelization, but performance decreases compared to RAID 10 due to large number operations. If one drive fails, reliability drops to RAID 0. It takes a long time to recover.

Raid 6

RAID 6 technology is similar to RAID 5, but higher reliability by increasing the number of parity disks.

However, at least 5 or more disks are already required powerful processor to process an increased number of operations, and the number of disks must be equal to the prime number 5,7,11 and so on.

Raid 10, 50, 60

Next come combinations the previously mentioned raids. For example, RAID 10 is RAID 0 + RAID 1.

They inherit and advantages arrays of their components in terms of reliability, performance and number of disks, and at the same time efficiency.

Creating a raid array on a home PC

The advantages of creating a raid array at home are not obvious, due to the fact that it uneconomical, data loss is not so critical in comparison with servers, but information can be stored in backups, periodically making backups.

For these purposes you will need raid controller, which has its own BIOS and its own settings. In modern motherboards, the raid controller can be integrated to the south bridge of the chipset. But even in such boards, you can connect another controller by connecting to a PCI or PCI-E connector. Examples include devices from Silicon Image and JMicron.

Each controller can have its own configuration utility.

Let's look at creating a raid using the Intel Matrix Storage Manager Option ROM.

Transfer all data from your disks, otherwise during the creation of the array they will be cleared.

Go to BIOSSetup your motherboard and turn on the operating mode RAID for your sata hard drive.

To launch the utility, restart your PC, click ctrl+i during the procedure POST. In the program window you will see a list of available disks. Click Create Massive Next, select required array level.

Further following intuitively clear interface enter array size And confirm its creation.

RAID array (Redundant Array of Independent Disks) - connecting several devices to increase performance and/or reliability of data storage, in translation - a redundant array of independent disks.

According to Moore's law, current productivity increases every year (namely, the number of transistors on a chip doubles every 2 years). This can be seen in almost every computer hardware industry. Processors increase the number of cores and transistors, while reducing the process RAM increases frequency and throughput, memory solid state drives increases wear resistance and reading speed.

But simple hard drives (HDDs) have not advanced much over the past 10 years. As the standard speed was 7200 rpm, it remains so (not taking into account server HDDs with revolutions of 10,000 or more). Slow 5400 rpm is still found on laptops. For most users, in order to increase the performance of their computer, it will be more convenient to buy an SDD, but the price for 1 gigabyte of such media is much higher than that of a simple HDD. “How to increase the performance of drives without losing a lot of money and volume? How to save your data or increase the security of your data? There is an answer to these questions - a RAID array.

Types of RAID arrays

On at the moment The following types of RAID arrays exist:

RAID 0 or "Striping"– an array of two or more disks to improve overall performance. The raid volume will be total (HDD 1 + HDD 2 = Total volume), the read/write speed will be higher (due to splitting the recording into 2 devices), but the reliability of information security will suffer. If one of the devices fails, all information in the array will be lost.

RAID 1 or "Mirror"– several disks copying each other to increase reliability. The write speed remains at the same level, the read speed increases, reliability increases many times over (even if one device fails, the second will work), but the cost of 1 Gigabyte of information increases by 2 times (if you make an array of two hdds).

RAID 2 is an array built on disks for storing information and error correction disks. Calculation of the number of HDDs for storing information is carried out using the formula “2^n-n-1”, where n is the number of HDD corrections. This type is used when there is a large number of HDDs, the minimum acceptable number is 7, where 4 is for storing information, and 3 is for storing errors. The advantage of this type will be increased productivity, compared to one disk.

RAID 3 – consists of “n-1” disks, where n is a disk for storing parity blocks, the rest are devices for storing information. Information is divided into pieces smaller than the sector size (divided into bytes), well suited for working with large files, the reading speed of small files is very low. Characterized by high performance, but low reliability and narrow specialization.

RAID 4 is similar to type 3, but is divided into blocks rather than bytes. This solution was able to correct the low reading speed of small files, but the writing speed remained low.

RAID 5 and 6 - instead of a separate disk for error correlation, as in previous versions, blocks are used that are evenly distributed across all devices. In this case, the speed of reading/writing information increases due to parallelization of recording. Minus of this type is long-term recovery of information in the event of failure of one of the disks. During recovery it goes very high load to other devices, which reduces reliability and increases the failure of another device and the loss of all array data. Type 6 improves overall reliability but reduces performance.

Combined types of RAID arrays:

RAID 01 (0+1) – Two Raid 0s are combined into Raid 1.

RAID 10 (1+0) – RAID 1 disk arrays, which are used in type 0 architecture. It is considered the most reliable data storage option, combining high reliability and performance.

You can also create an array from SSD drives . According to 3DNews testing, such a combination does not provide a significant increase. It is better to purchase a drive with a more powerful PCI or eSATA interface

Raid array: how to create

Created by connecting through a special RAID controller. At the moment there are 3 types of controllers:

Software – software an array is emulated, all calculations are performed by the CPU.
Integrated – mainly common on motherboards (not the server segment). A small chip on the mat. board responsible for emulating the array, calculations are performed through the CPU.
Hardware – expansion card (for desktop computers), usually with PCI interface, has own memory and a computing processor.

RAID hdd array: How to make it from 2 disks via IRST

Data Recovery

Some data recovery options:

If Raid 0 or 5 fails, the RAID Reconstructor utility can help, which will assemble available information drives and rewrite it to another device or media in the form of an image of the previous array. This option It will help if the disks are working properly and the error is software.
For Linux systems mdadm recovery is used (a utility for managing software Raid arrays).
Hardware recovery must be performed through specialized services, because without knowledge of the controller’s operating methodology, you can lose all the data and getting it back will be very difficult or even impossible.

There are many nuances that need to be taken into account when creating a Raid on your computer. Basically, most options are used in the server segment, where data stability and security is important and necessary. If you have questions or additions, you can leave them in the comments.

Have a great day!

Hard drives play an important role in a computer. It is stored on them various information user, the OS is launched from them, etc. Hard drives do not last forever and have a certain margin of safety. And also everyone hard drive has its own distinctive characteristics.

Most likely, you have heard at some point that so-called raid arrays can be made from ordinary hard drives. This is necessary in order to improve the performance of drives, as well as ensure the reliability of information storage. In addition, such arrays can have their own numbers (0, 1, 2, 3, 4, etc.). In this article we will tell you about RAID arrays.

RAID is a collection of hard drives or a disk array. As we have already said, such an array ensures reliable data storage and also increases the speed of reading or writing information. There are various RAID array configurations, which are marked with numbers 1, 2, 3, 4, etc. and differ in the functions they perform. By using such arrays with configuration 0 you will get significant performance improvements. A single RAID array guarantees complete safety of your data, since if one of the drives fails, the information will be located on the second hard drive.

Essentially RAID array– this is 2 or n number of hard drives connected to the motherboard, which supports the ability to create raids. Programmatically, you can select the raid configuration, that is, specify how these same disks should work. To do this, you will need to specify the settings in the BIOS.

To install the array, we need a motherboard that supports raid technology, 2 identical ones (completely in all respects) hard drives, which we connect to the motherboard. In the BIOS you need to set the parameter SATA Configuration: RAID. When the computer boots, press the key combination CTR-I, and already there we configure RAID. And after that, we install Windows as usual.

It is worth paying attention to the fact that if you create or delete a raid, then all information that is on the drives is deleted. Therefore, you must first make a copy of it.

Let's look at the RAID configurations we've already talked about. There are several of them: RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, RAID 6, etc.

RAID-0 (striping), also known as a zero-level array or “null array”. This level increases the speed of working with disks by an order of magnitude, but does not provide additional fault tolerance. In fact, this configuration is a raid array purely formally, because with this configuration there is no redundancy. Recording in such a bundle occurs in blocks, alternately written to different disks array. The main disadvantage here is the unreliability of data storage: if one of the array disks fails, all information is destroyed. Why does this happen? This happens because each file can be written in blocks to several hard drives at once, and if any of them malfunctions, the integrity of the file is violated, and, therefore, it is not possible to restore it. If you value performance and regularly make backups, then this array level can be used on your home PC, which will give a noticeable increase in performance.

RAID-1 (mirroring)– “mirror mode”. You can call this level of RAID arrays the paranoid level: this mode gives almost no increase in system performance, but absolutely protects your data from damage. Even if one of the disks fails, exact copy lost will be stored on another disk. This mode, like the first, can also be implemented on a home PC for people who value the data on their disks extremely highly.

When constructing these arrays, an information recovery algorithm is used using Hamming codes (an American engineer who developed this algorithm in 1950 to correct errors in the operation of electromechanical computers). To make this work RAID controller two groups of disks are created - one for storing data, the second group for storing error correction codes.

This type of RAID has become less widespread in home systems due to the excessive redundancy of the number of hard drives - for example, in an array of seven hard drives, only four will be allocated for data. As the number of disks increases, redundancy decreases, which is reflected in the table below.

The main advantage of RAID 2 is the ability to correct errors on the fly without reducing the speed of data exchange between the disk array and the central processor.

RAID 3 and RAID 4

These two types of disk arrays are very similar in design. Both use multiple hard drives to store information, one of which is used exclusively for storing checksums. Three hard drives are enough to create RAID 3 and RAID 4. Unlike RAID 2, data recovery “on the fly” is impossible - information is restored after replacing the failed one. hard drive for some time.

The difference between RAID 3 and RAID 4 is the level of data partitioning. In RAID 3, information is split into individual bytes, which leads to serious slowdowns when writing/reading large quantity small files. RAID 4 splits data into separate blocks, the size of which does not exceed the size of one sector on the disk. As a result, the processing speed of small files increases, which is critical for personal computers. For this reason, RAID 4 has become more widespread.

A significant disadvantage of the arrays under consideration is the increased load on the hard drive intended for storing checksums, which significantly reduces its resource.

RAID-5. The so-called fault-tolerant array of independent disks with distributed storage of checksums. This means that on an array of n disks, n-1 disk will be allocated for direct storage data, and the latter will store the checksum of the n-1 stripe iteration. To explain more clearly, let's imagine that we need to write a file. It will be divided into portions of the same length and will alternately begin to be written cyclically to all n-1 disks. On last disc a checksum of bytes of data portions of each iteration will be recorded, where the checksum will be implemented by a bitwise XOR operation.

It’s worth warning right away that if any of the disks fail, it will all go into emergency mode, which will significantly reduce performance, because to assemble the file together will be carried out unnecessary manipulations to restore its “missing” parts. If two or more disks fail at the same time, the information stored on them cannot be restored. In general, the implementation of a fifth-level raid array provides a fairly high access speed, parallel access to various files and good fault tolerance.

To a large extent, the above problem is solved by constructing arrays using the RAID 6 scheme. In these structures, a memory volume equal to the volume of two hard drives is allocated for storing checksums, which are also cyclically and evenly distributed to different disks. Instead of one, two checksums are calculated, which guarantees data integrity in the event of simultaneous failure of two hard drives in the array.

Advantages of RAID 6 - high degree information security and a smaller drop in performance than in RAID 5 during data recovery when replacing a damaged disk.

The disadvantage of RAID 6 is that the overall data exchange speed is reduced by approximately 10% due to an increase in the volume of necessary checksum calculations, as well as due to an increase in the amount of information written/read.

Combined RAID types

In addition to the main types discussed above, various combinations of them are widely used, which compensate for certain disadvantages of simple RAID. In particular, the use of RAID 10 and RAID 0+1 schemes is widespread. In the first case, a pair of mirrored arrays are combined into RAID 0, in the second, on the contrary, two RAID 0 are combined into a mirror. In both cases, the increased performance of RAID 0 is added to the information security of RAID 1.

Often in order to increase the level of protection important information RAID 51 or RAID 61 construction schemes are used - mirroring of already highly protected arrays ensures exceptional data safety in the event of any failures. However, it is impractical to implement such arrays at home due to excessive redundancy.

Building a disk array - from theory to practice

A specialized RAID controller is responsible for building and managing the operation of any RAID. Much to the relief of the average user personal computer, in most modern motherboards these controllers are already implemented at the level south bridge chipset. So, to build an array of hard drives, it’s enough to worry about purchasing the required number of them and determining what you want RAID type in the appropriate section BIOS settings. After this, instead of several hard drives in the system, you will see only one, which can be divided into partitions and logical drives if desired. Please note that those who are still using Windows XP will need to install an additional driver.

And finally, one more piece of advice - to create a RAID, purchase hard drives of the same capacity, the same manufacturer, the same model, and preferably from the same batch. Then they will be equipped with the same logic sets and the operation of the array of these hard drives will be the most stable.

Tags: , https://site/wp-content/uploads/2017/01/RAID1-400x333.jpg 333 400 Leonid Borislavsky /wp-content/uploads/2018/05/logo.svg?1Leonid Borislavsky 2017-01-16 08:57:09 2017-01-16 07:12:59 What are RAID arrays and why are they needed?