Читать книгу VMware Software-Defined Storage - Martin Hosken - Страница 14

Chapter 2
Classic Storage Models and Constructs
Classic Storage Concepts

Storage infrastructure is made up of a multitude of complex components and technologies, all of which need to interact seamlessly to provide high performance, continuous availability, and low latency across the environment. For students of vSphere storage, understanding the design and implementation complexities of mixed, multiplatform, multivendor enterprise or service provider–based storage can at first be overwhelming. Gaining the required understanding of all the components, technologies, and vendor-specific proprietary hardware takes time.

This chapter addresses each of these storage components and technologies, and their interactions in the classic storage environment. Upcoming chapters then move on to next-generation VMware storage solutions and the software-defined storage model.

This classic storage model employs intelligent but highly proprietary storage systems to group disks together and then partition and present those physical disks as discrete logical units. Because of the proprietary nature of these storage systems, my intention here is not to address the specific configuration of, for instance, HP, IBM, or EMC storage, but to demonstrate how the vSphere platform can use these types of classic storage devices.

In the classic storage model, the logical units, or storage devices, are assigned a logical unit number (LUN) before being presented to vSphere host clusters as physical storage devices. These LUNs are backed by a back-end physical disk array on the storage system, which is typically served by RAID (redundant array of independent disks) technology; depending on the hardware type, this technology can be applied at either the physical or logical disk layer, as shown in Figure 2.1.

Figure 2.1 Classic storage model

The LUN, or storage device, is a virtual representation of a portion of physical disk space within the storage array. The LUN aggregates a portion of disk space across the physical disks that make up the back-end system. However, as illustrated in the previous figure, the data is not written to a single physical device, but is instead spread across the drives. It is this mechanism that allows storage systems to provide fault tolerance and performance improvements over writing to a single physical disk.

This classic storage model has several limitations. To start with, all virtual disks (VMDKs) on a single LUN are treated the same, regardless of the LUN’s capabilities. For instance, you cannot replicate just a single virtual disk at the storage level; it is the whole LUN or nothing. Also, even though vSphere now supports LUNs of up to 64 terabytes, LUNs are still restricted in size, and you cannot attach more than 256 LUNs to a vSphere host or cluster.

In addition, with this classic storage approach, when a SCSI LUN is presented to the vSphere host or cluster, the underlying storage system has no knowledge of the hypervisor, filesystem, guest operating system, or application. It is left to the hypervisor and vCenter, or other management tools, to map objects and files (such as VMDKs) to the corresponding extents, pages, and logical block address (LBA) understood by the storage system. In the case of a NAS-based NFS solution, there is also a layer of abstraction placed over the underlying block storage to handle file management and the associated file-to-LBA mapping activity.

Other classic storage architecture challenges include the following:

• Proprietary technologies and not commodity hardware

• Low utilization of raw storage resources

• Frequent overprovisioning of storage resources

• Static, nonflexible classes of service

• Rigid provisioning methodologies

• Lack of granular control, at the virtual disk level

• Frequent data migrations required, due to changing workload requirements

• Time-consuming operational processes

• Lack of automation and common API-driven provisioning

• Slow storage-related requests requiring manual human interaction to perform maintenance and provisioning operations

Most storage systems have two basic categories of LUN: the traditional model and disk pools. The traditional model has been the standard mechanism for many years in legacy storage systems. Disk pools have recently provided compatible systems with additional flexibility and scalability, for the provisioning of virtual storage resources.

In the traditional model, when a LUN is created, the number and choice of disks directly corresponds to the RAID type and disk device configured. This traditional model has limitations, especially in virtual environments, which is why it was superseded by the more modern disk pool concept. The traditional model would often have a fixed maximum number of physical disks, which could be combined to form the logical disk. This maximum disk limitation was imposed by storage array systems as a hard limit, but was also linked to the practical considerations around availability and performance.

With this traditional disk-grouping method, it was often possible to expand a logical disk beyond its imposed physical limits by creating some sort of MetaLUN. However, this increased operational complexity and could often be difficult and time-consuming.

An additional consideration with this approach was that the amount of storage provisioned was often far greater than what was required, because of the tightly imposed array constraints. Provisioning too much storage was also done by storage administrators to prevent application outages often required to expand storage, or to cover potential workload requirements or growth patterns that were unknown. Either way, this typically resulted in expensive disk storage lying unutilized for a majority of the time.

On the plus side, this traditional approach to provisioning LUNs provided fixed, predictable performance, based on the RAID and disk type employed. For this reason, this method of disk provisioning is still sometimes a good choice when storage requirements do not have large amounts of expected growth, or have fixed service-level agreements (SLAs) based on strict application I/O requirements.

In more recent years, storage vendors have moved almost uniformly to disk pools. Pools can use far larger groups of disks, from which LUNs can be provisioned. While the disk pool concept still comprises physical disks employing a RAID mechanism to stripe or mirror data, with a LUN carved out from the pool, this device type can be built across a far greater number of disks. As a result of this approach, storage administrators can provision significantly larger LUNs without sacrificing levels of availability.

However, the sacrifice made by employing this more flexible approach is the small level of variability in performance that results. This is due to both the number of applications that are likely to share the storage of this single disk pool, which will inevitably increase over time, and the potential heterogeneous nature of disk pools, which have no requirement for uniformity, as it relates to the speed and capacity of individual physical disks (see Figure 2.2).

Figure 2.2 Storage LUN provisioning mechanisms

Also relevant from a classic storage design perspective are the trade-offs associated with choosing between provisioning a single disk pool or multiple disk pools. If choosing multiple pools, what criteria should a design use to define those pools?

We address tiering and autotiering in more detail later in this chapter, but this is one of the key design factors when considering whether to provision a single pool, with all the disk resources, or to deploy multiple storage pools on the array and to split storage resources accordingly.

Choosing a single pool provides simpler operational and capacity management of the environment. In addition, it allows LUNs or filesystems to be striped across a larger number of physical disks, which improves overall performance of the array system. However, it is also likely that a larger number of hosts and clusters will share the same underlying back-end disk system. Therefore, there is an increased possibility for resource contention and also an increased risk of specific applications not using an optimal RAID configuration, and maximizing I/O, which is likely to result in a degraded performance for those workloads.

Using multiple disk pools offers the flexibility to customize storage resources to meet specific application I/O requirements, and also allows operational teams to isolate specific workloads to specific physical drives, reducing the risk of disk contention. However, as the pools are inevitably smaller in this type of architecture, some systems may experience lower levels of performance than with a single larger pool. In addition, with multiple smaller pools, capacity planning becomes more complex, as growth across disk pools may not be consistent, and there is likely to be an increase in overall disk resources not being used.

Neither of these options is without its advantages and drawbacks, and there is no one perfect solution. However, designing a solution that uses multiple smaller pools over one universal disk pool will likely come down to one or more of the following key design factors:

• Disk pools based on function, such as development, QA, production, and so on. This option may be preferred if you are concerned with performance for specific environments, and want to isolate them from impacting the production system.

• In multitenanted environments, whether public or based on internal business units, each tenant can be allocated its own pool. However, depending on the environment and SLAs, each tenant might end up with multiple pools in order to address specific I/O characteristics of various applications.

• Application-based pools, such as database or email systems. This can provide optimum performance as applications of similar type often have similar I/O characteristics. For this reason, it may be worth considering designing pools based on application type. However, this also carries the risk of some databases, for instance, generating very high volumes of I/O and potentially impacting other databases residing on the same disk pool.

• Drive technology and RAID type. This allows you to place data on the storage type that best matches the application I/O characteristics, such as reads versus writes versus sequential. However, this approach can also increase costs and does not address any specific application I/O intensity requirement.

• Storage tier–based pools (such as Gold, Silver, and Bronze) could allow you to mix drive technologies and/or RAID types within each pool, therefore reducing the number of pools required to support most application types, configurations, and SLAs.

RAID Sets

The term RAID has already been used multiple times in different contexts, so let’s address this technology next.

RAID (redundant array of independent disks) combines two or more disk drives into a logical grouping, typically known as a RAID set. Under the control of a RAID controller (or in the case of a storage system, the storage processors or controllers), the RAID set appears to the connected hosts as a single logical disk drive, even though it is made up of multiple physical disks. RAID sets provide four primary advantages to a storage system:

• Higher data availability

• Increased capacity

• Improved I/O performance

• Streamlined management of storage devices

Typically, the storage array management software handles the following aspects of RAID technology:

• Management and control of disk aggregation

• Translation of I/O requests between the logical and the physical entities

• Error correction if disk failures occur

The physical disks that make up a RAID set can be either traditional mechanical disks or solid-state flash drives (SSDs). RAID sets have various levels, each optimized for specific use cases. Unlike many other common technologies, RAID levels are not standardized by an industry group or standardization committee. As a result, some storage vendors provide their own unique implementation of RAID technology. However, the following common RAID levels are covered in this chapter:

• RAID 0–striping

• RAID 1–mirroring

• RAID 5–striping with parity

• RAID 6–striping with double parity

• RAID 10–combining mirroring and striping

Determining which type of RAID to use when building a storage solution largely depends on three factors: capacity, availability, and performance. This section addresses the basic concepts that provide a foundation for understanding disk arrays, and how RAID can enable increased capacity by combining physical disks, provide higher availability in case of a drive failure, and increase performance through parallel drive access.

A key element in RAID is redundancy, in order to improve fault tolerance. This can be achieved through two mechanisms, mirroring and striping, depending on the RAID set level configured. Before addressing the RAID set capabilities typically used in storage array systems, we must first explain these two terms and what they mean for availability, capacity, performance, and manageability.

NOTE
Some storage systems also provide a JBOD configuration, which is an acronym for just a bunch of disks. In this configuration, the disks do not use any specific RAID level, and instead act as stand-alone drives. This type of disk arrangement is most typically employed for storage devices that contain swap files or spooling data, where redundancy is not paramount.

Striping in RAID Sets

As highlighted previously, RAID sets are made up of multiple physical disks. Within each disk are groups of continuously addressed blocks, called strips. The set of aligned strips that spans across all disks within the RAID set is called the stripe (see Figure 2.3).

Figure 2.3 Strips and stripes

Striping improves performance by distributing data across the disks in the RAID set (see Figure 2.4). This use of multiple independent disks allows multiple reads and writes to take place concurrently, providing one of the main advantages of disk striping: improved performance. For instance, striping data across three hard disks would provide three times the bandwidth of a single drive. Therefore, if each drive runs at 175 input/output operations per second (IOPS), disk striping would make available up to 525 IOPS for data reads and writes from that RAID set.

Figure 2.4 Performance in striping

Striping also provides performance and availability benefits by doing the following:

• Managing large amounts of data as it is being written; the first piece is sent to the first drive, the second piece to the second drive, and so on. These data pieces are then put back together again when the data is read.

• Increasing the number of physical disks in the RAID set increases performance, as more data can be read or written simultaneously.

• Using a higher stripe width indicates a higher number of drives and therefore better performance.

• Striping is managed through storage controllers, and is therefore transparent to the vSphere platform.

As part of the same mechanism, parity is provided as a redundancy check, to ensure that the data is protected without having to have a full set of duplicate drives, as illustrated in Figure 2.5. Parity is critical to striping, and provides the following functionality to a striped RAID set:

Figure 2.5 Redundancy through parity

• If a single disk in the array fails, the other disks have enough redundant data so that the data from the failed disk can be recovered.

• Like striping, parity is generally a function of the RAID controller or storage controller, and is therefore fully transparent to the vSphere platform.

• Parity information can be

• Stored on a separate, dedicated drive

• Distributed across all the drives in the RAID set

Mirroring in RAID Sets

Mirroring uses a mechanism that enables multiple physical disks to hold identical copies of the data, typically on two drives. Every write of data to a disk is also a write to the mirrored disk, meaning that both physical disks contain exactly the same information at all times. This mechanism is once again fully transparent to the vSphere platform and is managed by the RAID controller or storage controller. If a disk fails, the RAID controller uses the mirrored drive for data recovery, but continues I/O operations simultaneously, with data on the replaced drive being rebuilt from the mirrored drive in the background.

The primary benefits of mirroring are that it provides fast recovery from disk failure and improved read performance (see Figure 2.6). However, the main drawbacks include the following:

• Degraded write performance, as each block of data is written to multiple disks simultaneously

• A high financial cost for data protection, in that disk mirroring requires a 100 percent cost increase per gigabyte of data

Figure 2.6 Redundancy in disk mirroring

Enterprise storage systems typically support multiple RAID levels, and these levels can be mixed within a single storage array. However, once a RAID type is assigned to a set of physical disks, all LUNs carved from that RAID set will be assigned that RAID type.

Nested RAID

Some RAID levels are referred to as nested RAID, as they are based on a combination of RAID levels. Examples of nested RAID levels include RAID 03 (RAID 0+3, also known as RAID 53, or RAID 5+3) and RAID 50 (RAID 5+0). However, the only two commonly implemented nested RAID levels are RAID 1+0, also commonly known as RAID 10, and RAID 01 (RAID 0+1). These two are similar, except the data organization methods are slightly different; rather than creating a mirror and then striping the mirror, as in RAID 1+0, RAID 0+1 creates a stripe set and then mirrors it.

Calculating I/O per Second RAID Penalty

One of the primary ways to measure disk performance is input/output per second, also referred to as I/O per second or, more commonly, IOPS. This formula is simple: one read request or one write request is equal to one I/O.

Each physical disk in the storage is capable of providing a fixed number of I/O. Disk manufacturers calculate this based on the rotational speed, average latency, and seek time. Table 2.1 shows examples of typical physical drive IOPS specifications for the most common drive types.

Table 2.1 Typical average I/O per second (per physical disk)

A storage device’s IOPS capability is calculated as an aggregate of the sum of disks that make up the device. For instance, when considering a JBOD configuration, three disks rotating at 10,000 RPMs provide the JBOD with a total of 375 IOPS. However, with the exception of RAID 0 (which is simply a set of disks aggregated together to create a larger storage device), all RAID set configurations are based on the fact that write operations result in multiple writes to the RAID set, in order to provide the targeted level of availability and performance.

In a RAID 5 disk set, for example, for each random write request, the storage controller is required to perform multiple disk operations, which has a significant impact on the raw IOPS calculation. Typically, that RAID 5 disk set requires four IOPS per write operation. In addition, RAID 6, which provides a higher level of protection through double fault tolerance, also provides a significantly worse I/O penalty of six operations per write. Therefore, as the architect of such a solution, you must also plan for any I/O penalty associated with the RAID type being used in the design.

Table 2.2 summarizes the read and write RAID penalties for the most common RAID levels. Notice that you don’t have to calculate parity for a read operation, and no penalty is associated with this type of I/O. The I/O penalty relates specifically to writes, and there is no negative performance or IOPS impact when calculating read operations. It is only when you have writes to disk that you will see the RAID penalty come into play in RAID calculations and formulas. This is true even though in a parity-based RAID-type write operation, reads are performed as part of that write. For instance, writes in a RAID 5 disk set, where data is being written with a size that is less than that of a single block, require the following actions to be performed:

Table 2.2 RAID I/O penalty impact

1. Read the old data block.

2. Read the old parity block.

3. Compare data in the old block with the newly arrived data. For every changed bit, change the corresponding bit in parity.

4. Write the new data block.

5. Write the new parity block.

As noted previously, a RAID 0 stripe has no write penalty associated with it because there is no parity to be calculated. In Table 2.2, a no RAID penalty is expressed as a 1.

Конец ознакомительного фрагмента. Купить книгу

Подняться наверх

Читать книгу VMware Software-Defined Storage - Martin Hosken - Страница 14

Chapter 2Classic Storage Models and ConstructsClassic Storage Concepts

Chapter 2
Classic Storage Models and Constructs
Classic Storage Concepts