06 Oct Addressing Data Center Metrics Pain Points

As much as IT pros hate to be told, “We have to do more with less,” it’s doubtful that this directive will die in the near future. The unfortunate truth is that IT has to continue to do more with either no increase or with decreases in overall resources. This comes at the same time that increasing attention must be paid to various traditional data center pain points.

Let’s now learn about these pain points and how hyperconverged infrastructure can be leveraged to help address them.

The Relationship Between Performance and Virtual Machine Density

Return on investment. Total cost of ownership. These are phrases used to describe the economic impact of technology investments . . . or expenses, depending on your perspective. Regardless of the perspective though, businesses want to squeeze as much return as possible out of their technology investments while spending as little as reasonably possible on those investments.

You might be wondering what this quick economic discussion has to do with workload performance in the data center. There is actually a direct link between these two topics and it revolves around overall virtual machine density. Virtual machine density refers to the number of virtual machines that you can cram onto a single host. The more virtual machines that you can fit onto a host, the fewer hosts you need to buy. Obviously, fewer hosts mean having to spend less money on hardware, but the potential savings go far beyond that measure.

When you have fewer hosts, you also spend less on licensing. For example, you don’t need to buy vSphere licenses for hosts that don’t exist! In addition, if you’ve taken Microsoft up on their Windows Server Data Center licensing deal under which you simply license virtual hosts, and you can run as many individual virtual machine-based Windows instances as you like under that Data Center umbrella, you save even more.

The savings doesn’t stop there. Fewer hosts mean less electricity is needed to operate the data center environment. Fewer hosts mean there is less cooling needed in the data center environment. Fewer hosts mean that you free up rack space in the data center environment.

However, these benefits cannot come at the expense of poor workload performance. When workloads perform poorly, they actively cost the company money such as lost efficiency and customer dissatisfaction.

How do you maximize virtual machine density without impacting workload performance? First of all, it’s a balance that you need to find, but when you’re initially specifying hardware for a new environment, you won’t necessarily know how your workloads will function in that new environment, so things can be tough to predict. Instead, you need to look at the inputs, or the resources atop which the new environment is built. Storage is one of these key resources.

Storage Performance in a Hyperconverged Infrastructure

In a hyperconverged infrastructure environment, one of the primary resources that must be considered is storage, and not just from a capacity perspective. Remember that storage and compute are combined in hyperconvergence, so that becomes a factor that is not present in more traditional environments. In a traditional environment, 100% of the available CPU and memory resources are dedicated to serving the needs of running virtual machines. In a hyperconverged infrastructure environment, some of those resources must be diverted to support the needs of the storage management function, usually in the form of a VSA. This is one of the core trade-offs to consider when adopting a hyperconverged infrastructure.

This is where hardware acceleration can be a boon. Most hyperconverged infrastructure systems rely on the commodity hardware to carry out all functions. With a system that uses hardware acceleration, more commodity Intel CPU horsepower can be directed at running virtual machines while the acceleration hardware handles processor-intensive data reduction operations, such as deduplication and compression.

 

Data Deduplication Explained

Consider this scenario: Your organization is running a virtual desktop environment with hundreds of identical workstations all stored on an expensive storage array purchased specifically to support this initiative. That means you’re running hundreds of copies of Windows, Office, ERP software, and any other tools that users require.
Let’s say that each workstation image consumes 25 GB of disk space. With just 200 such workstations, these images alone would consume 5 TB of capacity.

With deduplication, you can store just 1 copy of these individual virtual machines and then allow the storage array to place pointers to the rest. Each time the deduplication engine comes across a piece of data that is already stored somewhere in the environment, rather than write that full copy of data all over again, the system instead saves a small pointer in the data copy’s place, thus freeing up the blocks that would have otherwise been occupied.
In Figure 1, the graphic on the left shows what happens without deduplication. The graphic on the right shows deduplication in action. In this example, there are four copies of the blue block and two copies of the green block stored on this array. Deduplication enables just one block to be written for each block, thus freeing up those other four blocks.
svt-3-a

 

Deduplication vs. No Deduplication

Now, let’s expand this example to a real world environment. Imagine the deduplication possibilities present in a VDI scenario: with hundreds of identical or close-to-identical desktop images, deduplication has the potential to significantly reduce the capacity needed to store all of those virtual machines.

Deduplication works by creating a data fingerprint for each object that is written to the storage array. As new data is written to the array, additional data copies beyond the first are saved as tiny pointers. If a completely new data item is written — one that the array has not seen before — the full copy of the data is stored.

Different vendors handle deduplication in different ways. In fact, there are two primary deduplication techniques that deserve discussion: inline deduplication and post-process deduplication.

Inline Deduplication

Inline deduplication takes place at the moment in which data is written to the storage device. While the data is in transit, the deduplication engine fingerprints the data on the fly. As you might expect, this deduplication process does create some overhead.

First, the system has to constantly fingerprint incoming data and then quickly identify whether that new fingerprint already matches something in the system. If it does, a pointer to the existing fingerprint is written. If it does not, the block is saved as-is. This process introduces the need to have processors that can keep up with what might be a tremendous workload. Further, there is the possibility that latency could be introduced into the storage I/O stream due to this process.

A few years ago, this might have been a showstopper since some storage controllers may not have been able to keep up with the workload need. Today, though, processors have evolved far beyond what they were just a few years ago. These kinds of workloads don’t have the same negative performance impact that they might have once had. In fact, inline deduplication is a cornerstone feature for most of the new storage devices released in the past few years and, while it may introduce some overhead, it often provides far more benefits than costs. With a hardware-accelerated hyperconverged infrastructure, inline deduplication is not only the norm, it’s a key cornerstone for the value that is derived from the infrastructure.

Post-Process Deduplication

As mentioned, inline deduplication imposes the potential for some processing overhead and potential latency. The problem with some deduplication engines is that they have to run constantly, which means that the system needs to be adequately configured with constant deduplication in mind. Making matters worse, it can be difficult to predict exactly how much processing power will be needed to achieve the deduplication goal. As such, it’s not always possible to perfectly plan overhead requirements.

This is where post-process deduplication comes into play. Whereas inline deduplication processes deduplication entries as the data flows through the storage controllers, post-process deduplication happens on a regular schedule, perhaps overnight. With post-process deduplication, all data is written in its full form — copies and all — on that regular schedule. The system then fingerprints all new data and removes multiple copies, replacing them with pointers to the original copy of the data.

Post-process deduplication enables organizations to utilize this data reduction service without having to worry about the constant processing overhead involved with inline deduplication. This process enables organizations to schedule dedupe to take place during off hours.

The biggest downside to post-process deduplication is the fact that all data is stored fully hydrated – a technical term that means that the data has not been deduplicated – and, as such, requires all of the space that non-deduplicated data needs. It’s only after the scheduled process that the data is shrunk. For those using post-process dedupe, bear in mind that, at least temporarily, you’ll need to plan on having extra capacity. There are a number of hyperconverged infrastructure systems that use post process deduplication while others don’t do deduplication at all. Lack of full inline deduplication increases costs and reduces efficiency.

 

Hardware Acceleration to the Rescue

Hardware-accelerated hyperconverged infrastructure solutions completely solve the overhead challenges inherent in those systems. All deduplication tasks are delegated to the accelerator, thereby negating the need for the system to consume processor resources that are also needed by the virtual machines.

 

Tiering and Deduplication

In order to match storage performance needs with storage solutions, many companies turn to what are known as tiered storage solutions. They run, for example, hard disk-based arrays for archival data and they run flash system for performance needs and they manage these resources separately. This also means that deduplication is handled separately per tier. Each time dedupe is duplicated, there are additional CPU resources that must be brought to bear and there are multiple copies of deduplicated data. Neither is efficient. Hyperconverged systems that include comprehensive in-line deduplication services carry with them incredibly efficient outcomes.

Don’t underestimate the benefits of data reduction! These services have far more impact on the environment than might be obvious at first glance and the benefits go far beyond simple capacity gains, although capacity efficiency is important.

There are several metrics that benefit when dedicated and specialized hardware is brought to bear.

Capacity

The sidebar, Data Deduplication Explained, discusses generalized capacity benefits of deduplication, but let’s now consider this in the world of hyperconverged infrastructure. In order to do this, you need to consider your organization’s holistic data needs:
Primary storage — This is the storage that’s user- or application-facing. It’s where your users store their stuff, where email servers store your messages, and where your ERPs database is housed. It’s the lifeblood for your day-to-day business operations.
Backup — An important pillar of the storage environment revolves around storage needs related to backup. As the primary storage environment grows, more storage has to be added to the backup environment, too.
Disaster recovery — For companies that have disaster recovery systems in place in which data is replicated to secondary data centers, there is continued need to grow disaster recovery-focused storage systems.

When people think about storage, they often focus just on primary storage, especially as users and applications demand more capacity. But when you look at the storage environment from the top down, storage growth happens across all of these storage tiers, not just the primary storage environment.

In other words, your capacity needs are growing far faster than it might appear. Hardware acceleration, when applied to all of the storage domains in aggregate, can have a tremendous impact on capacity. By treating all of these individual domains as one, and deduplicating across all of them, you can achieve pretty big capacity savings.
But deduplication, as mentioned before, can be CPU-intensive. By leveraging hardware acceleration, you can deduplicate all of this without taking CPU resources away from running workloads. By infusing the entire storage environment with global deduplication capabilities via hardware acceleration, you can get capacity benefits that were only the stuff of dreams just a few years ago. Hyperconvergence with great deduplication technology can attain great results while also simplifying the overall management needs in the data center.

IOPS

Imagine a world in which you don’t actually have to write 75% of the data that is injected into a system. That world becomes possible when hardware acceleration is used so that all workloads benefit from inline deduplication, not just some workloads.

The more that data can be deduplicated, the fewer write operations that have to take place. For example, a deduplication ratio of 5:1 means that there would only be 1 actual write-to-storage operation that takes place for every 5 attempted write operations.

Hardware acceleration allows this comprehensive data reduction process to take place in a way that doesn’t divert workload CPU resources. As a result, you continue to enjoy the IOPS benefits without having to give up workload density.

Latency

Latency is the enemy of data center performance. By offloading intensive tasks to a custom developed hardware board that specifically handles these kinds of tasks, latency can be reduced to a point where it doesn’t affect application performance.

Application Performance

At the end of the day, all that matters is application performance. That’s the primary need in the data center environment and, while it can be difficult to measure, you will know very quickly if you’ve failed to hit this metric. The phones will start to ring off the hook. Hardware acceleration help you to keep this metric in the green.

Linear Scalability

Businesses grow all the time. The data center has to grow along with it. Scaling “up” has been one of the primary accepted ways to grow, but it carries some risks. Remember, in the world of storage, scaling up occurs when you add additional capacity without also adding more CPU and networking capacity at the same time. The problem here is that you run the risk of eventually overwhelming the shared resources that exist. Figure 1 shows an example of a scale-up environment.

svt-3-b

Figure 1: A scale up environment relies on shared components
Scale-out has become a more popular option because it expands all resources at the same time. With hyperconverged infrastructure, the scaling method is referred to as linear scalability. Each node has all of the resources it needs — CPU, RAM, and storage — in order to stand alone. Figure 2 gives you a look at this kind of scalability.

svt-3-c

Figure 2: A scale out environment has nodes that can individually stand alone
For solutions that use hardware acceleration, accelerators are a critical part of the scaling capabilities as they offload intensive functionality that can be workload impacting. This increases density, but more importantly, by freeing up resources, IT can add more predictability to overall performance of applications, even while maintaining high levels of density.

Summary

The items discussed here are critically important to ensuring that the data center adequately (maybe even excellently!) supports business needs, but these metrics are just the very tip of the iceberg. Under the waterline are other requirements that can’t be ignored.



Scott D. Lowe, vExpert, MVP Hyper-V, MCSE
scott@actualtechmedia.com

Virtualization and storage expert Scott D. Lowe is a multi-year vExpert, MVP Hyper-V, frequent speaker for multiple organizations and Co-Founder of ActualTech Media. Scott has been in the IT field for over twenty years with ten of those years filling the CIO role for various organizations. Scott has authored several books and hundreds of whitepapers, research reports and the like throughout his career. Over the years, he has regularly contributed to such sites as TechRepublic, Wikibon, and VirtualizationSoftware.com and is currently the editor of EnterpriseStorageGuide.com

No Comments

Post A Comment

Share with your friends










Submit
Share with your friends










Submit
Do you want massive traffic?
Dignissim enim porta aliquam nisi pellentesque. Pulvinar rhoncus magnis turpis sit odio pid pulvinar mattis integer aliquam!
  • Goblinus globalus fantumo tubus dia montes
  • Scelerisque cursus dignissim lopatico vutario
  • Montes vutario lacus quis preambul den lacus
  • Leftomato denitro oculus softam lorum quis
  • Spiratio dodenus christmas gulleria tix digit
  • Dualo fitemus lacus quis preambul pat turtulis
* we never share your e-mail with third parties.
COMPANY NAME
221, Mount Olimpus, Rheasilvia, Mars,
Solar System, Milky Way Galaxy
+1 (999) 123-45-67
Thank You. We will contact you as soon as possible.
Do you want more traffic?
Dignissim enim porta aliquam nisi pellentesque. Pulvinar rhoncus magnis turpis sit odio pid pulvinar mattis integer aliquam!
  • Goblinus globalus fantumo tubus dia montes
  • Scelerisque cursus dignissim lopatico vutario
  • Montes vutario lacus quis preambul den lacus
  • Leftomato denitro oculus softam lorum quis
  • Spiratio dodenus christmas gulleria tix digit
  • Dualo fitemus lacus quis preambul pat turtulis
  • Scelerisque cursus dignissim lopatico vutario
  • Montes vutario lacus quis preambul den lacus
SUBSCRIBE TO OUR NEWSLETTER AND START INCREASING YOUR PROFITS NOW!
* we never share your e-mail with third parties.
SUBSCRIBE TO NEWSLETTER
Turpis dis amet adipiscing hac montes odio ac velit? Porta, non rhoncus vut, vel, et adipiscing magna pulvinar adipiscing est adipiscing urna. Dignissim rhoncus scelerisque pulvinar?
SUBSCRIBE TO OUR NEWSLETTER
PGlmcmFtZSB3aWR0aD0iMTAwJSIgaGVpZ2h0PSIxMDAlIiBzcmM9Imh0dHA6Ly93d3cueW91dHViZS5jb20vZW1iZWQvajhsU2NITzJtTTAiIGZyYW1lYm9yZGVyPSIwIiBhbGxvd2Z1bGxzY3JlZW4+PC9pZnJhbWU+
All rights reserved © Company Name, 2014
CONTACT US
COMPANY NAME
221, Mount Olimpus, Rheasilvia, Mars
Solar System, Milky Way Galaxy
+1 (999) 999-99-99
Thank You. We will contact you as soon as possible.
Macbook Pro
* Intel Core i7 (3.8GHz, 6MB cache)
* Retina Display (2880 x 1880 px)
* NVIDIA GeForce GT 750M (Iris)
* 802.11ac Wi-Fi and Bluetooth 4.0
* Thunderbolt 2 (up to 20Gb/s)
* Faster All-Flash Storage (X1)
* Long Lasting Battery (9 hours)
Ivan Churakov, developer
Tel.:
Fax:
E-mail:
Website:
+1 (800) 800-1234, +1 (800) 123-4567
+1 (800) 800-1234 (ext. 1234)
ivan.churakov@domain.tld
http://halfdata.com/
My CodeCanyon Portfolio
Banner Manager Pro - CodeCanyon Item for Sale
Coming Soon and Maintenance Mode - CodeCanyon Item for Sale
Code Shop - CodeCanyon Item for Sale
Keyword Tooltips - CodeCanyon Item for Sale
Subscribe & Download - CodeCanyon Item for Sale
"A placerat mauris placerat et penatibus porta aliquet sed dapibus, pulvinar urna cum aliquet arcu lectus sed tortor aliquet sed dapibus."
John Doe, Astronomer
Bubble Company Inc. © 2011-2014
SUBSCRIBE TO NEWSLETTER
PGlmcmFtZSB3aWR0aD0iMTAwJSIgaGVpZ2h0PSIxMDAlIiBzcmM9Ii8vd3d3LnlvdXR1YmUuY29tL2VtYmVkL3NCV1BDdmR2OEJrP2F1dG9wbGF5PTEiIGZyYW1lYm9yZGVyPSIwIiBhbGxvd2Z1bGxzY3JlZW4+PC9pZnJhbWU+
ENJOY AURORA BOREALIS
SUBSCRIBE TO NEWSLETTER
INTERGALACTIC COMPANY
"Ridiculus enim cras placerat facilisis amet lorem ipsum scelerisque sagittis lorem tis!"
Jojn Doe, CEO
Tel.: +1 (800) 123-45-67, +1 (800) 123-45-68
Fax: +1 (800) 123-45-69 (any time, 24/7/365)
E-mail: info@intergalactic.company
Website: http://www.intergalactic.company
Address:
221, Mount Olimpus,
Rheasilvia region, Mars,
Solar System, Milky Way Galaxy
Do you want more traffic?
Dignissim enim porta aliquam nisi pellentesque. Pulvinar rhoncus magnis turpis sit odio pid pulvinar mattis integer aliquam!
  • Goblinus globalus fantumo tubus dia
  • Scelerisque cursus dignissim lopatico
  • Montes vutario lacus quis preambul
  • Leftomato denitro oculus softam lorum
  • Spiratio dodenus christmas gulleria tix
  • Dualo fitemus lacus quis preambul bela
PGlmcmFtZSB3aWR0aD0iMTAwJSIgaGVpZ2h0PSIxMDAlIiBzcmM9Imh0dHA6Ly93d3cueW91dHViZS5jb20vZW1iZWQvajhsU2NITzJtTTAiIGZyYW1lYm9yZGVyPSIwIiBhbGxvd2Z1bGxzY3JlZW4+PC9pZnJhbWU+
* we never share your details with third parties.
Do you want massive traffic?
Scelerisque augue ac hac, aliquet, nascetur turpis. Augue diam phasellus odio lorem integer, aliquam aliquam sociis nisi adipiscing hacac.
  • Goblinus globalus fantumo tubus dia
  • Scelerisque cursus dignissim lopatico
  • Montes vutario lacus quis preambul
  • Leftomato denitro oculus softam lorum
  • Spiratio dodenus christmas gulleria tix
  • Dualo fitemus lacus quis preambul bela
PGlmcmFtZSB3aWR0aD0iMTAwJSIgaGVpZ2h0PSIxMDAlIiBzcmM9Imh0dHA6Ly93d3cueW91dHViZS5jb20vZW1iZWQvajhsU2NITzJtTTAiIGZyYW1lYm9yZGVyPSIwIiBhbGxvd2Z1bGxzY3JlZW4+PC9pZnJhbWU+
* we never share your details with third parties.