Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

How To Get The Most Out Of Your Flash Storage Or Move To Cloud

by Rick Cadruvi, Chief Architect 2. January 2020 10:41

You just went out and upgraded your storage to all-flash.  Or, maybe you have moved your systems to the cloud where you can choose the SLA to get the performance you want.  We can provide you with a secret weapon that will make you continue to look like a hero and get the real performance you made these choices for.


Let’s start with why you made those choices to start with.  Why did you make the change?  Why not just upgrade the aging storage to a new-gen HDD or hybrid storage subsystem?  After all, if you’re like most of us, you’re still experiencing explosive growth in data and HDDs continue to be more cost-effective for whatever data requirements you’re going to need in the future.

 

If you went to all-flash, perhaps it was the decreasing cost that made it more approachable from a budgetary point of view and the obvious gain in speed made it easy to justify.

 

If it was a move to the cloud, there may have been many reasons including:

   •  Not having to maintain the infrastructure anymore

   •  More flexibility to quickly add additional resources as needed

   •  Ability to pay for the SLA you need to match application needs to end user performance

Good choices.  So, what can Diskeeper® and V-locity® do to help make these even better choices to provide the expected performance results at peak times when needed most?

 

Let’s start with a brief conversation about I/O bottlenecks.

 

If you have an All-Flash Array, you still have a network connection between your system and your storage array.  If you have local flash storage, system memory is still faster, but your data size requirements make it a limited resource. 

 

If you’re on the cloud, you’re still competing for resources.  And, at peak times, you’ll have slows due to resource contention.  Plus, you will experience issues because of File System and Operating System overhead. 

 

File fragmentation creates significant increases in the number of I/Os that have to be requested for your applications to process the data they need to.  Free Space fragmentation adds overhead to allocating file space and makes file fragmentation far more likely.

 

Then there is all the I/Os that Windows creates that are not directly related to your application’s data access.  And then you have utilities to deal with anti-malware, data recovery, etc....  And trust me, there are LOTs of those.

 

At Condusiv, we’ve watched the dramatic changes in storage and data for a long time.  The one constant we have seen is that your needs will always accelerate past the current generation of technologies you use.  We also handle the issues that aren’t handled by the next generation of hardware.  Let’s take just a minute and talk about that.

 

What about all the I/O overhead created in the background by Windows or your anti-malware and other system utility software packages?  What about the I/Os that your application doesn’t bother to optimize because it isn’t the primary data being accessed?  Those I/Os account for a LOT of I/O bandwidth.  We refer to those as “noisy” I/Os.  They are necessary, but not the data your application is actually trying to process.  And, what about all the I/Os to the storage subsystem from other compute nodes?  We refer to that problem as the I/O Blender Effect.

 

 

Our RAM caching technologies are highly optimized to use a small amount of RAM resources to eliminate the maximum amount of I/O overhead.  It does it dynamically so that when you need RAM the most, we will free it up for your needs.  Then, when RAM is available, we will use it to remove the I/Os causing the most overhead.  A small amount of free RAM will go a long way towards reducing the I/O overhead problem.  That’s because our caching algorithms look at how to eliminate the most I/O overhead effectively.  We don’t use LIFO or FIFO algorithms hoping to eliminate I/Os.  Our algorithm uses empirical data, in real-time to guarantee maximum I/O overhead elimination while using minimal resources.

 

Defragmenting all your files that are fragmented is not reasonable due to data explosion.  Plus, you didn’t spend your money to have our software use it to make it look pretty.  We knew this long before you ever did.  As a result, we created technologies to prevent fragmentation in the first place.  And, we created technologies to empirically locate just those files that are causing extra overhead due to fragmentation so we can address those files only and therefore get the most bang for the buck in terms of I/O density.

 

Between our caching and file optimization technologies, we will make sure you keep getting the performance you hoped for when you need it the most.  And, of course, you will continue to be the superstar to the end users and your boss.  I call that a Win-Win. 😊

 

Finally, we continue to look in our crystal ball for the next set of I/O performance issues that will be coming up that others aren’t thinking before they appear in the first place.  You can rest assured we will have solutions for those problems long before you ever experience them.

 

##

 

Additional and related resources:

 

Windows is still Windows Whether in the Cloud, on Hyperconverged or All-flash

Why Faster Storage May NOT Fix It

How to make NVMe storage even faster

Trial Downloads

 

Tags:

Do I Really Need V-locity on All VMs?

by Rick Cadruvi, Chief Architect 15. August 2019 04:12

V-locity® customers may wonder, “How many VMs do I need to install V-locity on for optimal results? What kind of benefits will I see with V-locity on one or two VMs versus all the VMs on a host?” 

As a refresher…

It is true that V-locity will likely provide significant benefit on that one VM.  It may even be extraordinary.  But loading V-locity on just one VM on a host with sometimes dozens of VMs won’t give you the biggest bang for your buck. V-locity includes many technologies that address storage performance issues in an extremely intelligent manner.  Part of the underlying design is to learn about the specific loads your system has and intelligently adapt to each specific environment presented to it.   That’s why we created V-locity especially for virtual environments in the first place. 

As you have experienced, the beauty of V-locity is its ability to deal with the I/O Blender Effect.  When there are multiple VMs on a host, or multiple hosts with VMs that use the same back-end storage system (e.g., a SAN) a “blender” effect occurs when all these VMs are sending I/O requests up and down the stack.  As you can guess, it can create huge performance bottlenecks. In fact, perhaps the most significant issue that virtualized environments face is the fact that there are MANY performance chokepoints in the ecosystem, especially the storage subsystem.  These chokepoints are robbing 30-50% of your throughput.  This is the dark side of virtualized systems. 

Look at it this way.  VM “A” may have different resource requirements than VM “B” and so on.  Besides performing different tasks with different workloads, they may have different peak usage periods.  What happens when those peaks overlap?  Worse yet, what happens if several of your VMs have very similar resource requirements and workloads that constantly overlap? 

 

The answer is that the I/O Blender Effect takes over and now VM “A” is competing directly with VM “B” and VM “C” and so on.  The blender pours all those resource desires into a funnel, creating bottlenecks with unpredictable performance results.  What is predictable is that performance will suffer, and likely a LOT.

V-locity was designed from the ground up to intelligently deal with these core issues.  The guiding question in front of us as it was being designed and engineered, was: 

Given your workload and resources, how can V-locity help you overcome the I/O Blender Effect? 

By making sure that V-locity will adapt to your specific workload and having studied what kinds of I/Os amplify the I/O Blender Effect, we were able to add intelligence to specifically go after those I/Os.  We take a global view.  We aren’t limited to a specific application or workload.  While we do have technologies that shine under certain workloads, such as transactional SQL applications, our goal is to optimize the entire ecosystem.  That’s the only way to overcome the I/O Blender Effect.

So, while we can indeed give you great gains on a single VM, V-locity truly gets to shine and show off its purpose when it can intelligently deal with the chokepoints that create the I/O Blender Effect.  That means you should add V-locity to ALL your VMs.  With our no-reboot installation and a V-locity Management Console, it’s fast and easy to cover and manage your environment.

If you have V-locity on all the VMs on your host(s), let us know how it is going! If you don’t yet, contact your account manager who can get you set up!

For an in-depth refresher,  watch our 10-min whiteboard video

 

Overcoming the I/O Blender Effect with V-locity

by Rick Cadruvi, Chief Architect 5. August 2019 05:23

You’ve decided that you want to try out V-locity® software – kick the tires so to speak.  You’ll just load it on one of your Virtual Machines and see how it goes.  What kind of benefit will you see?

It is true that V-locity will likely provide significant benefit on that one VM.  It may even be extraordinary.  But loading V-locity on just one VM on a host with sometimes dozens of VMs won’t give you the biggest bang for your buck. V-locity includes many technologies that address storage performance issues in an extremely intelligent manner.  Part of the underlying design is to learn about the specific loads your system has and intelligently adapt to each specific environment presented to it.   That’s why we created a product especially for Virtual environments in the first place. 

The beauty of V-locity is its ability to deal with something called the I/O Blender Effect.  This is the dark side of virtualized systems.  When there are multiple VMs on a host, or multiple hosts with VMs that use the same back-end storage system (e.g., a SAN) a “blender” effect occurs when all these VMs are sending I/O requests up and down the stack.  As you can guess, it can create huge performance bottlenecks. In fact, perhaps the most significant issue that virtualized environments face is the fact that there are MANY performance chokepoints in the ecosystem, especially the storage subsystem.  These chokepoints are robbing 30-50% of your throughput.  That’s what V-locity can recover.

Look at it this way.  VM “A” may have different resource requirements than VM “B” and so on.  Besides performing different tasks with different workloads, they may have different peak usage periods.  What happens when those peaks overlap?  Worse yet, what happens if several of your VMs have very similar resource requirements and workloads that constantly overlap? 

 

The answer is that the I/O Blender Effect takes over and now VM “A” is competing directly with VM “B” and VM “C” and so on.  The blender pours all those resource desires into a funnel, creating bottlenecks with unpredictable performance results.  What is predictable is that performance will suffer, and likely a LOT.

Enter V-locity.  V-locity was designed from the ground up to intelligently deal with these core issues.  The guiding question in front of us as it was being designed and engineered, was:

Given your workload and resources, how can V-locity help you overcome the I/O Blender Effect?

By making sure that V-locity will adapt to your specific workload and having studied what kinds of I/Os amplify the I/O Blender Effect, we were able to add intelligence to specifically go after those I/Os.  We take a global view.  We aren’t limited to a specific application or workload.  While we do have technologies that shine under certain workloads, such as transactional SQL applications, our goal is to optimize the entire ecosystem.  That’s the only way to overcome the I/O Blender Effect.

So, while we can indeed give you great gains on a single VM, V-locity truly gets to shine and show off its purpose when it can intelligently deal with the chokepoints that create the I/O Blender Effect.  That means you should add V-locity to ALL your VMs.  With our no-reboot installation and a V-locity Management Console, it’s fast and easy to cover and manage your environment.

And yes, this same I/O Blender effect can occur in your physical environment with multiple physical systems all accessing different LUNs on the same SAN. Our Diskeeper® software is the answer here.

Go ahead and try V-locity on the VMs that are in the most competition for resources and you’ll be amazed at the benefits.  The chokepoints aren’t obvious or right in front of your face, but they are real and V-locity is the answer.  After that, just add V-locity to all your VMs, then sit back and see how smart you were to so easily improve throughput across your eco-system.

Video: Condusiv I/O Reduction Software Overview

Download a 30-day Free Trial

 

Why Faster Storage May NOT Fix It

by Rick Cadruvi, Chief Architect 20. September 2018 04:58

 

With all the myriad of possible hardware solutions to storage I/O performance issues, the question that people are starting to ask is something like:

         If I just buy newer, faster Storage, won’t that fix my application performance problem?

 The short answer is:

         Maybe Yes (for a while), Quite Possibly No.

I know – not a satisfying answer.  For the next couple of minutes, I want to take a 10,000-foot view of just three issues that affect I/O performance to shine some technical light on the question and hopefully give you a more satisfying answer (or maybe more questions) as you look to discover IT truth.  There are other issues, but let’s spend just a moment looking at the following three:

1.     Non-Application I/O Overhead

2.     Data Pipelines

3.     File System Overhead

These three issues by themselves can create I/O bottlenecks causing degradation to your applications of 30-50% or more.

Non-Application I/O Overhead:

One of the most commonly overlooked performance issues is that an awful lot of I/Os are NOT application generated.  Maybe you can add enough DRAM and go to an NVMe direct attached storage model and get your application data cached at an 80%+ rate.  Of course, you still need to process Writes and the NVMe probably makes that a lot faster than what you can do today.  But you still need to get it to the Storage.  And, there are lots of I/Os generated on your system that are not directly from your application.  There’s also lots of application related I/Os that are not targeted for caching – they’re simply non-essential overhead I/Os to manage metadata and such.  People generally don’t think about the management layers of the computer and application that have to perform Storage I/O just to make sure everything can run.  Those I/Os hit the data path to Storage along with the I/Os your application has to send to Storage, even if you have huge caches.  They get in the way and make your Application specific I/Os stall and slow down responsiveness.

And let’s face it, a full Hyper-Converged, NVMe based storage infrastructure sounds great, but there are lots of issues besides the enormous cost with that.  What about data redundancy and localization?  That brings us to issue # 2.

Data Pipelines: 

Since your data is exploding and you’re pushing 100s of Terabytes, perhaps Petabytes and in a few cases maybe even Exabytes of data, you’re not going to get that much data on your one server box, even if you didn’t care about hardware/data failures.  

Like it or not, you have an entire infrastructure of Servers, Switches, SANs, whatever.  Somehow, all that data needs to get to and from the application and wherever it is stored.  And if you add Cloud storage into the mix, it gets worse. At some point the data pipes themselves become the limiting factor.  Even with Converged infrastructures, and software technologies that stage data for you where it is supposedly needed most, data needs to be constantly shipped along a pipe that is nowhere close to the speed of access that your new high-speed storage can handle.  Then add lots of users and applications simultaneously beating on that pipe and you can quickly start to visualize the problem.

If this wasn’t enough, there are other factors and that takes us to issue #3.

File System Overhead:

You didn’t buy your computer to run an operating system.  You bought it to manipulate data.  Most likely, you don’t even really care about the actual application.  You care about doing some kind of work.  Most people use Microsoft Word to write documents.  I did to draft this blog.  But I didn’t really care about using Word.  I cared about writing this blog and Word was something I had, I knew how to use and was convenient for the task.  That’s your application, but manipulating the data is your real conquest.  The application is a tool to allow you to paint a beautiful picture of your data, so you can see it and accomplish your job better.

The Operating System (let’s say Windows), is one of a whole stack of tools between you, your application and your data.  Operating Systems have lots of layers of software to manage the flow from your user to the data and back.  Storage is a BLOB of stuff.  Whether it is spinning hard drives, SSDs, SANs, cloud-based storage, or you name it, it is just a canvas where the data can be stored.  One of the first strokes of the brush that will eventually allow you to create that picture you want from your data is the File System.  It brings some basic order.  You can see this by going into Windows File Explorer and perusing the various folders.  The file system abstracts that BLOB into pieces of data in a hierarchical structure with folders, files, file types, information about size/location/ownership/security, etc... you get the idea.  Before the painting you want to see from your data emerges, a lot of strokes need to be placed on the canvas and a lot of those strokes happen from the Operating and File Systems.  They try to manage that BLOB so your Application can turn it into usable data and eventually that beautiful (we hope) picture you desire to draw. 

Most people know there is an Operating System and those of you reading this know that Operating Systems use File Systems to organize raw data into useful components.  And there are other layers as well, but let’s focus.  The reality is there are lots of layers that have to be compensated for.  Ignoring file system overhead and focusing solely on application overhead is ignoring a really big Elephant in the room.

Let’s wrap this up and talk about the initial question.  If I just buy newer, faster Storage won’t that fix my application performance?  I suppose if you have enough money you might think you can.  You’ll still have data pipeline issues unless you have a very small amount of data, little if any data/compute redundancy requirements and a very limited number of users.  And yet, the File System overhead will still get in your way. 

When SSDs were starting to come out, Condusiv® worked with several OEMs to produce software to handle obvious issues like the fact that writes were slower and re-writes were limited in number. In doing that work, one of our surprise discoveries was that when you got beyond a certain level of file system fragmentation, the File System overhead of trying to collect/arrange the small pieces of data made a huge impact regardless of how fast the underlying storage was.  Just making sure data wasn’t broken down into too many pieces each time a need to manipulate it came along provided truly measurable and, in some instances, gave incredible performance gains. 

Then there is that whole issue of I/Os that have nothing to do with your data/application. We also discovered that there was a path to finding/eliminating the I/Os that, while not obvious, made substantial differences in performance because we could remove those out of the flow, thus allowing the I/Os your application wants to perform happen without the noise.  Think of traffic jams.  Have you ever driven in stop and go traffic and noticed there aren’t any accidents or other distractions to account for such slowness?  It’s just too many vehicles on the road with you.  What if you could get all the people who were just out for a drive, off the road?  You’d get where you want to go a LOT faster.  That’s what we figured out how to do.  And it turns out no one else is focused on that - not the Operating System, not the File System, and certainly not your application. 

And then you got swamped with more data.  Okay, so you’re in an industry where regulations forced that decision on you.  Either way, you get the point.  There was a time when 1GB was more storage than you would ever need.  Not too long ago, 1TB was the ultimate.  Now that embedded SSD on your laptop is 1TB.  Before too long, your phone will have 1TB of storage.  Mine has 128GB, but hey I’m a geek and MicroSD cards are cheap.  My point is that the explosion of data in your computing environment strains File System Architectures.  The good news is that we’ve built technologies to compensate for and fix limitations in the File System.

Let me wrap this up by giving you a 10,000-foot view of us and our software.  The big picture is we have been focused on Storage Performance for a very long time and at all layers.  We’ve seen lots of hardware solutions that were going to fix Storage slowness.  And we’ve seen that about the time a new generation comes along, there will be reasons it will still not fix the problem.  Maybe it does today, but tomorrow you’ll overtax that solution as well.  As computing gets faster and storage gets denser, your needs/desires to use it will grow even faster.  We are constantly looking into the crystal ball knowing the future presents new challenges.  We know by looking into the rear-view mirror, the future doesn’t solve the problem, it just means the problems are different.  And that’s where I get to have fun.  I get to work on solving those problems before you even realize they exist.  That’s what turns us on.  That’s what we do, and we have been doing it for a long time and, with all due modesty, we’re really good at it! 

So yes, go ahead and buy that shiny new toy.  It will help, and your users will see improvements for a time.  But we’ll be there filling in those gaps and your users will get even greater improvements.  And that’s where we really shine.  We make you look like the true genius you are, and we love doing it.

  

 

Dashboard Analytics 13 Metrics and Why They Matter

by Rick Cadruvi, Chief Architect 11. July 2018 09:12

 

Our latest V-locity®, Diskeeper® and SSDkeeper® products include a built-in dashboard that reports the benefits our software is providing.  There are tabs in the dashboard that allow users to view very granular data that can help them assess the impact of our software.  In the dashboard Analytics tab we display hourly data for 13 key metrics.  This document describes what those metrics are and why we chose them as key to understanding your storage performance, which directly translates to your application performance.

To start with, let’s spend a moment  trying to understand why 24-hour graphs matter.  When you, and/or your users really notice bottlenecks is generally during peak usage periods.  While some servers are truly at peak usage 24x7,  most systems, including servers, have peak I/O periods.  These almost always follow peak user activity.  

Sometimes there will be spikes also in the overnight hours when you are doing backups, virus scans, large report/data maintenance jobs, etc.  While these may not be your major concern, some of our customers find that these overlap their daytime production and therefore can easily be THE major source of concern.  For some people, making these happen before the deluge of daytime work starts, is the single biggest factor they deal with.

Regardless of what causes the peaks, it is at those peak moments when performance matters most.  When little is happening, performance rarely matters.  When a lot is happening, it is key.  The 24-hour graphs allow you to visually see the times when performance matters to you.  You can also match metrics during specific hours to see where the bottlenecks are and what technologies of ours are most effective during those hours. 

Let’s move on to the actual metrics.

 

Total I/Os Eliminated

 

Total I/Os eliminated measures the number of I/Os that would have had to go through to storage if our technologies were not eliminating them before they ever got sent to storage.  We eliminate I/Os in one of two ways.  First, via our patented IntelliMemory® technology, we satisfy I/Os from memory without the request ever going out to the storage device.  Second, several of our other technologies, such as IntelliWrite® cause the data to be stored more efficiently and densely so that when data is requested, it takes less I/Os to get the same amount of data as would otherwise be required.  The net effect is that your storage subsystems see less actual I/Os sent to them because we eliminated the need for those extra I/Os.  That allows those I/Os that do go to storage to finish faster because they aren’t waiting on the eliminated I/Os to complete.

 

IOPS

IOPS stands for I/Os Per Second.  It is the number of I/OS that you are actually requesting.  During the times with the most activity, I/Os eliminated actually causes this number to be much higher than would be possible with just your storage subsystem.  It is also a measure of the total amount of work your applications/systems are able to accomplish.

 

Data from Cache (GB)

Data from cache tells you how much of that total throughput was satisfied directly from cache.  This can be deceiving.  Our caching algorithms are aimed at eliminating a lot of small noisy I/Os that jam up the storage subsystem works.  By not having to process those, the data freeway is wide open.  This is like a freeway with accidents.  Even though the cars have moved to the side, the traffic slows dramatically.  Our cache is like accident avoidance.  It may be just a subset of the total throughput, but you process a LOT more data because you aren’t waiting for those noisy, necessary I/Os that hold your applications/systems back.

Throughput (GB Total)

Throughput is the total amount of data you process and is measured in GigaBytes.  Think of this like a freight train.  The more railcars, the more total freight being shipped.  The higher the throughput, the more work your system is doing.

 

Throughput (MB/Sec)

Throughput is a measure of the total volume of data flowing to/from your storage subsystem.  This metric measures throughput in MegaBytes per second kind of like your speedometer versus your odometer.

I/O Time Saved (seconds)

The I/O Time Saved metric tells you how much time you didn’t have to wait for I/Os to complete because of the physical I/Os we eliminated from going to storage.  This can be extremely important during your busiest times.  Because I/O requests overlap across multiple processes and threads, this time can actually be greater than elapsed clock time.  And what that means to you is that the total amount of work that gets done can actually experience a multiplier effect because systems and applications tend to multitask.  It’s like having 10 people working on sub-tasks at the same time.  The projects finish much faster than if 1 person had to do all the tasks for the project by themselves.  By allowing pieces to be done by different people and then just plugging them altogether you get more done faster.  This metric measures that effect.

 

I/O Response Time

I/O Response time is sometimes referred to as Latency.  It is how long it takes for I/Os to complete.  This is generally measured in milliseconds.  The lower the number, the better the performance.

Read/Write %

Read/Write % is the percentage of Reads to Writes.  If it is at 75%, 3 out of every 4 I/Os are Reads to each Write.  If it were 25%, then it would signify that there are 3 Writes per each Read.

 

Read I/Os Eliminated

This metric tells you how many Read I/Os we eliminated.  If your Read to Write ratio is very high, this may be one of the most important metrics for you.  However, remember that eliminating Writes means that Reads that do go to storage do NOT have to wait for those writes we eliminated to complete.  That means they finish faster.  Of course, the same is true that Reads eliminated improves overall Read performance.

% Read I/Os Eliminated

 

% Read I/Os Eliminated tells you what percentage of your overall Reads were eliminated from having to be processed at all by your storage subsystem.

 

Write I/Os Eliminated

This metric tells you how many Write I/Os we eliminated.  This is due to our technologies that improve the efficiency and density of data being stored by the Windows NTFS file system.

% Write I/Os Eliminated 

 

% Write I/Os Eliminated tells you what percentage of your overall Writes were eliminated from having to be processed at all by your storage subsystem.

Fragments Prevented and Eliminated

Fragments Prevented and Eliminated gives you an idea of how we are causing data to be stored more efficiently and dense, thus allowing Windows to process the same amount of data with far fewer actual I/Os.

If you have our latest versions of V-locity, Diskeeper or SSDkeeper installed, you can open the Dashboard now and select the Analytics tab and see all of these metrics.

If you don’t have the latest version installed and you have a current maintenance agreement, login to your online account to download and install the software.

Not a customer yet and want to checkout these dashboard metrics, download a free trial at www.condusiv.com/try.

RecentComments

Comment RSS

Month List

Calendar

<<  January 2020  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
272829303112
3456789

View posts in large calendar