Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Which Processes are Using All of My System Resources?

by Gary Quan 17. July 2018 05:50

Over time as more files and applications are added to your system, you notice that performance has degraded, and you want to find out what is causing it. A good starting point is to see how the system resources are being used and which processes and/or files are using them.

Both Diskeeper® and SSDkeeper® contain a lesser known feature to assist you on this. It is called the System Monitoring Report which can show you how the CPU and I/O resources are being utilized, then digging down a bit deeper, which processes or files are using them.

Under Reports on the Main Menu, the System Monitoring Report provides you with data on the system’s CPU usage and I/O Activity.

 

The CPU Usage report takes the average CPU usage from the past 7 days, then provides a graph of the hourly usage on an average day. You can then see at which times the CPU resources are being hit the most and by how much.

Digging down some more, you can then see which processes utilized the most CPU resources.

 

The Disk I/O Activity report takes the average disk I/O activity from the past 7 days, then provides a graph of the hourly activity on an average day. You can then determine at which times the I/O activity is the highest.

Digging down some more, you can then see which processes utilized the I/O resources the most, plus what processes are causing the most split (extra) I/Os.

 

You can also see which file types have the highest I/O utilization as well as those causing the most split (extra) I/Os.  This can help indicate what files and related processes are causing this type of extra I/O activity.

 

So, if you are trying to see how your system is being used, maybe for performance issues, this report gives you a quick and easy look on how the CPU and Disk I/O resources are being used on your system and what processes and file types are using them. This along with some other Microsoft Utilities, like Task Manager and Performance Monitor can help you tune your system for optimum performance.

Dashboard Analytics 13 Metrics and Why They Matter

by Rick Cadruvi, Chief Architect 11. July 2018 09:12

 

Our latest V-locity®, Diskeeper® and SSDkeeper® products include a built-in dashboard that reports the benefits our software is providing.  There are tabs in the dashboard that allow users to view very granular data that can help them assess the impact of our software.  In the dashboard Analytics tab we display hourly data for 13 key metrics.  This document describes what those metrics are and why we chose them as key to understanding your storage performance, which directly translates to your application performance.

To start with, let’s spend a moment  trying to understand why 24-hour graphs matter.  When you, and/or your users really notice bottlenecks is generally during peak usage periods.  While some servers are truly at peak usage 24x7,  most systems, including servers, have peak I/O periods.  These almost always follow peak user activity.  

Sometimes there will be spikes also in the overnight hours when you are doing backups, virus scans, large report/data maintenance jobs, etc.  While these may not be your major concern, some of our customers find that these overlap their daytime production and therefore can easily be THE major source of concern.  For some people, making these happen before the deluge of daytime work starts, is the single biggest factor they deal with.

Regardless of what causes the peaks, it is at those peak moments when performance matters most.  When little is happening, performance rarely matters.  When a lot is happening, it is key.  The 24-hour graphs allow you to visually see the times when performance matters to you.  You can also match metrics during specific hours to see where the bottlenecks are and what technologies of ours are most effective during those hours. 

Let’s move on to the actual metrics.

 

Total I/Os Eliminated

 

Total I/Os eliminated measures the number of I/Os that would have had to go through to storage if our technologies were not eliminating them before they ever got sent to storage.  We eliminate I/Os in one of two ways.  First, via our patented IntelliMemory® technology, we satisfy I/Os from memory without the request ever going out to the storage device.  Second, several of our other technologies, such as IntelliWrite® cause the data to be stored more efficiently and densely so that when data is requested, it takes less I/Os to get the same amount of data as would otherwise be required.  The net effect is that your storage subsystems see less actual I/Os sent to them because we eliminated the need for those extra I/Os.  That allows those I/Os that do go to storage to finish faster because they aren’t waiting on the eliminated I/Os to complete.

 

IOPS

IOPS stands for I/Os Per Second.  It is the number of I/OS that you are actually requesting.  During the times with the most activity, I/Os eliminated actually causes this number to be much higher than would be possible with just your storage subsystem.  It is also a measure of the total amount of work your applications/systems are able to accomplish.

 

Data from Cache (GB)

Data from cache tells you how much of that total throughput was satisfied directly from cache.  This can be deceiving.  Our caching algorithms are aimed at eliminating a lot of small noisy I/Os that jam up the storage subsystem works.  By not having to process those, the data freeway is wide open.  This is like a freeway with accidents.  Even though the cars have moved to the side, the traffic slows dramatically.  Our cache is like accident avoidance.  It may be just a subset of the total throughput, but you process a LOT more data because you aren’t waiting for those noisy, necessary I/Os that hold your applications/systems back.

Throughput (GB Total)

Throughput is the total amount of data you process and is measured in GigaBytes.  Think of this like a freight train.  The more railcars, the more total freight being shipped.  The higher the throughput, the more work your system is doing.

 

Throughput (MB/Sec)

Throughput is a measure of the total volume of data flowing to/from your storage subsystem.  This metric measures throughput in MegaBytes per second kind of like your speedometer versus your odometer.

I/O Time Saved (seconds)

The I/O Time Saved metric tells you how much time you didn’t have to wait for I/Os to complete because of the physical I/Os we eliminated from going to storage.  This can be extremely important during your busiest times.  Because I/O requests overlap across multiple processes and threads, this time can actually be greater than elapsed clock time.  And what that means to you is that the total amount of work that gets done can actually experience a multiplier effect because systems and applications tend to multitask.  It’s like having 10 people working on sub-tasks at the same time.  The projects finish much faster than if 1 person had to do all the tasks for the project by themselves.  By allowing pieces to be done by different people and then just plugging them altogether you get more done faster.  This metric measures that effect.

 

I/O Response Time

I/O Response time is sometimes referred to as Latency.  It is how long it takes for I/Os to complete.  This is generally measured in milliseconds.  The lower the number, the better the performance.

Read/Write %

Read/Write % is the percentage of Reads to Writes.  If it is at 75%, 3 out of every 4 I/Os are Reads to each Write.  If it were 25%, then it would signify that there are 3 Writes per each Read.

 

Read I/Os Eliminated

This metric tells you how many Read I/Os we eliminated.  If your Read to Write ratio is very high, this may be one of the most important metrics for you.  However, remember that eliminating Writes means that Reads that do go to storage do NOT have to wait for those writes we eliminated to complete.  That means they finish faster.  Of course, the same is true that Reads eliminated improves overall Read performance.

% Read I/Os Eliminated

 

% Read I/Os Eliminated tells you what percentage of your overall Reads were eliminated from having to be processed at all by your storage subsystem.

 

Write I/Os Eliminated

This metric tells you how many Write I/Os we eliminated.  This is due to our technologies that improve the efficiency and density of data being stored by the Windows NTFS file system.

% Write I/Os Eliminated 

 

% Write I/Os Eliminated tells you what percentage of your overall Writes were eliminated from having to be processed at all by your storage subsystem.

Fragments Prevented and Eliminated

Fragments Prevented and Eliminated gives you an idea of how we are causing data to be stored more efficiently and dense, thus allowing Windows to process the same amount of data with far fewer actual I/Os.

If you have our latest versions of V-locity, Diskeeper or SSDkeeper installed, you can open the Dashboard now and select the Analytics tab and see all of these metrics.

If you don’t have the latest version installed and you have a current maintenance agreement, login to your online account to download and install the software.

Not a customer yet and want to checkout these dashboard metrics, download a free trial at www.condusiv.com/try.

Solving the IO Blender Effect with Software-Based Caching

by Spencer Allingham 5. July 2018 07:30

First, let me explain exactly what the IO Blender Effect is, and why it causes a problem in virtualized environments such as those from VMware or Microsoft’s Hyper-V.



This is typically what storage IO traffic would look like when everything is working well. You have the least number of storage IO packets, each carrying a large payload of data down to the storage. Because the data is arriving in large chunks at a time, the storage controller has the opportunity to create large stripes across its media, using the least number of storage-level operations before being able to acknowledge that the write has been successful.



Unfortunately, all too often the Windows Write Driver is forced to split data that it’s writing into many more, much smaller IO packets. These split IO situations cause data to be transferred far less efficiently, and this adds overhead to each write and subsequent read. Now that the storage controller is only receiving data in much smaller chunks at a time, it can only create much smaller stripes across its media, meaning many more storage operations are required to process each gigabyte of storage IO traffic.


This is not only true when writing data, but also if you need to read that data back at some later time.

But what does this really mean in real-world terms?

It means that an average gigabyte of storage IO traffic that should take perhaps 2,000 or 3,000 storage IO packets to complete, is now taking 30,000, or 40,000 storage IO packets instead. The data transfer has been split into many more, much smaller, fractured IO packets. Each storage IO operation that has to be generated takes a measurable amount of time and system resource to process, and so this is bad for performance! It will cause your workloads to run slower than they should, and this will worsen over time unless you perform some time and resource-costly maintenance.

So, what about the IO Blender Effect?

Well, the IO Blender Effect can amplify the performance penalty (or Windows IO Performance Tax) in a virtualized environment. Here’s how it works…

 

As the small, fractured IO traffic from several virtual machines passes through the physical host hypervisor (Hyper-V server or VMware ESX server), the hypervisor acts like a blender. It mixes these IO streams, which causes a randomization of the storage IO packets, before sending out what is now a chaotic mess of small, fractured and now very random IO streams out to the storage controller.

It doesn’t matter what type of storage you have on the back-end. It could be direct attached disks in the physical host machine, or a Storage Area Network (SAN), this type of storage IO profile couldn’t be less storage-friendly.

The storage is now only receiving data in small chunks at a time, and won’t understand the relationship between the packets, so it now only has the opportunity to create very small stripes across its media, and that unfortunately means many more storage operations are required before it can send an acknowledgement of the data transfer back up to the Windows operating system that originated it.

How can RAM caching alleviate the problem?

 

Firstly, to be truly effective the RAM caching needs to be done at the Windows operating system layer. This provides the shortest IO path for read IO requests that can be satisfied from server-side RAM, provisioned to each virtual machine. By satisfying as many “Hot Reads” from RAM as possible, you now have a situation where not only are those read requests being satisfied faster, but those requests are now not having to go out to storage. That means less storage IO packets for the hypervisor to blend.

Furthermore, the V-locity® caching software from Condusiv Technologies also employs a patented technology called IntelliWrite®. This intelligently helps the Windows Write Driver make better choices when writing data out to disk, which avoids many of the split IO situations that would then be made worse by the IO Blender Effect. You now get back to that ideal situation of healthy IO; large, sequential writes and reads.

Is RAM caching a disruptive solution?

 

No! Not at all, if done properly.

Condusiv’s V-locity software for virtualised environments is completely non-disruptive to live, running workloads such as SQL Servers, Microsoft Dynamics, Business Information (BI) solutions such as IBM Cognos, or other important workloads such as SAP, Oracle and the such.

In fact, all you need to do to test this for yourself is download a free trialware copy from:

www.condusiv.com/try

Just install it! There are no reboots required, and it will start working in just a couple of minutes. If you decide that it isn’t for you, then uninstall it just as easily. No reboots, no disruption!


Help! I hit “Save” instead of “Save As”!!

by Gary Quan 19. June 2018 06:30

Need to get back to a previous version of a Microsoft Office file before the changes you just made?  Undelete has you covered with its Versioning feature.

Have you or your users ever made some changes to a Word document, Excel spreadsheet, or a PowerPoint presentation, saved it and then realized later that what was saved did not contain the previous work? For example, and a true story, a CEO was working on a PowerPoint file he needed for a Board of Directors presentation that afternoon. He had worked about 4 hours that morning making changes and he was careful to periodically save the changes as he worked. The trouble was the last changes he saved had a large part of his previous changes accidentally overwritten.  The CEO then panicked as he just lost a majority of the 4 hours of work he just put in and was not sure he could redo it in time for his presentation deadline. He immediately called up his IT Manager who indicated the nightly backups would not help as they would not contain any of the changes he made that morning. The IT Manager then remembered he had Undelete installed on this file server. This was mainly to recover accidentally deleted files, but he recalled a Versioning feature that would allow recovery of previous versions of Microsoft Office files. He was then able use Undelete to retrieve the previous version of the CEO’s PowerPoint presentation and recover the work he did that morning. The CEO was extremely happy, and the IT Manager was a ‘hero’ to the CEO!

Another very common scenario is users making edits to original files and then selecting “Save” instead of “Save As” and then the original files are now gone. As an example, a customer had a budget file in Excel and several people had accessed it throughout the day. At some point, someone had inadvertently made multiple changes to it for his department, including deleting sections that were not relevant to his department all the while thinking he was working in his own Save-As copy. Boy, were the other department heads upset! The way our IT Admin customer tells the story it sounded like a riot was about to erupt! Well, he swoops in just in time and recovers the earlier version in minutes and saves the day. We hear stories daily about Word document overwrites that IT Admins are able to recover the previous versions of in just a few minutes, saving users hours of having to recreate their work.

While the most popular functionality of Undelete is the ability to recover accidentally deleted files instantly with the click of a mouse, the Undelete Versioning feature is certainly the runner up, so we wanted to remind users, or prospective users, that it’s also here to save the day for you, too.

The Undelete Versioning feature will automatically save the previous versions of specific file types, including Microsoft Office files. The default is to save the last 5 versions, but this is settable.  Undelete then allows you to see what and when versions were saved and are then easily recoverable. A vital data protection feature to have.

If you already have Undelete Server installed on your file servers, check out the Versioning feature. If you have any of your own “hero” stories you would like to share, email custinfo@condusiv.com

If you don’t have Undelete Server or Undelete Pro yet, you can purchase them from your favorite online reseller or you can buy online from our store http://www.condusiv.com/purchase/Undelete/

 

Tags:

Data Protection | Data Recovery | File Recovery | General | Success Stories | Undelete

How to Improve Application Performance by Decreasing Disk Latency like an IT Engineer

by Spencer Allingham 13. June 2018 06:49

You might be responsible for a busy SQL server, for example, or a Web Server; perhaps a busy file and print server, the Finance Department's systems, documentation management, CRM, BI, or something else entirely.

Now, think about WHY these are the workloads that you care about the most?

 

Were YOU responsible for installing the application running the workload for your company? Is the workload being run business critical, or considered TOO BIG TO FAIL?

Or is it simply because users, or even worse, customers, complain about performance?

 

If the last question made you wince, because you know that YOU are responsible for some of the workloads running in your organisation that would benefit from additional performance, please read on. This article is just for you, even if you don't consider yourself a "Techie".

Before we get started, you should know that there are many variables that can affect the performance of the applications that you care about the most. The slowest, most restrictive of these is referred to as the "Bottleneck". Think of water being poured from a bottle. The water can only flow as fast as the neck of the bottle, the 'slowest' part of the bottle.

Don't worry though, in a computer the bottleneck will pretty much always fit into one of the following categories:

•           CPU

•           DISK

•           MEMORY

•           NETWORK

The good news is that if you're running Windows, it is usually very easy to find out which one the bottleneck is in, and here is how to do it (like an IT Engineer):

 •          Open Resource Monitor by clicking the Start menu, typing "resource monitor", and pressing Enter. Microsoft includes this as part of the Windows operating system and it is already installed.

 •          Do you see the graphs in the right-hand pane? When your computer is running at peak load, or users are complaining about performance, which of the graphs are 'maxing out'?

This is a great indicator of where your workload's bottleneck is to be found.         

 

SO, now you have identified the slowest part of your 'compute environment' (continue reading for more details), what can you do to improve it?

The traditional approach to solving computer performance issues has been to throw hardware at the solution. This could be treating yourself to a new laptop, or putting more RAM into your workstation, or on the more extreme end, buying new servers or expensive storage solutions.

BUT, how do you know when it is appropriate to spend money on new or additional hardware, and when it isn't. Well the answer is; 'when you can get the performance that you need', with the existing hardware infrastructure that you have already bought and paid for. You wouldn't replace your car, just because it needed a service, would you?

Let's take disk speed as an example.  Let’s take a look at the response time column in Resource Monitor. Make sure you open the monitor to full screen or large enough to see the data.  Then open the Disk Activity section so you can see the Response Time column.  Do it now on the computer you're using to read this. (You didn't close Resource Monitor yet, did you?) This is showing the Disk Response Time, or put another way, how long is the storage taking to read and write data? Of course, slower disk speed = slower performance, but what is considered good disk speed and bad?

To answer that question, I will refer to a great blog post by Scott Lowe, that you can read here...

https://www.techrepublic.com/blog/the-enterprise-cloud/use-resource-monitor-to-monitor-storage-performance/

In it, the author perfectly describes what to expect from faster and slower Disk Response Times:

"Response Time (ms). Disk response time in milliseconds. For this metric, a lower number is definitely better; in general, anything less than 10 ms is considered good performance. If you occasionally go beyond 10 ms, you should be okay, but if the system is consistently waiting more than 20 ms for response from the storage, then you may have a problem that needs attention, and it's likely that users will notice performance degradation. At 50 ms and greater, the problem is serious."

Hopefully when you checked on your computer, the Disk Response Time is below 20 milliseconds. BUT, what about those other workloads that you were thinking about earlier. What's the Disk Response Times on that busy SQL server, the CRM or BI platform, or those Windows servers that the users complain about?

If the Disk Response Times are often higher than 20 milliseconds, and you need to improve the performance, then it's choice time and there are basically two options:

           In my opinion as an IT Engineer, the most sensible option is to use storage workload reduction software like Diskeeper for physical Windows computers, or V-locity for virtualised Windows computers. These will reduce Disk Storage Times by allowing a good percentage of the data that your applications need to read, to come from a RAM cache, rather than slower disk storage. This works because RAM is much faster than the media in your disk storage. Best of all, the only thing you need to do to try it, is download a free copy of the 30 day trial. You don't even have to reboot the computer; just check and see if it is able to bring the Disk Response Times down for the workloads that you care about the most.

           If you have tried the Diskeeper or V-locity software, and you STILL need faster disk access, then, I'm afraid, it's time to start getting quotations for new hardware. It does make sense though, to take a couple of minutes to install Diskeeper or V-locity first, to see if this step can be avoided. The software solution to remove storage inefficiencies is typically a much more cost-effective solution than having to buy hardware!

Visit www.condusiv.com/try to download Diskeeper and V-locity now, for your free trial.

 

RecentComments

Comment RSS

Month List

Calendar

<<  September 2018  >>
MoTuWeThFrSaSu
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

View posts in large calendar