Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Cultech Limited Solves ERP and SQL Troubles with Diskeeper 18 Server

by Spencer Allingham 8. October 2018 09:11

Before discovering Diskeeper®, Cultech Limited experienced sluggish ERP and SQL performance, unnecessary downtime, and lost valuable hours each day troubleshooting issues related to Windows write inefficiencies.

As an internationally recognized innovator and premium quality manufacturer within the nutritional supplement industry, the usual troubleshooting approaches just weren’t cutting it. “We were running a very demanding ERP system on legacy servers and network. A hardware refresh was the first step in troubleshooting our issues. As much as we did see some improvement, it did not solve the daily breakdowns associated with our Sage ERP,” said Rob, IT Manager, Cultech Limited.

After upgrading the network and replacing ERP and SQL servers and not seeing much improvement, Rob further dug into troubleshooting approaches and SQL optimizations. With months of troubleshooting and SQL optimizations and no relief, Rob continued to research and find a way to improve performance issues, knowing that Cultech could not continue to interrupt productivity multiple times a day to fix corrupted records. As Rob explains, “I was on support calls with Sage literally day and night to solve issues that occurred daily. Files would not write properly to the database, and I would have to go through the tedious process of getting all users to logout of Sage then manually correct the problem – a 25-min exercise. That might not be a big deal every so often, but I found myself doing this 3-4 times a day at times.”

In doing his research, Rob found Condusiv’s® Diskeeper Server and decided to give it a try after reading customer testimonials on how it had solved similar performance issues. To Cultech’s surprise, after just 24-hours of being installed, they were no longer calling Sage support. “I installed Diskeeper and crossed my fingers, hoping it would solve at least some of our problems. It didn’t just solve some problems, it solved all of our problems. I was calling Sage support daily then suddenly I wasn’t calling them at all,” said Rob. Problems that Rob was having to fix outside of production hours had been solved thanks to Diskeeper’s ability to prevent fragmentation from occurring. And in addition to recouping hours a day of downtime during production hours, Cultech was now able to focus this time and energy on innovation and producing quality products.

“Now that we have Diskeeper optimizing our Sage servers and SQL servers, we have it running on our other key systems to ensure peak performance and optimum reliability. Instead of considering Windows write inefficiencies as a culprit after trying all else, I would encourage administrators to think of it first,” said Rob.

Read the full case study                        Download 30-day trial

How to make NVMe storage even faster

by Spencer Allingham 4. September 2018 07:21

This is a blog to complement a vlog that I posted a few weeks ago, in which I demonstrated how to use the intelligent RAM caching technology found in the V-locity® software from Condusiv® Technologies to improve the performance that a computer can get from NVMe flash storage. You can view this video here:

 

 A question arose from a couple of long-term customers about whether the use of the V-locity software was still relevant if they started utilizing very fast, flash storage solutions. This was a fair question!

The V-locity software is designed to reduce the amount of unnecessary storage I/O traffic that actually has to go out and be processed by the underlying disk storage layer. It not only reduces the amount of I/O traffic, but it optimizes that which DOES have to go out to disk, and moreover, it further reduces the workload on the storage layer by employing a very intelligent RAM caching strategy.

So, given that flash storage, whilst not only becoming more prevalent in today’s compute environments, can process storage I/O traffic VERY fast when compared to its spinning disk counterparts, and is capable of processing more I/Os per Second (IOPS) than ever before, the very sensible question was this:


"Can the use of Condusiv's V-locity software provide a significant performance increase when using very fast flash storage?"


As I was fortunate to have recently implemented some flash storage in my workstation, I was keen to run an experiment to find out.


SPOILER ALERT: For those of you who just want to have

the question answered, the answer is a resounding YES!

The test showed beyond doubt that with Condusiv’s V-locity software installed, your Windows computer has the ability to process significantly more I/Os per Second, process a much higher throughput of data, and allow the storage I/O heavy workloads running in computers the opportunity to get significantly more work done in the same amount of time – even when using very fast flash storage.

 

For those of you true ‘techies’ that are as geeky as me, read on, and I will detail the testing methodology and results in more detail. 

The storage that I now had in my workstation (and am still happily using!) was a 1 terabyte SM961 POLARIS M.2-2280 PCI-E 3.0 X 4 NVMe solid state drive (SSD).

 

 Is it as fast as it’s made out to be? Well, in this engineer’s opinion – OMG YES!

 

It makes one hell of a difference, when compared to spinning disk drives. This is in part because it’s connected to the computer via a PCI Express (PCIe) bus, as opposed to a SATA bus. The bus is what you connect your disk to in the computer, and different types of buses have different capabilities, such as the speed at which data can be transferred. SATA-connected disks are significantly slower than today’s PCIe-connected storage using an NVMe device interface. There is a great Wiki about this here if you want to read more: 

https://en.wikipedia.org/wiki/NVM_Express

 

To give you an idea of the improvement though, consider that the Advanced Host Controller Interface (AHCI) that is used with the SATA connected disks has one command queue, in which it can process 32 commands. That’s up to 32 storage requests at a time, and that was okay for spinning disk technology, because the disks themselves could only cope with a certain number of storage requests at a time.

NVMe on the other hand doesn’t have one command queue, it has 65,535 queues. AND, each of those command queues can themselves accommodate 65,536 commands. That’s a lot more storage requests that can be processed at the same time! This is really important, because flash storage is capable of processing MANY more storage requests in parallel than its spinning disk cousins. Quite simply NVMe was needed to really make the most of what flash disk hardware can do. You wouldn’t put a kitchen tap (faucet) on the end of a fire hose and expect the same amount of water to flow through it, right? Same principle!

As you can probably tell, I’m quite excited by this boost in storage performance. (I’m strange like that!) And, I know I’m getting a little off topic (apologies), so back to the point!

I had this SUPER-FAST storage solution and needed to prove one way or another if Condusiv’s V-locity software could increase the ability of my computer to process even more workload.

Would my computer be able to process more storage I/Os per Second?

Would my computer be able to process a larger amount of storage I/O traffic (megabytes) every second?

 

Testing Methodology

To answer these questions, I took a virtual machine, and cloned it so that I had two virtual machines that were as identical as I could make them. I then installed Condusiv’s V-locity software on both and disabled V-locity on one of the machines, so that it would process storage I/O traffic, just as if V-locity wasn’t installed.

To generate a storage I/O traffic workload, I turned to my old friend IOMETER. For those of you who might not know IOMETER, this is a software utility originally designed by Intel, but is now open source and available at SourceForge.net. It is designed as an I/O subsystem measurement tool and is great for generating I/O workloads of different types (very customizable!), and measure how quickly that I/O workload can be processed. Great for testing networks or in this case, how fast you can process storage I/O traffic.

I configured IOMETER on both machines with the type of workload that one might find on a typical SQL database server. I KNOW, I know, there is no such thing as a ‘typical’ SQL database, but I wanted a storage I/O profile that was as meaningful as possible, rather than a workload that would just make V-locity look good. Here is the actual IOMETER configuration:

Worker 1 – 16 kilobyte I/O requests, 100% random, 33% Write / 67% Read

Worker 2 – 64 kilobyte I/O requests, 100% random, 33% Write / 67% Read

Test Results

V-locity Disabled

 

V-locity Enabled

 

Summary

 

 

Conclusion

 

In this lab test, the presence of V-locity reduced the average amount of time required to process storage I/O requests by around 65%, allowing a great amount of storage I/O requests to be processed per second and a greater amount of data to be transferred.

To prove beyond doubt that it was indeed V-locity that caused the additional storage I/O traffic to be processed, I stopped the V-locity service. This immediately ‘turned off’ all of the RAM caching and other optimization engines that V-locity was providing, and the net result was that the IOPS and throughput dropped to normal as the underlying storage had to start processing ALL of the storage traffic that IOMETER was generating.

What value is there to reducing storage I/O traffic?

The more you can reduce storage I/O traffic that has to go out and be processed by your disk storage, the more storage I/O headroom you are handing back to your environment for use by additional workloads. It means that your current disk storage can now cope with:

·       - More computers sharing the storage. Great if you have a Storage Area Network (SAN) underpinning your virtualized environment, for example. More VMs running!

 

·       - More users accessing and manipulating the shared storage. The more users you have, the more storage I/O traffic is likely to be generated.

·       - Greater CPU utilization. CPU speeds and processing capacity keeps increasing. Now that the processing power is typically much more than typical needs, V-locity can help your applications become more productive and use more of that processing power by not having to wait so much on the disk storage layer.

 

If you can achieve this without having to replace or upgrade your storage hardware, it not only increases the return on your current storage hardware investment, but also might allow you to keep that storage running for a longer period of time (if you’re not on a fixed refresh cycle).

Sweat the storage asset!

(I hate that term, but you get the idea)

When you do finally need to replace your current storage, perhaps it won’t be as costly as you thought because you’re not having to OVER-PROVISION the storage as much, to cope with all of the excess, unnecessary storage traffic that Condusiv’s V-locity software can eliminate.

I typically see a storage traffic reduction of at least 25% at customer sites.

AND, I haven’t even mentioned the performance boost that many workloads receive from the RAM caching technology provided by Condusiv’s V-locity software. It is worth remembering that as fast as today’s flash storage solutions are, the RAM that you have in your computers is faster! The greater the percentage of read I/O traffic that you can satisfy from RAM instead of the storage layer, the better performing those storage I/O-hungry applications are likely to be.

What type of applications benefit the most?

In the real world, V-locity is not a silver-bullet for all types of workloads, and I wouldn’t insult your intelligence by saying that it was. If you have some workloads that don’t generate a great deal of storage I/O traffic, perhaps a DNS server, or DHCP server, well, V-locity isn’t likely to make a huge difference. That’s my honest opinion as an IT Engineer.

HOWEVER, if you are using storage I/O-hungry applications, then you really should give it a try.

Here are just some examples of the workloads that thousands of V-locity customers are ‘performance-boosting’ with Condusiv’s I/O reduction and RAM caching technologies:

  • -Database solutions such as Microsoft SQL Server, Oracle, MySQL, SQL Express, and others.
  • -Virtualization solution such as Microsoft Hyper-V and VMware.
  • -Enterprise Resource Planning (ERP) solutions like Epicor.
  • -Business Intelligence (BI) solutions like IBM Cognos.
  • -Finance and payroll solutions like SAGE Accounting.
  • -Electronic Health Records (EHR) solutions, such as MEDITECH 
  • -Customer Relationship Management (CRM) solutions, such as Microsoft Dynamics.
  • -Learning Management Systems (LMS Solutions.
  • -Not to mention email servers like Microsoft Exchange AND busy file servers.

 

 

Do you use any of these in your IT environment?

 

There are case studies on the Condusiv web site for all of these workload types (and more), here:

http://www.condusiv.com/knowledge-center/case-studies/default.aspx

 

Try it for yourself

You can experience the full power of Condusiv’s V-locity software for yourself, in YOUR Windows environment within a couple of minutes. Just go to www.condusiv.com/try and get a copy of the fully-featured 30-day trialware. You can check the dashboard reporting after a week or two and see just how much storage I/O traffic has been eliminated, and more importantly, how much storage time has been saved by doing do.

It really is that simple!

You don’t even need to reboot to make the software work. There is no disruption to live running workloads; you can just install and uninstall at will, and it only takes a minute or so.


You will typically start seeing results just minutes after installing.

I hope that this has been interesting and helpful. If you have any questions about the technologies within V-locity or have any questions about testing, feel free to email me directly at sallingham@condusiv.co.uk.

 

I will be delighted to hear from you!

 

 

Solving the IO Blender Effect with Software-Based Caching

by Spencer Allingham 5. July 2018 07:30

First, let me explain exactly what the IO Blender Effect is, and why it causes a problem in virtualized environments such as those from VMware or Microsoft’s Hyper-V.



This is typically what storage IO traffic would look like when everything is working well. You have the least number of storage IO packets, each carrying a large payload of data down to the storage. Because the data is arriving in large chunks at a time, the storage controller has the opportunity to create large stripes across its media, using the least number of storage-level operations before being able to acknowledge that the write has been successful.



Unfortunately, all too often the Windows Write Driver is forced to split data that it’s writing into many more, much smaller IO packets. These split IO situations cause data to be transferred far less efficiently, and this adds overhead to each write and subsequent read. Now that the storage controller is only receiving data in much smaller chunks at a time, it can only create much smaller stripes across its media, meaning many more storage operations are required to process each gigabyte of storage IO traffic.


This is not only true when writing data, but also if you need to read that data back at some later time.

But what does this really mean in real-world terms?

It means that an average gigabyte of storage IO traffic that should take perhaps 2,000 or 3,000 storage IO packets to complete, is now taking 30,000, or 40,000 storage IO packets instead. The data transfer has been split into many more, much smaller, fractured IO packets. Each storage IO operation that has to be generated takes a measurable amount of time and system resource to process, and so this is bad for performance! It will cause your workloads to run slower than they should, and this will worsen over time unless you perform some time and resource-costly maintenance.

So, what about the IO Blender Effect?

Well, the IO Blender Effect can amplify the performance penalty (or Windows IO Performance Tax) in a virtualized environment. Here’s how it works…

 

As the small, fractured IO traffic from several virtual machines passes through the physical host hypervisor (Hyper-V server or VMware ESX server), the hypervisor acts like a blender. It mixes these IO streams, which causes a randomization of the storage IO packets, before sending out what is now a chaotic mess of small, fractured and now very random IO streams out to the storage controller.

It doesn’t matter what type of storage you have on the back-end. It could be direct attached disks in the physical host machine, or a Storage Area Network (SAN), this type of storage IO profile couldn’t be less storage-friendly.

The storage is now only receiving data in small chunks at a time, and won’t understand the relationship between the packets, so it now only has the opportunity to create very small stripes across its media, and that unfortunately means many more storage operations are required before it can send an acknowledgement of the data transfer back up to the Windows operating system that originated it.

How can RAM caching alleviate the problem?

 

Firstly, to be truly effective the RAM caching needs to be done at the Windows operating system layer. This provides the shortest IO path for read IO requests that can be satisfied from server-side RAM, provisioned to each virtual machine. By satisfying as many “Hot Reads” from RAM as possible, you now have a situation where not only are those read requests being satisfied faster, but those requests are now not having to go out to storage. That means less storage IO packets for the hypervisor to blend.

Furthermore, the V-locity® caching software from Condusiv Technologies also employs a patented technology called IntelliWrite®. This intelligently helps the Windows Write Driver make better choices when writing data out to disk, which avoids many of the split IO situations that would then be made worse by the IO Blender Effect. You now get back to that ideal situation of healthy IO; large, sequential writes and reads.

Is RAM caching a disruptive solution?

 

No! Not at all, if done properly.

Condusiv’s V-locity software for virtualised environments is completely non-disruptive to live, running workloads such as SQL Servers, Microsoft Dynamics, Business Information (BI) solutions such as IBM Cognos, or other important workloads such as SAP, Oracle and the such.

In fact, all you need to do to test this for yourself is download a free trialware copy from:

www.condusiv.com/try

Just install it! There are no reboots required, and it will start working in just a couple of minutes. If you decide that it isn’t for you, then uninstall it just as easily. No reboots, no disruption!


How to Improve Application Performance by Decreasing Disk Latency like an IT Engineer

by Spencer Allingham 13. June 2018 06:49

You might be responsible for a busy SQL server, for example, or a Web Server; perhaps a busy file and print server, the Finance Department's systems, documentation management, CRM, BI, or something else entirely.

Now, think about WHY these are the workloads that you care about the most?

 

Were YOU responsible for installing the application running the workload for your company? Is the workload being run business critical, or considered TOO BIG TO FAIL?

Or is it simply because users, or even worse, customers, complain about performance?

 

If the last question made you wince, because you know that YOU are responsible for some of the workloads running in your organisation that would benefit from additional performance, please read on. This article is just for you, even if you don't consider yourself a "Techie".

Before we get started, you should know that there are many variables that can affect the performance of the applications that you care about the most. The slowest, most restrictive of these is referred to as the "Bottleneck". Think of water being poured from a bottle. The water can only flow as fast as the neck of the bottle, the 'slowest' part of the bottle.

Don't worry though, in a computer the bottleneck will pretty much always fit into one of the following categories:

•           CPU

•           DISK

•           MEMORY

•           NETWORK

The good news is that if you're running Windows, it is usually very easy to find out which one the bottleneck is in, and here is how to do it (like an IT Engineer):

 •          Open Resource Monitor by clicking the Start menu, typing "resource monitor", and pressing Enter. Microsoft includes this as part of the Windows operating system and it is already installed.

 •          Do you see the graphs in the right-hand pane? When your computer is running at peak load, or users are complaining about performance, which of the graphs are 'maxing out'?

This is a great indicator of where your workload's bottleneck is to be found.         

 

SO, now you have identified the slowest part of your 'compute environment' (continue reading for more details), what can you do to improve it?

The traditional approach to solving computer performance issues has been to throw hardware at the solution. This could be treating yourself to a new laptop, or putting more RAM into your workstation, or on the more extreme end, buying new servers or expensive storage solutions.

BUT, how do you know when it is appropriate to spend money on new or additional hardware, and when it isn't. Well the answer is; 'when you can get the performance that you need', with the existing hardware infrastructure that you have already bought and paid for. You wouldn't replace your car, just because it needed a service, would you?

Let's take disk speed as an example.  Let’s take a look at the response time column in Resource Monitor. Make sure you open the monitor to full screen or large enough to see the data.  Then open the Disk Activity section so you can see the Response Time column.  Do it now on the computer you're using to read this. (You didn't close Resource Monitor yet, did you?) This is showing the Disk Response Time, or put another way, how long is the storage taking to read and write data? Of course, slower disk speed = slower performance, but what is considered good disk speed and bad?

To answer that question, I will refer to a great blog post by Scott Lowe, that you can read here...

https://www.techrepublic.com/blog/the-enterprise-cloud/use-resource-monitor-to-monitor-storage-performance/

In it, the author perfectly describes what to expect from faster and slower Disk Response Times:

"Response Time (ms). Disk response time in milliseconds. For this metric, a lower number is definitely better; in general, anything less than 10 ms is considered good performance. If you occasionally go beyond 10 ms, you should be okay, but if the system is consistently waiting more than 20 ms for response from the storage, then you may have a problem that needs attention, and it's likely that users will notice performance degradation. At 50 ms and greater, the problem is serious."

Hopefully when you checked on your computer, the Disk Response Time is below 20 milliseconds. BUT, what about those other workloads that you were thinking about earlier. What's the Disk Response Times on that busy SQL server, the CRM or BI platform, or those Windows servers that the users complain about?

If the Disk Response Times are often higher than 20 milliseconds, and you need to improve the performance, then it's choice time and there are basically two options:

           In my opinion as an IT Engineer, the most sensible option is to use storage workload reduction software like Diskeeper for physical Windows computers, or V-locity for virtualised Windows computers. These will reduce Disk Storage Times by allowing a good percentage of the data that your applications need to read, to come from a RAM cache, rather than slower disk storage. This works because RAM is much faster than the media in your disk storage. Best of all, the only thing you need to do to try it, is download a free copy of the 30 day trial. You don't even have to reboot the computer; just check and see if it is able to bring the Disk Response Times down for the workloads that you care about the most.

           If you have tried the Diskeeper or V-locity software, and you STILL need faster disk access, then, I'm afraid, it's time to start getting quotations for new hardware. It does make sense though, to take a couple of minutes to install Diskeeper or V-locity first, to see if this step can be avoided. The software solution to remove storage inefficiencies is typically a much more cost-effective solution than having to buy hardware!

Visit www.condusiv.com/try to download Diskeeper and V-locity now, for your free trial.

 

RecentComments

Comment RSS

Month List

Calendar

<<  December 2018  >>
MoTuWeThFrSaSu
262728293012
3456789
10111213141516
17181920212223
24252627282930
31123456

View posts in large calendar