Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Cultech Limited Solves ERP and SQL Troubles with Diskeeper 18 Server

by Spencer Allingham 8. October 2018 09:11

Before discovering Diskeeper®, Cultech Limited experienced sluggish ERP and SQL performance, unnecessary downtime, and lost valuable hours each day troubleshooting issues related to Windows write inefficiencies.

As an internationally recognized innovator and premium quality manufacturer within the nutritional supplement industry, the usual troubleshooting approaches just weren’t cutting it. “We were running a very demanding ERP system on legacy servers and network. A hardware refresh was the first step in troubleshooting our issues. As much as we did see some improvement, it did not solve the daily breakdowns associated with our Sage ERP,” said Rob, IT Manager, Cultech Limited.

After upgrading the network and replacing ERP and SQL servers and not seeing much improvement, Rob further dug into troubleshooting approaches and SQL optimizations. With months of troubleshooting and SQL optimizations and no relief, Rob continued to research and find a way to improve performance issues, knowing that Cultech could not continue to interrupt productivity multiple times a day to fix corrupted records. As Rob explains, “I was on support calls with Sage literally day and night to solve issues that occurred daily. Files would not write properly to the database, and I would have to go through the tedious process of getting all users to logout of Sage then manually correct the problem – a 25-min exercise. That might not be a big deal every so often, but I found myself doing this 3-4 times a day at times.”

In doing his research, Rob found Condusiv’s® Diskeeper Server and decided to give it a try after reading customer testimonials on how it had solved similar performance issues. To Cultech’s surprise, after just 24-hours of being installed, they were no longer calling Sage support. “I installed Diskeeper and crossed my fingers, hoping it would solve at least some of our problems. It didn’t just solve some problems, it solved all of our problems. I was calling Sage support daily then suddenly I wasn’t calling them at all,” said Rob. Problems that Rob was having to fix outside of production hours had been solved thanks to Diskeeper’s ability to prevent fragmentation from occurring. And in addition to recouping hours a day of downtime during production hours, Cultech was now able to focus this time and energy on innovation and producing quality products.

“Now that we have Diskeeper optimizing our Sage servers and SQL servers, we have it running on our other key systems to ensure peak performance and optimum reliability. Instead of considering Windows write inefficiencies as a culprit after trying all else, I would encourage administrators to think of it first,” said Rob.

Read the full case study                        Download 30-day trial

Why Faster Storage May NOT Fix It

by Rick Cadruvi, Chief Architect 20. September 2018 04:58

 

With all the myriad of possible hardware solutions to storage I/O performance issues, the question that people are starting to ask is something like:

         If I just buy newer, faster Storage, won’t that fix my application performance problem?

 The short answer is:

         Maybe Yes (for a while), Quite Possibly No.

I know – not a satisfying answer.  For the next couple of minutes, I want to take a 10,000-foot view of just three issues that affect I/O performance to shine some technical light on the question and hopefully give you a more satisfying answer (or maybe more questions) as you look to discover IT truth.  There are other issues, but let’s spend just a moment looking at the following three:

1.     Non-Application I/O Overhead

2.     Data Pipelines

3.     File System Overhead

These three issues by themselves can create I/O bottlenecks causing degradation to your applications of 30-50% or more.

Non-Application I/O Overhead:

One of the most commonly overlooked performance issues is that an awful lot of I/Os are NOT application generated.  Maybe you can add enough DRAM and go to an NVMe direct attached storage model and get your application data cached at an 80%+ rate.  Of course, you still need to process Writes and the NVMe probably makes that a lot faster than what you can do today.  But you still need to get it to the Storage.  And, there are lots of I/Os generated on your system that are not directly from your application.  There’s also lots of application related I/Os that are not targeted for caching – they’re simply non-essential overhead I/Os to manage metadata and such.  People generally don’t think about the management layers of the computer and application that have to perform Storage I/O just to make sure everything can run.  Those I/Os hit the data path to Storage along with the I/Os your application has to send to Storage, even if you have huge caches.  They get in the way and make your Application specific I/Os stall and slow down responsiveness.

And let’s face it, a full Hyper-Converged, NVMe based storage infrastructure sounds great, but there are lots of issues besides the enormous cost with that.  What about data redundancy and localization?  That brings us to issue # 2.

Data Pipelines: 

Since your data is exploding and you’re pushing 100s of Terabytes, perhaps Petabytes and in a few cases maybe even Exabytes of data, you’re not going to get that much data on your one server box, even if you didn’t care about hardware/data failures.  

Like it or not, you have an entire infrastructure of Servers, Switches, SANs, whatever.  Somehow, all that data needs to get to and from the application and wherever it is stored.  And if you add Cloud storage into the mix, it gets worse. At some point the data pipes themselves become the limiting factor.  Even with Converged infrastructures, and software technologies that stage data for you where it is supposedly needed most, data needs to be constantly shipped along a pipe that is nowhere close to the speed of access that your new high-speed storage can handle.  Then add lots of users and applications simultaneously beating on that pipe and you can quickly start to visualize the problem.

If this wasn’t enough, there are other factors and that takes us to issue #3.

File System Overhead:

You didn’t buy your computer to run an operating system.  You bought it to manipulate data.  Most likely, you don’t even really care about the actual application.  You care about doing some kind of work.  Most people use Microsoft Word to write documents.  I did to draft this blog.  But I didn’t really care about using Word.  I cared about writing this blog and Word was something I had, I knew how to use and was convenient for the task.  That’s your application, but manipulating the data is your real conquest.  The application is a tool to allow you to paint a beautiful picture of your data, so you can see it and accomplish your job better.

The Operating System (let’s say Windows), is one of a whole stack of tools between you, your application and your data.  Operating Systems have lots of layers of software to manage the flow from your user to the data and back.  Storage is a BLOB of stuff.  Whether it is spinning hard drives, SSDs, SANs, cloud-based storage, or you name it, it is just a canvas where the data can be stored.  One of the first strokes of the brush that will eventually allow you to create that picture you want from your data is the File System.  It brings some basic order.  You can see this by going into Windows File Explorer and perusing the various folders.  The file system abstracts that BLOB into pieces of data in a hierarchical structure with folders, files, file types, information about size/location/ownership/security, etc... you get the idea.  Before the painting you want to see from your data emerges, a lot of strokes need to be placed on the canvas and a lot of those strokes happen from the Operating and File Systems.  They try to manage that BLOB so your Application can turn it into usable data and eventually that beautiful (we hope) picture you desire to draw. 

Most people know there is an Operating System and those of you reading this know that Operating Systems use File Systems to organize raw data into useful components.  And there are other layers as well, but let’s focus.  The reality is there are lots of layers that have to be compensated for.  Ignoring file system overhead and focusing solely on application overhead is ignoring a really big Elephant in the room.

Let’s wrap this up and talk about the initial question.  If I just buy newer, faster Storage won’t that fix my application performance?  I suppose if you have enough money you might think you can.  You’ll still have data pipeline issues unless you have a very small amount of data, little if any data/compute redundancy requirements and a very limited number of users.  And yet, the File System overhead will still get in your way. 

When SSDs were starting to come out, Condusiv® worked with several OEMs to produce software to handle obvious issues like the fact that writes were slower and re-writes were limited in number. In doing that work, one of our surprise discoveries was that when you got beyond a certain level of file system fragmentation, the File System overhead of trying to collect/arrange the small pieces of data made a huge impact regardless of how fast the underlying storage was.  Just making sure data wasn’t broken down into too many pieces each time a need to manipulate it came along provided truly measurable and, in some instances, gave incredible performance gains. 

Then there is that whole issue of I/Os that have nothing to do with your data/application. We also discovered that there was a path to finding/eliminating the I/Os that, while not obvious, made substantial differences in performance because we could remove those out of the flow, thus allowing the I/Os your application wants to perform happen without the noise.  Think of traffic jams.  Have you ever driven in stop and go traffic and noticed there aren’t any accidents or other distractions to account for such slowness?  It’s just too many vehicles on the road with you.  What if you could get all the people who were just out for a drive, off the road?  You’d get where you want to go a LOT faster.  That’s what we figured out how to do.  And it turns out no one else is focused on that - not the Operating System, not the File System, and certainly not your application. 

And then you got swamped with more data.  Okay, so you’re in an industry where regulations forced that decision on you.  Either way, you get the point.  There was a time when 1GB was more storage than you would ever need.  Not too long ago, 1TB was the ultimate.  Now that embedded SSD on your laptop is 1TB.  Before too long, your phone will have 1TB of storage.  Mine has 128GB, but hey I’m a geek and MicroSD cards are cheap.  My point is that the explosion of data in your computing environment strains File System Architectures.  The good news is that we’ve built technologies to compensate for and fix limitations in the File System.

Let me wrap this up by giving you a 10,000-foot view of us and our software.  The big picture is we have been focused on Storage Performance for a very long time and at all layers.  We’ve seen lots of hardware solutions that were going to fix Storage slowness.  And we’ve seen that about the time a new generation comes along, there will be reasons it will still not fix the problem.  Maybe it does today, but tomorrow you’ll overtax that solution as well.  As computing gets faster and storage gets denser, your needs/desires to use it will grow even faster.  We are constantly looking into the crystal ball knowing the future presents new challenges.  We know by looking into the rear-view mirror, the future doesn’t solve the problem, it just means the problems are different.  And that’s where I get to have fun.  I get to work on solving those problems before you even realize they exist.  That’s what turns us on.  That’s what we do, and we have been doing it for a long time and, with all due modesty, we’re really good at it! 

So yes, go ahead and buy that shiny new toy.  It will help, and your users will see improvements for a time.  But we’ll be there filling in those gaps and your users will get even greater improvements.  And that’s where we really shine.  We make you look like the true genius you are, and we love doing it.

  

 

Doing it All: The Internet of Things and the Data Tsunami

by Dawn Richcreek 7. August 2018 15:44

“If you’re a CIO today, basically you have no choice. You have to do edge computing and cloud computing, and you have to do them within budgets that don’t allow for wholesale hardware replacement…”

For a while there, it looked like corporate IT resource planning was going to be easy. Organizations would move practically everything to the cloud, lean on their cloud service suppliers to maintain performance, cut back on operating expenses for local computing, and reduce—or at least stabilize—overall cost.

Unfortunately, that prediction didn’t reckon with the Internet of Things (IoT), which, in terms of both size and importance, is exploding.

What’s the “edge?”

It varies. To a telecom, the edge could be a cell phone, or a cell tower. To a manufacturer, it could be a machine on a shop floor. To a hospital, it could be a pacemaker. What’s important is that edge computing allows data to be analyzed in near real time, allowing actions to take place at a speed that would be impossible in a cloud-based environment. 

(Consider, for example, a self-driving car. The onboard optics spot a baby carriage in an upcoming crosswalk. There isn’t time for that information to be sent upstream to a cloud-based application, processed, and an instruction returned before slamming on the brakes.)

Meanwhile, the need for massive data processing and analytics continues to grow, creating a kind of digital arms race between data creation and the ability to store and analyze it. In the life sciences, for instance, it’s estimated that only 5% of the data ever created has been analyzed.

Condusiv® CEO Jim D’Arezzo was interviewed by App Development magazine (which publishes news to 50,000 IT pros) on this very topic, in an article entitled “Edge computing has a need for speed.” Noting that edge computing is predicted to grow at a CAGR of 46% between now and 2022, Jim said, “If you’re a CIO today, basically you have no choice. You have to do edge computing and cloud computing, and you have to do them within budgets that don’t allow for wholesale hardware replacement. For that to happen, your I/O capacity and SQL performance need to be optimized. And, given the realities of edge computing, so do your desktops and laptops.”

At Condusiv, we’ve seen users of our I/O reduction software solutions increase the capability of their storage and servers, including SQL servers, by 30% to 50% or more. In some cases, we’ve seen results as high as 10X initial performance—without the need to purchase a single box of new hardware.

If you’re interested in working with a firm that can reduce your two biggest silent killers of SQL performance, request a demo with an I/O performance specialist now.

If you want to hear why your heaviest workloads are only processing half the throughput they should from VM to storage, view this short video.

A Deep Dive Into The I/O Performance Dashboard

by Howard Butler 2. August 2018 08:36

While most users are familiar with the main Diskeeper®/V-locity®/SSDkeeper™ Dashboard view which focuses on the number of I/Os eliminated and Storage I/O Time Saved, the I/O Performance Dashboard tab takes a deeper look into the performance characteristics of I/O activity.  The data shown here is similar in nature to other Windows performance monitoring utilities and provides a wealth of data on I/O traffic streams. 

By default, the information displayed is from the time the product was installed. You can easily filter this down to a different time frame by clicking on the “Since Installation” picklist and choosing a different time frame such as Last 24 Hours, Last 7 Days, Last 30 Days, Last 60 Days, Last 90 Days, or Last 180 Days.  The data displayed will automatically be updated to reflect the time frame selected.

 

The first section of the display above is labeled as “I/O Performance Metrics” and you will see values that represent Average, Minimum, and Maximum values for I/Os Per Second (IOPS), throughput measured in Megabytes per Second (MB/Sec) and application I/O Latency measured in milliseconds (msecs). Diskeeper, V-locity and SSDkeeper use the Windows high performance system counters to gather this data and it is measured down to the microsecond (1/1,000,000 second).

While most people are familiar with IOPS and throughput expressed in MB/Sec, I will give a short description just to make sure. 

IOPS is the number of I/Os completed in 1 second of time.  This is a measurement of both read and write I/O operations.  MB/Sec is a measurement that reflects the amount of data being worked on and passed through the system.  Taken together they represent speed and throughput efficiency.  One thing I want to point out is that the Latency value shown in the above report is not measured at the storage device, but instead is a much more accurate reflection of I/O response time at an application level.  This is where the rubber meets the road.  Each I/O that passes through the Windows storage driver has a start and completion time stamp.  The difference between these two values measures the real-world elapsed time for how long it takes an I/O to complete and be handed back to the application for further processing.  Measurements at the storage device do not account for network, host, and hypervisor congestion.  Therefore, our Latency value is a much more meaningful value than typical hardware counters for I/O response time or latency.  In this display, we also provide meaningful data on the percentage of I/O traffic- which are reads and which are writes.  This helps to better gauge which of our technologies (IntelliMemory® or IntelliWrite®) is likely to provide the greatest benefit.

The next section of the display measures the “Total Workload” in terms of the amount of data accessed for both reads and writes as well as any data satisfied from cache. 

 

A system which has higher workloads as compared to other systems in your environment are the ones that likely have higher I/O traffic and tend to cause more of the I/O blender effect when connected to a shared SAN storage or virtualized environment and are prime candidates for the extra I/O capacity relief that Diskeeper, V-locity and SSDkeeper provide.

Now moving into the third section of the display labeled as “Memory Usage” we see some measurements that represent the Total Memory in the system and the total amount of I/O data that has been satisfied from the IntelliMemory cache.  The purpose of our patented read caching technology is twofold.  Satisfy from cache the frequently repetitive read data requests and be aware of the small read operations that tend to cause excessive “noise” in the I/O stream to storage and satisfy them from the cache.  So, it’s not uncommon to see the “Data Satisfied from Cache” compared to the “Total Workload” to be a bit lower than other types of caching algorithms.  Storage arrays tend to do quite well when handed large sequential I/O traffic but choke when small random reads and writes are part of the mix.  Eliminating I/O traffic from going to storage is what it’s all about.  The fewer I/Os to storage, the faster and more data your applications will be able to access.

In addition, we show the average, minimum, and maximum values for free memory used by the cache.  For each of these values, the corresponding Total Free Memory in Cache for the system is shown (Total Free Memory is memory used by the cache plus memory reported by the system as free).  The memory values will be displayed in a yellow color font if the size of the cache is being severely restricted due to the current memory demands of other applications and preventing our product from providing maximum I/O benefit.  The memory values will be displayed in red if the Total Memory is less than 3GB.

Read I/O traffic, which is potentially cacheable, can receive an additional benefit by adding more DRAM for the cache and allowing the IntelliMemory caching technology to satisfy a greater amount of that read I/O traffic at the speed of DRAM (10-15 times faster than SSD), offloading it away from the slower back-end storage. This would have the effect of further reducing average storage I/O latency and saving even more storage I/O time.

Additional Note: For machines running SQL Server or Microsoft Exchange, you will likely need to cap the amount of memory that those applications can use (if you haven’t done so already), to prevent them from ‘stealing’ any additional memory that you add to those machines.

It should be noted the IntelliMemory read cache is dynamic and self-learning.  This means you do not need to pre-allocate a fixed amount of memory to the cache or run some pre-assessment tool or discovery utility to determine what should be loaded into cache.  IntelliMemory will only use memory that is otherwise, free, available, or unused memory for its cache and will always leave plenty of memory untouched (1.5GB – 4GB depending on the total system memory) and available for Windows and other applications to use.  As there is a demand for memory, IntelliMemory will release memory from it’s cache and give this memory back to Windows so there will not be a memory shortage.  There is further intelligence with the IntelliMemory caching technology to know in real time precisely what data should be in cache at any moment in time and the relative importance of the entries already in the cache.  The goal is to ensure that the data maintained in the cache results in the maximum benefit possible to reduce Read I/O traffic. 

So, there you have it.  I hope this deeper dive explanation provides better clarity to the benefit and internal workings of Diskeeper, V-locity and SSDkeeper as it relates to I/O performance and memory management.

You can download a free 30-day, fully functioning trial of our software and see the new dashboard here: www.condusiv.com/try

Which Processes are Using All of My System Resources?

by Gary Quan 17. July 2018 05:50

Over time as more files and applications are added to your system, you notice that performance has degraded, and you want to find out what is causing it. A good starting point is to see how the system resources are being used and which processes and/or files are using them.

Both Diskeeper® and SSDkeeper® contain a lesser known feature to assist you on this. It is called the System Monitoring Report which can show you how the CPU and I/O resources are being utilized, then digging down a bit deeper, which processes or files are using them.

Under Reports on the Main Menu, the System Monitoring Report provides you with data on the system’s CPU usage and I/O Activity.

 

The CPU Usage report takes the average CPU usage from the past 7 days, then provides a graph of the hourly usage on an average day. You can then see at which times the CPU resources are being hit the most and by how much.

Digging down some more, you can then see which processes utilized the most CPU resources.

 

The Disk I/O Activity report takes the average disk I/O activity from the past 7 days, then provides a graph of the hourly activity on an average day. You can then determine at which times the I/O activity is the highest.

Digging down some more, you can then see which processes utilized the I/O resources the most, plus what processes are causing the most split (extra) I/Os.

 

You can also see which file types have the highest I/O utilization as well as those causing the most split (extra) I/Os.  This can help indicate what files and related processes are causing this type of extra I/O activity.

 

So, if you are trying to see how your system is being used, maybe for performance issues, this report gives you a quick and easy look on how the CPU and Disk I/O resources are being used on your system and what processes and file types are using them. This along with some other Microsoft Utilities, like Task Manager and Performance Monitor can help you tune your system for optimum performance.

RecentComments

Comment RSS

Month List

Calendar

<<  October 2018  >>
MoTuWeThFrSaSu
24252627282930
1234567
891011121314
15161718192021
22232425262728
2930311234

View posts in large calendar