Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Condusiv Smashes the I/O Performance Gap with New V-locity 7.0, Diskeeper 18, and SSDkeeper 2.0

by Brian Morin 6. April 2018 08:37

Condusiv is pleased to announce the release of V-locity® 7.0, Diskeeper® 18, and SSDkeeper 2.0 that smash the I/O Performance Gap on Windows servers and PCs as growing volumes of data continue to outpace the ability of underlying server and storage hardware to meet performance SLAs on mission critical workloads like MS-SQL.

The new 2018 editions of V-locity, Diskeeper, and SSDkeeper come with “no reboot” capabilities and enhanced reporting that offers a single pane view of all systems to show the exact benefit of I/O reduction software to each system in terms of number of noisy I/Os eliminated, percentage of read and write traffic offloaded from storage, and, most importantly, how much time is saved on each system as a result. It is also now easier than ever to quickly identify systems underperforming from a caching standpoint that could use more memory.

When a minimum of 30-40% I/O traffic from any Windows server is completely unnecessary, nothing but mere noise chewing up IOPS and throughput, it needs to be easy to see the exact levels of inefficiency on individual systems and what it means in terms of I/O reduction and “time saved” when Condusiv software is deployed to eliminate those inefficiencies. Since many customers choose to add a little more memory on key systems like MS-SQL to get even more from the software, it is now clearly evident what 50% or more reduction in I/O traffic actually means.

Our recent 4th annual I/O Performance Survey (no Condusiv customers included) found that MS-SQL performance problems are at their worst level in 4 years despite heavy investments in hardware infrastructure. 28% of mid-sized and large enterprises receive regular complaints from users regarding sluggish SQL-based applications. This is simply due to the growth of I/O outpacing the hardware stack’s ability to keep up. This is why it is more important than ever to consider I/O reduction software solutions that guarantee to solve performance issues instead of reactively throwing expensive new servers or storage at the problem.

Not only are the latest versions of V-locity, Diskeeper, and SSDkeeper easier to deploy and manage with “no reboot” capabilities, but reporting has been enhanced to enable administrators to quickly see the full value being provided to each system along with memory tuning recommendations for even more benefit.

A single pane view lists out all systems with associated workload data, memory data, and benefit data from I/O reduction software and lists systems as red, yellow, and green according to caching effectiveness to help administers quickly identify and prioritize systems that could use a little more memory to achieve a 50% or more reduction in I/O to storage.

Regarding the new “no reboot” capabilities, this is something that the engineering team has been attempting to crack for some time.  Per Rick Cadruvi, SVP, Engineering, “All storage filter drivers require a reboot, which is problematic for admins who manage software across thousands of servers. However, due to our extensive knowledge of Windows Kernel internals dating back to Windows NT 3.51, we were able to find a way to properly synchronize and handle the load/unload sequences of our driver transparently to other drivers in the storage stack so as not to require a reboot when deploying or updating Condusiv software.” 

Fore more on Condusiv’s quest for “no reboot” capabilities, see this blog by Rick Cadruvi, SVP Engineering: The Inside Story of Condusiv's No Reboot Quest 

This means that customers who are currently on V-locity 5.3 and higher or Diskeeper 15 and higher are able to upgrade to the latest version without a reboot. Customers on older versions will have to uninstall, reboot, then install the new version.

"As much as Condusiv I/O reduction software has been a real benefit to our applications running across 2,500+ Windows servers, we are happy to see a no reboot version of the software released so it is now truly "Set It & Forget It"®. My team is happy they no longer have to wrestle down a reboot window for hundreds of servers in order to update or deploy Condusiv software," said Blake W. Smith, MSME, System Director, Enterprise Infrastructure, CHRISTUS Health.

 

MS-SQL Performance Woes Worsen Despite Massive Hardware Investments, Survey Confirms

by Brian Morin 26. February 2018 06:36

We just completed our 4th annual I/O Performance Survey from 870 IT Professionals. This is one of the industry’s largest studies of its kind and reveals the latest trends in applications driving performance demands, and how IT Professionals are responding.

The survey results consist of 20 detailed questions designed to identify the impact of I/O growth in the modern IT environment. In addition to multiple choice questions, the survey included optional open responses, allowing a respondent to provide commentary on why they selected a particular answer. All of the individual responses have been compiled across 68 pages to help readers dive deeply into any question.

The most interesting year over year take away when comparing the 2018 report against the 2017 report is that IT Professionals cite worsening performance from applications sitting on MS-SQL despite larger investments in multi-core servers and all-flash storage. More than 1 in 4 respondents cite staff and customer complaints regarding sluggish SQL queries and reports. No Condusiv customer was included in the survey.

Below is a word cloud representing hundreds of answers to visually show the applications IT Pros are having the most trouble to support from a performance standpoint.

To see what has either grown or shrunk, below is the same word cloud from 2017 responses.

 

The full report can be accessed by visiting http://learn.condusiv.com/IO-Performance-Study.html

As much as organizations continue to reactively respond to performance challenges by purchasing expensive new server and storage hardware, our V-locity® I/O reduction software offers a far more efficient path by guaranteeing to solve the toughest application performance challenges on I/O intensive systems like MS-SQL. This means organizations are able to offload 50% of I/O traffic from storage that is nothing but mere noise chewing up IOPS and dampening performance. As soon as we open up 50% of that bandwidth to storage, sluggish performance disappears and now there’s far more storage IOPS to be used for other things.

To learn more about how V-locity I/O reduction software eliminates the two big I/O inefficiencies in a virtual environment see 2-min Video: Condusiv I/O Reduction Software Overview

 

Condusiv Addresses Concerns about the Intel CPU Security Flaw

by Rick Cadruvi, Chief Architect 8. January 2018 11:08

Since the news broke on the Intel CPU security flaw, we have fielded customer concerns about the potential impact to our software and worries of increased contention for CPU cycles if less CPU power1 is available after the patches issued by affected vendors.

Let us first simply state there is no overhead concern regarding Condusiv software related to software contention for fewer CPU cycles post-patch. If any user has concerns about CPU overhead from Condusiv software, then it is important that we communicate how Condusiv software is architected to use CPU resources in the first place (explained below) such that it is not an issue.

Google reported this issue to Intel, AMD and ARM on Jun 1, 2017.  The issue became public Jan 3, 2018.  The issue affects most Intel CPUs dating back to 1995.  Most OSes released a patch this week to mitigate the risk.  Also, firmware updates are expected soon.  A report about this flaw from the Google Project Zero team can be found at:

https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html

Before discussing the basic vulnerabilities and any impact the security flaw or patch has or doesn’t have on V-locity®/SSDkeeper®/Diskeeper® (or how our software actually proves beneficial), let me first address the performance hits anticipated due to patches in OSes and/or firmware.  While the details are being tightly held, probably to keep hackers from being able to exploit the vulnerabilities, the consensus seems to be that the fixes in the OSes (Windows Linux, etc.) will have a potential of reducing performance by 5%-30%1.  A lot of this depends on how the system is being used.  For most dedicated systems/applications the consensus appears to be that the affect will be negligible.  That is likely due to the fact that most of those systems already have excess compute capability, so the user just won’t experience a noticeable slowdown.  Also, they aren’t doing lots of things concurrently.

The real issue comes up for servers where there are multiple accessors/applications hitting on the server.  They will experience the greatest degradation as they will likely have the most number of overlapping data access.  The following article indicates that the biggest performance hits will happen on reads from storage and on SSDs. 

https://www.pcworld.com/article/3245606/security/intel-x86-cpu-kernel-bug-faq-how-it-affects-pc-mac.html

So what about V-locity/Diskeeper/SSDkeeper?

As previously mentioned, we can state that there is not increased CPU contention or negative overhead impact by Condusiv software. Condusiv background operations run at low priority, which means only otherwise idle and available CPU cycles are used. This means that despite whatever CPU horsepower is available (a lot or little), Condusiv software is unobtrusive on server performance because its patented Invisitasking® technology only uses free CPU cycles. If computing is ever completely bound by other tasks, Condusiv software sits idle so there is NO negative intrusion or impact on server resources. The same can be said about our patented DRAM caching engine (IntelliMemory®) as it only uses memory that is otherwise idle and not being used – zero contention for resources.

However, if storage reads slow down due to the fix (per the PC World article), our software will certainly overcome a significant amount of the lost performance since eliminating I/O traffic in the first place is what our software is designed to do. Telemetry data across thousands of systems demonstrates our software eliminates 30-40% of noisy and completely unnecessary I/O traffic on typically configured systems2. In fact, those who add just a little more DRAM on important systems to better leverage the tier-0 caching strategy, see a 50% or more reduction, which pushes them into the 2X faster gains and higher.

Those organizations who are concerned about loss of read performance from their SSDs due to the chip fixes and patches need only do one thing to mitigate that concern – simply allocate more DRAM to important systems. Our software will pick up the additional available memory to enhance your tier-0 caching layer. For every 2GB of memory added, we typically see a conservative 25% of read traffic offloaded from storage. That figure is often times 35-40% and even as high as 50% depending on the workload. Since our behavioral analytics engine sits at the application layer, we are able to cache very effectively with very little cache churn. And since DRAM is 15X faster than SSD, that means only a small amount of capacity can drive very big gains. Simply monitor the in-product dashboard to watch your cache hit rates rise with additional capacity.

Regarding the vulnerabilities themselves, for a very long time, memory in the CPU chip itself has been a LOT faster than system memory (DRAM).  As a result, chip manufacturers have done several things to help take advantage of CPU speed increases.  For the purpose of this paper, I will be discussing the following 2 approaches that were used to improve performance:

1. Speculative execution
2. CPU memory cache

These mechanisms to improve performance opened up the security vulnerabilities being labeled “Spectre” and “Meltdown”.

Speculative execution is a technology whereby the CPU prefetches machine instructions and data from system memory (typically DRAM) for the purpose of predicting likely code execution paths.  The CPU can pre-execute various potential code paths.  This means that by the time the actual code execution path is determined, it has often already been executed.  Think of this like coming to a fork in the road as you are driving.  What if your car had already gone down both possible directions before you even made a decision as to which one you wanted to take?  By the time you decided which path to take, your car would have already been significantly further on the way regardless of which path you ultimately chose. 

Of course, in our world, we can’t be at two places at one time, so that can’t happen.  However, a modern CPU chip has lots of unused execution cycles due to synchronization, fetching data from DRAM, etc.  These “wait states” present an opportunity to do other things. One thing is to figure out likely code that could be executed and pre-execute it.  Even if that code path wasn’t ultimately taken, all that happened is that execution cycles that would otherwise have been wasted, just tried those paths even though they didn’t need to be tried.  And with modern chips, they can execute lots of these speculative code paths. 

Worst case – No harm No foul, right?  Not quite.  Because the code, and more importantly the DRAM data needed for that code, got fetched it is in the CPU and potentially available to software.  And, the data from DRAM got fetched without checking if it was legal for this program to read it.  If the guess was correct, your system increased performance a LOT!  BUT, since memory (that may not have had legal access based on memory protection) was pre-fetched, a very clever program could take advantage of this.  Google was able to create a proof of concept for this flaw.  This is the “Spectre” case.

Before you panic about getting hacked, realize that to effectively find really useful information would require extreme knowledge of the CPU chip and the data in memory you would be interested in.  Google estimates that an attack on a system with 64GB of RAM would require a software setup cycle of 10-30 minutes.  Essentially, a hacker may be able to read around 1,500 bytes per second – a very small amount of memory.  The hacker would have to attack specific applications for this to be effective. 

As the number of transistors in a chip grew dramatically, it became possible to create VERY large memory caches on the CPU itself.  Referencing memory from the CPU cache is MUCH faster than accessing the data from the main system memory (DRAM).  The “Meltdown” flaw proof of concept was able to access this data directly without requiring the software to have elevated privileges. 

Again, before getting too excited, it is important to think through what memory is in the CPU cache.  To start with, current chips typically max out around 8MB of cache on chip.  Depending on the type of cache, this is essentially actively used memory.  This is NOT just large swaths of DRAM.  Of course, the exploit fools the chip caching algorithms to think that the memory the attack wants to read is being actively used.  According to Google, it takes more than 100 CPU cycles to cause un-cached data to become cached.  And that is in CPU word size chunks – typically 8 bytes.

So what about V-locity/Diskeeper/SSDkeeper?

Our software runs such that we are no more or less vulnerable than any other application/software component.  Data in the NTFS File System Cache and in SQL Server’s cache are just as vulnerable to being read as data in our IntelliMemory cache.  The same holds true for Oracle or any other software that caches data in DRAM.  And, your typical anti-virus has to analyze file data, so it too may have data in memory that could be read from various data files. However, as the chip flaws are fixed, our I/O reduction software provides the advantage of making up for lost performance, and more.


1https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/
2 http://blog.condusiv.com/post/2017/10/25/New-Dashboard-Finally-Answers-the-Big-Question.aspx

Tags:

I’m a MEDITECH Hospital with SSDs, Is FAL Growth Still an Issue that Risks Downtime?

by Brian Morin 4. December 2017 07:34

Now that many MEDITECH hospitals have gone all-flash for their backend storage, one of the most common questions we field is whether or not there is still downtime risk from the File Attribute List (FAL) growth issue if the data physically lives on solid-state drives (SSDs).

The main reason this question comes up is because MEDITECH requires “defragmentation,” which most admins insinuate as only being a requirement for a spinning disk backend. That misnomer couldn’t be further from the truth as the FAL issue has nothing to do with the backend media but rather the file system itself. Clearly, defragmentation processes are damaging to solid-state media, which is why MEDITECH hospitals turn to Condusiv’s V-locity® I/O reduction software that prevents fragmentation from occurring in the first place and has special engines designed for MEDITECH environments to remediate the FAL from reaching its size limit and causing unscheduled downtime.

The File Attribute List is a Windows NTFS file metadata structure referred to as the FAL. ThFAstructure capointdifferentypeofilattributessucasecuritattributeostandarinformatiosuch acreatioanmodificatiodateandmosimportantlythactuadatcontainewith ithfileFoexamplethFAkeeps tracowheralthdatifothfileThFAactuallcontains pointers to file records thaindicatthlocatioothfildatothvolumeIthadathatbe storeadifferenlogic allocationothvolume (i.e.fragmentation), morpointerarrequired. This iturincreasethsizothFALHereiliethproblemthFAL sizhaauppelimitatioo256KB which is comprised of 8192 attribute entriesWhethalimiireachednmorpointercan be added, whicmeans Nmore datcan baddetthfileAnd, iit is a foldefile whickeeps tracoalthfiletharesidundethafolderNmorfilecabaddeundethafoldefile. Once this occurs, the application crashes, leading to a best case scenario of several hours of unscheduled downtime to resolve.

Although this blog points out MEDITECH customers experiencing this issue, we have seen this FAL problem occur within non-MEDITECH environments like MS-Exchange and MS-SQL, with varying types of backend storage media from HDDs to all-flash arrays. So, what can be done about it?

The logical solution would be–why not just defragment the volume? Wouldn’t that decrease the number of pointers and decrease the FAL size? The problem is that traditional defragmentation actually causes the FAL to grow in size! While it can decrease the number of pointers, it will not decrease the FAL size, but in fact, it can cause the FAL size to groeven larger, making the problem worse even though you are attempting to remediate it.

The only proprietary solution to solve this problem is by using Condusiv’s V-locity® for virtual servers or Diskeeper® Server for physical servers. Included is a special technology called MediWrite®, which helps suppress this issue from occurring in the first place and provides special handling if it has already occurred. MediWrite includes:

>Unique FAL handling: As indicated above,traditional methods of defragmentation will cause the FAL to groeven further in size. MediWrite will detect when files have FAL size issues and will use proprietary methods to prevent FAL growth. This is the only engine of its kind in the industry.

>Unique FAL safe file movement:  V-locity and Diskeeper’s free space consolidation engines automatically detect FAL size issues and automatically deploy the MediWrite feature to resolve.

>Unique FAL growth prevention: Along with MediWrite, V-locity and Diskeeper contain another very important technology called IntelliWrite® which automatically prevents new fragmentation from occurring. By preventing new fragmentation from occurring, IntelliWrite minimizes any further FAL size growth issues.

>Unique Offline FAL Consolidation tool: Any MEDITECH hospital that already has an FAL issue can use the embedded offline tool to shrink the FAL-IN-USE size in a very short time (~5 min) as opposed to manual processes that take several hours.

>V-locity and Diskeeper have been endorsed by MEDITECH. Click Here to view.

 

 

How to Achieve 2X Faster MS-SQL Applications

by Brian Morin 8. November 2017 05:31

By following the best practices outlined here, we can virtually guarantee a 2X or faster boost in your MS-SQL performance with our I/O reduction software.

  1) Don’t just run our I/O reduction software on the SQL Server instances but also on the application servers that run on top of MS-SQL

- It’s not just SQL performance that needs improvement, but the associated application servers that communicate with SQL. Our software will eliminate a minimum of 30-40% of the I/O traffic from those systems.

  2) Run our I/O reduction software on all the non-SQL systems on the same host/hypervisor

- Sometimes a customer is only concerned with improving their SQL performance, so they only install our I/O reduction software on the SQL Server instances. Keep in mind, the other VMs on the same host/hypervisor are interfering with the performance of your SQL instances due to chatty I/O that is contending for the same storage resources. Our software eliminates a minimum of 30-40% of the I/O traffic from those systems that is completely unnecessary, so they don’t interfere with your SQL performance.

- Any customer that is on the core or host pricing model is able to deploy the software to an unlimited number of guest machines on the same host. If you are on per system pricing, consider migrating to a host model if your VM density is 7 or greater.

  3) Cap MS-SQL memory usage, leaving at least 8GB left over

- Perhaps the largest SQL inefficiency is related to how it uses memory. SQL is a memory hog. It takes everything you give it then does very little with it to actually boost performance, which is why customers see such big gains with our software when memory has been tuned properly. If SQL is left uncapped, our software will not see any memory available to be used for cache, so only our write optimization engine will be in effect. Moreover, most DB admins cap SQL, leaving 4GB for the OS to use according to Microsoft’s own best practice.

- However, when using our software, it is best to begin by capping SQL a little more aggressively by leaving 8GB. That will give plenty to the OS, and whatever is leftover as idle will be dynamically leveraged by our software for cache. If 4GB is available to be used as cache by our software, we commonly see customers achieve 50% cache hit rates. It doesn’t take much capacity for our software to drive big gains.

  4) Consider adding more memory to the SQL Server

- Some customers will add more memory then limit SQL memory usage to what it was using originally, which leaves the extra RAM left over for our software to use as cache. This also alleviates concerns about capping SQL aggressively if you feel that it may result in the application being memory starved. The very best place to start is by adding 16GB of RAM then monitor the dashboard to see the impact. The software can use up to 128GB of DRAM. Those customers who are generous in this approach on read-heavy applications get into otherworldly kind of gains far beyond 2X with >90% of I/O served from DRAM. Remember, DRAM is 15X faster than SSD and sits next to the CPU.

  5) Monitor the dashboard for a 50% reduction in I/O traffic to storage

- When our dashboard shows a 50% reduction in I/O to storage, that’s when you know you have properly tuned your system to be in the range of 2X faster gains to the user, barring any network congestion issues or delivery issues.

- As much as capping SQL at 8GB may be a good place to start, it may not always get you to the desired 50% I/O reduction number. Monitor the dashboard to see how much I/O is being offloaded and simply tweak memory usage by capping SQL a little more aggressively. If you feel you may be memory constrained already, then add a little more memory, so you can cap more aggressively. For every 1-2GB of memory added, another 10-25% of read traffic will be offloaded.

 

Not a customer yet? Download a free trial of Condusiv I/O reduction software and apply these best practice steps at www.condusiv.com/try

 

RecentComments

Comment RSS

Month List

Calendar

<<  August 2018  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
272829303112
3456789

View posts in large calendar