Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

How to Achieve 2X Faster MS-SQL Applications

by Brian Morin 8. November 2017 05:31

By following the best practices outlined here, we can virtually guarantee a 2X or faster boost in your MS-SQL performance with our I/O reduction software.

  1) Don’t just run our I/O reduction software on the SQL Server instances but also on the application servers that run on top of MS-SQL

- It’s not just SQL performance that needs improvement, but the associated application servers that communicate with SQL. Our software will eliminate a minimum of 30-40% of the I/O traffic from those systems.

  2) Run our I/O reduction software on all the non-SQL systems on the same host/hypervisor

- Sometimes a customer is only concerned with improving their SQL performance, so they only install our I/O reduction software on the SQL Server instances. Keep in mind, the other VMs on the same host/hypervisor are interfering with the performance of your SQL instances due to chatty I/O that is contending for the same storage resources. Our software eliminates a minimum of 30-40% of the I/O traffic from those systems that is completely unnecessary, so they don’t interfere with your SQL performance.

- Any customer that is on the core or host pricing model is able to deploy the software to an unlimited number of guest machines on the same host. If you are on per system pricing, consider migrating to a host model if your VM density is 7 or greater.

  3) Cap MS-SQL memory usage, leaving at least 8GB left over

- Perhaps the largest SQL inefficiency is related to how it uses memory. SQL is a memory hog. It takes everything you give it then does very little with it to actually boost performance, which is why customers see such big gains with our software when memory has been tuned properly. If SQL is left uncapped, our software will not see any memory available to be used for cache, so only our write optimization engine will be in effect. Moreover, most DB admins cap SQL, leaving 4GB for the OS to use according to Microsoft’s own best practice.

- However, when using our software, it is best to begin by capping SQL a little more aggressively by leaving 8GB. That will give plenty to the OS, and whatever is leftover as idle will be dynamically leveraged by our software for cache. If 4GB is available to be used as cache by our software, we commonly see customers achieve 50% cache hit rates. It doesn’t take much capacity for our software to drive big gains.

  4) Consider adding more memory to the SQL Server

- Some customers will add more memory then limit SQL memory usage to what it was using originally, which leaves the extra RAM left over for our software to use as cache. This also alleviates concerns about capping SQL aggressively if you feel that it may result in the application being memory starved. Our software can use up to 128GB of DRAM. Those customers who are generous in this approach on read-heavy applications get into otherworldly kind of gains far beyond 2X with >90% of I/O served from DRAM. Remember, DRAM is 15X faster than SSD and sits next to the CPU.

  5) Monitor the dashboard for a 50% reduction in I/O traffic to storage

- When our dashboard shows a 50% reduction in I/O to storage, that’s when you know you have properly tuned your system to be in the range of 2X faster gains to the user, barring any network congestion issues or delivery issues.

- As much as capping SQL at 8GB may be a good place to start, it may not always get you to the desired 50% I/O reduction number. Monitor the dashboard to see how much I/O is being offloaded and simply tweak memory usage by capping SQL a little more aggressively. If you feel you may be memory constrained already, then add a little more memory, so you can cap more aggressively. For every 1-2GB of memory added, another 10-25% of read traffic will be offloaded.

 

Not a customer yet? Download a free trial of Condusiv I/O reduction software and apply these best practice steps at www.condusiv.com/try

 

New Dashboard Finally Answers the Big Question

by Brian Morin 25. October 2017 04:38

After surveying thousands of IT professionals, we’ve found that the vast majority agree that Windows performance degrades over time – they just don’t agree on how much. Unbeknown to most is what the problem actually is, which is I/O degradation as the size of writes and reads become excessively smaller than they should. This inefficiency is akin to moving a gallon of water across a room with dixie cups instead of a single gallon jug. Even if you have all-flash storage and can move those dixie cups quickly, you are still not processing data nearly as fast as you could.

In the same surveys, we’ve also found that the vast majority of IT professionals are aware of the performance penalty of the “I/O blender” effect in a virtual environment, which is the mixing and randomizing of I/O streams from the disparate virtual machines on the same host. What they don’t agree on is how much. And, they are not aware of how the issue is compounded by Windows write inefficiencies.

Now that the latest Condusiv in-product dashboard has been deployed across thousands of customer systems who have upgraded their Condusiv I/O reduction software to the latest version, customers are getting their first-ever granular view into what I/O reduction software is doing for their systems in terms of seeing the exact percentage and number of read and write I/O operations eliminated from storage and how much I/O time that saves any given system or group of systems. Ultimately, it’s a picture into the size of the problem – all the I/O traffic that is mere noise – all the unnecessary I/O that dampens system performance.

In our surveys, we found IT professionals all over the map on the size of the performance penalty from inefficiencies. Some are quite positive the performance penalty is no more than 10%. More put that range at 20%. Most put it at 30%. Then it dips back down with fewer believing a 40% penalty with the fewest throwing the dart at 50%.

As it turns out, our latest version has been able to drop a pin on that.

There are variances that move the extent of the penalty on any given workload such as system configuration and/or workload behavior. Some systems might be memory constrained, some workloads might be too light to matter, etc.

However, after thousands of installs over the last several months, we see a very consistent range on the vast majority of systems in which 30-40% of all I/O traffic is being offloaded from underlying storage with our software. Not only does that represent an immediate performance boost for users, but it also means 30-40% of I/O headroom is handed back to the storage subsystem that can now use those IOPS for other things.

The biggest factor to consider is the 30-40% improvement number represents systems where memory has not been increased beyond the typical configuration that most administrators use. Customers who offload 50% or more of I/O traffic from storage are the ones with read heavy workloads who beef up memory server-side to get more from the software. For every additional 1-2GB of memory added, another 10-25% of read traffic is offloaded. Some customers are more aggressive and leverage as much memory as possible server-side to offload 90% or more I/O traffic on read-heavy applications.

As expensive as new all-flash systems are, how much sense does it make to pay for all those IOPS only to allow 30-40% of those IOPS to be chewed up by unnecessary, noisy I/O? By addressing the two biggest penalties that dampen performance (Windows write inefficiencies compounded by the “I/O blender” effect), Condusiv I/O reduction software ensures optimal performance and protects the CapEx investments made into servers and storage by extending their useful life.

Tags:

Disruption, Application Performance, IOPS | virtualization | V-Locity

Condusiv Launches SSDkeeper Software that Guarantees “Faster than New” Performance for PCs and Physical Servers and Extends Longevity of SSDs

by Brian Morin 17. January 2017 09:30

The company that sold over 100 Million Diskeeper® licenses for hard disk drive systems, now releases SSDkeeper™ to keep solid-state drive systems running longer while performing “faster than new.”

Every Windows PC or physical server fitted with a solid-state drive (SSD) suffers from very small, fractured writes and reads, which dampen optimal SSD performance and ultimately erodes the longevity of SSDs from write amplification issues. SSDkeeper’s patented software ensures large, clean contiguous writes and reads for more payload with every I/O operation, reduced Program/Erase (P/E) cycles that shorten SSD longevity, and boosts performance even further with its ability to cache hot reads within idle, available DRAM.

Solid-state drives can only handle a number of finite writes before failing. Every write kicks off P/E cycles that shorten SSD lifespan otherwise known as write amplification. By reducing the number of writes required for any given file or workload, SSDkeeper significantly boosts write performance speed while also reducing the number of P/E cycles that would have otherwise been executed. This enables individuals and organizations to reclaim the write speed of their SSD drives while ensuring the longest life possible.

Patented Write Optimization

SSDkeeper’s patented write optimization engine (IntelliWrite®) prevents excessively small, fragmented writes and reads that rob the performance and endurance of SSDs. SSDkeeper ensures large, clean contiguous writes from Windows, so maximum payload is carried with every I/O operation. By eliminating the “death by a thousand cuts” scenario of many, tiny writes and reads that slow system performance, the lifespan of an SSD is also extended due to reduction in write amplification issues that plague all SSD devices.

Patented Read Optimization

SSDkeeper electrifies Windows system performance further with an additional patented feature - dynamic memory caching (IntelliMemory®). By automatically using idle, available DRAM to serve hot reads, data is served from memory which is 12-15X faster than SSD and further reduces wear to the SSD device. The real genius in SSDkeeper’s DRAM caching engine is that nothing has to be allocated for cache. All caching occurs automatically. SSDkeeper dynamically uses only the memory that is available at any given moment and throttles according to the need of the application, so there is never an issue of resource contention or memory starvation. If a system is ever memory constrained at any point, SSDkeeper's caching engine will back off entirely. However, systems with just 4GB of available DRAM commonly serve 50% of read traffic. It doesn't take much available memory to have a big impact on performance.

Enhanced Reporting

If you ever wanted to know how much Windows inefficiencies were robbing system performance, SSDkeeper tracks time saved due to elimination of small, fragmented writes and time saved from every read request that is served from DRAM instead of being served from the underlying SSD. Users can leverage SSDkeeper’s built-in dashboard to see what percentage of all write requests are reduced by sequentializing otherwise small, fractured writes and what percentage of all read requests are cached from idle, available DRAM.

SSDkeeper is a lightweight file system driver that runs invisibly in the background with near-zero intrusion on system resources. All optimizations occur automatically in real-time.

While SSDkeeper provides the same core patented functionality and features as the latest Diskeeper® 16 for hard disk drives (minus defragmentation functions for hard disk drives only), the benefit to a solid-state drive is different than to a hard disk drive. Hard disk drives do not suffer from write amplification that reduces longevity. By eliminating excessively small writes, IntelliWrite goes beyond improved write performance but extends endurance as well.

Available in Professional and Server Editions

>SSDkeeper Professional for Windows PCs with SSD drives greatly enhances the performance of corporate laptops and desktops.

>SSDkeeper Server speeds physical server system performance of the most I/O intensive applications such as MS-SQL Server by 2X to 10X depending on the amount of idle, unused memory.  

>Options include Diskeeper Administrator management console to automate network deployment and management across hundreds or thousands of PCs or servers.  

>A free 30-day software trial download is available at http://www.condusiv.com/evaluation-software/

>Now available for purchase on our online store:  http://www.condusiv.com/purchase/SSDKeeper/

 

How Can I/O Reduction Software Guarantee to Solve the Toughest Performance Problems?

by Brian Morin 14. January 2017 01:00

The #1 request I’ve been getting from customers is a white board video that succinctly explains the two silent killers of VM performance and how our I/O reduction guarantees to solve performance problems, so applications run perfectly on every Windows server.

Expensive backend storage upgrades should ONLY take place when needing more capacity – not more performance. Anytime I tell someone our I/O reduction software guarantees to solve their toughest performance problems…the very first response is invariably the same…HOW? Not only have I answered this question hundreds of times, our own customers find themselves answering this question repeatedly to other team members or new hires.

To make this easier, I’ve answered it all here in this 10-min White Board Video ->, or you can continue reading.

 Most of us have been upgrading hardware to get more performance ever since we can remember. It’s become so engrained, it’s often times the ONLY approach we think of when needing a performance upgrade.

For many organizations, they don’t necessarily need a performance boost on EVERY application, but they need it on one or two I/O intensive applications. To throw a new all-flash array or new hybrid array at a performance problem ends up being the most expensive and disruptive way to solve a performance problem when all you have to do is the same thing thousands of our customers have done: simply try our I/O reduction software on any Windows server and watch the application run at least 50% faster and in many cases 2X-10X faster.

Most IT professionals are unaware of the fact that as great as virtualization has been for server efficiency, the one downside is how it adds complexity to the data path. On top of that, Windows doesn’t play well in a virtual environment (or any environment where it is abstracted from the physical layer). This means I/O characteristics that are a lot smaller, more fractured and more random than they need to be – the perfect trifecta for bad storage performance.

This “death by a thousand cuts” scenario means systems are processing workloads about 50% slower than they should. Condusiv’s I/O reduction software solves this problem by displacing many small tiny writes and reads with large, clean contiguous writes and reads. As huge as that patented engine is for our customers, it’s not the only thing we’re doing to make applications run smoothly. Performance is further electrified by establishing a tier-0 caching strategy - automatically using idle, available memory to serve hot reads. This is the same battle-tested technology that has been OEM’d by some of the largest out there – Dell, Lenovo, HP, SanDisk, Western Digital, just to name a few.

Although we might be most known for our first patented engine that solves Windows write inefficiencies to HDDs or SSDs, more and more customers are discovering just how important our patented DRAM caching engine is. If any customer can maintain even just 4GB of available memory to be used for cache, they most often see cache hit rates in the range of 50%. That means serving data out of DRAM, which is 15X faster than SSD and opens up even more precious bandwidth to and from storage for everything else. Other customers who really need to crank up performance are simply provisioning more memory on those systems and seeing >90% cache hit rates.

See all this and more described in the latest Condusiv I/O Reduction White Board video that explains eeevvvveeerything you need to know about the problem, how we solve it, and the typical results that should be expected in the time it takes you to drink a cup of coffee. So go get a cup of coffee, sit back, relax, and see how we can solve your toughest performance problems – guaranteed.

 

Overview of How We Derive Storage I/O Time Saved

by Rick Cadruvi, Chief Architect 11. January 2017 01:00

The latest versions of V-locity® (for virtual servers) and Diskeeper® (for physical servers and PCs) both contain built-in dashboards that show the exact benefit of the product to any one system or group of systems by showing how much and what percentage of read/write traffic is offloaded from storage and how much “I/O Time” that saves.

To understand the computation on “I/O Time Saved,” in its simplest form, the formula is essentially:

       Storage I/O Time Saved = Total I/Os Eliminated * Average I/O Response Time

In essence, if you take Total I/Os Eliminated from the dashboard Benefits screen and multiply it times the average latency from the I/O Performance dashboard screen, you will generally end up in the ballpark of the “I/O Time Saved.”

I/O counts and I/O times are accumulated on a per I/O basis. Every I/O that goes to storage is timed using Windows High Performance Counters for accuracy.  That timing is from when the I/O is sent down the stack until it comes back up. In essence we time I/O response time (IORT) or latency that the application sees, not the storage device.  We also track reads and writes separately as they impact the storage “I/O Time Saved” differently.

The data is accumulated and calculated during periods of time rather than across the entire reporting period. In the long term, that period of time ends up being hourly. Very active I/O periods will have longer IORTs and therefore the amount of I/O storage time saved per I/O eliminated will likely be greater than during relatively light periods. 

If there is a high queue depth, the IORT we time will be larger than the per I/O storage IORT.  We look at the effective IORT the application would see rather than the time the underlying storage takes to process any single I/O.  After all, the user only cares about how long the application took to process an I/O he/she requested, not how long a HDD or SSD took for any single I/O when it got around to processing it.

Let’s talk for a moment about storage “I/O Time Saved” versus clock time because they are not the same and our technologies can, in some cases, save far more storage I/O time than clock time.

If all storage I/O was sequential for the entire instance of the operating system, then the maximum amount of storage “I/O Time Saved” would be the amount of time since installation, and you would expect it to be considerably less as we are unlikely to eliminate ALL I/Os. And you might expect some idle time. Of course, applications do not do pure sequential I/O.  Modern applications are almost always multi-threaded and most computer systems are running multiple applications or instances of them at the same time.  Also, other operations are happening on the system outside of the primary application.  Think of Outlook running in the background while you do some other work on your system. Outlook is constantly receiving updated data.  Windows is also processing lots of I/Os in the background just for it to be able to continue operations.  These I/Os happen in parallel to any I/Os that users may be doing with an application.

In general, there are lots of I/Os that are being processed at the same time.  You would not want to work on a computer system where only a single I/O was being processed at any one point in time as it would be VERY slow.  If the average queue depth would have been 5 without us but 2 with us, that means every time 2 I/Os go through to storage, we would have eliminated 3 I/Os.  The end result would be a storage “I/O Time Saved” of somewhere between 1.5-3x clock time, depending on how the underlying storage processed the I/Os. 

Another factor that contributes to the possibility of storage “I/O Time Saved” exceeding of clock time is the reduction of split I/Os.  Let’s say that without our product all I/Os actually end up being split into 3 I/Os due to Windows writing files in an excessively small, fragmented manner.  After installing our product, by displacing small, tiny writes with large, contiguous writes, each of those I/Os that had to be split into 3 are now being completed as a single I/O.  If that was the normal case, the storage “I/O Time Saved” for each I/O would be roughly 2x the actual storage I/O time due to prevention of fragmentation.

Month List

Calendar

<<  November 2017  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar