Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

It’s Diskeeper Groundhog Day

by Brian Morin 12. July 2017 06:08

Every week, I find myself sending the same email to at least one Diskeeper® customer. Almost every time, it’s a new manager/director/VP who joins the company and hears Diskeeper is running on their physical servers or clients. This is how it goes:

Hi Christopher,

I’m the SVP of WW Sales here and received news XYZ company may not be renewing support on your Diskeeper licenses because of concerns with it on MS-SQL servers attached to SAN storage with SSDs.

Since you own these licenses, I wanted to reach out for a quick 15-min tech conversation only to make sure you understand what you have in the latest version. Many still have legacy ideas of Diskeeper when it was a “defrag” product and not applicable to the new world order of SSDs and modern SANs.

I guarantee the current version of our product will offload anywhere from 30-40% of your I/O traffic from your underlying storage to provide a nice performance boost and give some precious I/O headroom back to that subsystem. Many customers offload >50% of I/O by simply adding more memory, enabling them to sweat their storage assets significantly and use those IOPS for other things. It’s a free upgrade while you are active on your maintenance.

As a primer, you can read this case study we published last month with the University of Illinois in which we doubled performance of their SQL and Oracle applications sitting on all-flash arrays. And if you want the short, short summary of what the technology does now, here’s the 2-min video: https://www.youtube.com/watch?v=Ge-49YYPwBM. This is why Gartner named us Cool Vendor of the Year a couple years back.

The good news is that they have heard of Diskeeper. The bad news is that they still associate it with legacy versions that emphasized defragmentation applicable only to spinning disk. For those customers who virtualized and converted their licenses to V-locity®, we don’t run into this issue.

If you are a current Diskeeper customer and have difficulty educating new management, I suggest setting up a tech review between Condusiv and new team members. If you are running all-flash, an alternative approach others have taken is simply replacing their Diskeeper with SSDkeeper® so they don’t run into old “defrag” objections. The core features are identical and both auto-detect if the storage is HDD or SSD and apply the best optimization method.

Keep in mind, since Diskeeper now proactively eliminates excessively small tiny writes and reads in the first place, the whole concept of “defragmentation” is a dead concept for the most part except in some extenuating circumstances. An example would be a heavily fragmented volume on spinning disk that didn’t have Diskeeper on it and could use a one-time clean up. A good example of this is a new customer last week whose full backup was taking over a day to complete! After running Diskeeper on their physical servers, the backup time was cut in half.

Tags:

Defrag | Diskeeper | SAN | SSD, Solid State, Flash

University of Illinois Doubles SQL and Oracle Performance on All-Flash Arrays with V-locity® I/O Reduction Software

by Brian Morin 19. May 2017 11:00

The University of Illinois, had already deployed an all-flash Dell Compellent storage array to support their hardest hitting application, AssetWorks AiM, that runs on Oracle. After a year in service, performance began to erode due to growth in users and overall workload. Other hard hitting MS-SQL applications supported by hybrid arrays were suffering performance degradation as well.

The one common denominator is that all these applications ran on Windows servers – Windows Server 2012R2.

“As we learned through this exercise with Condusiv’s V-locity I/O reduction software, we were getting hit really hard by thousands of excessively small, tiny writes and reads that dampened performance significantly,” said Greg Landes, Manager of Systems Services. “Everything was just slower due to Windows Server write inefficiencies that break writes down to be much smaller than they need to be, and forces the all-flash SAN to process far more I/O operations than necessary for any given workload.”

Landes continued, “When you have a dump truck but are only filling it a shovelful at a time before sending it on, you’re not getting near the payload you should get with each trip. That’s the exact effect we were getting with a surplus of unnecessarily small, fractured writes and subsequent reads, and it was really hurting our storage performance, even though we had a really fast ‘dump truck.’ We had no idea how much this was hurting us until we tried V-locity to address the root-cause problem to get more payload with every write. When you no longer have to process three small, fractured writes for something that only needs one write and a single I/O operation, everything is just faster.”

When testing the before-and-after effect on their production system, Greg and his team first measured without V-locity. It took 4 hours and 3 minutes to process 1.57TB of data, requiring 13,910,568 I/O operations from storage. After V-locity, the same system processed 2.1TB of data in 1 hour and 6 minutes, while only needing to process 2,702,479 I/O operations from underlying storage. “We processed half a terabyte more in a quarter of the time,” said Landes.

“Not only did V-locity dramatically help our write-heavy MS-SQL and Oracle Servers by increasing performance 50–100% on several workloads, we saw even bigger gains on our read heavy applications that could take advantage of V-locity’s patented DRAM caching engine, that put our idle, unused memory to good use. Since we had provisioned adequate memory for these I/O-intensive systems, we were well positioned to get the most from V-locity.”

“We thought we were getting the most performance possible from our systems, but it wasn’t until we used V-locity that we realized how inefficient these systems really are if you’re not addressing the root cause performance issues related to the I/O profile from Windows servers. By solving the issue of small, fractured, random I/O, we’ve been able to increase the efficiency of our infrastructure and, ultimately, our people,” said Landes.

Tags:

General | SAN | Success Stories | virtualization | V-Locity | Windows Server 2012

How Can I/O Reduction Software Guarantee to Solve the Toughest Performance Problems?

by Brian Morin 14. January 2017 01:00

The #1 request I’ve been getting from customers is a white board video that succinctly explains the two silent killers of VM performance and how our I/O reduction guarantees to solve performance problems, so applications run perfectly on every Windows server.

Expensive backend storage upgrades should ONLY take place when needing more capacity – not more performance. Anytime I tell someone our I/O reduction software guarantees to solve their toughest performance problems…the very first response is invariably the same…HOW? Not only have I answered this question hundreds of times, our own customers find themselves answering this question repeatedly to other team members or new hires.

To make this easier, I’ve answered it all here in this 10-min White Board Video ->, or you can continue reading.

 Most of us have been upgrading hardware to get more performance ever since we can remember. It’s become so engrained, it’s often times the ONLY approach we think of when needing a performance upgrade.

For many organizations, they don’t necessarily need a performance boost on EVERY application, but they need it on one or two I/O intensive applications. To throw a new all-flash array or new hybrid array at a performance problem ends up being the most expensive and disruptive way to solve a performance problem when all you have to do is the same thing thousands of our customers have done: simply try our I/O reduction software on any Windows server and watch the application run at least 50% faster and in many cases 2X-10X faster.

Most IT professionals are unaware of the fact that as great as virtualization has been for server efficiency, the one downside is how it adds complexity to the data path. On top of that, Windows doesn’t play well in a virtual environment (or any environment where it is abstracted from the physical layer). This means I/O characteristics that are a lot smaller, more fractured and more random than they need to be – the perfect trifecta for bad storage performance.

This “death by a thousand cuts” scenario means systems are processing workloads about 50% slower than they should. Condusiv’s I/O reduction software solves this problem by displacing many small tiny writes and reads with large, clean contiguous writes and reads. As huge as that patented engine is for our customers, it’s not the only thing we’re doing to make applications run smoothly. Performance is further electrified by establishing a tier-0 caching strategy - automatically using idle, available memory to serve hot reads. This is the same battle-tested technology that has been OEM’d by some of the largest out there – Dell, Lenovo, HP, SanDisk, Western Digital, just to name a few.

Although we might be most known for our first patented engine that solves Windows write inefficiencies to HDDs or SSDs, more and more customers are discovering just how important our patented DRAM caching engine is. If any customer can maintain even just 4GB of available memory to be used for cache, they most often see cache hit rates in the range of 50%. That means serving data out of DRAM, which is 15X faster than SSD and opens up even more precious bandwidth to and from storage for everything else. Other customers who really need to crank up performance are simply provisioning more memory on those systems and seeing >90% cache hit rates.

See all this and more described in the latest Condusiv I/O Reduction White Board video that explains eeevvvveeerything you need to know about the problem, how we solve it, and the typical results that should be expected in the time it takes you to drink a cup of coffee. So go get a cup of coffee, sit back, relax, and see how we can solve your toughest performance problems – guaranteed.

 

Everything You Need to Know about SSDs and Fragmentation in 5 Minutes

by Howard Butler 17. November 2016 05:42

When reading articles, blogs, and forums posted by well-respected (or at least well intentioned people) on the subject of fragmentation and SSDs, many make statements about how (1) SSDs don’t fragment, or (2) there’s no moving parts, so no problem, or (3) an SSD is so fast, why bother? We all know and agree SSDs shouldn’t be “defragmented” since that shortens lifespan, so is there a problem after all?

The truth of the matter is that applications running on Windows do not talk directly to the storage device.  Data is referenced as an abstracted layer of logical clusters rather than physical track/sectors or specific NAND-flash memory cells.  Before a storage unit (HDD or SSD) can be recognized by Windows, a file system must be prepared for the volume.  This takes place when the volume is formatted and in most cases is set with a 4KB cluster size.  The cluster size is the smallest unit of space that can be allocated.  Too large of a cluster size results in wasted space due to over allocation for the actual data needed.  Too small of a cluster size causes many file extents or fragments.  After formatting is complete and when a volume is first written to, most all of the free space is in just one or two very large sections.  Over the course of time as files of various sizes are written, modified, re-written, copied, and deleted, the size of individual sections of free space as seen from the NTFS logical file system point of view becomes smaller and smaller.  I have seen both HDD and SSD storage devices with over 3 million free space extents.  Since Windows lacks file size intelligence when writing a file, it never chooses the best allocation at the logical layer, only the next available – even if the next available is 4KB. That means 128K worth of data could wind up with 32 extents or fragments, each being 4KB in size. Therefore SSDs do fragment at the logical Windows NTFS file system level.  This happens not as a function of the storage media, but of the design of the file system.

Let’s examine how this impacts performance.  Each extent of a file requires its own separate I/O request. In the example above, that means 32 I/O operations for a file that could have taken a single I/O if Windows was smarter about managing free space and finding the best logical clusters instead of the next available. Since I/O takes a measurable amount of time to complete, the issue we’re talking about here related to SSDs has to do with an I/O overhead issue.

Even with no moving parts and multi-channel I/O capability, the more I/O requests needed to complete a given workload, the longer it is going to take your SSD to access the data.  This performance loss occurs on initial file creation and carries forward with each subsequent read of the same data.  But wait… the performance loss doesn’t stop there.  Once data is written to a memory cell on an SSD and later the file space is marked for deletion, it must first be erased before new data can be written to that memory cell.  This is a rather time consuming process and individual memory cells cannot be individually erased, but instead a group of adjacent memory cells (referred to as a page) are processed together.  Unfortunately, some of those memory cells may still contain valuable data and this information must first be copied to a different set of memory cells before the memory cell page (group of memory cells) can be erased and made ready to accept the new data.  This is known as Write Amplification.  This is one of the reasons why writes are so much slower than reads on an SSD.  Another unique problem associated with SSDs is that each memory cell has a limited number of times that a memory cell can be written to before that memory cell is no longer usable.  When too many memory cells are considered invalid the whole unit becomes unusable.  While TRIM, wear leveling technologies, and garbage collection routines have been developed to help with this behavior, they are not able to run in real-time and therefore are only playing catch-up instead of being focused on the kind of preventative measures that are needed the most.  In fact, these advanced technologies offered by SSD manufacturers (and within Windows) do not prevent or reverse the effects of file and free space fragmentation at the NTFS file system level.

The only way to eliminate this surplus of small, tiny writes and reads that (1) chew up performance and (2) shorten lifespan from all the wear and tear is by taking a preventative approach that makes Windows “smarter” about how it writes files and manages free space, so more payload is delivered with every I/O operation. That’s exactly why more users run Condusiv’s Diskeeper® (for physical servers and workstations) or V-locity® (for virtual servers) on systems with SSD storage. For anyone who questions how much value this approach adds to their systems, the easiest way to find out is by downloading a free 30-day trial and watch the “time saved” dashboard for yourself. Since the fastest I/O is the one you don’t have to write, Condusiv software understands exactly how much time is saved by eliminating multiple, fractured writes with fewer, larger contiguous writes. It even has an additional feature to cache reads from idle, available DRAM (15X faster than SSD), which further offloads I/O bandwidth to SSD storage. Especially for businesses with many users accessing a multitude of applications across hundreds or thousands of servers, the time savings are enormous.

 

ATTO Benchmark Results with and without Diskeeper 16 running on a 120GB Samsung SSD Pro 840. The read data caching shows a 10X improvement in read performance.

First-ever “Time Saved” Dashboard = Holy Grail for ROI

by Brian Morin 2. November 2016 10:03

If you’ve ever wondered about the exact business value that Condusiv® I/O reduction software provides to your systems, the latest “time saved” reporting does exactly that.

Prior to V-locity® v6.2 for virtual servers and Diskeeper® 16 for physical servers and endpoints, customers would conduct expansive before/after tests to extract the intrinsic performance value, but struggled to extract the ongoing business benefit over time. This has been especially true during annual maintenance renewal cycles when key stakeholders need to be “re-sold” to allocate budget for ongoing maintenance, or push new licenses to new servers.

The number one request from customers has been to better understand the ongoing business benefit of I/O reduction in terms that are easily relatable to senior management and makes justifying the ROI painless. This “holy grail” search on part of our engineering team has led to the industry’s first-ever “time saved” dashboard for an I/O optimization software platform.

When Condusiv software proactively eliminates the surplus of small, fractured writes and reads and ensures more “payload” with every I/O operation, the net effect is fewer write and read operations for any given workload, which saves time. When Condusiv software caches hot reads within idle, available DRAM, the net effect is fewer reads traversing the full stack down to storage and back, which saves time.

In terms of benefits, the new dashboard shows:

    1. How many write I/Os are eliminated by ensuring large, clean, contiguous writes from Windows

    2. How many read I/Os are cached from idle DRAM

    3. What percentage of write and read traffic is offloaded from underlying SSD or HDD storage

    4. Most importantly – the dashboard relates I/O reduction to the business benefit of … “time saved”

This reporting approach makes the software fully transparent on the type of benefit being delivered to any individual system or groups of systems. Since the software itself sits within the Windows operating system, it is aware of latency to storage and understands just how much time is saved by serving an I/O from DRAM instead of the underlying SSD or HDD. And, most importantly, since the fastest I/O is the one you don’t have to write, Condusiv software understands how much time is saved by eliminating multiple small, fractured writes with fewer, larger contiguous writes.  

Have you ever wondered how much time V-locity will save a VDI deployment? Or an application supported by all-flash? Or a Hyperconverged environment? Rather than wonder, just install a 30-day version of the software and monitor the “time saved” dashboard to find out. Benefits are fully transparent and easily quantified.

Have you ever needed to justify Diskeeper’s endpoint solution across a fleet of corporate laptops with SSDs? Now you can see the “time saved” on individual systems or all systems and quantify the cost of labor against the number of hours that Diskeeper saved in I/O time across any time period. The “no brainer” benefit will be immediately obvious.

Customers will be pleasantly surprised to find out the latest dashboard doesn’t just show granular benefits but also granular performance metrics and other important information to assist with memory tuning. See the avg., min, and max of idle memory used for cache over any time period (even by the hour) to make quick assessments on which systems could use more memory to take better advantage of the caching engine for greater application performance. Customers have found if they can maintain at least 2GB used for cache, that's where they begin to get into the sweet spot of what the product can do. If even more can be maintained to establish a tier-0 cache strategy, performance rises even further. Systems with at least 4GB idle for cache will invariably serve 60% of reads or more. 

 

 

       Lou Goodreau, IT Manager, New England Fishery

      “32% of my write traffic has been eliminated and 64% of my read traffic has been cached within idle memory. This saved over 20 hours in I/O time after 24 days of testing!”

       David Bruce, Managing Partner, David Bruce & Associates

                                    “Over 50% of my reads are now served from DRAM and over 30% of write traffic has

                                   been eliminated by ensuring large, contiguous writes. Now everything is more

                                   responsive!"

 

Month List

Calendar

<<  September 2017  >>
MoTuWeThFrSaSu
28293031123
45678910
11121314151617
18192021222324
2526272829301
2345678

View posts in large calendar