Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Everything You Need to Know about SSDs and Fragmentation in 5 Minutes

by Howard Butler 17. November 2016 05:42

When reading articles, blogs, and forums posted by well-respected (or at least well intentioned people) on the subject of fragmentation and SSDs, many make statements about how (1) SSDs don’t fragment, or (2) there’s no moving parts, so no problem, or (3) an SSD is so fast, why bother? We all know and agree SSDs shouldn’t be “defragmented” since that shortens lifespan, so is there a problem after all?

The truth of the matter is that applications running on Windows do not talk directly to the storage device.  Data is referenced as an abstracted layer of logical clusters rather than physical track/sectors or specific NAND-flash memory cells.  Before a storage unit (HDD or SSD) can be recognized by Windows, a file system must be prepared for the volume.  This takes place when the volume is formatted and in most cases is set with a 4KB cluster size.  The cluster size is the smallest unit of space that can be allocated.  Too large of a cluster size results in wasted space due to over allocation for the actual data needed.  Too small of a cluster size causes many file extents or fragments.  After formatting is complete and when a volume is first written to, most all of the free space is in just one or two very large sections.  Over the course of time as files of various sizes are written, modified, re-written, copied, and deleted, the size of individual sections of free space as seen from the NTFS logical file system point of view becomes smaller and smaller.  I have seen both HDD and SSD storage devices with over 3 million free space extents.  Since Windows lacks file size intelligence when writing a file, it never chooses the best allocation at the logical layer, only the next available – even if the next available is 4KB. That means 128K worth of data could wind up with 32 extents or fragments, each being 4KB in size. Therefore SSDs do fragment at the logical Windows NTFS file system level.  This happens not as a function of the storage media, but of the design of the file system.

Let’s examine how this impacts performance.  Each extent of a file requires its own separate I/O request. In the example above, that means 32 I/O operations for a file that could have taken a single I/O if Windows was smarter about managing free space and finding the best logical clusters instead of the next available. Since I/O takes a measurable amount of time to complete, the issue we’re talking about here related to SSDs has to do with an I/O overhead issue.

Even with no moving parts and multi-channel I/O capability, the more I/O requests needed to complete a given workload, the longer it is going to take your SSD to access the data.  This performance loss occurs on initial file creation and carries forward with each subsequent read of the same data.  But wait… the performance loss doesn’t stop there.  Once data is written to a memory cell on an SSD and later the file space is marked for deletion, it must first be erased before new data can be written to that memory cell.  This is a rather time consuming process and individual memory cells cannot be individually erased, but instead a group of adjacent memory cells (referred to as a page) are processed together.  Unfortunately, some of those memory cells may still contain valuable data and this information must first be copied to a different set of memory cells before the memory cell page (group of memory cells) can be erased and made ready to accept the new data.  This is known as Write Amplification.  This is one of the reasons why writes are so much slower than reads on an SSD.  Another unique problem associated with SSDs is that each memory cell has a limited number of times that a memory cell can be written to before that memory cell is no longer usable.  When too many memory cells are considered invalid the whole unit becomes unusable.  While TRIM, wear leveling technologies, and garbage collection routines have been developed to help with this behavior, they are not able to run in real-time and therefore are only playing catch-up instead of being focused on the kind of preventative measures that are needed the most.  In fact, these advanced technologies offered by SSD manufacturers (and within Windows) do not prevent or reverse the effects of file and free space fragmentation at the NTFS file system level.

The only way to eliminate this surplus of small, tiny writes and reads that (1) chew up performance and (2) shorten lifespan from all the wear and tear is by taking a preventative approach that makes Windows “smarter” about how it writes files and manages free space, so more payload is delivered with every I/O operation. That’s exactly why more users run Condusiv’s Diskeeper® (for physical servers and workstations) or V-locity® (for virtual servers) on systems with SSD storage. For anyone who questions how much value this approach adds to their systems, the easiest way to find out is by downloading a free 30-day trial and watch the “time saved” dashboard for yourself. Since the fastest I/O is the one you don’t have to write, Condusiv software understands exactly how much time is saved by eliminating multiple, fractured writes with fewer, larger contiguous writes. It even has an additional feature to cache reads from idle, available DRAM (15X faster than SSD), which further offloads I/O bandwidth to SSD storage. Especially for businesses with many users accessing a multitude of applications across hundreds or thousands of servers, the time savings are enormous.

 

ATTO Benchmark Results with and without Diskeeper 16 running on a 120GB Samsung SSD Pro 840. The read data caching shows a 10X improvement in read performance.

Setting the Record Straight - Windows 7 Fragmentation, SSDs, and You

by Howard Butler 21. January 2012 14:50

In today’s well connected world of electronics and instant communications I received a text from a friend asking if I had seen the recent PC World magazine (February, 2012).  He said it had some tidbit of information concerning one of my favorite subjects; system performance, defragmentation, and SSDs.  I located a copy here at the office and found the article. As I read the first line I realized the debate on the virtues of defragmentation especially on SSDs will be one that goes on indefinitely as no one really talks about the issue with supporting hard facts and numbers.  Most articles are rehashing ideas and opinions long since debunked.  They continue to surface because very few truly understand the intricacies of the Windows NTFS file system and that of the storage media, whether it is rotating magnetic hard disks or electronic solid state disks.

So let’s set the record straight… Fragmentation is exponentially more of a problem with today’s data explosion. Defragmenting once a week will still cause the user to experience slowdowns from the degradation effects and doesn’t address the issue when files are initially being written.  And yes, never do a traditional defrag on SSDs.

NTFS file and free space fragmentation happens far more frequently than you might guess.  It has the potential to happen as soon as you install the operating system.  It can happen when you install applications or system updates, access the internet, download and save photos, create e-mail, office documents, etc…  It is a normal occurrence and behavior of the computer system, but does have a negative effect on over all application and system performance.  As fragmentation happens the computer system and underlying storage is performing more work than necessary.  Each I/O request takes a measurable amount of time.  Even in SSD environments there is no such thing as an “instant” I/O request.  Any time an application requests to read or write data and that request is split into additional I/O requests it causes more work to be done.   This extra work causes a delay right at that very moment in time.  Whoever thought that defragmenting once a month or weekly was good enough, simply didn’t understand fragmentation.

Disk drives have gotten faster over the years, but so have CPUs.  In fact, the gap between the difference in speed between hard disks and CPU has actually widened.  This means that applications can get plenty of CPU cycles, but they are still starving to get the data from the storage.  What’s more, the amount of data that is being stored has increased dramatically.  Just think of all those digital photos taken and shared over the holidays.  Each photo use to be approximately 1MB in size, now they are exceeding 15MB per photo and some go way beyond that.  Video editing and rendering and storage of digital movies have also become quite popular and as a result applications are manipulating hundreds of Gigabytes of data.  With typical disk cluster sizes of 4k, a 15MB size file could potentially be fragmented into nearly 4,000 extents.  This means an extra 4,000 disk I/O requests are required to read or write the file.  No matter what type of storage, it will simply take longer to complete the operation.

Suppose I chose to do some editing of my family videos on Tuesday evening.  Even the built-in defragmentation tool in Windows 7 doesn’t do me much good because it isn’t schedule to run until Wednesday morning at 1:00am.  This also means that quite a bit of fragmentation has built up since the previous week when it last ran.  Maybe I’ll manually run it, but that can take quite a while and I’ve wasted time that I would have rather spent on my project.  Unfortunately, the Windows built-in defragmentation utility doesn’t prevent fragmentation so even after running it manually; I still will wind up with fragmentation and slow access speed of my newly created files. 

I’ve often thought about why Wednesday at 1:00am was chosen as the time to schedule defragmentation.  Why isn’t it scheduled all the time?   It is because there could be system resource conflicts that either interfere with getting the task done or the defragmentation process has difficulty throttling back under a variety of conditions.  Regardless, this wait a week to clean up fragmentation doesn’t really help me when I need it most.

As pointed out in the article, the built-in defragmenter does not have the technology advancement to properly deal with fragmentation and SSDs. The physical placement of data on an SSD doesn’t really matter like it does on regular magnetic HDDs.  With an SSD there is no rotational latency or seek time to contend with.  Many experts assume that fragmentation is no longer a problem, but the application data access speed isn’t just defined in those terms.  Each and every I/O request performed takes a measurable amount of time.  SSD’s are fast, but they are not instantaneous.  Windows NTFS file system does not behave any differently because the underlying storage is an SSD vs. HDD and therefore fragmentation still occurs.  Reducing the unnecessary I/O’s by preventing and eradicating the fragmentation reduces the number of I/O requests and as a result speeds up application data response time and improve the overall lifespan of the SSD.  In essence, this makes for more sequential I/O operations which is generally faster and outperforms random writes.

In addition, SSD’s require that old data be erased before new data is written over it, rather than just writing over the old information as with HDDs.  This doubles the wear and tear and can cause major issues with the speed performance and lifespan of the SSD.  Most SSD manufactures have very sophisticated wear-leveling technologies to help with this. The principle issue is write speed degradation due to free space fragmentation.  Small free spaces scattered across the SSD causes the NTFS file system to write a file in fragmented pieces to those small available free spaces.  This has the effect of causing more random I/O traffic that is slower than sequential operations.

I think I have clearly made my point….  The built-in defragmenter in Windows 7 is not a solution for neither the consumer/home user, nor the enterprise business user.  Data access speeds are far more critical in the business world where time is money.  In the enterprise environment there are generally many more files that are used by higher number of users that are accessing data across shared type of storage such as SAN.  Even virtual platforms benefit from the same points covered.  This opens the door and is the reason why robust solutions such as Diskeeper exist.  More data about Diskeeper and the superior technology it offers can be found at http://www.diskeeper.com.

Month List

Calendar

<<  November 2017  >>
MoTuWeThFrSaSu
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910

View posts in large calendar