Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

How NTFS Reads a File

by Michael 17. March 2011 11:38

When Windows NT 4.0 was released, Diskeeper 2.0 hit the market. NT 4.0 had limitations about the type of data that could be safely moved online. So, a market-first innovation that Diskeeper brought to market with Diskeeper 3.0 was what we called Boot Time Defragmentation. Boot Time Defragmentation addressed these special data types during the computer boot process, when it was safe to do so. The files that Diskeeper optimized included metadata (data which "lives above" your data), directories (folders), and the paging file. 

Metadata are special files that the NTFS file system driver uses to manage an NTFS volume. The most famous piece of metadata is the MFT (Master File Table), which is a special file typically consisting of 1024-byte records. Each file or directory on the volume is described by at least one of these MFT records. It may take several MFT records to fully describe a file... especially if it is badly fragmented or compressed; A 271MB compressed file can require over 450 MFT records!

Defragmenting the MFT, data files, and folders are all vital for optimal performance. The example below of what occurs when NTFS goes to read in the 1-cluster file \Flintstone\Barney.txt, makes that case.
 
1. The volume's boot record is read to get the cluster address of the first cluster of the MFT.
2. The first cluster of the MFT is read, which is used to locate all of the pieces of the MFT.
3. MFT record 5 is read as it is predefined to be the MFT record of the root directory.
4. Data cluster 0 of the root directory is read in and searched for "Flintstone".
5. If "Flintstone" is not found, then at least one other data cluster of the root directory needs to be
read to find it.
6. The MFT record for the "Flintstone" directory is read in.
7. Data cluster 0 of the "Flintstone" directory is read in and searched for "Barney.txt".
8. If "barney.txt" is not found, then at least one other data cluster of the "Flintstone" directory needs.
to be read to find it.
9. The MFT record for the "Barney.txt" file is read in
10. Data cluster 0 of the "Barney.txt" file is read in.
 
This is a worst-case scenario. It presumes the volume is not yet mounted, so the NTFS cache is empty at step 1, and the MFT itself needs to be located.  But it shows how many I/Os are required to get at a file that is only one level removed from the root directory: 10. Each one of those 10 I/Os requires a head movement. Any fragmentation along that path only increases the amount of disk I/Os required to access the data - slowing the whole process down.

And, if you follow the step-by-step I/O sequence outlined above, you'll see that every time a new directory is encountered in the path is an additional two or three I/Os. For obvious performance reasons it is beneficial to keep the depth of your directory structure at a minimum.  It also makes the importance of defragmenting these special file types quite obvious. 

As Windows progressed with newer iterations, many of the files that required offline defragmentation, were supported in online defragmentation (including the MFT and directories), so while Boot Time defrag still exists today, the need to run it has diminished. As a great deal of metadata is typically cached from boot to shutdown, perhaps the last remaining system file that is vital to defragment "offline" is the paging file. We've heard arguments over the years that due to the random nature of the data in the paging file that defrag was not valuable, but anyone who has cleaned up a badly shredded paging file will tell you otherwise.

Tags:

Defrag | Diskeeper

Diskeeper Receives U.S. Army Certificate of Networthiness

by Colleen Toumayan 9. March 2011 04:10

Diskeeper Corporation, Innovators in Performance and Reliability Technologies, announced that its Diskeeper 2010 performance software has received the Certificate of Networthiness (CoN) from the U.S. Army Network Enterprise Technology Command. The Certificate of Networthiness signifies that Diskeeper performance software is configured to the current Army Golden Master baseline and complies with all U.S. Army and Department of Defense (DoD) standards for security, compatibility and sustainability. A CoN is required for all enterprise software products deployed in the Army Enterprise Infrastructure Network and used by the U.S. Army, all National Guard, Army Reserve and DoD organizations that use the Army Enterprise Infrastructure.

Tags:

Defrag | Diskeeper | SAN

Additional benefits from our products

by Gary Quan 3. March 2011 08:06

As the CTO, one of the biggest joys I get is hearing back from customers on the benefits that our products have provided them.  I also find it very interesting how users find other benefits that we had not intentionally planned for the product. For example, one IT Manager would periodically execute a CHKDSK command on all of the disks of his user’s Windows’s systems. The CHKDSK command checks the file system integrity and can fix some logical errors if found.  The IT Manager did this as a preventative measure to catch and fix any logical disk errors before they could cause serious problems. This is a tedious process and can require a reboot in some cases, especially if executed on the system disk.  This IT Manager found an easy way to get this accomplished with Diskeeper. One Diskeeper feature is Boot-Time Defragmentation which will defragment files that cannot be safely defragmented while Windows is running, so it is run during the boot-up process. With Diskeeper, a user can schedule when this Boot-Time Defragmentation will occur and also add the option to perform a CHKDSK as part of the process.  The IT Manager used this feature to automatically schedule for the CHKDSK command to occur at a time that would not impact the users and he would not have to be there to manually perform it. At the same time, he also got the benefit of Boot-Time defragmentation.  If you have any other uses from our products, please send the feedback so I can hear about them.  Use the ‘Diskeeper Feedback’ feature in the Diskeeper product. This is under the ‘Action’ menu option.

Gary Quan

CTO – Diskeeper Corporation

Tags:

Diskeeper

IBM's Watson would get this one right too...

by Michael 18. February 2011 07:54

On April 10, 2002, Diskeeper enjoyed 15 seconds of TV game-show fame on the TV show Jeopardy with the answer: "Diskeeper is software to do this, reorganize your computer’s files."  The contestant won $2,000 by correctly providing the question, "What is defragment?"

 

Tags:

Defrag | Diskeeper | Diskeeper TV

Do you need to defragment a Mac?

by Michael 2. February 2011 05:54

The purpose of this blog post is to provide some data about fragmentation on the Mac, that I've not seen researched/published elsewhere.

Mac OSX has a defragmenter in the file system itself. Given Mac is open-source, we looked at the code.

During a file open the files get defragmented if the following conditions are met:

1. The file is less than 20MB in size

2. There are more than 7 fragments

3. System has been up for more than 3 minutes

4. A regular file

5. File system is journaled

6. And the file system is not read-only.

So what's Apple's take on the subject? An Apple technical article states this:

Do I need to optimize?

You probably won't need to optimize at all if you use Mac OS X. Here's why:

  • Hard disk capacity is generally much greater now than a few years ago. With more free space available, the file system doesn't need to fill up every "nook and cranny." Mac OS Extended formatting (HFS Plus) avoids reusing space from deleted files as much as possible, to avoid prematurely filling small areas of recently-freed space.
  • Mac OS X 10.2 and later includes delayed allocation for Mac OS X Extended-formatted volumes. This allows a number of small allocations to be combined into a single large allocation in one area of the disk.
  • Fragmentation was often caused by continually appending data to existing files, especially with resource forks. With faster hard drives and better caching, as well as the new application packaging format, many applications simply rewrite the entire file each time. Mac OS X 10.3 Panther can also automatically defragment such slow-growing files. This process is sometimes known as "Hot-File-Adaptive-Clustering."
  • Aggressive read-ahead and write-behind caching means that minor fragmentation has less effect on perceived system performance.

For these reasons, there is little benefit to defragmenting.

Note: Mac OS X systems use hundreds of thousands of small files, many of which are rarely accessed. Optimizing them can be a major effort for very little practical gain. There is also a chance that one of the files placed in the "hot band" for rapid reads during system startup might be moved during defragmentation, which would decrease performance.

If your disks are almost full, and you often modify or create large files (such as editing video, but see the Tip below if you use iMovie and Mac OS X 10.3), there's a chance the disks could be fragmented. In this case, you might benefit from defragmentation, which can be performed with some third-party disk utilities. 
 

Here is my take on that information:

While I have no problem with the lead-in which states probably, the reasons are theoretical. Expressing theory and then an opinion on that theory is fine, so long as you properly indicate it is an opinion. The problem I do have with this is the last sentence before the notation, "For these reasons, there is little benefit to defragmenting.", or more clearly; passing off theory as fact.

Theory, and therefore "reasons" need to be substantiated by actual scientific processes that apply the theory and then either validate or invalidate it. Common examples we hear of theory-as-fact are statements like "SSDs don't have moving parts and don't need to be defragmented". Given our primary business is large enterprise corporations, we hear a lot of theory about the need (or lack thereof) of defragmenting complex and expensive storage systems. In all those cases, testing proves fragmentation (files, free space or both) slows computers down. The reasons sound logical, which dupes readers/listeners into believing the statements are true.

On that note, while the first three are logical, the last "reason" is most likely wrong. Block-based read-ahead caching is predicated on files being sequentially located/interleaved on the same disk "tracks". File-based read-ahead would still have to issue additional I/Os due to fragmentation. Fragmentation of data essentially breaks read-ahead efforts. Could the Mac be predicting file access and pre-loading files into memory well in advance of use, sure. If that's the case I could agree with the last point (i.e. "perceived system performance), but I find this unlikely (anyone reading this is welcome to comment).

They do also qualify the reason by stating "minor fragmentation", to which I would add that that minor fragmentation on Windows may not have "perceived" impact either.

I do agree with the final statement that states "you might benefit from defragmentation" when using large files, although I think might is too indecisive.

Where my opinion comes from:

A few years ago (spring/summer of 2009) we did a research project to understand how much fragmentation existed on Apple Macs. We wrote and sent out a fragmentation/performance analysis tool to select customers who also had Macs at their homes/businesses. We collected data from 198 volumes on 82 Macs (OSX 10.4.x & 10.5.x). 30 of those systems were in use between 1 – 2 years. 

                               

While system specifics are confidential (testers provided us the data under non-disclosure agreements) we found that free space fragmentation was particularly bad in many cases (worse than Windows). We also found an average of a little over 26,000 fragments per Mac, with an average expected performance gain from defrag of about 8%.Our research also found that the more severe cases of fragmentation, where we saw 70k/100k+ fragments, were low on available free space (substantiating that last paragraph in the Apple tech article).

This article also provide some fragmentation studies as well as performance tests. Their data also validates Apple's last paragraph and makes the "might benefit" statement a bit understated.

Your Mileage May Vary (YMMV): 

So, in summary I would recommend defragmenting your Mac. As with Windows, the benefit from defragmenting is proportionate to the amount of fragmentation. Defrag will help. The question is "does defrag help enough to spend the time and money?". The good thing is most Mac defragmenters, just like Diskeeper for Windows, have free demo versions you can trial to see if its worth spending money.

 

Here are some options: 

+ iDefrag (used by a former Diskeeper Corp employee who did graphic design on a Mac)

+ Drive Genius suite (a company we have spoken with in the past)

+ Stellar Drive Defrag (relatively new)

Perhaps this article begs the question/rumor "will there be a Diskeeper for Mac?", to which I would answer "unlikely, but not impossible". The reason is that we already have a very full development schedule with opportunities in other areas that we plan to pursue.

We are keeping an "i" on it though ;-).

Month List

Calendar

<<  February 2018  >>
MoTuWeThFrSaSu
2930311234
567891011
12131415161718
19202122232425
2627281234
567891011

View posts in large calendar