Condusiv Technologies Blog

Condusiv Technologies Blog

Blogging @Condusiv

The Condusiv blog shares insight into the issues surrounding system and application performance—and how I/O optimization software is breaking new ground in solving those issues.

Diskeeper video on HotWyred

by Michael 25. June 2007 19:15
Derek was recently featured on BETs popular tech/computer TV show, HotWyred. You can view the 90 second clip here. If anyone has visted the Diskeeper booth at a trade show, you'll recognize the clear case computer. We have two of them, and they use Raptor drives with a see-through cover. As you can imagine, they make for a pretty cool visual side-by-side comparison, where one of the computers has a fragmented disk and the other a defragmented one.



The Impact of Fragmentation on Flash Drives (iPods, Jump Drives, etc)

by Michael 22. June 2007 12:50

One of the questions that comes up on occasion is "should I defrag my iPod, my SD card, or my USB drives?". To answer that, let's first take a step back and make note that these drives (also known by other names such as thumb drives, jump drives, solid state disks, etc) are flash-based storage devices (the largest I've seen is 32GB). They are used in Digital cameras under the names of SD cards, compact flash, memory stick, etc. The iPod and other MP3 players either have miniature hard disk drives (HDD), typically called microdrives, in the larger storage models, or flash-based drives in the smaller, 2Gb-4GB, such as with the iPod Nano. The exact nomenclature of a flash storage device depends on it's "interface". If it uses a USB interface it is typically called a jump drive, if it uses a SATA/SCSI interface and is intended to replace a hard disk drive, it is called a Solid State Drive (SSD). Other flash devices include the aforementioned digital media storage devices such as Memory Stick, Compact Flash, etc... In a nutshell, Flash based disks do not use a spinning disk and can access data randomly without any performance penalty. That may seem to obfuscate the benefit of defragmentation, and to a good degree it certainly mitigates the need. Flash and SSD devices are good at reading data, but are not as good at writing data. The reason for the poor write performance is that these (NAND based) devices must erase the space used for new file writes, immediately prior to writing the new data. This is known as erase-on-write or erase/write. Improvements in this area are coming (phase-change memory). However, flash devices running FAT or NTFS file system do still fragment the same way that a HDD would. Non-Windows products, like digital cameras/camcorders, use the FAT file system (FAT16 or FAT32, depending on the size of the drive). FAT file systems are more susceptible to fragmentation than NTFS. The greatest drawback of flash devices from the perspective of fragmentation is it is slow at random write I/O. Here's a quick test you can do yourself to show that severe free space fragmentation on Flash drives does affect performance. I did this myself, as a test run.

I took a brand new Kingston 1GB DataTraveler Hi-Speed USB drive with 24MB/sec read and 10MB/sec performance (per the manufacturer).

First I did a format of the disk - FAT16 (you'll need to use FAT32 for drives over 2GB). Then, using a development testing tool from Diskeeper Corp I fragmented the free space. I used Diskeeper 2007 to confirm the fragmentation as well as DiskView (a more granular tool available from Microsoft - formerly SysInternals). I created about 45MB of small files (16k to 48k in size) spread all across that Flash disk.

I then grabbed the VM Player install file (145MB), and made five more copies of it and zipped (Winzip) them into single 846MB zip file. This file was kept on a separate spindle (SATA disk) from the OS and paging files (to minimize variables from my time tests).

I used a simple stop-watch to time how long it took to copy this file from the SATA disk to the USB Flash drive with fragmented free space. It took 2:37 from start to finish.

I reformatted the USB drive, to the FAT16 file system again and rebooted the PC (just to make sure the cache was clear). I then copied that 846MB zip file from the same location over to that USB drive. This time the copy operation took 1:14, less than half the time required to copy than when the free space was fragmented.

Deleting a large, fragmented file also takes a long time.

From a "scientific" perspective the test can be run a few more times to come to an average, but given the difference was so significant, I personally did not feel the need to redo it. You can reverse the test order, and even use a program to zero-out the flash drive, just to eliminate any minor possible variables. Anyone else is certainly welcome to give this a go for themselves.

I did test one more case where I fragmented the free space into 24 even chunks and found no difference in copy time. While severe free space fragmentation is an issue, mild free space fragmentation is not - same concept as on physical disk. And yes the 846MB was fragmented in 19 pieces.

To create the free space fragmentation (without the development tool I'm privileged to have access to), you can copy a large number of small files to the Flash drive and deleted every other one, or other random deletion pattern (vary between deleting every third, fourth, fifth ...n file). If you have some programming skill this can be
scripted fairly simply. Just make sure there is enough room left on that USB drive after fragmenting the free space to copy the same test file. The more severe the free space fragmentation, the longer the copy operation will take.

That said, the degree to how this translates into actual usage depends. A real-world equivalent might be with a digital camera/camacorder where you mix various sized mpegs and jpegs, and use the device to delete some of these files from the drive. Unless you wipe the disk, the free space fragmentation will build up.

The test case I made up may be so extreme that it is unreal. I don't know what's "real-world" as I don't personally use Flash drives that often, and even then my actual usage isn't likely to equal yours. How often you want to consider free space consolidation depends; my best-guess is once every 6 months or so. The limited extent to which I use USB drives and the fact my 2GB mp3 player only ever gets minor and infrequent file changes, I doubt I'll personally ever need to worry about the free space fragmentation.

PS: We've been working with several of the technology leaders in the Flash/SSD industry for some time. They have been kind enough to send us pre-release devices for our R&D efforts. Expect future innovations from Diskeeper Corporation and those industry partners to improve performance and reliability on these storage devices.


General | SSD, Solid State, Flash

The Most Popular PC Utility Ever?

by Michael 21. June 2007 15:14
I've added this entry at the request of our Channel staff. The blog's title comes from the eponymous Tech IQ magazine article. You can read the article here.



Diskeeper and SANs - A blog for Server Analysts/Administrators

by Michael 19. June 2007 19:10
SANs typically employ a clustered/SAN file system to pool disk arrays into a virtualized storage volume. It is called in Wikipedia a "shared disk file system". This is not NTFS, but rather a proprietary software, provided by a SAN hardware or software vendor such as EMC (Celerra), LSI (StoreAge SVM), VMware (VMFS), etc... VMFS, for example, uses between 1MB to 8MB storage blocks. This file system essentially "runs on top of NTFS", it does not replace it. Keep in mind that every file system is a "virtual" disk. Stacking one virtual component over another (i.e. one file system on top of another) is very doable and increasingly more common. What the vendor of a SAN file system does at "that" file system is irrelevant to what Diskeeper does. Claims that "you do not need to defragment" may be misunderstood and incorrectly implied to mean "NTFS". It is very possible that you do not need to defragment the "SAN file system". The expert on that file system and the source from which you should get setup tips, best practices, and SAN I/O optimization methodologies is that manufacturer. As for NTFS, it still fragments and causes the Windows OS to "split" I/O requests for files sent into the SAN, creating a performance penalty. You can measure this using Window's built-in PerfMon tool and watch the split I/O counter. You can also use the Average Queued Disk I/O, given you account for the number of physical spindles. Diskeeper partner Hyper I/O has a much more advanced tool called hIOmon. Iometer (formerly from Intel - now open source under GNU Public License) can be used to stress test mock environments for fragmentation. Given that SANs are ONLY ever block-level storage, they do NOT know what I/Os relate to what files. Therefore they cannot intelligently spread the fragments of a file across multiple disks. A whole mass of separate I/Os writes/reads for fragmented files (which will most certainly be interspersed with other simultaneous data writes/reads) will be non-optimally spread across the disks in the SAN storage pool (i.e. write more fragments of a given file to one disk rather than evenly spreading the data across all the disks). SAN file system vendors may offer optimization strategies to, over time, move data around the disks as it learns typical data requests (such as from that fragmented file incorrectly laid out on the disks) are not properly load-balanced across SAN spindles. Generally speaking, the above holds true for disk striping as well (RAID). I've been working with numerous high-profile SAN vendors, who are also Diskeeper partners. One for one, when I speak with SAN designers or developers, they agree that NTFS fragmentation IS an issue and advanced defragmentation is important ("basic" defragmenters can actually cause problems). Later this year we'll publish an very in-depth, about 15 page, technical paper on modern disk storage technologies based on joint research projects with some of these industry partners. I'm busy working on the new Diskeeper and Undelete releases, but I'll post a link here when it is published.



Home Disasters - An Ounce of Prevention and Other Tips

by Michael 7. June 2007 16:05
I came across an interesting story on ComputerWorld today. It discusses the recovery of lost files in one of those "emergency" situations. The author recommends defragmenting and discusses the use of various file recovery tools to assist in a modern day spin on the "damsel in distress" drama. Reading the article, you really feel for the anguish they went through, not something I'd want to ever go through. As noted in the article, the system on which the events took place was Windows XP. Windows Vista (Biz,Enterprise and Ultimate), to its credit, does include a type of file backup solution using technology called ShadowCopy. Just like you would use your camera to take a snapshot of a visual image, Vista will take a snapshot of your data. More specifically it takes point-in-time copies of your data so you can change back. While this can definitely help in disaster circumstance as described in the article, it isn't as effective as the Diskeeper Corporation product Undelete, because Undelete is "event-based" . That means Undelete captures EVERY change, not just changes on an occasional time-basis, where snapshot type methods expose you to data loss in the gaps between snapshots. With Undelete pre-installed, the entire trying circumstances to restore lost photo's depicted in the story would have, almost certainly, been averted. The real key to Undelete is that it is really "Data-Protection" more so than "File Recovery". In the disaster events described in the article, a data protection technology would never have exposed the digital photo files to the possibility of being overwritten. Yes, Undelete does have some of the emergency file recovery features as well. One feature that Undelete does not include (because we concentrate on "data-protection") is what the author noted in the article as "Raw" reads. There are a few fairly good tools on the market that do RAW reads for fairly affordable prices (one is mentioned in the article). One other such tool I can recommend is File Rescue Plus from SoftwareShelf (a Diskeeper Corporation reseller and close partner). Taking a step back and detailing some of the other comments in that story, the "protection" technology of Undelete would also mean that defrag (which typically increases the chance of file recovery) also would not cause the potential negative effect of overwriting the space that a deleted file used to occupy (because that "space" is now being protected). And when I say this (I'm sure that I'm preaching to the choir), you should have an automatic backup solution in place to ensure you have a duplicate copy of important data. Use another hard drive, or DVD/CDs to store copies. Don't leave this up to a manual, every-once-in-a-while-when-you-remember-to-do-it-solution. Even a simple batch script that copies data from one drive to another is a start. While I already think that a manual approach is a bad idea when it comes to defragmentation, it is a REALLY bad idea when it comes to backing up your data. One product I like (and their US office is down the street from my house) is NovaBackup from NovaStor. One other personal recommendation is to never store data on your C: drive. I know this is kind of tough to overcome, because most PC's come from the manufacturer with one hard drive formatted into one "volume"; the C: drive. If you aren't already familiar with it, educate yourself on "partitioning". There are numerous tools on the market to help with this, even if you purchased your new PC with a single 300GB C: drive. Personally I partition my PC, separating the operating system, from non-critical applications and data. I do store important apps (for performance reasons on the first partition of a physical disk) - using separate physical disks / RAID with parity, when possible. Taking it a step further, I also put the paging file on a seperate physical disk, from the OS, as well. IT Professionals who manage business servers practice this religiously. Also, in conjunction with a partitioning strategy, there are ways for system administrators to "hide" the system drive from non-administrative users on desktops and laptops. I've not seen this implemented that frequently in practice, but I still recommend this in a business setting (especially for those roaming laptops that store local data). While the author already described the fact that a C: drive is very active with operating system activity behind the scenes, the key reason in my opinion, to not store data on the same volume as the operating system is more basic. If you're like me, you are constantly tweaking your computer or installing and uninstalling programs. If your operating system ever gets into some really serious issue and you need to re-build/re-install the operating system you are likely to overwrite that data you have stored on the drive. Pulling the hard drive out of one computer and daisy-chaining it to another to extract data is a real pain. Apart from that primary reason, there are a number of other good reasons. Due to hard drive physics, it is better (for performance) to use more smaller capacity drives than fewer larger capacity drives. Four 250GB drives are better than two 500GB drives which are better than one 1TB drive. It is also safer from the standpoint of RAID with parity - to account for physical disk failures. The one possible argument against partitioning, for the reason I noted, is the prevalence of free virtualization software, where you can do all the tweaking and experimenting in the VMs instead. You'll just need to license more software. But then of course, the best practice for VMs is to place them on their own volumes / physical disk drives anyways. Another option, in lieu of seperating data onto dedicated volumes, is to do regular and full system "images" (e.g. Acronis, Ghost, etc). That works, but for overall practicality I prefer the solutions described. Those imaging solutions still make for good data backup solutions though. In the end, make use of volume partitioning or multiple hard drives (as appropriate), use a good automatic backup/imaging solution, and use Diskeeper and Undelete. That combo will give you an excellent foundation for a reliable computing experience. I'll end this blog with one last reference. Tweakguides has an excellent manual for advanced and novice users alike. It's put together by Windows guru Koroush Ghazi and I highly recommend it.




Comment RSS

Month List


<<  January 2020  >>

View posts in large calendar