Random stuff: 2011

Saturday, October 15, 2011

Linux and fashion

All users know that Linux developers like and support in every way the diversity of versions which support various incompatible things whether they are really useful or not.

It is enough to look at the md-raid superblock that is known to have at least three "just a little" different versions. Most of them differ only in a location of a superblock in relation to the array contents. Such a difference can result in several unexpected consequences, e.g. you want to recover NAS manually following the instructions at www.nasrecovery.info, but Linux is not able to assemble NAS because it is not able to see a single superblock.

In my opinion, those who develop Linux are just a little mad at a location of a superblock. As a consequence of all this - with enviable regularity another Linux version with a new superblock location is released.

Wednesday, September 21, 2011

Want to build an array?

If you are going to make some big storage, be aware of the following concerns:

needed array size
redundancy
performance
required ability to boot from the RAID array
money you are going to spend.

Taking into account the items listed above, then you need to decide what solution fits for you - create your own storage or just order off-the-shelf storage.

Should you choose the latter option, then have a look at this one of RAID Tips. If you decide to create your own RAID then go to www.raidtips.com.

Sunday, August 28, 2011

URE values - real or not?

When reading technical specification of disks given by vendors it can be noted that often makers provide not real Unrecoverable Error Rate values. This URE probability value is widely utilized to substantiate naive statements similar to "RAID 5 is dead by 2009" and to guess chances of double failure in RAID5. These calculations get people building their own RAIDs concerned.

In fact, the vendor URE data seems to be very far off the mark. Read technical documentation on Hitachi official website, they have kind of interesting URE values for 3 TB hard disk - 10^-14 errors per read bit. Such a value can be converted to the probability to read the drive from the start to the end and not encounter an URE is:

(1- 10^-14)^(8*3*10¹²)~0,79

therefore, the probability of the disk failing to read one sector is about 20%.

In other words when you have a disk filled at capacity there is a non-negligible chance (about 20%) that you are not be able to get data back off it. This is easily proven wrong by simplistic testing.

Monday, August 15, 2011

Random access time in RAIDs

The basic characteristics of a data storage speed are:

access time which is defined as time delay between when a request is addressed to a storage device and the moment the requested data begins to come in.
throughput is sustained average transfer rate.

It is well-known that RAID level 0 increases throughput (N times increase ideally). When one plans to build a RAID array, only the throughput numbers are considered, not thinking about access time.

Access time in a regular disk includes the time to position a read head above the track (so called seek time) and the time which is needed a drive to bring a sector under a read head (so called rotational latency). No matter how many member disks are in RAID 0, there always exists such a sector which is simultaneously the furthest away from the head and this sector is not contained in the cache. For this sector the access time will be the same as in case of a single drive. The only way to decrease access time is to stick to an SSD.

P.S. One can easily estimate access time (and other performance characteristics) using free benchmark software BenchMe.

Friday, August 5, 2011

New free benchmark tool

I wanted to check my RAID from the performance point of view and so downloaded three benchmark utilities:

HD Tune Pro
Crystal Disk Mark
BenchMe

The price of HD Tune Pro is $34.95 and it provides linear read speed, access time, and quite a few other metrics most of which are not significant. I say these things because I just don't know where I could use this data. One more disadvantage is that it was difficult to find such a main performance characteristic as IOPS.

CrystalDiskMark doesn't have an option to benchmark a physical data storage device, but is limited to partitions only. From my point of view this might be a drawback. The tool gives read/write speed for various number and size of requests. You can also get benchmark either for linear or random read/write speed.
All tests are designed in such a way that one is able to vary amount of data to be read.

And finally meet one more benchmark tool, BenchMe, that is a really easy-to use benchmark software. Unlike CrystalDiskMark, it benchmarks physical devices, but not logical volumes. Using the software one can get the various performance parameters such as linear read speed, distribution of access time, and a list of features. IOPS (I/O operations per second) values are measured for queue depth 1 and 32.

Thus, the conclusions of benchmark software review are:

HD Tune Pro is a paid software with a lot of useless benchmark parameters.
CrystalDiskMark is a free tool which is characterized by inconvenient interface and with no ability to scan physical data storage devices.
BenchMe is a free program that gives only really needed benchmark parameters in the most convenient representation.

Friday, July 29, 2011

There exists a time-tested Data Recovery Law - always verify recovered data.

Many of the data recovery tutorials advise users to verify the retrieved data, but usually most people don't do it. So the following scenario may take place:

you extract files and documents and folder tree seems reasonably correct.
the disk is formatted or disk content is destroyed in some other way.
then you realize that the retrieved data is not recovered correctly.

It is evident you need to recover disk content once again. However, the original disk content is no longer available.
Moral of the the above is that you should verify the recovered data first and only after checking start to fill the original hard disk with new files.

Sunday, April 10, 2011

"You need to format the disk in drive X: before you can use it."

It may happen when you disconnect the storage device (flash drive) without using "safely remove hardware" option, then, when you connect the drive the OS will say "You need to format the disk in drive X: before you can use it.".

Typically such a behavior means that you have a RAW file system. Obviously, initial filesystem on your drive was corrupted because the data stored in the buffer during disconnection of the drive was lost due to inappropriate disconnection.

You can quickly solve the problem by simply formatting the drive, but be aware that the data would be gone forever according to the format level you use. If the data is important you need to retrieve data first and only then format the drive. Data recovery in case of RAW filesystem is like unformat and easy to do - just download some data recovery software.

Monday, March 28, 2011

Make a software RAID bootable?

Can one create a bootable software RAID0, RAID5, or JBOD containing a Windows installation?
The answer is No. One cannot run an operating system from a software RAID0, RAID 5, or spanned volume.
A hardware RAID controller is required to be able to make a RAID bootable.
It is not possible to boot off the software RAID because the RAID is not readable until the operating system is fully loaded, and the OS itself is on the RAID.
One can only start an OS from a software RAID1, and even that is not trivial. To boot from a "second" hard drive of a software RAID1, you probably need to manually copy the MBR first.

Tuesday, March 8, 2011

Hard Drives with Full Encryption

Certain modern hard drives have built-in a hardware-based 256-bit AES encryption.

Surprisingly though, the content on these is encrypted even if no password was set. If the encryption chip quits, the cipher key is lost and hence data cannot be recovered despite the fact the storage itself is OK. Considering that in real life a failure of the encryption chip is higher probability even than the drive getting into enemy hands, the continuous encryption is likely not a very bright idea.

Why the heck did they do that in such a way? The rationale behind such a decision is a speed of a password change. If there is a policy of "no password = no encryption", once the password is set or changed, the full capacity of the disk needs to be re-encrypted, taking some hours. And this even before we start looking into other complications like multiple consequent power failures during reciphering. The same consideration exists when the password is removed.

So the engineers implement the faster option. The master encryption key which is actually used to encrypt the data is initialized during the production and flashed into controller's NVRAM. All the data on the disk is encrypted using this master key, regardless if the user sets the password. If user requests a password to be set, the master key is encrypted with that password. The contents of the drive being encrypted from the start, you cannot read data not having the master key, and the key is not accessible unless you have the correct password.
Now if the encryption module burns, the data is not accessible at all.

These drives are often used in external enclosures and laptops (anticipating a higher probability of actually losing the drive compared to an internal desktop hard drives), forming a special class of devices in addition to this list. These external drives are fairly hard to recover.

Monday, February 14, 2011

Recovery of the laptops supplied without a recovery disc

Laptop makers are divided into those who users a recovery CD (which often gets lost) and those makers who do not provide. Vendors who don't provide the recovery disc to their consumers usually place the original installation of Windows to the special invisible partition along with a program that can deploy this OS installation.
Such a partition is placed in the end of the drive and it becomes invisible to the operating system by the means of HPA.

The recovery process in case of a laptop containing Host Protected Area can be the following:

you press certain set of keys when the system starts up
the BIOS discards HPA
the OS is loaded from the appeared recovery partition
a factory reset utility runs from this partition. Such a tool reformats the hard disk and copies the original copy of Windows to the drive.
As the recovery process is finished, HPA is set again.

After the recovery the laptop would be as good as new.

Sunday, January 16, 2011

What is hard drive capacity clipping?

This post deals mostly with legacy hardware, something you regret not having tossed into the garbage years ago.

The maximum size restrictions on a hard drive did exist because the older LBA standard called LBA28 only allowed 28 bits to address a sector. Therefore, there was a limit of 2²⁸ sectors on a hard disk. Given the sector size of 512 bytes, the maximum number of sector addresses is 268435456 sectors, which is precisely 128 gigabytes.

Note that 128 GB (137,438,953,472 bytes) is in fact a little bit more than 137 billion bytes. The drive makers, unlike programmers, use decimal size units speaking about the hard disk size. Because of this, drive vendors can sell a 128GB drive as having size of 137GB (see, e.g., this post).
You can come across a 128GB limit if the mobo is old (something 2002-ish), or while using an OS which is not up to date (similar to Windows 2000 or XP prior to Service Pack 2).
The corrective actions are

Flashing the latest possible mobo BIOS. This may or may not work, depending on the specific mainboard.
Updating the OS. If other components are up to speed, that in most cases solves the problem.

Most often, the reduction in size results from an old hardware or software, incompatible with new drives, there is one notable exception, namely the Host Protected Area.

The HPA is a feature of a hard drive allowing the OS hide a part of the capacity of the hard drive.
This provides some compatibility for legacy computers, because one can use hard disk in an old PC which cannot handle large hard disks.

If turned on inadvertently, the HPA fools the operating system to believe that the hard drive is smaller than it should be. If this is not desired, you can reset the HPA, usually with some vendor-supplied software.