thrash - hard disk stress tester with random seek speed statistics
thrash [-v] [-c count] [-b bsize] [-s sizeK] [-r seed] [-n] [-f] [-i] [-a] [-V] [-w] <device>
thrash is intended to exercise a hard disk seek mechanism and perform a random seek speed benchmark. It is largely targeted at new harddisks that should be exercised to encourage early failure (i.e. assuming a bathtub curve for failures). This is achieved by moving the drive head to some random location, reading (typically) 512 bytes and then moving to another random location. Caching is turned off by default, each seek+read request should lead to the drive head physically moving. Just how many seeks are needed to move the drive to the bottom of the bathtub curve is an open question. Estimating an (optimistic?) HD life of 5 years, with a gentle 2 seeks/second, we would expect the drive to seek at least 315 million times in its lifetime. A heavily used system might expect a HD life of 5 years, but average 40 seeks per second, leading to 6300 million seeks. All this is obviously guesswork. Looking at harddisk datasheets, they say we can expect 1 in 10^14 reads to fail. This is still 16000 times greater than the 6300 million seeks figure so seems to be an extreme upper limit. Given all this, it appears a few million seeks is a good starting figure, 5 million to pick a number, which would probably amount to running thrash over night.
When the given number of seeks have been completed, some timing statistics are shown. For performance, the seeks per second figure is the most important. An old 15 gig IDE drive would execute around 50 seek+512 byte reads per second, a more modern 80 gig IDE drive would read around 80 per second, more still if the effective size was reduced to 15 gig in the name of fairness. A 74 gig Raptor drive would read around 130 per second, again more if the effective size was reduced to 15 gig, the lowest common denominator in this discussion. Compact flash (which has a built in PIO based IDE interface) will return seeks rates of around 530/second. A USB stick (which isn't limited to PIO but does have the USB protocol overhead) will return seek rates of around 1131/second.
Note that seek rate is only one side of the performance issue, if probably the most important. Once the head has moved to the right location, the data needs to be transfered to the host. This continuous read rate can be measured with
hdparm -t. The old 15 gig IDE drive would transfer around 20 megabytes per second, a more modern 80 gig IDE drive would transfer around 55 megabytes per second. A 10k RPM Raptor would transfer 67 megabytes per second. Note this is at the start of the disk, transfer rates will fall as the head moves towards the center of the platter. Compact Flash will transfer around 4.5 megabytes per second and a USB2 stick will transfer around 5.6 megabytes per second. Note this is at the start of the disk, transfer rates will fall as the head moves towards the center of the platter. Also note that the transfer rate (and seek rate) is partially dependent on the chipset and overall speed of the host system, so your mileage will vary.
Use short seeks to increase seek rate. This is achieved by reducing the device size to 10% of its original size. This is a heuristic, if the HD sounds as if it isn't seeking (because the seeks are too close together), use
-s instead of
Block read size, defaults to 512 bytes. Typical values would be 512, 1024 and 4096 bytes. Note that this will change the seek rate. Also note that 512 bytes is the standard harddrive block size, but probably not your standard filesystem block size, which will be atleast 1024 bytes.
Number of seek+read iterations, defaults to 100000. Suggest 500 to give some idea of the seek time, which can then be used to do timed thrash runs using
-c $[ 24*3600*<seekrate> ] to run for 24 hours.
Call FLUSHBUFS when thrash is first run. This will work when O_DIRECT is unsupported, but has the disadvantage that it runs only once when thrash begins. Thus it can be used to obtain a reasonable seek rate with
-c 500, but as the number of iterations grows, so too will the cache hit rate, leading to falsely inflated seek rates.
If you want to do a burn in test and are unable to use O_DIRECT,
-f -n is suggested, along with running:
while [ 1 ]; do blockdev --flushbufs <your device> ; sleep 10 ; done in another shell.
Ignore read errors, continue thrashing even when a
read() call fails. Mainly aimed at hard disks with bad blocks, where a random seek could try to read from bad block and so stop thrash. Nowadays, hard disks with bad blocks are probably broken.
Don't use O_DIRECT flag, instead use the default caching access method. This will speed up seek times by removing the need to seek, as some read requests will have been cached from previous reads. This will generate higher than actualy seek rates - assuming you are measuring the HD speed. 2.4.x systems do not support O_DIRECT, this option provides a way of disabling O_DIRECT rather than having thrash disable it. See --flushbufs.
Set the random number seed to seed value, between 0 and 32767. Default is to not set the seed and use the system default, on my system that produces the same random number sequence on each run.
Manually sets the device size instead of automatically finding the device size, specified in kilobytes. Useful if automatic detection fails (which is unlikely) or to speed up the seek rate and so burn in the drive more quickly. When doing a HD comparison, it is unfair to seek over a big drive and compare it to a small drive. In reality, the big drive would have to seek less to read the same amount of data as the small HD. On the other hand, you would tend to use all of the bigger HD, so maybe the seektime of the whole HD is more relevant. See also -a.
Set the verbosity level, the more
-v, the more verbose. This is for debugging, by default you see all the relevant information.
Shows the version of
Write blocks instead of reading them. ***ERASES DISK!!!*** The data written to the disk is random, based on whatever was in the buffer at the time of allocation. Writing to disk is more involved than reading, so this option should stress test a disk more thoroughly.
<device> can be any
seek()able regular files or block device. Usage statistics on your drives can be obtained from
/proc/diskstats (2.6.x) and
/proc/partititons (2.4.x with Partition statistics enabled in the kernel build). The meaning of the columns varies, the 4th column in
/proc/diskstats (disk lines, not partitions) is the number of input operations and the 5th column is the number of 512 byte blocks read. For lines in
/proc/partitions, the 5th column is the number of input operations and the 7th column in the number of 512 byte blocks read. The numbers for raid devices don't make any sense, suggest you use both thrash and hdparm to see what numbers change and by how much.
The latest thrash is available from http://abatis.org.uk/thrash, with the archive http://abatis.org.uk/thrash-1.2.tar.gz.
TCQ and NCQ could have an effect on the seek rates and the amount of seeking that actually takes place. Depending on your OS, DIRECT IO might require the block size to be at least the size of the filesytem block size (eg. 512) or the page size (eg. 4096 or 8192), normally block size is a multiple of 512. Try --nodirect if you suspect this to be the problem.
Written by Greg: greg at csc liv ac uk. Would welcome any comments.
This software is released under the terms of the GNU GPLv2.