Difference between revisions of "Benchmarks"

From Noah.org
Jump to: navigation, search
m (Some aliases for spot checks)
m (Some aliases for spot checks)
Line 149: Line 149:
 
== Some aliases for spot checks ==
 
== Some aliases for spot checks ==
  
These tests return write speed for a 100MB file using block sizes of 4KB, 40KB, 1MB, and 10MB.
+
These tests give the write speed for a 100 MB file using block sizes of 4 KB, 100 KB, 1 MB, and 10 MB.
  
 
{| cellpadding="5" cellspacing="0" border="1"
 
{| cellpadding="5" cellspacing="0" border="1"
| command      || block size human || block size bytes ||  block size MB || * block count = 100 MB || Note
+
! command      !! block size human !! align="right" | block size bytes !! align="right" |  block size MB !! * block count = 100 MB !! Note
 
|+
 
|+
|test-write-tt ||            4 KB ||      4096 bytes ||  0.00390625 MB ||  * 25600 blocks || typical minimal sector size, typical Linux page size
+
|test-write-tt ||            4 KB || align="right" |      4096 bytes || align="right" |  0.00390625 MB ||  * 25600 blocks || typical minimal sector size<br>typical Linux page size
 
|-
 
|-
|test-write-sm ||          100 KB ||    102400 bytes ||  0.09765625 MB ||  * 1024 blocks || 1/10th
+
|test-write-sm ||          100 KB || align="right" |    102400 bytes || align="right" |  0.09765625 MB ||  * 1024 blocks || ~1/10th X
 
|-
 
|-
|test-write-md ||            1 MB ||    1048576 bytes ||  1.0        MB ||    * 100 blocks || 1
+
|test-write-md ||            1 MB || align="right" |    1048576 bytes || align="right" |  1.00000000 MB ||    * 100 blocks || 1 X
 
|-
 
|-
|test-write-lg ||            10 MB ||  10485760 bytes || 10.0        MB ||    * 10 blocks || 10
+
|test-write-lg ||            10 MB || align="right" |  10485760 bytes || align="right" | 10.00000000 MB ||    * 10 blocks || 10 X
 
|}
 
|}
  

Revision as of 19:14, 24 May 2010


Units of Measurement of Speed

The common CD-ROM (74 minute, 12 cm, ISO-9960) can store 681984000 bytes, which is approximately 650 MB.

(650 bytes * (1024 * 1024)) / (124 * (kbit / s)) = 11.9283154 hours

Note that Google uses powers of 2 for unit prefixes K, M, and G (Kilo, Mega, Giga).

1 KB 1024 bytes "1 KB in bytes"
1 MB 1048576 bytes "1 MB in bytes"
1 GB 1073741824 bytes "1 GB in bytes"
1 TB 1099511627776 bytes "1 TB in bytes"

Google Calculator search expressions for calculating bandwidth:
(650 bytes*(1024*1024))/(124*(kbit/s))

<form name="input" action="http://www.google.com/search" method="get"> Google query: <input type="text" name="q" /> <input type="submit" value="submit" /> </form>

Remember when putting units in a formula that you must not label every scalar number. For example:

correct
650 bytes * 1024 * 1024 = 650 MB
wrong
650 bytes * 1024 bytes * 1024 bytes = 681574400 Bytes3

Benchmarking Network Speed

Benchmarking Disk Speed

Trivial Disk Speed Testing

Often I want to run simple read and write tests of a disk for sanity testing. Performance geeks will object to simple tests as being meaningless, but these are often good enough for quick comparison testing.

The /dev/urandom device is based on SHA1 which has a modest computational expense therefore I assume we can read from dev/urandom without putting much load on the system; I assume the speed it can generate those numbers is consistent (no strange pauses or latency); and I assume it can supply this data faster than most storage systems can write it... That last assumption may not be valid.

Check your block size (even though it's always 512 Bytes)

# dd if=/dev/urandom of=TESTDATA count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000458444 s, 1.1 MB/s

Create a 1MB test data file (random bytes)

Oh, look! Now we're almost testing speed, since `dd` reports its own statistics.

dd if=/dev/urandom of=TESTDATA count=2048
2048+0 records in
2048+0 records out
1048576 bytes (1.0 MB) copied, 0.309145 s, 3.4 MB/s

This does the same, but will make the math easier in future tests. This sets the blocksize to 1MB and count to 1 block. It is good to see that `dd` doesn't show any odd behavior here. The speed is about the same.

# dd if=/dev/urandom of=TESTDATA bs=1048576 count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.296067 s, 3.5 MB/s

Now create a 10 MB file. Setting the blocksize make it more clear how big a file we want.

# dd if=/dev/urandom of=TESTDATA bs=10485760 count=1
1+0 records in
1+0 records out
10485760 bytes (10 MB) copied, 2.39607 s, 4.4 MB/s

But the Kernel will cache output data and keep writing after the process is done and has close the file. How do we know the data is really all there? Add 'oflag=sync'. Note that this does slow down the total speed a little bit.

# dd if=/dev/urandom of=TESTDATA oflag=sync bs=10485760 count=1
1+0 records in
1+0 records out
10485760 bytes (10 MB) copied, 2.62692 s, 4.0 MB/s

But that shouldn't make a big difference for large blobs of data. But the dataset would have to be larger than the page cache which is less than the physical RAM. Even a 100MB file is plenty small enough to fit. The kernel will cache the entire thing and then sync it to disk at its leisure.

So we still see a drop in speed when using sync even for large blocks. With smaller block sizes and larger block counts we begin to see a penalty -- from 4.0 MB/s down to 1.2 MB/s.

# dd if=/dev/urandom of=TESTDATA oflag=sync bs=4096 count=2560
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 8.68235 s, 1.2 MB/s

What about the default blocksize of `dd` 512 bytes? To get 10MB we need 20480 blocks:

# dd if=/dev/urandom of=TESTDATA count=20480
20480+0 records in
20480+0 records out
10485760 bytes (10 MB) copied, 2.47843 s, 4.2 MB/s

That speed seems close to using one block with a size of 10485760 bytes, so blocksize does not seem to effect the speed very much. But with sync turned on that changes and we can see that `dd` must be syncing the disk much more often. Ouch:

# dd if=/dev/urandom of=TESTDATA oflag=sync count=20480
20480+0 records in
20480+0 records out
10485760 bytes (10 MB) copied, 35.4203 s, 296 kB/s

Linux page cach uses 4096 bytes per page, but there doesn't seem to be any special relationship here:

# dd if=/dev/urandom of=TESTDATA oflag=sync bs=4096 count=2560
2560+0 records in
2560+0 records out
10485760 bytes (10 MB) copied, 8.62401 s, 1.2 MB/s

It's debatable whether one should use 'fsync' or 'fdatasync' options. The 'fsync' option also makes sure filesystem metadata is written to disk. That makes more sense if you are testing the whole filesystem speed. The 'fdatasync' option only ensure the file's contents is on the disk. That would be better if you care about testing raw disk performance... But this is all theoretical. In general you won't see a difference and these are primitive tests that you wouldn't want to take to a performance testing debate.

Uh oh...

This shows why primitive testing can be bad. You have to know what you are doing. Look at these terrible results. It turns out that `dd` reads one byte then writes one byte over and over until done.

# dd if=/dev/urandom of=TESTDATA bs=1 count=1048576
1048576+0 records in
1048576+0 records out
1048576 bytes (1.0 MB) copied, 6.32568 s, 166 kB/s

Things get even worse if you add the 'sync' option because now `dd` will read then write then sync the disk. Ouch. Super slow. It seems that disks are not designed to write one byte at a time and guarantee that the byte was actually written to the disk before going on to the next byte. This is only 1K of data! But it is nice to see that the sync and blocksize options actually do seem to do what say they will -- tune performance.

# dd if=/dev/urandom of=TESTDATA oflag=sync bs=1 count=1024
1024+0 records in
1024+0 records out
1024 bytes (1.0 kB) copied, 1.42996 s, 0.7 kB/s

Some aliases for spot checks

These tests give the write speed for a 100 MB file using block sizes of 4 KB, 100 KB, 1 MB, and 10 MB.

command block size human block size bytes block size MB * block count = 100 MB Note
test-write-tt 4 KB 4096 bytes 0.00390625 MB * 25600 blocks typical minimal sector size
typical Linux page size
test-write-sm 100 KB 102400 bytes 0.09765625 MB * 1024 blocks ~1/10th X
test-write-md 1 MB 1048576 bytes 1.00000000 MB * 100 blocks 1 X
test-write-lg 10 MB 10485760 bytes 10.00000000 MB * 10 blocks 10 X
alias test-write-tt='dd if=/dev/urandom of=random_data.bin oflag=dsync conv=fdatasync bs=4096 count=25600 2>&1 | grep --only-matching -E "[0-9]+\.?[0-9]+ [kKmMgGtT]B/s"'
alias test-write-sm='dd if=/dev/urandom of=random_data.bin oflag=dsync conv=fdatasync bs=102400 count=1024 2>&1 | grep --only-matching -E "[0-9]+\.?[0-9]+ [kKmMgGtT]B/s"'
alias test-write-md='dd if=/dev/urandom of=random_data.bin oflag=dsync conv=fdatasync bs=1048576 count=100 2>&1 | grep --only-matching -E "[0-9]+\.?[0-9]+ [kKmMgGtT]B/s"'
alias test-write-lg='dd if=/dev/urandom of=random_data.bin oflag=dsync conv=fdatasync bs=10485760 count=10 2>&1 | grep --only-matching -E "[0-9]+\.?[0-9]+ [kKmMgGtT]B/s"'

Streaming big bursts or lots of little discrete chunks

So now we see that performance measurement depends on how we want to use the disk. Do we only care how fast we can write a giant blob of data? Or do we care how fast it can write lots of little blocks of data. If you are recording video then you probably care more about writing giants blobs of data. If you are writing small transactions in a log file (such as a server log) then you probably care more about small block performance.

If you really are just trying to test the raw disk write speed then you should just test a large burst. It really is the operating system's job to worry about how to manage lots of small requests and still get performance.

read speed testing

For these tests I want a 10MB data file, so first I create a fresh one:

# dd if=/dev/urandom of=TESTDATA oflag=sync bs=10485760 count=1
1+0 records in
1+0 records out
10485760 bytes (10 MB) copied, 2.69033 s, 3.9 MB/s

When using `dd` for read testing don't forget the 'iflag=direct' option. All files in Linux pass through the Page Cache, so successive testing of read performance on file will represent the speed to read from cache, not disk. That is usually not what people want to see in speed tests, but it has it's place. Note that the data still goes through a buffer since the data goes through a DMA channel in most disk IO, but that buffer is in the userspace, not the kernel, so the 'DIRECT' option for reading disk can actually speed up IO in some applications since you get rid of the kernel overhead.

# dd if=TESTDATA iflag=direct of=/dev/null
20480+0 records in
20480+0 records out
10485760 bytes (10 MB) copied, 3.58198 s, 2.9 MB/s

That seems slower than writing... Oh, I forgot the set blocksize. Silly me. Ha! quite a bit faster now:

# dd if=TESTDATA iflag=direct of=/dev/null bs=10485760
1+0 records in
1+0 records out
10485760 bytes (10 MB) copied, 0.200038 s, 52.4 MB/s

Let's explore this a little bit. How bad is reading just 1K at a time? Still faster than writing, but not by much.

# dd if=TESTDATA iflag=direct of=/dev/null bs=1024
10240+0 records in
10240+0 records out
10485760 bytes (10 MB) copied, 1.65979 s, 6.3 MB/s

And then we can see a weird piece of machinery if we use the 'direct' flag wrong. It turns out that 'direct' requires alignment with the device block size (512 on my disk). See what happens if I set a blocksize that doesn't align with the disk block size?

# dd if=TESTDATA iflag=direct of=/dev/null bs=513
dd: reading `TESTDATA': Invalid argument
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.000634682 s, 0.0 kB/s

That makes the direct flag a big pain in the ass to use. In fact, it's not really that useful in the real world since you usually don't want to avoid the kernel cache in the first place. So what's the point? Well, it does help with writing benchmark tools :-) And here we can see something even more mysterious.

Run a few times without 'iflag=direct'. Notice that the first run is slow, but subsequent runs are much faster -- the data comes from cache:

# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.394607 s, 26.6 MB/s
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.0110787 s, 946 MB/s
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.00873244 s, 1.2 GB/s
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.00825178 s, 1.3 GB/s


We can flush the kernel cache so that subsequent tests are not effected by the previous test, but notice here that they do get faster. What's this? This is the disk cache! I don't know how to suppress that!

# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.918925 s, 11.4 MB/s
root@home: /root 0
# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.689502 s, 15.2 MB/s
root@home: /root 0
# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.425424 s, 24.6 MB/s

Reading a different file should cause the disk cache to be flushed, but something weird is going on here:

# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.180207 s, 58.2 MB/s
root@home: /root 0
# dd if=TESTDATA2 of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.601255 s, 17.4 MB/s
root@home: /root 0
# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.222922 s, 47.0 MB/s
root@home: /root 0
# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA2 of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.672152 s, 15.6 MB/s
root@home: /root 0
# echo 3 > /proc/sys/vm/drop_caches
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.211442 s, 49.6 MB/s
root@home: /root 0
root@home: /root 0
# dd if=/dev/urandom of=TESTDATA oflag=sync bs=10485760 count=1
1+0 records in
1+0 records out
10485760 bytes (10 MB) copied, 2.63246 s, 4.0 MB/s
root@home: /root 0
# dd if=TESTDATA of=/dev/null bs=1048576
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.00826575 s, 1.3 GB/s