I’ve been writing some simple low-level IO code to copy a few GB at a time around, and since it’s all wrapped up in Ruby synchronization logic, I used my preferred idiom of a `sysread`/`syswrite` loop with a reasonable buffer size:
bsiz = 65536
open(from) do |inh|
open(to, ‘w’) do |outh|
begin
loop do
outh.write(inh.read(bsiz))
end
rescue EOFError; end
end
end
However, I’ve always just sort of picked the above `bsiz` value more or less out of thin air, and realized that it might be far from optimal.
So, I dragged out my old friend `Benchmark.bm`, and ran something like the following:
require ‘benchmark’
def with_open_files(src_path, dst_path)
open(src_path) do |src|
open(dst_path, ‘w’) do |dst|
begin
yield [src, dst]
rescue EOFError; end
end
end
end
def basic_syscopy(src_path, dst_path, bsiz)
with_open_files(src_path, dst_path) do |src, dst|
loop do
dst.syswrite(src.sysread(bsiz))
end
end
end
src_path = ‘__data__.in’
dst_path = ‘__data__.out’
if !File.exists?(src_path)
print “generating test data…”
STDOUT.flush
`dd if=/dev/urandom of=#{src_path} bs=1024 count=65536`
puts “done.”
end
Benchmark.bm(14) do |b|
(10..22).each do |exp|
bsiz = 2**exp
b.report(”bsiz=%8d: ” % bsiz) { basic_syscopy(src_path, dst_path, bsiz) }
end
end
The output was interesting, if not earth-shaking:
lennon@firefly:~$ ruby copy_bm.rb
user system total real
bsiz= 1024: 0.150000 0.700000 0.850000 ( 0.854416)
bsiz= 2048: 0.090000 0.580000 0.670000 ( 0.662276)
bsiz= 4096: 0.070000 0.420000 0.490000 ( 0.516608)
bsiz= 8192: 0.040000 0.390000 0.430000 ( 0.433775)
bsiz= 16384: 0.030000 0.380000 0.410000 ( 0.410382)
bsiz= 32768: 0.020000 0.370000 0.390000 ( 0.390833)
bsiz= 65536: 0.010000 0.370000 0.380000 ( 0.379887)
bsiz= 131072: 0.010000 0.360000 0.370000 ( 0.374959)
bsiz= 262144: 0.010000 0.370000 0.380000 ( 0.374990)
bsiz= 524288: 0.010000 0.380000 0.390000 ( 0.586017)
bsiz= 1048576: 0.020000 0.360000 0.380000 ( 0.390283)
bsiz= 2097152: 0.000000 0.380000 0.380000 ( 0.384693)
bsiz= 4194304: 0.010000 0.370000 0.380000 ( 0.380670)
Basically, this tells me that a) buffer size really doesn’t make a big difference (aside from the weird spike around 512K-1MB) and b) that Ruby really can turn in respectable IO performance, since the baseline for using the `cp` command is only a few percentage points faster:
lennon@firefly:~$ time cp __data__.in __data__.out
real 0m0.366s
user 0m0.000s
sys 0m0.364s
Of course, I don’t know that I’d put a lot of faith in Ruby keeping up with hand-tooled C over the long haul — as the size of the data (or longevity of the process) went up, I would expect memory allocation and garbage to start having an impact.
Because of your block size of your system is 4KB. You don’t even need to run benchmark if you had read the bible, Advanced Programming in UNIX environment writen by Richard Stevens. Since read buffer between file system and device driver has 4KB size, system call invokes no more than it needed ($file_size / $block_size + 1 times).
Anyway, nice try and I’m interested on the way you run the test.