I would normally use C, and would definitely use GNU/Linux or some other system with a modern filesystem on a task like this - NTFS is so god-awful slow. But that wasn't my choice.
Just to make sure i/o would be as minimal a bottleneck as possible I wrote a very simple async i/o loop - I don't even know if makes a lot of difference because Linux does quite aggressive read-ahead anyway, but it certainly doesn't hurt it. Java 7 is supposed to have an async i/o interface that maps more directly to the OS one, but that isn't available so I just rolled my own.
I used the 'nio' interfaces - just for the buffers really to avoid excess copying. Oddly searching for 'async i/o' brings them up but they only have a non-blocking select type thing which has no use in this application.
It's pretty simple anyway. First a bit of setup in the processing thread.
final ByteBuffer sentinal = ByteBuffer.allocate(0);
final ArrayBlockingQueuequeue = new ArrayBlockingQueue (numBuffers+1);
final ArrayBlockingQueuedone = new ArrayBlockingQueue (numBuffers+1);
// must be one less than queue size - room for sentinal
for (int i = 0; i < numBuffers; i++) {
done.add(ByteBuffer.allocate(bufSize));
}
The main idea is using the blocking queues as a read-ahead mechanism. They seem to have a fair bit of overhead but it can be easily tuned by using different sized blocks.
The sentinal is used to mark end of file.
Then a thread is setup with a simple loop:
while (true) {
ByteBuffer bin = done.take();
bin.rewind();
int len = fc.read(bin);
if (len > 0) {
bin.limit(bin.position());
bin.position(0);
queue.add(bin);
} else if (len == -1) {
queue.add(sentinal);
break;
} else {
done.add(bin);
}
}
It took me a little while to realise how to use the ByteBuffer properly - to reset it after reading so the processing loop can read the same data, but it's pretty straightforward.
The processing loop is then simply:
while ((bb = queue.take()) != sentinal) {
... do stuff ...
bb.clear();
done.add(bb);
}
The processing loop can get a bit 'hairy' if you want to avoid copying data around too much, but this is the same in any language.
For comparison I whipped up a simple C version that did the same processing but with just the Linux read-ahead. The overall processing time (taking out java startup) is identical because - surprise surprise - the system is completely i/o bound. The Linux read-ahead is obviously quite aggressive too - which might not be so good if you're doing more random access.
The C version used 1/10th of the memory and about 1/4 of the CPU time ... but it had a better inner-loop, and the Java has a nice GUI as well. The C one only took about 20 minutes to write vs a few hours for Java - but I'm more familiar with C, and particularly writing this sort of stream processor to run efficiently in C.
Anyway it was an interesting diversion from my current work, and a good learning opportunity.
No comments:
Post a Comment