So I’ve been trying to get good performance with minimal “resident” resource usage for JoinExt. I started out using buffered, overlapped I/O and a mapped view of the target file so I could read into the view and let the OS write it back out to disk. Performance was good and it allowed me to use a single thread to drive the UI and the file I/O, but the system cached the files, it was complicated to maintain and there is no guarantee that a file system will honor a request for asynchronous I/O so the UI could still hang. Solving the potential for hanging the UI is easy though, run the I/O in a a new thread, so I focused on keeping the file data out of the system cache. This meant using unbuffered I/O but it seemed overly complicated with all the alignment requirements. I thought maybe a file mapping was smart enough to deal with the alignment issues and all I would need to do is memcpy from source to target. So I rewrote the code to do just that and it worked, but what appears to happen behind the scenes is that a map caches the files just as if they were opened for buffered access and so you lose all the benefits of doing unbuffered I/O in the first place. The only option then was to deal with all the complications of unbuffered I/O. I tried a few different approaches but settled on one that I think is pretty simple to maintain and is performant for the majority of cases that JoinExt will be used for. I don’t know what the hell to call it, but it’s something like having 2 sections to a buffer, one for holding unaligned data and another which always starts and ends on an alignment required by whatever file is being read. When the alignment changes, the entire buffer is compacted so that everything in the aligned section moves to the unaligned section and padding is added after the unaligned section to reset the aligned section on its required boundary. When the buffer is full, it is emptied to the target file. When the files are large it works very well because many aligned reads can be done to fill up the buffer before the contents must be shifted to fix up a realignment. When the files are small, there are a lot of fix ups but for this utility, small files should generally be the exception. The plus to using a large buffer is that disk I/O is kept local until the buffer is filled with reads, then local again as it is emptied back to disk with writes. One other thing I ran into is that there are limits on how much data can be read or written at a single time. From a few Usenet posts it seems to be around 128KB, but that works out good for providing responsive feedback on the I/O in progress as well.

This entry was posted in Programming. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s