Experiment: compare ZODB file-storage iteration with Python and Rust

Contents

Introduction
Comparison
Working through data sharing
Non-nested iterator
Error handling
Error handling and iteration
Summary

Introduction

ZEO [1] storage servers, are slower than they should be. When benchmarking ZEO, storage servers are routinely CPU bound. The two biggest reasons for this appear to be:

the code runs slowly, and
Python's global interpreter lock makes it hard to leverage multiple threads.

I've been thinking of writing a ZEO server in a lower-level language for some time. Recently, I decided to try Rust.

Rust is interesting in a number of ways:

It has a reputation for being very fast.
It has a (seemingly to me) unique memory-management model. Rather than a garbage collector [2], it has an ownership model that lets much of memory-management take place at compile time.
Like Go, it (mostly) returns errors rather than raising exceptions, but unlike Go, it leverages pattern matching to make error handling far less insane.
Unlike Go, Rust has generics and a fairly powerful type system.

ZODB applications tend to be dominated by reads, so read performance is critical. ZEO servers spend a lot of time loading database records. I decided than a relatively quick and dirty way to compare Rust and Python for implementing a storage server would be to write scripts for iterating over a ZODB file storage.

The Python script leverages an existing file-storage iteration API:

import ZODB.FileStorage
import sys

def main():
    it = ZODB.FileStorage.FileIterator(sys.argv[1])
    transactions = records = 0
    for transaction in it:
        transactions += 1
        for record in transaction:
            records += 1

    print(transactions, records)

if __name__ == '__main__':
    main()

The Rust script, fs1.rs has to include the low-level iteration logic, which does mostly equivalent processing, reading transaction meta data and record data during iteration and handling file-storage back pointers:

I ran both scripts against a small (170M) file-storage file. This was big enough (4452 transactions, and 190689 data records) to be interesting, but small enough to fit into my disk cache, so as to avoid results being skewed by disk I/O. I ran the scripts on a reasonably fast Mac. Here are times for Rust and Python, as well as times for reading the data with the Unix sum and du commands:

	Real	User	System
	Real	User	System
Python	2.2	2.0	.18
Rust	.12	.05	.060
`sum`	.40	.35	.056
`du`	.004	.001	.002

The times in the table are in seconds. Rust was about 40 times faster in terms of user time. This is very encouraging.

One might expect system times to be similar, but the Python iterator does a lot of seeks it doesn't need to, due to the way the code was factored, although that doesn't seem to explain all of the difference in system times.

Working through data sharing

Getting this to work took a lot of effort. I was/am learning Rust, so some difficulty was to be expected. Rust's memory model is fairly draconian about not allowing multiple active mutable bindings or references to an object. This is a feature, but it takes some getting used to.

File storage files consist of nested records. Transaction records contain one or more data records. The most natural way to iterate over these is with nested iterators. A file iterator yields transactions, which can be iterated over to get data records. Both of these iterators need mutable [3] access to the database file. This is something the Rust compiler doesn't allow. The outer iterator has a mutable reference to the file while inner iterators need one. I fought this longer than I should have.

One of my attempts leveraged a quirk of the Rust File API that creates a file clone that shares internal state. This worked, but was fairly slow, although still much faster than the Python version. Rust files are unbuffered, and I was reading individual fields directly from the file, because of the serialization API I used.

Rust provides a BufReader class that adds buffering to files and other readable objects. BufReader didn't provide the cloning mechanism that File provides, so I was back to dealing with the compiler protection against multiple mutable references.

Rust provides mechanisms for moving mutability protections to run time. This involved 2 adapters RefCell and Rc. Both are wrapper objects. RefCell lets you temporarily get mutable references to it's contents. It generates run-time panics [4] if there are attempts to get a mutable references while there are outstanding references and if there are attempts to get non-mutable references while there is an outstanding mutable reference.

It should have been sufficient to use RefCell alone, as there could have been multiple immutable references to a RefCell cell wrapping a BufReader wrapping the open file. A number of factors got in the way of this. In Rust, when references are shared, the compiler uses lifetimes. to assure that references to a resource don't outlast the resource itself. For various reasons [5], this wasn't possible in this case.

An Rc object bypasses compile time reference checks by adding run-time reference counting. An object is wrapped with an Rc and then the Rc is cloned to create multiple reference objects pointing the the same object. This allowed me to to share a single BufReader without tripping over the compiler. I ended up with 3 levels of wrapping. I used a type alias to make this sane to use in type declarations:

type Reader = Rc<RefCell<BufReader<File>>>;

Non-nested iterator

I was curious about the overhead that might be caused by the RefCell and Rc wrappers, so I created an alternate implementation of the script without nested iterators, fs1b.rs:

Rather than iterating over the file and then over each transaction, this version uses a single iterator that yields either transaction records or data records. It does this using an enum:

enum Record {
    Transaction(TransactionRecord),
    Data(DataRecord),
}

When iterating, pattern matching is used to handle transactions and data records (and errors) separately:

for tr in Iterator::open(&args[1]).unwrap() {
    match tr {
        Ok(Record::Transaction(t)) => {
            stats.transactions += 1;
        },
        Ok(Record::Data(d)) => {
            stats.records += 1;
        },
        Err(error) => {
            print!("WTF? {}\n", error);
            break;
        },
    }
}

For now, ignore the Ok(...) wrappers in the above pattern match, I'll talk about them in a bit.

This version was no faster (or slower) than the nested-iterator version. Apparently, the RefCell and Rc overhead is quite low.

Error handling

Rust, like Go, mostly avoids using exceptions to communicate errors. Functions that can (routinely) error, return Result objects. Results are enums that can be either an Ok or an Err. Callers check results and handle errs, usually by propagating them upwards. An excellent discussion of error handling patterns may be found here:

http://blog.burntsushi.net/rust-error-handling/

Error handling is done through pattern matching on result objects, but there are APIs for making this simpler. In the previous section, we pattern-matched against Ok and Err objects to do error handling in our main function.

The iteration scripts did most of their error handling using the try! macro. Error handling also comes into play in the type signatures of the various functions in the iteration scripts.

Rust also provides [4] a course exception mechanism for handling uncommon error situations. Component writers are discouraged from using them because the make composition difficult.

A problem with Rust's error model, in my opinion, is that it makes it difficult to tell where errors actually happened, both in terms of where in the code and in where the call stack. If error management is done properly, panics, and thus stack traces will only start far above where the error occurred in the call stack.

Fortunately, there's a macro for getting a current file name, and another for getting a line number. One could write a macro to automate using this in generating error messages. Presumably, one could use a debugger to get the call stack, but I miss getting full stack traces in the first place.

Error handling and iteration

Iteration uses Option objects to control iteration. In Python, it's common for variables to hold a specific type of object, or None. A number of languages, including Rust, use options to implement this pattern in a type-safe way. In the iterator API, iteration continues as long as Some objects are returned by the iterator's next method and stops when None us returned.

This model makes error handling awkward. The file-storage iterators can certainly encounter errors, and shouldn't just panic when they get them. A common pattern, used here, is to create iterators of Result objects. The iterator next methods return Option<Result<_>> objects. The iterator framework isn't aware of this, so iteration doesn't stop when we encounter errors. Arguably, there could be applications where we wouldn't want to stop on errors. In our scripts, we need to use break to stop iteration when we see an error.

Summary

I'm encouraged by the speed of the Rust file-storage iterator. This makes me hopeful that I'll be able to create a much faster ZEO server.

It's been fun and frustrating learning Rust. I anticipate more ahead as I work on a new ZEO server.

Rust is a low-level language, but with a lot of interesting compile-time automation. It reminds me a lot of Scala, but it seems to be much much faster. I wouldn't use it for everything, but it looks to be very attractive where speed is critical.

[1]	ZEO is a client-server storage implementation for ZODB.

[2]	There are library facilities for run-time reference counting.

[3]	Mutable access is needed because file I/O operations change a file's position and thus mutate it.

[4]	(1, 2) Panics are a crude form of exception, meant for dealing with program bugs. The normal way of dealing with errors is with Result objects.

[5]

I'll note what the reasons were, but getting into the details would be a too much of a distraction here. In Rust, a struct (similar to a Python class) can only contain references if the struct is parameterized by a lifetime. Iterators use associated types to describe the things they yield. If an associated type is parameterized by a lifetime, then any trait (interface) implemented by a type with the associated type must be parameterized by a lifetime. The iteration trait isn't parameterized by lifetime, so iterators can't contain references.