In March 2015, we conducted a user study on the PhoneLab smartphone experimental platform. Over the course of one month, 11 participants used an instrumented Android smartphone as their primary device. All SQL processed by Android’s built-in SQLite library was anonymized, logged, and harvested once the phone was plugged in. These results serve as the baseline and initial motivation for our evaluation study and benchmark.
A key insight from this study was that the majority of queries recorded were effectively key-value store queries, either single-row equality-predicate lookups or full table scans. 24 out of 179 distinct apps encountered during the study used only queries of this type, while 80\% of the median app’s workload fit the same pattern. In short, mobile database workloads are dominated by key-value style queries. Accordingly, we decided to start with YCSB, an industry standard key-value store benchmark.
A second insight from this study was that workloads are bursty, with short (1-2 second) periods of moderate (100s of queries) work, followed by longer periods of inactivity or light background noise. Put differently, databases on mobile devices almost never operate at saturation. This is in direct contrast to YCSB, which attempts to measure a throughput/latency curve that our measured workloads do not approach. Hence, while we adapt and extend the six canonical key-value workloads (A-F) of YCSB, we develop our own mobile-centric experimental protocol for performance measurement. The PocketData benchmark focuses on more common mobile device bottlenecks: CPU, IO, and Query Latency. Query latency is a factor important for device responsiveness, while CPU and IO represent shared resources. In addition to pressure from other apps, as well as the app invoking the query, they also factor in to another major bottleneck in mobile devices: power consumption.
The YCSB benchmark provides 6 canonical workloads (A-F). These workloads each feature a different mix of operations including write (upsert a new record), append (add a new record with a monotonically increasing key field), update (read one field of a record, modify it, and write it back), read (recover all fields of a record), and scan (read the 1000 records following a given key). In addition to the six YCSB workloads, we add two micro-benchmark workloads (G-H) that use the YCSB schema and generate workloads consisting of 10% update, 10% insert, 40% scan. This distribution matches results presented in our user study. The remaining 40\% of the workload follows one of two patterns: In Pocket-G, the remaining workload queries filter data based on a 1-dimensional range predicate, modeling a temporal query (e.g., Google Mail or the Facebook news feed). In Pocket-H, the remaining workload queries filter data based on a 2-dimensional range predicate, modeling a spatial query (e.g., Maps). A single thread dedicated to issuing queries produces too many queries to be representative. Accordingly, we artificially reduce query throughput by sleeping the query thread according to one of three inter-query timing rates: 1. Zero delay, or performance at saturation, 2. A fixed 1ms delay in between queries, and 3. A logarithmic delay, mirroring typical app behavior in our user study.
The PocketData benchmarking toolkit
is the result of extensive trials, validation, and refinement.
Here, we provide a high-level overview of the benchmarking process.
The benchmark itself consists of a driver application and a set of boot configurations for the Android operating system and kernel.
The application part of the benchmark connects to an embedded database through a modular driver.
We presently have drivers for:
The application operates in two phases: initialization and query.
On its first run, the application initializes the database, creating database files (if needed), creating tables, and pre-loading initial data.
After initializing the database the app finishes and exits. When invoked a second time, the runner loads the workload — a sequence of SQL statements — into memory. The choice to use a pre-defined, pre-loaded trace was made for two reasons. First, this ensures that overheads from workload generation remain constant across experiments; there is no cost for assembling the SQL query string representation.
Second, having the same exact sequence of queries allows for repeatable experiments across database engines and with different instrumentation configurations.
Optionally, query traces provided to the app may be annotated with an equivalent key-value operation (e.g., PUT, GET, SCAN). Annotated traces can be run on a key-value backend like BerkeleyDB’s native interface or our naive in-memory TreeMap-based store.
PocketData: Repeatable Benchmarking for Mobile Data
Carl Nuessle, Grant Wrazen, Lukasz Ziarek, Geoffrey Challen, Oliver Kennedy
Preprint [ report ]
Debugging Performance Issues in Mobile Data Management,
Carl Nuessle, Lukasz Ziarek, Oliver Kennedy
Preprint, [ report ]