logo

Database Operations

We consider measuring the performance of database operations in the context of a tracking application that stores track information in a database. Tracks are indexed using their location (spatial coordinates). The tracker operates in discrete time intervals called cycles. During each cycle, the tracker receives a set of target reports from a radar. It asks the database to search for all tracks that could be associated with each target report, based on location. The tracker may direct the database to insert new tracks based on target reports that are not associated with any tracks, and to delete specific tracks.

The database interface receives a stream of instructions from the tracker consisting of the search, insert, and delete operations to be performed. Its output is a set of record identifiers which are presumably used to look up the actual records in memory. As the actual database does not exist, the numbers are essentially random 32-bit integers. Our goal is to measure the performance of the search, insert, and delete operations, without ever altering the contents of any particular record. The major motivation for this is to avoid generating the large amount of data necessary for the database. A typical record from a feature-aided track application is on the order of 650 bytes per record (see [1]), and test cases of interest may require up to 100,000 such records. Thus, in the benchmark, we do not actually generate and maintain the contents of the database itself, only the indexing structures. Therefore, the structures used only store the values we need: x and y coordinates, and a record identifier, which is an 32-bit integer index of or pointer to a data record. For an application of this type, the three database operations needed are:

search
Look up and retrieve all items whose characteristics fall into a given range. In this case, a search is done for all targets within a specified range of a particular (x,y) coordinate pair, where x and y are floating-point numbers. Given a set of sector bounds { xMin, xMax, yMin, yMax }, this search can be expressed in a fashion approximating the structured query language (SQL) as select * from TrackDatabase
 where (x > xMin AND x < xMax AND y > yMin AND y < yMax).
insert
Add a new item to the database. This can be expressed in an SQL-like fashion as
insert into TrackDatabase values(id, xu, yu).
delete
Remove an item from the database, expressed in an SQL-like fashion as
delete from TrackDatabase where (x = xu AND y = yu).
Database workloads are provided based on two scenarios: a kinematic tracking scenario similar to the parameters proposed for the integrated radar-tracker benchmark, and a multi-hypothesis tracking scenario in which the database is allowed to become much larger. For each scenario, the frequency of each operation (search, insert, delete) is specified. The parameters that define these two scenarios are given in Table 1.
Table 1: Tracking parameters.
Parameter Description Values
Set 1 Set 2
Cycles to run 100 100
Total existing targets (P+U) 500 102,400
Number of placed targets (P) 450 92,160
X total area, M (km) 5 32
Y total area, N (km) 5 32
X search area, Δx (km) 2 2
Y search area, Δy (km) 2 2
Overall target density, d (targets/km2) 20 100
Search operations per cycle, ns 400 100
Matches found per search k 80 400
Insert operations per cycle, ni 20 300
Delete operations per cycle nd 20 300
Workload per cycle (transactions) 440 700

Test Data

Test data for the database is drawn from a tracker scenario with stationary targets which will appear and disappear from the database. The targets for the scenario will be distributed roughly evenly on a grid of size M × N km2. These targets will be divided into a set of placed targets, P, and a set of unplaced targets, U. Targets in set P will have corresponding records in the database (they have previously "appeared") and targets in set U will not (they have "disappeared"). The size of set P is defined to be sufficient that searches in an area of size Δx × Δy will expect to find k targets on average; that is, the target density is d = k/(ΔxΔy) targets/km2. The size of set U is chosen to allow sufficient insertions and deletions with additional targets remaining to allow some measure of differences between searches. We have chosen to make P 90% of the total targets and U the remaining 10%.

The database generator creates a sequence of commands for the database. To do this, the generator must keep track of the sets P and U. For each cycle, the generator must choose ns uniformly random (x, y) pairs, nd targets from P, and ni targets from U. The ns pairs will be passed to the database, which will search the records and return those within Δx and Δy of the random pairs. The nd targets from P will be passed to the database to have their records deleted (they will "disappear"). The ni targets from U will be passed to the database and corresponding records will be inserted into the database (thus will "appear"). The time reported for each cycle is the total time to execute the search, insert, and delete commands. The command generation occurs before any of the actual benchmarking and therefore is not timed.

Workload

For a workload value for each scenario, we count each transaction (search, insert, delete) as an operation to be performed. This workload value, given in Table 1, can be used to compute throughput for the database kernel (in transactions per second) and compare among different architectures. However, an efficiency for the database kernel benchmark is not defined. Peak performance for the database kernel benchmark would be calculated from the rate at which the chip can perform each database operation, which in turn is related to the memory hierarchy of the entire system.

Reference

1. W. Coate and M. Arakawa. Preliminary design review: Feature-aided tracking for the PCA integrated radar-tracker application. Project Report PCA-IRT-5, MIT Lincoln Laboratory, Lexington, MA, October 2004.