Using an Object Oriented Database to Store BaBar's Terabytes

Tim Adye
Particle Physics Department
Rutherford Appleton Laboratory
CLRC

Abstract

The BaBar experiment at the PEP-II electron-positron collider at SLAC has recorded more than 30 Tb of data in an Object Oriented database, and is expected to reach a data rate of 300 Tb/year. This data must be accessible to 400 physicists in 10 countries. We report on the design and initial experience with this system.

BaBar is among the first in a new generation of particle physics experiments pushing the limits of data storage and access technologies. We make mention of plans for future experiments, notably on the LHC at CERN, with storage requirements in the multi-Petabyte region.

The BaBar Experiment

The BaBar experiment is based in California at the Stanford Linear Accelerator Center (SLAC), and was designed and built by more than 500 physicists from 10 countries, including from 9 UK Universities and the Rutherford Appleton Laboratory.

BaBar is looking for the subtle differences between the decay of subatomic particles, the B⁰ meson and its antiparticle (B⁰bar). If this effect (CP Violation) is large enough, it could explain the cosmological matter-antimatter asymmetry.

Since we are are looking for a subtle effect in a rare (and difficult to identify) decay, we need to record the results of a large numbers of events.

Since BaBar started operation in May 1999, we have recorded and analysed 260 million events, 48 million of which have been written to database, the remainder are rejected as part of the initial event processing (the raw data is archived in HPSS in flat-file format).

The experiment is scheduled for at least 4 more years' running with continually improving luminosity. We eventually expect to record data at around 100 Hz or 10⁹ events per year.

Since each event uses 100-300kb and we need to generate 1-10 times that number of simulated events, we will need to store around 300 Tb a year, or 1-2 Pb during the lifetime of the experiment. The database currently holds 34 Tb.

Choice of an Object Oriented Database

BaBar is the first large High Energy Physics (HEP) experiment to adopt C++ and Object Oriented (OO) techniques wholesale. This made an OO database management system (OODBMS) a natural choice. The main factors influencing our choice were

We require a distributed database accessible from many clients. A 200+ node farm at SLAC is used for event processing and analysis.
Multiple database servers allowing for scaling with the data volume. Currently we use 15 data and 20 lock and metadata servers.
Since the data structures will change over time, but we cannot afford to reprocess everything, we need to use some form of schema evolution.
A native C++ API (and Java).

Objectivity/DB, which supports all these features, was chosen. It is also the front runner for data storage on future experiments at CERN.

Data Organisation

Traditional HEP analyses read each event and select relevant events, for which additional processing is done. This model can be supported with sequential files.

In BaBar there is too much data to allow all the people to read all the data all of the time. Moreover, all the data cannot be stored on disk.

To allow for many different analyses, selecting different subsets of the events without reading the entire dataset, we organise data into different levels of detail and only read more detailed information when required. The tag, "microDST", "miniDST", full reconstruction, and raw data are streamed to separate files, with Objectivity keeping track of cross-references.

In addition data is also streamed according to common event selections, for which an inclusive signature can be determined in advance, such as hadronic events (which should include all the B⁰-B⁰bar events).

The tag and microDST data, as well as more detailed information for the most interesting events is stored on disk. Everything is archived in the mass store (HPSS at SLAC). Despite this we are already buying around a Terabyte of disk a month, so in future even this may not be enough. One possibility would be to keep just the tag on disk as well as a "dataset of the month", and require physics analyses to be performed in-step.

Performance

The main challenge has been getting this to scale to hundreds of processes/processors reading and writing at the same time. The vendor seems to believe we can do it.

"The Terabyte Wars are over.
While other vendors quarrel about who can store 1 Terabyte in a database, the BaBar physics experiment at the Stanford Linear Accelerator Center (SLAC) has demonstrated putting 1 Terabyte of data PER DAY into an Objectivity Database."
Top news item on Objectivity web site, May 1999

This was the result of a lot of effort to improve the speed of recording events.
Performance Plot

The figure shows how the total event throughput scales with the number of processors. The improvement from some of the many approaches attempted can be seen.

Work is ongoing to obtain similar improvements in data access.

Regional Centres

Even with all the measures to improve analysis efficiency at SLAC, it cannot support the entire collaboration. Moreover, the network connection from UK is slow, occasionally unreliable. Therefore we need to allow analysis outside SLAC.

The model is for three "Regional Centres" in the UK, France, and Italy. RAL is the UK Regional Centre. It has been a major challenge to transfer data from SLAC, and to reproduce databases and the analysis environment at RAL.

As part of a £800k JREI award, we have Sun analysis and data server machines with 5 Tb disk at RAL. UK Universities have smaller Sun machines and 0.5-1 Tb locally.

The microDST data is imported using DLT-IV tapes. With our data, these provide around 50 Gb per tape with compression (35 Gb native). We have so far exported 3 Tb.

The Objectivity federation has been interfaced to the Atlas Datastore (see paper by Tim Folkes in these proceedings). As well as acting as a local backup, the less-used parts of the federation can be archived and brought back to disk on demand (this procedure still requires more automation).

Other Experiments

BaBar's requirements are modest with respect to what is to come.

2001 Tevatron Run II

~1 Pb/year
CDF JIF (Joint Infrastructure Fund) award
Regional Centre at RAL

2005 Four LHC Experiments

many Pb/year
Prototype Tier 1 centre at RAL: 3100 PC99, 125 Tb disk, 0.3 Pb tape, 50 Mbps network to CERN
Tier 2 centres at Edinburgh and Liverpool
Require ~5 times more at startup

Future Software

The two major software choices that face these future experiments (the LHC in particular) are a storage manager (HSM) and database system (OODBMS).

A number of HSM systems, of varying degrees of complexity (and cost), are under consideration.

HPSS is expensive. Maybe we do not need all the bells and whistles, but is is already in use at SLAC, CERN, and some other HEP sites.
EuroStore (EU/CERN/DESY/...)
ENSTORE (Fermilab)
CASTOR (CERN)

BaBar's use of Objectivity is seen as a testbed for later experiments. Our experience will help determine whether it is well-suited for HEP. The is also being investigated by the MONARC Project, and can be compared with home-grown system such as Espresso from CERN.

Conclusions

Using Objectivity and locally-written database tools, we have successfully stored and analysed the BaBar data, both at SLAC and outside. Due to the ever-increasing data rate, data storage remains one of the major challenges of the BaBar experiment. BaBar's experience with Objectivity should inform decisions of future large HEP experiments.

http://hepwww.rl.ac.uk/Adye/talks/000225-dl-data-storage/000225-dl-data-storage.html last modified 13th March 2000 by

Tim Adye, <T.J.Adye@rl.ac.uk>