Following the 1994 White Book 4 (WB4) review of high energy physics computing the central facilities provided by the then RAL Central Computing Department {now CLRC/RAL Department of Computing and Information and Information (DCI)} were revolutionised, becoming much more focused on the actual needs of the community and requiring a significantly lower level of support. The Computing and Network Advisory Panel (CNAP) inherited the responsibility from WB4 for overseeing the execution of the new strategy. Given the rapid pace in the development of computing and information technology, it is now appropriate to undertake a thorough follow-up review of the central facilities for UK HEP.
The formal terms of reference of this review are to be found in annexe 1 to this report, along with the Review Panel membership (annexe 2).
After a brief review of the current status of the central computing provision today (section 2), the boundary conditions and strategic directions are given (section 3). Extensive surveys have been carried out to solicit the opinions of the user community (section 4) and to study the validity of computing models first adopted in WB4 (section 5). Section 6 summarises the interaction with DCI during the review. The findings of the panel are presented in section 7 and lead to the recommendations laid out in section 8. A brief overview of the WB4 review is given in annexe 3 and a summary of comparable computing facilities at major accelerator centres and some national laboratories is given in annexe 4. References and support documentation for this review are listed in annexe 5.
We thank the members of the HEP community who provided answers to the detailed questionnaire on present and future use of the central facilities. We thank those people who have worked hard, often at short notice and usually with good humour, to answer our many questions and especially those from DCI and PPD who have borne the brunt of this onslaught. Indeed, DCI have willingly and openly discussed the future provision of central services.
The
recommendations of WB4 as they affected the provision of
Central Computing resources are summarised in annexe 3.
The essence of the changes resulting from WB4 was to
provide three CPU platforms coupled to a large datastore
for the batch production of Monte Carlo simulated
datasets. One area of development beyond WB4 which
deserves comment is the provision of central disk storage
space. AFS (and equivalent systems) has become more
important to the community and the reduction in the price
of disk storage has meant that the central disk store
could be increased. This makes both more efficient use of
the robot datastore and permits the use of the central
disk store for the temporary storage of datasets.
The overall resources and principle
deliverables are defined within the framework of the
Service Level Agreement (SLA) between PPARC and CCLRC for
the Particle Physics Programme. The present DCI services
are those defined by the WB4 exercise, but a number have
been upgraded with full agreement from CNAP.
A document supplied by DCI and titled: Review
of RAL Particle Physics [1] (also 'Additional Input'
[2] ) has a large part describing the services now
offered. In this section, a summary is given of the most
important aspects of the resources employed, and services
offered, within the current Central Computing provision.
Table 1 shows the allocation of
resources and Table 2 the breakdown of capital
expenditure for the financial year 1996/97.
| Effort (SY) | Expenditure (£k) | |
| Direct Staff Effort | 9.5 | - |
| Direct Staff Costs | - | 35.00 |
| Recurrent | - | 44.00 |
| Capital | - | 108.06 |
| Indirect Staff Effort | 0.8 | - |
| Indirect Staff Costs | - | 9.00 |
| TOTAL | 10.3 | 196.06 |
Table 1 : Allocation of Resources for FY 1996/97
For comparison, in 1995/6 the
effort was 16 SY and the expenditure £200k. The
ëDirect Staff Costsí are administrative costs
associated with the DCI staff concerned. The
total expenditure in providing central computing
services for Particle Physics is £657k in
1996/7. In earlier financial years the
expenditure on central computing was
approximately twice the current amount.
| Amount (£k) | |
| AFS/NFS Disk Space (36 Gbyte) | 6.59 |
| 115 Magstar 3590 tapes for IBM 3494 robot | 4.37 |
| Import/Export Server | 2.00 |
| Disk Farm - 100 Gbyte initial phase | 28.00 |
| CSF Upgrade (four HP9000-C110 machines) | 32.50 |
| CSF Disk | 3.10 |
| ISDN Bonding Software | 1.00 |
| Windows-NT PC Farm - first phase | 15.50 |
| Unallocated | 15.00 |
| TOTAL | 108.06 |
Table 2 : Budget Breakdown for Capital in FY 1996/97
The £15K pounds of unallocated funds is being spent at the time of writing. Some of the funds are being targeted on disk upgrades so that CSF will be able to handle 1 Gbyte simulation datasets (a capability that has already been identified by CNAP as a pressing requirement).
In Figures 1-3, the weekly CPU utilisations are shown through most of 1996 for the CSF (HP-UX), OSF (Digital UNIX) and AXPRL1 (VMS) central platforms.
The CSF farm has been heavily
used throughout the year for Monte Carlo
production by a number of experiments including
ALEPH, BABAR, H1, OPAL, NA48 and SNO.
Preparations are also underway for the ATLAS
collaboration to produce a Monte-Carlo dataset of
107 jets for calorimeter and trigger
studies. It is clear that CSF continues to
provide an extremely valuable service for
large-scale simulation work. With full agreement
from CNAP, the use of CSF has diversified through
the year and it has played an essential role in
data analysis for some experiments. During summer
1996, the ALEPH(UK) groups used this facility to
perform a rapid and competitive WW 4jet mass
measurement from their LEP2 dataset. Due to
changes in HERA operation and increases in
luminosity, the H1 collaboration are unable to
continue reprocessing their data at DESY during
the accelerator shutdowns. A significant part of
the 1996 reprocessing is currently underway on
CSF.
Figure 1:
Particle Physics Utilisation of CSF (UNIX)
Platform.
Figure 2:
Particle Physics Utilisation of OSF (UNIX)
Platform.
Figure 3:
Particle Physics Utilisation of AXPRL1(VMS)
Platform.
Use of VMS continued to be dominated by DELPHI Monte Carlo production, but the facility was also used by CPLEAR, ALEPH, Crystal Barrel, CMS and Soudan II. The future of the VMS service was discussed by the CNAP committee on 3/10/96 and it was decided to recommend closure at the end of September 1997. Users have been informed and measures are being put in place to cater for those most seriously affected. The decision to close the VMS service has been endorsed by the Review Panel.
OSF is a shared facility and the HEP fraction (50%) was only fully utilised in the second half of 1996. The use is a mix of Monte Carlo and analysis. The OSF system is the server for the ZEUS Funnel operation in the UK and it also acts as the collection point for Monte Carlo data from CPUs based at many institutes. The CPU capacity of the system is too small for large scale Monte Carlo production work. During the last year the largest users of OSF were WA89 (hyperon) analysis, BaBar and OPAL Monte Carlo in that order. It is the central code repository for UK BaBar and an important facility for UK groups who are developing code for BaBar.
During the first half of 1996, a trial disk farm (where datasets are maintained semi-permanently) was set up so that connectivity and software could be tested by the ALEPH collaboration. The successful trial culminated in the purchase of 25*4 Gbyte Fast SCSI disks, which were installed in July 1996. The Disk Farm provides a fast alternative to the Datastore for frequently accessed datasets. The 100 Gbyte of disk space is already shared between eight experiments and CNAP has approved the purchase of another 100 Gbyte in 1997/98. There is strong demand for this facility already.
The 'rfio' software from CERN has been installed and work is underway to modify the stager utility to interface to the RAL tape system.
During 1996, a number of upgrades were made to the Datastore.
Particle physics data has also been reorganised within the robot so that virtual files belonging to different owners are no longer mixed on the same physical volume. This will make it easier to monitor and manage Datastore use in the future.
It has been recognised for some time that the import/export facilities into the Datastore are inadequate for the bulk handling of large datasets that can simultaneously arrive from a number of experiments. The H1 reprocessing alone has been estimated to require the import of 5 Tbyte of data over a short time period. A proposal has been submitted, and agreed within CNAP and the Data Storage Subgroup, to provide an automated facility based on DLT7000 technology. It is intended that this open-shop system should be capable of importing and exporting data 24 hours/day with minimal operator intervention.
AFS has become a production service with a dedicated server for particle physics. A further server is envisaged to give the necessary resilience to the existing server once funds are available. The master copy of the ASIS repository also moved to AFS in 1996. The AFS client software has been installed on both the CSF and OSF systems, although work remains to provide a secure implementation for the NQS batch system. An additional 36 Gbyte of disk was purchased for both NFS and AFS (for the storage of software as opposed to data) and has been split according to the requirements of experiments.
At the end of July 1996 four HP C110 machines, each with 64 Mbyte of memory and 2 Gbyte disk, were added to the CSF platform. This expanded the farm to 29 batch nodes and benchmarking indicated a 30% increase in CSF CPU power to about 688 CERN Units. The PA7200 CPU chip in the C110 nodes is upgradable to the PA8000 processor, offering the possibility of further performance enhancements.
The smooth installation of the C110 nodes is reflected in Figure 1, where the weekly CPU utilisation is seen to increase to about 5,000 hours shortly after their arrival. This coincided well with a significant demand from the OPAL experiment which simulated and reconstructed over 3 million LEP1 events in 3 months.
It is anticipated that the CSF operating system will be upgraded to HP-UX 10 in early 1997.
Farms consisting of commodity PC machines offer a cost-effective solution to providing the increased CPU power needed for present and future high rate experiments. A development project was launched by DCI to study the requirements for a PC Farm. Initially, this consisted of two machines running Windows-NT (WNT) v3.51; a standard off-the-shelf PC (120MHz Pentium) and one assembled at RAL from proprietary components (166MHz Pentium). This provided a useful test-bed with which to investigate compilers, install WNT versions of the CERN library and run particle physics software. Benchmarking tests indicated the 166MHz machine to have a performance mid-way between a HP712 and HP735 (machines used in CSF).
A proposal was made to CNAP in October to extend this work and commission a production PC Farm for the UK particle physics community. At the same time, a Windows-NT Subgroup of CNAP was launched to co-ordinate the work of systems managers and help to define the detailed requirements of the community. Full agreement was reached on the provision of a WNT production service at RAL starting with a first phase of five Pentium Pro (200MHz) nodes.
The White Book
4 review [9] led to a revolution in central computing for
particle physics, and resulted in a far more focused and
cost-effective central service. Furthermore the funding
of computing services via the SLA between PPARC and CCLRC
for the Particle Physics Programme has led to a greater
degree of financial control than had been enjoyed
previously.
Although the three platforms set up by
WB4 were designed primarily for non-interactive Monte
Carlo event generation it is clear that the coupling of a
large datastore and fast CPUs allows for other uses.
Small groups and non-accelerator experiments continue to
use the central facility as their primary datastore and
analysis resource (e.g. WA89, Soudan II and Crystal
Barrel). The changing pattern of running at HERA, and the
vagaries of international network connections, have also
led to requests for significant analysis or
reconstruction work. Indeed the
Review Panel would like to stress that the use of central
computing resources is, and will continue to be, affected
strongly by the performance of international network
connections.
This review of central computing was
required to evaluate the efficacy of the current
provision, and then to determine a strategy for the next
two years and beyond into the LHC era. It was required to
study the resources the particle physics community
invests in central computing, and to examine the balance
between Monte Carlo and analysis/reconstruction work.
This section summarises the strategic
directions and assumptions used by the Review Panel.
In the absence of firm financial boundary conditions, the Review Panel decided to adopt three possible funding levels for services provided by DCI. These are 100%, 85% and 70% of the funding resources provided in FY 96/7 (see section 2.1). It was agreed that the Panel would develop strategies for each of these.
The White Book 4 recommendations were
successful for two reasons in particular:
The Review Panel has sought to
build on this foundation and to take the process
a step further. The strategy adopted is as
follows:
As part of the
review, a user survey was carried out in the Autumn of
1996. A questionnaire [3] was sent to one representative
per experiment per institute. This asked questions on the
present and planned usage of the central computing and
datastore facilities.
The survey shows some changes in the
use of central RAL facilities since WB4. Although most
LEP and HERA groups still work to the model which emerged
in the 1994 survey (annexe 3) there is a significant
increase in the usage by some university groups. The
importance of the services to some small experiments
(e.g. Soudan II and CPLEAR), and the necessity of central
services for the next generation of experiments (BaBar
and LHC), also clearly emerges from the survey. Although
the central services are widely used, some users felt
that there was inadequate documentation on some aspects
of the services. The provision of useful and up-to-date
documentation on the web pages should be considered as an
important user requirement.
There is an important interplay in any computing centre between data storage and CPU power capacity. In the survey, questions were asked on the present and future usage of both of these and also addressed the awareness of services provided by DCI. Apart from the provision of central computers and data storage there was almost no mention of the services classified as "Other Services" (see section 2.8). It would seem that the effectiveness of this work, at least in making its existence known, could be improved.
One very important feature of a
central service is the storage of data. At
present the central DCI facilities for data
storage comprise the Datastore, of which 10
Tbytes are available for particle physics, and
100 Gbyte of associated disk space. These
facilities are used extensively by members of
most collaborations in the UK. For the LEP and
HERA experiments the main usage has been
associated with the large Monte Carlo production
runs. These runs are the UK shares of
collaboration wide Monte Carlo event production.
In addition, there is some storage space used for
DSTs and ntuples, and the H1 HERA experiment is
planning to reprocess a significant fraction of
their data, which will necessitate a sizeable use
of the Datastore. The user survey confirms this
pattern of usage. The LEP community constitutes
about 32% of the total HEP community. The
corresponding figure for HERA is 22%. The pattern
of non Monte Carlo usage varies considerably
between the different experiments and even
institutes within the experiments, with some
groups making extensive use and others no use at
all. In the future the LEP needs will tail off as
the LEP-1 data analysis is completed, since the
LEP-2 datasets are smaller. Only small usage
beyond the year 2000 is foreseen. For HERA the
demands are expected to grow as the luminosity
delivered increases; however, the rise may well
be slower than that of the luminosity due to
additional data selection.
The next largest community of
users is the LHC with 25%. At present the storage
usage is relatively small. However, as discussed
below, the LHC experiments will need extensive
storage and CPU capacity. The BaBar experiment
involves about 7% of the HEP community and again
will have significant future demands. NA48, with
3% of the community, will start in 1997 to
produce large volumes of data. The other smaller
experiments (SNO 4%, UKDMC 3%, Soudan II/Minos 2%
, Crystal Barrel 1% and WA89 1%) which use the
central facilities, all find them very important
to their work.
The demands for data storage
are continually increasing and the Datastore
occupancy is already close to capacity. Estimates
from the experiments show an ever increasing
demand. However, obtaining reliable estimates of
future usage seems to be difficult. For example,
there was a large increase in usage from August
96 to January 97; well beyond that expected from
estimates made earlier in 1996.
In order to attempt to meet
these requests, the CNAP Data Storage Subgroup
has explored a number of options with DCI. One of
these is an effective quota assignment for each
experiment whereby a certain number of slots in
the robot would be allocated for each experiment
and the experiment would choose which data would
be active in the robot. This solution also
implies the collaborations would need extra
resources to organise offline data archives.
Details are being worked out and the system will
be introduced as soon as possible.
Another aspect of the Datastore
which is also currently problematic is the
capacity to import and export data. This is a
severe bottleneck, and is a priority item in the
improvements foreseen.
In the longer term there will
need to be a significant increase in the capacity
of both the Datastore and disk storage. The data
volume requested to be stored is likely to grow
in the next few years at somewhere between 4 and
8 Tbytes per year. At present the Data Storage
Subgroup are exploring, together with DCI,
possible solutions and costings for upgrades.
The largest CPU power is at present that
of the CSF platform. In 1996 the LEP experiments
used 47% of the CPU ( ALEPH 36%, OPAL 9% and
DELPHI 2%), HERA (mostly H1) used 15%, SNO 14%,
other accelerator experiments used 16% (largest
are NA48 7%, CPLEAR 5% and BaBar 4%) and the LHC
experiments used 8% (CMS 6% and ATLAS 2%).
The estimated CPU usage for
production work from all experiments (mainly
Monte Carlo) is about 54%. This work is submitted
by a small number of specialists on behalf of
their collaborations. If likely production work
is removed from the above statistics then RAL/PPD
use accounts for 17% of remaining CPU capacity
and university group usage 83%. This represents a
significant change since WB4 when the largest
fraction of CPU usage was from the RAL/PPD group.
For the LEP experiments the
production Monte Carlo work for ALEPH and OPAL
uses mainly CSF, whereas DELPHI uses VMS on
AXPRL1. In general, the CPU capacity available is
considered just about adequate.
For the HERA experiments again
CSF is most important and it is envisaged that
CPU requirements will grow significantly as the
luminosity increases (e.g. ZEUS estimate a factor
2 increase by the year 2000).
The Crystal Barrel experiment
is complete, but still has a significant need for
both processing power and use of the Datastore.
The SNO experiment will soon
start to produce data and foresees increased
usage by a factor 2 by the year 2000.
The new accelerator experiments, NA48, BaBar and the LHC experiments foresee a rather large increase in their CPU needs, both in the next few years and, particularly for the LHC experiments, beyond the year 2000. The simulation of LHC data is largely CPU limited so to some extent it is difficult to get a realistic picture of requirements as these depend on the CPU and storage capacity available throughout the collaboration. However, in order to contribute usefully to Monte Carlo production, increases in line with other computer centres will be needed. The Windows-NT possibilities are clearly of great relevance to these demands.
The present central
services appear to be reasonably well matched to
the current needs of the UK HEP community, and
are heavily used. There was a very
good response to the user survey, so that it can
be used as a reliable reflection of the user
views on central computing. No serious problems
came to light, although there are some worries on
the reliability of the Datastore in recent months
(which are currently being addressed).
The computer models for the
BaBar experiment and the LHC experiments are
still evolving. In the short term, there is an
increasing demand for both CPU power and storage
space. However, it is expected that significant
use of object oriented databases will be made and
the evolution of possible technologies to be used
in association with such concepts cannot be
accurately forecast.
For the longer term
future, the needs of the new experiments (BaBar
and the LHC experiments) are potentially very
large indeed, and the role of central facilities
will probably be much more important than for the
LEP and HERA experiments. It is
clearly important to embark on work in
preparation for this new era. There is also a
need for a systematic approach to the provision
of software licences. These are particularly
relevant in the area of OO database methods and
are already an issue for the BaBar Collaboration.
It is clearly in the interests
of the whole community if a common solution for
both BaBar and the LHC experiments can be found.
The situation with the LHC experiments should
become a little clearer when their computer
models have been considered by CNAP and the
implications are better understood. However, it
is also to be expected that the models will
evolve with time in response to new ideas and to
technological developments.
Satisfying the potential demands of the new experiments raises questions which extend somewhat beyond the scope of this report. It is most probable that the costs of setting up effective regional computing centres for these experiments would imply budgets well beyond a reasonable projection of the existing level of funding. Whether this funding, if approved, should be through a specific peer reviewed body, such as the PPESP, is a question that will need to be considered by both CNAP and PPC.
In the model of computing recommended in WB4,
data is collected, processed, filtered and stored
centrally at the experimental facilities. Data analysis
and code development are performed both locally at the
experimental facility and at the collaborating institutes
on compressed sub-samples of the data which are stored
(depending on their size) at the collaborating institute
home computers or centrally at RAL. For the small,
non-accelerator, experiments substantial fractions of the
data are held at RAL. Large, collaboration wide, Monte
Carlo simulation is organised centrally and the load
divided between the regional centres. RAL, acting as the
regional centre for the UK, has a farm of processors
tailored to this task.
In a questionnaire on computing models
[4] we asked one representative of each of the large
experimental groups how this model serves their
experiment. In addition to specific questions we invited
the respondent for any general comments on their model.
In their replies to the questionnaire
the LEP and HERA representatives indicated that their
current computing models were broadly in line with those
laid down in WB4. It was mentioned by the OPAL and ALEPH
groups that this was partially due to the constraints of
WB4. Most collaborations have made extensive use of
central facilities for large scale Monte Carlo
simulations. Provision for interactive physics analysis
computing was added at a later stage and sanctioned
case-by-case by CNAP. The use of central facilities by
physics analysts differ from those carrying out Monte
Carlo production in that they are more mobile and will
use the route or system by which they can get the fastest
turn around.
Variations between different
collaborations and institutes do exist with some
collaborations/groups making more use of central
facilities than others. Patterns of central computing
usage have changed with time when, for example, computing
power at and/or networking access to accelerator sites is
compromised. Problems with network access
between home institutes and accelerator sites were
mentioned as the major reason why individuals and groups
change their model of computing. These
problems can be short term in nature, when network
performance becomes poor over a period of weeks or is
lost for a few hours, or long term with a gradually
declining performance due to an ever increasing load. It
is not foreseen that the model used to date by the LEP
and HERA experiments will change significantly over the
remaining lifetime of the experiments. The replies to the
questionnaire indicate that the approach outlined in WB4
has been largely successful.
NA48, BaBar and the LHC experiments are
not yet taking data. However, to date, they have pursued
computing models approximately in line with those of WB4.
The difference between the LEP and HERA experiments and
the "LHC generation" of experiments lies in
their plans for regional computing centres. This is
partly driven by significantly larger data volumes and
partially, in the case of BaBar, by the difficulties of
working effectively across a transatlantic network.
NA48 which will start to take data in
1997 and BaBar which will start to take data in 1999 will
copy large volumes of data at CERN and SLAC and ship it
to RAL placing severe requirements on the data store.
For the LHC experiments the computing
models are not yet fixed but those being considered
foresee extensive use of the regional centres such as
RAL. The function of the regional centre would include
re-processing of data and the simulation of the large
number of Monte Carlo events required. This will place
severe requirements on data storage and CPU.
As a consequence, the role of RAL can
be thought of as being of similar importance to the
accelerator datastores at the LEP or HERA experiments. RAL
will be pivotal to successful UK participation in the
future generation of experiments.
One new consideration which must be
taken into account is the development and increasing
acceptance of new software design tools and
methodologies. Programming concepts such as Object
Oriented techniques and the replacement of data summary
tapes (DST's) by Object Oriented Databases are becoming
widespread within the BaBar and LHC Collaborations. These
require commercial products and their cost will have an
effect on the computing budgets of experiments.
The survey clearly indicates that the WB4 model of computing was largely successful within the limits imposed by international networking. The analysis computing models at LEP, HERA and the LHC are particularly susceptible to international networking changes. With the next generation of experiments we are seeing an evolution of these models. The need for central computing centres with the capacity to store large volumes of data and powerful farms for Monte Carlo production is clear. New programming methodologies are, for the first time in 30 years, changing the way many physicists do their analysis.
A crucial part
of the review of central computing facilities was the
exchange of information between the Panel and RAL/DCI.
DCI provided the following
documentation :-
After each presentation from DCI, there was detailed questioning and discussion.
The original document from DCI
([1] and [2]) consists of two distinct parts:
Their strategic directions are
summarised as:
These are consistent with the directions the committee defined in section 3.2.
By the time of the second
meeting with DCI, the Panel had received replies
to its other questionnaires, and had refined its
ëbaseline requirementsí as:
DCI were requested to revise their proposals for funding at 100%, 85% and 70% of the 1996/7 funding level, in the light of the refined baseline requirements, and this resulted in their second submission to the Panel [7].
DCI can only fulfil the baseline requirements with funding at 100% of the 1996/7 level. Below this there would be serious shortcomings. With 100% funding, and after the changes recommended had been made, then there would be 8.6SY (10.3SY) of effort and £238k (£152k) capital and recurrent (where the 1996/7 amounts are given in brackets).
The
Review Panel has looked at the consequences of the
decisions taken by the WB4 Review, and it concludes that
the recommended changes were correct and the resulting
central computing services have been successful.
As proposed by WB4, it is the close coupling of CPU farms
with a large datastore that make the central facilities
such a valuable resource and one which makes the
community less susceptible to the problems of
international networking. The introduction of platform
managers has also been a significant factor in the
success of the facilities. The staff effort required to
run the central HEP computing facilities has dropped from
26 SY pre-WB4, through 16 SY in 95/96 to 10.3 SY in 96/97
which is roughly the steady state level proposed in WB4.
Capital and recurrent expenditure has run at about £200K
per annum over this period. Although it is very difficult
to make direct comparisons, the information received from
other national laboratories shows that the level of
resources devoted to running the central facilities by
DCI is either at, or on the low side of what would be
used elsewhere.
The User Survey makes it very clear
that users have a continuing need for central computer
services in order to undertake Monte Carlo dataset
production, for other well specified production purposes
(e.g. the H1 data reprocessing), for analysis services
for the smaller experiments (foreseen for BaBar within a
regional centre), and also for a limited amount of code
development. The operational terms of reference
of the platform managers should be modified by CNAP to
recognise the somewhat modified pattern of usage.
However the Review Panel does not propose that the
facility become a general purpose one. Central resources
should continue to be tightly focused around the coupling
of a large datastore and a concentration of powerful
CPUs. The facilities should complement those available at
home institutes and should be of greatest benefit to the
majority of users.
By locating the HEP central facilities
at a large IT centre such as RAL/DCI the community gain
from the backup available in depth for both staff and
equipment. Maintenance costs are also lower because they
are set at the level of the contract for CLRC as a whole
with the equipment suppliers. However, CNAP should
continue to be vigilant that both these advantages are
realised in practice.
The Panel believes that the
trend established in WB4 should continue and there should
be a further shift in resources from staff and recurrent
costs into capital or savings. As a
consequence the future staff levels in DCI will reduce
from the current 10.3 SY to 8.6 SY with a consequent
increase in capital and recurrent funds from 152k to 238k
(assuming funding at the 100% level).
From the existing and projected pattern
of use (using information on the use of the Datastore
updated to January 1997) it is clear that the capacity of
the Datastore needs to be increased urgently. The
Panel recommends an increase of 2 Tbytes as soon as
possible and if possible a doubling of the existing 10
Tbytes capacity over the next 2 years (97/99).
It will also be necessary to introduce a system of space
allocation for experimental groups. The details of the
latter are being worked out by the CNAP Data Storage
Group independently of this review. We note that the
Datastore will require a large capital upgrade in 2-4
years time to move to the most cost effective (i.e. low
operator overhead) emerging technology and to meet the
projected demand as the LHC experiments prepare for data
taking. Although the choice of hardware is becoming clear
there are some serious issues concerning software and the
changing pattern of data storage that need to be
addressed. The Panel welcomed the news that there is now
UK HEP and DCI involvement in the CERN RD45 [11]
development project which is also addressing these
issues, and plans are in place for initial OODBMS tests.
Over the last couple of years the power
and flexibility of PCís has changed dramatically. It is
now possible to run large networked systems using either
WNT or LINUX to provide desktop machines for staff.
Recently, a number of groups both in the UK and in other
countries have demonstrated that PC farms could provide
very cost effective CPU power.
If UK HEP is to take advantage of these
changes within a constant or falling budget, then some of
the existing platforms have to close. The strategy that
the Panel followed has been derived from general
principles adopted by CNAP at its inception and the
information from the user survey. The closure of the VMS
service has already been discussed, agreed by CNAP and
announced. It is following a trend already far advanced
in the community and elsewhere. The decision to close one
of the two UNIX platforms was taken as the only way to
release resources for the development of a WNT production
farm. The choice was dictated largely by the much larger
fraction of the community using the CSF platform than by
any obvious technical merit of CSF over OSF.
To meet the needs of the existing users
of the VMS and OSF platforms and to allow for natural
development, the Panel recommends that the CSF
platform be upgraded by roughly 50% CPU capacity and with
a corresponding increase in diskspace and internal
network capacity. The CSF platform will be
used both for batch production and for the other more
varied uses already agreed. The Panel note that the HP
operating system needs to move to version HP-UX10 before
the system can be used for BaBar code development. The
present WNT development system (approved by CNAP in 1996)
should be expanded rapidly to a batch production facility
and its use for code development and analysis be
investigated and if appropriate developed over the next
couple of years.
It is recognised that groups affected
by the phased closure of existing platforms must be
properly catered for, particularly those analysing data
at the end of an experiment for whom the effort of
changing to a new platform would be both difficult and
inefficient. Following discussions with the groups
concerned the Panel is satisfied that their legitimate
needs will be met.
Having decided the broad strategy to be
recommended for the next few years, the Review Panel drew
up more detailed ëbaseline requirementsí for the
central computing services (see section 6.2). The Panel
then negotiated with DCI to define the full set of
services they could offer against three funding
scenarios.
The conclusions of the Panel concerning
the three funding scenarios are :-
The proposed baseline requirements can
be achieved at the level of resources proposed
for 96/97 and for 97/98 (i.e. 8.6 SY and about
£238K capital and recurrent) with similar
figures expected for 98/99.
Some of the increase in the CSF farm would be delayed and the WNT development would be reduced in scope (only 10 nodes instead of 20 and no development beyond production). This keeps the essence of the baseline intact, but the Panel does not think that the rather small saving will compensate for the reduction in facilities.
It would not be possible to run both
platforms and it would be necessary to give up or
postpone the WNT development, leaving only the
CSF platform and the Datastore. The development
of the Datastore would also be attenuated. In the
view of the Panel this is not a viable option.
The computing models of BaBar and the
LHC experiments anticipate the extensive use of regional
computing centres. The Panel has considered the BaBar and
the LHC experiments situations to be somewhat different.
The BaBar computing model is more advanced, the
requirements are sooner, and the requests, in particular
with respect to licencing issues, are better defined.
BaBar is pioneering the concept of a UK regional centre
and this will offer operational experience in order that
plans can be made for the LHC experiments.
Some of the Panel's recommendations
already take into account BaBar's needs. It is foreseen
that most of the demands of the BaBar experiment, as they
have been formulated to the Review Panel, can be funded
from resources allocated to the central services with
possible specific items falling on the BaBar experimental
budget. This should be monitored by CNAP as the requests
become more precisely defined.
Neither the Review Panel nor CNAP has had time to study the consequences for UK facilities in any detail. However, it is clear that to provide a specialised regional centre for each experiment may require resources beyond those currently assigned or projected for the central facilities. The PPC together with the PPESP will have to consider this fundamental issue as part of the total resourcing of the LHC experiments.
The Review Panel NOTE the following:
For the future, the Panel recommends :
In more detail our main recommendation (2) above implies the following:
Comments:
R.Devenish
(Oxford) (chairman)
A.Halley (Glasgow)
P.Jeffreys (RAL/PPD)
S.McMahon (Liverpool)
R.Middleton (RAL/PPD) (secretary)
P.Renton (Oxford)
P.Watkins (Birmingham)
The review made by the WB4 Committee [9] covered
the provision of data storage, computing power, software
and hardware provision and networking. The review
followed a period of transition in central computing in
the UK in that the mainframe IBM service was no longer
being used. Instead the facilities in place were more
tailored to Particle Physics usage.
An extensive user survey was conducted to find out the views in the community on the central services provided and of their future needs. From these replies a clear pattern emerged of the way in which the large majority of physicists, particularly those working on LEP and HERA experiments, carried out their analyses. The survey showed that the only reliable and up to date copy of the datasets used for analysis were kept at the accelerator centre (CERN or DESY). The model which emerged was that these datasets were initially accessed using computers at the accelerator centre and a reduced dataset (e.g. ntuple) was brought back to the local institute computers for detailed analysis. This transfer could either be on tape (e.g. exabyte) or by network. The `shelf-life' of the dataset might be typically one or two months.
Almost all groups stated that reliable
and fast international networking was of the utmost
importance. Although this was not, in fact, a centrally
provided service in the terms of the review, the network
performance was in practice a determining factor in the
evolution of the computer models adopted. Many groups
gave, although not explicitly asked, impassioned views on
the desirability of their local computer facilities.
These views were reported in the WB4 report as input for
the PPC to decide the balance of expenditure between
central and local computing.
The model described above was not,
however, universal. In particular for non-accelerator
experiments such as Soudan II there is no central
laboratory, and the RAL Datastore has become the central
datastore for the experiment.
Another very important usage of central
facilities was the production of Monte Carlo events for
collaboration wide distribution. These events represented
the UK quota for the various experiments.
The WB4 committee identified two
important areas for the future provision of central
services:
These recommendations had consequences
on the nature and scope of the central services. The
system envisaged was to be dedicated essentially to these
limited goals and could be much simplified from what
existed. No general purpose computing was recommended on
the central machines (i.e. no e-mail, PAW etc.). Post WB4
some limited analysis work was permitted. Three platforms
were foreseen; two UNIX-based (CSF and a more limited OSF
service) and VMS. The capacity of the CSF service was to
be increased in line with reasonable demand. Up to 10
Tbytes of the 20 Tbyte datastore were foreseen for
eventual HEP use. The datasets stored were not to be
backed up, so that they were not to be the primary source
of the data.
The recommendations to provide a more restricted service meant that it could be run by fewer people, resulting in economies in the staff costs. Since no back-up service was proposed a reduction was made in the number of operators. Effort was proposed for areas of general importance for particle physics. These included end-to-end monitoring of international network response, ATM and video conferencing. Taking all these considerations into account, and after discussion with CCD (now DCI), the overall staff complement was to be reduced from 20 to 10 SY in a timeframe of about 2 years. It was envisaged that in the longer term further reductions may be possible.
Although the current review is concerned with
the future of central computing resources for UK HEP and
is largely a domestic matter, it is important to put it
into the context of developments at the accelerator
centres used by UK experimentalists. It is also useful
for the Panel and the PPC to be aware of what is
happening at other national laboratories in support of
their domestic HEP programmes. To this end a short note
[10] outlining the purpose of the review and requesting
some information was sent to accelerator centres and a
number of national labs. In addition to strategic
questions, a small number of questions were addressed to
specific issues concerned with this review. The
respondents are listed at the end of this appendix.
Of course each of the major accelerator
centres (CERN, DESY, FNAL and SLAC) and other national
labs work within the context of their history. Only CERN
was from the outset a facility to be shared by many
countries, the other centres have encouraged
collaboration from other institutions and other countries
for quite a number of years but within differing
frameworks. These differences are also to be found in
their approach to computing resources. On the other hand
the inexorable change to the most cost-effective way of
providing CPU and data storage is common to all and not
surprisingly most centres have followed the trend away
from large multi-user general purpose machines to
clusters of work-stations running a variety of operating
systems - but with a predominance of UNIX rather than
manufacturer specific systems such as VM or VMS. More
recently most countries have at least one centre actively
pursuing networked PCs as the next platform for data
reduction and event generation, as well as being the box
that sits on most people's desks. What is also clear is
that nobody considers that they have 'solved' the problem
of data storage and access and this is seen to be
"The Problem" in computing in particle physics.
There is however strong convergence on the choice of
technology being pursued and the desire to move as
rapidly as possible towards '100% automation' of tape
movements.
First some comments on the view from
the accelerator labs. SLAC is rather different from the
other labs in that it is smaller and for a number of
years has really only had one large experiment at a time
operating, at present SLD and soon BaBar. The computer
centre considers itself as a partner with the
experimental collaboration in developing the offline
reconstruction and analysis facilities. The solution will
be a compute farm closely coupled to a large datastore,
but it will be owned and run by SLAC for the benefit of
all SLAC users. In designing the farm attention has also
to be given to the overall data management system and the
connection technology between the CPUs and datastore.
Some design numbers are useful (as the system is
essentially for BaBar): by 1999 the aim is to have 30K
MIPS CPU power; 16 StorageTek Redwood drives in 5 silos
with a capacity of order 1.5 Peta Bytes capable of
delivering data at an aggregate rate of 150 Mbytes/s; an
aggregate network capacity within the farm of 250
Mbytes/s and 5 Tbytes of diskspace. A total of 13 people
run the SLAC central computing facilities and that
includes four responsible for physics tools.
At CERN the situation is much more
complex and it was with the large LEP experiments that
the SHIFT system - of large single experiment offline
reconstruction and analysis farms - developed. The
pattern has also been followed by the HERA experiments at
DESY. Although these farms are managed and part funded by
the collaborations, the problem of reliable day-to-day
operation has not been so easily solved. In fact in most
cases the collaborations are now effectively contracting
out the operation of their farms to the computer centre
professionals. For LEP experiments data storage is not a
problem but new systems are being put in place for
current and near future high rate experiments such as
NA48. This will give CERN some operational experience
before detailed decisions have to be made for the LHC
experiments. It is to be noted that full automation of
tape operations features strongly in the plans. The WB4
computing model has some similarity to those emerging
from the ATLAS and CMS groups, it is thought that
'regional data centres' will be needed. The present
platforms at DCI were modelled very closely on what was
available at CERN a few years ago. CERN expect to
continue running UNIX farms and are developing a WNT
farm. Concerning support, CERN regard it as very
important that those supporting a platform are a coherent
team that can provide mutual backup and give 24 hour
service. If one assumes that some tape handling is
necessary then a total group of about 12 people would be
needed to support three platforms roughly: team leader, 3
platform managers, one network expert, one for the
datastore, 3 general systems including user support, 3
operators. It is also noted that as a platform becomes
more general purpose so the amount of support needed
grows rapidly.
DESY has always had central logging of
data from experiments performed on the accelerators there
and they continue to provide the central datastore for
the HERA experiments. They have a lot of experience with
robot systems (ACS, Ampex, Grau) and continue to upgrade
to meet demand. Both the large collaborations H1 and ZEUS
have their own data reduction farms based on SGI
machines, but these are run by the computer centre.
HERMES has developed and is using a Linux PC farm and
ZEUS is developing a WNT farm. Germany is well served by
a national 34 Mbit network which is heavily used by
German University groups to access DESY. At the moment
quite a lot of manpower is consumed running the
datastores, quite a large fraction for software support.
It is expected that desktop machines will be largely PCs
within a few years.
Remarks from FNAL are similar regarding
farms and datastores. As network connections in the USA
are an order of magnitude better on the whole than in
Europe the way in which it is possible to work locally
there is not a very good guide for the UK at the moment.
From the national computer centres the
reply from IN2P3 was the most extensive. The situation in
France has some similarity with that in the UK but the
facilities are larger. There are two computer centres,
one (CCIN2P3) for experimental HEP and nuclear physics
and IDRIS for high performance computing. CCIN2P3 has a
staff of 36 and they run a large IBM/SP2 for interactive
work and two farms, both based around RS6000 and HP
workstations, and the national IN2P3 network PhyNet. The
response points out the benefits that they feel they gain
from a powerful centre, apart from the technology,
particularly the support in depth and the 24 hour
operation. The total capacity of the two farms is roughly
twice that of the three platforms at DCI. The farms are
connected to a large datastore based on a StorageTek
system with Timberline drives (upgrade path to Redwoods).
One farm is for Monte Carlo simulation, but the other was
designed from the outset to be general purpose and has
900 Gbyte of diskspace. One large difference between the
mode of operation in the UK and France is that the French
do significant analysis at the CCIN2P3 centre and have
had a copy of the full DST data for ALEPH and DELPHI
there. They also perform a considerable part of data
reprocessing for H1. They are not sure yet whether their
mode of operation will continue to be valid in the LHC
era. They note that there is a difference for
geographically wide spread collaborations such as BaBar
and for those that have a large number of collaborating
institutes close to the accelerator. They are also not
sure that the LHC collaborations will be able to afford a
completely centralised computing infrastructure. They
therefore intend to retain the capacity for handling and
processing a large volume of data locally. The technical
solutions that have been adopted and are being
investigated are similar to those already described
elsewhere. One final point to note is that PhyNet which
provides connections to IN2P3 institutions and CERN is
based on private leased lines. It is expected that this
will be discontinued in the near future and the traffic
will be moved to RENATER, the French Academic Network.
A general comment that was made by most
respondents was the importance of exploiting fully global
file systems such as AFS and linking this to multi-level
hierarchical file stores based on automated technology.
This is most likely to be achieved in near to medium term
within a single country where one can be confident of the
existence of a network with sufficient bandwidth. It is
another reason why it is important for UK HEP to continue
to press for better international network connections.
Replies were received from