Review of Central Computing Provision at RAL for Particle Physics in the UK

  1. Introduction

    Following the 1994 White Book 4 (WB4) review of high energy physics computing the central facilities provided by the then RAL Central Computing Department {now CLRC/RAL Department of Computing and Information and Information (DCI)} were revolutionised, becoming much more focused on the actual needs of the community and requiring a significantly lower level of support. The Computing and Network Advisory Panel (CNAP) inherited the responsibility from WB4 for overseeing the execution of the new strategy. Given the rapid pace in the development of computing and information technology, it is now appropriate to undertake a thorough follow-up review of the central facilities for UK HEP.

    The formal terms of reference of this review are to be found in annexe 1 to this report, along with the Review Panel membership (annexe 2).

    After a brief review of the current status of the central computing provision today (section 2), the boundary conditions and strategic directions are given (section 3). Extensive surveys have been carried out to solicit the opinions of the user community (section 4) and to study the validity of computing models first adopted in WB4 (section 5). Section 6 summarises the interaction with DCI during the review. The findings of the panel are presented in section 7 and lead to the recommendations laid out in section 8. A brief overview of the WB4 review is given in annexe 3 and a summary of comparable computing facilities at major accelerator centres and some national laboratories is given in annexe 4. References and support documentation for this review are listed in annexe 5.

    We thank the members of the HEP community who provided answers to the detailed questionnaire on present and future use of the central facilities. We thank those people who have worked hard, often at short notice and usually with good humour, to answer our many questions and especially those from DCI and PPD who have borne the brunt of this onslaught. Indeed, DCI have willingly and openly discussed the future provision of central services.

  2. Status of Central Computing

    The recommendations of WB4 as they affected the provision of Central Computing resources are summarised in annexe 3. The essence of the changes resulting from WB4 was to provide three CPU platforms coupled to a large datastore for the batch production of Monte Carlo simulated datasets. One area of development beyond WB4 which deserves comment is the provision of central disk storage space. AFS (and equivalent systems) has become more important to the community and the reduction in the price of disk storage has meant that the central disk store could be increased. This makes both more efficient use of the robot datastore and permits the use of the central disk store for the temporary storage of datasets.

    The overall resources and principle deliverables are defined within the framework of the Service Level Agreement (SLA) between PPARC and CCLRC for the Particle Physics Programme. The present DCI services are those defined by the WB4 exercise, but a number have been upgraded with full agreement from CNAP.

    A document supplied by DCI and titled: Review of RAL Particle Physics [1] (also 'Additional Input' [2] ) has a large part describing the services now offered. In this section, a summary is given of the most important aspects of the resources employed, and services offered, within the current Central Computing provision.

    1. Resources for FY 1996/97

      Table 1 shows the allocation of resources and Table 2 the breakdown of capital expenditure for the financial year 1996/97.

        Effort (SY) Expenditure (£k)
           
      Direct Staff Effort 9.5 -
      Direct Staff Costs - 35.00
      Recurrent - 44.00
      Capital - 108.06
      Indirect Staff Effort 0.8 -
      Indirect Staff Costs - 9.00
           
      TOTAL 10.3 196.06

      Table 1 : Allocation of Resources for FY 1996/97

      For comparison, in 1995/6 the effort was 16 SY and the expenditure £200k. The ëDirect Staff Costsí are administrative costs associated with the DCI staff concerned. The total expenditure in providing central computing services for Particle Physics is £657k in 1996/7. In earlier financial years the expenditure on central computing was approximately twice the current amount.

        Amount (£k)
         
      AFS/NFS Disk Space (36 Gbyte) 6.59
      115 Magstar 3590 tapes for IBM 3494 robot 4.37
      Import/Export Server 2.00
      Disk Farm - 100 Gbyte initial phase 28.00
      CSF Upgrade (four HP9000-C110 machines) 32.50
      CSF Disk 3.10
      ISDN Bonding Software 1.00
      Windows-NT PC Farm - first phase 15.50
      Unallocated 15.00
         
      TOTAL 108.06

      Table 2 : Budget Breakdown for Capital in FY 1996/97

      The £15K pounds of unallocated funds is being spent at the time of writing. Some of the funds are being targeted on disk upgrades so that CSF will be able to handle 1 Gbyte simulation datasets (a capability that has already been identified by CNAP as a pressing requirement).

    2. Usage of Central Facilities

      In Figures 1-3, the weekly CPU utilisations are shown through most of 1996 for the CSF (HP-UX), OSF (Digital UNIX) and AXPRL1 (VMS) central platforms.

      The CSF farm has been heavily used throughout the year for Monte Carlo production by a number of experiments including ALEPH, BABAR, H1, OPAL, NA48 and SNO. Preparations are also underway for the ATLAS collaboration to produce a Monte-Carlo dataset of 107 jets for calorimeter and trigger studies. It is clear that CSF continues to provide an extremely valuable service for large-scale simulation work. With full agreement from CNAP, the use of CSF has diversified through the year and it has played an essential role in data analysis for some experiments. During summer 1996, the ALEPH(UK) groups used this facility to perform a rapid and competitive WW 4jet mass measurement from their LEP2 dataset. Due to changes in HERA operation and increases in luminosity, the H1 collaboration are unable to continue reprocessing their data at DESY during the accelerator shutdowns. A significant part of the 1996 reprocessing is currently underway on CSF.

      Figure 1: Particle Physics Utilisation of CSF (UNIX) Platform.

      Figure 2: Particle Physics Utilisation of OSF (UNIX) Platform.

      Figure 3: Particle Physics Utilisation of AXPRL1(VMS) Platform.

      Use of VMS continued to be dominated by DELPHI Monte Carlo production, but the facility was also used by CPLEAR, ALEPH, Crystal Barrel, CMS and Soudan II. The future of the VMS service was discussed by the CNAP committee on 3/10/96 and it was decided to recommend closure at the end of September 1997. Users have been informed and measures are being put in place to cater for those most seriously affected. The decision to close the VMS service has been endorsed by the Review Panel.

      OSF is a shared facility and the HEP fraction (50%) was only fully utilised in the second half of 1996. The use is a mix of Monte Carlo and analysis. The OSF system is the server for the ZEUS Funnel operation in the UK and it also acts as the collection point for Monte Carlo data from CPUs based at many institutes. The CPU capacity of the system is too small for large scale Monte Carlo production work. During the last year the largest users of OSF were WA89 (hyperon) analysis, BaBar and OPAL Monte Carlo in that order. It is the central code repository for UK BaBar and an important facility for UK groups who are developing code for BaBar.

    3. Disk Farm

      During the first half of 1996, a trial disk farm (where datasets are maintained semi-permanently) was set up so that connectivity and software could be tested by the ALEPH collaboration. The successful trial culminated in the purchase of 25*4 Gbyte Fast SCSI disks, which were installed in July 1996. The Disk Farm provides a fast alternative to the Datastore for frequently accessed datasets. The 100 Gbyte of disk space is already shared between eight experiments and CNAP has approved the purchase of another 100 Gbyte in 1997/98. There is strong demand for this facility already.

      The 'rfio' software from CERN has been installed and work is underway to modify the stager utility to interface to the RAL tape system.

    4. Datastore

      During 1996, a number of upgrades were made to the Datastore.

      • A fifth Magstar tape drive was added to the IBM 3494 robot to increase throughput and reduce wait times.
      • An additional set of 115 Magstar 3590 tapes was purchased to fully populate the 10 Tbyte share of the robot slots agreed by WB4 for particle physics.
      • Another 54 Gbyte of disk was added to the VTP servers (this is transient staging space which is overwritten by subsequent jobs) to increase the lifetime of staged datasets and improve the handling of very large datasets (e.g. whole DLTs).
      • Tape import/export (DAT, DLT, Exabyte) was moved to a dedicated server to eliminate some bottlenecks previously experienced.

      Particle physics data has also been reorganised within the robot so that virtual files belonging to different owners are no longer mixed on the same physical volume. This will make it easier to monitor and manage Datastore use in the future.

      It has been recognised for some time that the import/export facilities into the Datastore are inadequate for the bulk handling of large datasets that can simultaneously arrive from a number of experiments. The H1 reprocessing alone has been estimated to require the import of 5 Tbyte of data over a short time period. A proposal has been submitted, and agreed within CNAP and the Data Storage Subgroup, to provide an automated facility based on DLT7000 technology. It is intended that this open-shop system should be capable of importing and exporting data 24 hours/day with minimal operator intervention.

    5. Distributed File Systems

      AFS has become a production service with a dedicated server for particle physics. A further server is envisaged to give the necessary resilience to the existing server once funds are available. The master copy of the ASIS repository also moved to AFS in 1996. The AFS client software has been installed on both the CSF and OSF systems, although work remains to provide a secure implementation for the NQS batch system. An additional 36 Gbyte of disk was purchased for both NFS and AFS (for the storage of software as opposed to data) and has been split according to the requirements of experiments.

    6. CSF Upgrade

      At the end of July 1996 four HP C110 machines, each with 64 Mbyte of memory and 2 Gbyte disk, were added to the CSF platform. This expanded the farm to 29 batch nodes and benchmarking indicated a 30% increase in CSF CPU power to about 688 CERN Units. The PA7200 CPU chip in the C110 nodes is upgradable to the PA8000 processor, offering the possibility of further performance enhancements.

      The smooth installation of the C110 nodes is reflected in Figure 1, where the weekly CPU utilisation is seen to increase to about 5,000 hours shortly after their arrival. This coincided well with a significant demand from the OPAL experiment which simulated and reconstructed over 3 million LEP1 events in 3 months.

      It is anticipated that the CSF operating system will be upgraded to HP-UX 10 in early 1997.

    7. Windows-NT PC Farm

      Farms consisting of commodity PC machines offer a cost-effective solution to providing the increased CPU power needed for present and future high rate experiments. A development project was launched by DCI to study the requirements for a PC Farm. Initially, this consisted of two machines running Windows-NT (WNT) v3.51; a standard off-the-shelf PC (120MHz Pentium) and one assembled at RAL from proprietary components (166MHz Pentium). This provided a useful test-bed with which to investigate compilers, install WNT versions of the CERN library and run particle physics software. Benchmarking tests indicated the 166MHz machine to have a performance mid-way between a HP712 and HP735 (machines used in CSF).

      A proposal was made to CNAP in October to extend this work and commission a production PC Farm for the UK particle physics community. At the same time, a Windows-NT Subgroup of CNAP was launched to co-ordinate the work of systems managers and help to define the detailed requirements of the community. Full agreement was reached on the provision of a WNT production service at RAL starting with a first phase of five Pentium Pro (200MHz) nodes.

    8. Other Services

      Details are given in the submission from DCI referenced above for the following additional activities:

      • Network Monitoring: End-to-end monitoring of the performance of international connections
      • Development of ATM: related to coupling of Datastore and farms in particular
      • Development of Videoconferencing facilities for Particle Physics community.
  3. Review Strategy

    The White Book 4 review [9] led to a revolution in central computing for particle physics, and resulted in a far more focused and cost-effective central service. Furthermore the funding of computing services via the SLA between PPARC and CCLRC for the Particle Physics Programme has led to a greater degree of financial control than had been enjoyed previously.

    Although the three platforms set up by WB4 were designed primarily for non-interactive Monte Carlo event generation it is clear that the coupling of a large datastore and fast CPUs allows for other uses. Small groups and non-accelerator experiments continue to use the central facility as their primary datastore and analysis resource (e.g. WA89, Soudan II and Crystal Barrel). The changing pattern of running at HERA, and the vagaries of international network connections, have also led to requests for significant analysis or reconstruction work. Indeed the Review Panel would like to stress that the use of central computing resources is, and will continue to be, affected strongly by the performance of international network connections.

    This review of central computing was required to evaluate the efficacy of the current provision, and then to determine a strategy for the next two years and beyond into the LHC era. It was required to study the resources the particle physics community invests in central computing, and to examine the balance between Monte Carlo and analysis/reconstruction work.

    This section summarises the strategic directions and assumptions used by the Review Panel.

    1. Financial Boundary Conditions

      In the absence of firm financial boundary conditions, the Review Panel decided to adopt three possible funding levels for services provided by DCI. These are 100%, 85% and 70% of the funding resources provided in FY 96/7 (see section 2.1). It was agreed that the Panel would develop strategies for each of these.

    2. Strategic Directions

      The White Book 4 recommendations were successful for two reasons in particular:

      1. The central computing services were focused into well defined, identifiable platforms, operated by platform managers.
      2. Effort was reduced to a minimum as a result of the first point, and also by imposing requirements on the use of the central services (e.g. minimising interactive work).

      The Review Panel has sought to build on this foundation and to take the process a step further. The strategy adopted is as follows:

      • to establish from the community the key central services which have the highest priority, to review the latest experimental computing models, with particular reference to batch, analysis and reconstruction work, and to examine the international context;
      • on the basis of the above input, to focus resources on a restricted number of key services and to close the other services as quickly as reasonably possible;
      • to (further) upgrade the remaining services to ensure that there is adequate capacity to meet demand (CSF in particular) and test new services if they are likely to open new avenues and/or be more cost-effective (e.g. a WNT farm);
      • following the decision to minimise the number of central services, to reduce accordingly the DCI staff complement as far as possible;
      • to reduce recurrent expenditures as far as possible;
      • to use additional funds from reduced effort and reduced recurrent costs to give a higher capital budget for rapid investment in key (existing and new) central services.
  4. User Survey

    As part of the review, a user survey was carried out in the Autumn of 1996. A questionnaire [3] was sent to one representative per experiment per institute. This asked questions on the present and planned usage of the central computing and datastore facilities.

    The survey shows some changes in the use of central RAL facilities since WB4. Although most LEP and HERA groups still work to the model which emerged in the 1994 survey (annexe 3) there is a significant increase in the usage by some university groups. The importance of the services to some small experiments (e.g. Soudan II and CPLEAR), and the necessity of central services for the next generation of experiments (BaBar and LHC), also clearly emerges from the survey. Although the central services are widely used, some users felt that there was inadequate documentation on some aspects of the services. The provision of useful and up-to-date documentation on the web pages should be considered as an important user requirement.

    There is an important interplay in any computing centre between data storage and CPU power capacity. In the survey, questions were asked on the present and future usage of both of these and also addressed the awareness of services provided by DCI. Apart from the provision of central computers and data storage there was almost no mention of the services classified as "Other Services" (see section 2.8). It would seem that the effectiveness of this work, at least in making its existence known, could be improved.

    1. Data storage

      One very important feature of a central service is the storage of data. At present the central DCI facilities for data storage comprise the Datastore, of which 10 Tbytes are available for particle physics, and 100 Gbyte of associated disk space. These facilities are used extensively by members of most collaborations in the UK. For the LEP and HERA experiments the main usage has been associated with the large Monte Carlo production runs. These runs are the UK shares of collaboration wide Monte Carlo event production. In addition, there is some storage space used for DSTs and ntuples, and the H1 HERA experiment is planning to reprocess a significant fraction of their data, which will necessitate a sizeable use of the Datastore. The user survey confirms this pattern of usage. The LEP community constitutes about 32% of the total HEP community. The corresponding figure for HERA is 22%. The pattern of non Monte Carlo usage varies considerably between the different experiments and even institutes within the experiments, with some groups making extensive use and others no use at all. In the future the LEP needs will tail off as the LEP-1 data analysis is completed, since the LEP-2 datasets are smaller. Only small usage beyond the year 2000 is foreseen. For HERA the demands are expected to grow as the luminosity delivered increases; however, the rise may well be slower than that of the luminosity due to additional data selection.

      The next largest community of users is the LHC with 25%. At present the storage usage is relatively small. However, as discussed below, the LHC experiments will need extensive storage and CPU capacity. The BaBar experiment involves about 7% of the HEP community and again will have significant future demands. NA48, with 3% of the community, will start in 1997 to produce large volumes of data. The other smaller experiments (SNO 4%, UKDMC 3%, Soudan II/Minos 2% , Crystal Barrel 1% and WA89 1%) which use the central facilities, all find them very important to their work.

      The demands for data storage are continually increasing and the Datastore occupancy is already close to capacity. Estimates from the experiments show an ever increasing demand. However, obtaining reliable estimates of future usage seems to be difficult. For example, there was a large increase in usage from August 96 to January 97; well beyond that expected from estimates made earlier in 1996.

      In order to attempt to meet these requests, the CNAP Data Storage Subgroup has explored a number of options with DCI. One of these is an effective quota assignment for each experiment whereby a certain number of slots in the robot would be allocated for each experiment and the experiment would choose which data would be active in the robot. This solution also implies the collaborations would need extra resources to organise offline data archives. Details are being worked out and the system will be introduced as soon as possible.

      Another aspect of the Datastore which is also currently problematic is the capacity to import and export data. This is a severe bottleneck, and is a priority item in the improvements foreseen.

      In the longer term there will need to be a significant increase in the capacity of both the Datastore and disk storage. The data volume requested to be stored is likely to grow in the next few years at somewhere between 4 and 8 Tbytes per year. At present the Data Storage Subgroup are exploring, together with DCI, possible solutions and costings for upgrades.

    2. CPU power

      The largest CPU power is at present that of the CSF platform. In 1996 the LEP experiments used 47% of the CPU ( ALEPH 36%, OPAL 9% and DELPHI 2%), HERA (mostly H1) used 15%, SNO 14%, other accelerator experiments used 16% (largest are NA48 7%, CPLEAR 5% and BaBar 4%) and the LHC experiments used 8% (CMS 6% and ATLAS 2%).

      The estimated CPU usage for production work from all experiments (mainly Monte Carlo) is about 54%. This work is submitted by a small number of specialists on behalf of their collaborations. If likely production work is removed from the above statistics then RAL/PPD use accounts for 17% of remaining CPU capacity and university group usage 83%. This represents a significant change since WB4 when the largest fraction of CPU usage was from the RAL/PPD group.

      For the LEP experiments the production Monte Carlo work for ALEPH and OPAL uses mainly CSF, whereas DELPHI uses VMS on AXPRL1. In general, the CPU capacity available is considered just about adequate.

      For the HERA experiments again CSF is most important and it is envisaged that CPU requirements will grow significantly as the luminosity increases (e.g. ZEUS estimate a factor 2 increase by the year 2000).

      The Crystal Barrel experiment is complete, but still has a significant need for both processing power and use of the Datastore.

      The SNO experiment will soon start to produce data and foresees increased usage by a factor 2 by the year 2000.

      The new accelerator experiments, NA48, BaBar and the LHC experiments foresee a rather large increase in their CPU needs, both in the next few years and, particularly for the LHC experiments, beyond the year 2000. The simulation of LHC data is largely CPU limited so to some extent it is difficult to get a realistic picture of requirements as these depend on the CPU and storage capacity available throughout the collaboration. However, in order to contribute usefully to Monte Carlo production, increases in line with other computer centres will be needed. The Windows-NT possibilities are clearly of great relevance to these demands.

    3. Summary and Conclusions

      The present central services appear to be reasonably well matched to the current needs of the UK HEP community, and are heavily used. There was a very good response to the user survey, so that it can be used as a reliable reflection of the user views on central computing. No serious problems came to light, although there are some worries on the reliability of the Datastore in recent months (which are currently being addressed).

      The computer models for the BaBar experiment and the LHC experiments are still evolving. In the short term, there is an increasing demand for both CPU power and storage space. However, it is expected that significant use of object oriented databases will be made and the evolution of possible technologies to be used in association with such concepts cannot be accurately forecast.

      For the longer term future, the needs of the new experiments (BaBar and the LHC experiments) are potentially very large indeed, and the role of central facilities will probably be much more important than for the LEP and HERA experiments. It is clearly important to embark on work in preparation for this new era. There is also a need for a systematic approach to the provision of software licences. These are particularly relevant in the area of OO database methods and are already an issue for the BaBar Collaboration.

      It is clearly in the interests of the whole community if a common solution for both BaBar and the LHC experiments can be found. The situation with the LHC experiments should become a little clearer when their computer models have been considered by CNAP and the implications are better understood. However, it is also to be expected that the models will evolve with time in response to new ideas and to technological developments.

      Satisfying the potential demands of the new experiments raises questions which extend somewhat beyond the scope of this report. It is most probable that the costs of setting up effective regional computing centres for these experiments would imply budgets well beyond a reasonable projection of the existing level of funding. Whether this funding, if approved, should be through a specific peer reviewed body, such as the PPESP, is a question that will need to be considered by both CNAP and PPC.

  5. Review of Experiment's Computing Models

    In the model of computing recommended in WB4, data is collected, processed, filtered and stored centrally at the experimental facilities. Data analysis and code development are performed both locally at the experimental facility and at the collaborating institutes on compressed sub-samples of the data which are stored (depending on their size) at the collaborating institute home computers or centrally at RAL. For the small, non-accelerator, experiments substantial fractions of the data are held at RAL. Large, collaboration wide, Monte Carlo simulation is organised centrally and the load divided between the regional centres. RAL, acting as the regional centre for the UK, has a farm of processors tailored to this task.

    In a questionnaire on computing models [4] we asked one representative of each of the large experimental groups how this model serves their experiment. In addition to specific questions we invited the respondent for any general comments on their model.

    In their replies to the questionnaire the LEP and HERA representatives indicated that their current computing models were broadly in line with those laid down in WB4. It was mentioned by the OPAL and ALEPH groups that this was partially due to the constraints of WB4. Most collaborations have made extensive use of central facilities for large scale Monte Carlo simulations. Provision for interactive physics analysis computing was added at a later stage and sanctioned case-by-case by CNAP. The use of central facilities by physics analysts differ from those carrying out Monte Carlo production in that they are more mobile and will use the route or system by which they can get the fastest turn around.

    Variations between different collaborations and institutes do exist with some collaborations/groups making more use of central facilities than others. Patterns of central computing usage have changed with time when, for example, computing power at and/or networking access to accelerator sites is compromised. Problems with network access between home institutes and accelerator sites were mentioned as the major reason why individuals and groups change their model of computing. These problems can be short term in nature, when network performance becomes poor over a period of weeks or is lost for a few hours, or long term with a gradually declining performance due to an ever increasing load. It is not foreseen that the model used to date by the LEP and HERA experiments will change significantly over the remaining lifetime of the experiments. The replies to the questionnaire indicate that the approach outlined in WB4 has been largely successful.

    NA48, BaBar and the LHC experiments are not yet taking data. However, to date, they have pursued computing models approximately in line with those of WB4. The difference between the LEP and HERA experiments and the "LHC generation" of experiments lies in their plans for regional computing centres. This is partly driven by significantly larger data volumes and partially, in the case of BaBar, by the difficulties of working effectively across a transatlantic network.

    NA48 which will start to take data in 1997 and BaBar which will start to take data in 1999 will copy large volumes of data at CERN and SLAC and ship it to RAL placing severe requirements on the data store.

    For the LHC experiments the computing models are not yet fixed but those being considered foresee extensive use of the regional centres such as RAL. The function of the regional centre would include re-processing of data and the simulation of the large number of Monte Carlo events required. This will place severe requirements on data storage and CPU.

    As a consequence, the role of RAL can be thought of as being of similar importance to the accelerator datastores at the LEP or HERA experiments. RAL will be pivotal to successful UK participation in the future generation of experiments.

    One new consideration which must be taken into account is the development and increasing acceptance of new software design tools and methodologies. Programming concepts such as Object Oriented techniques and the replacement of data summary tapes (DST's) by Object Oriented Databases are becoming widespread within the BaBar and LHC Collaborations. These require commercial products and their cost will have an effect on the computing budgets of experiments.

    The survey clearly indicates that the WB4 model of computing was largely successful within the limits imposed by international networking. The analysis computing models at LEP, HERA and the LHC are particularly susceptible to international networking changes. With the next generation of experiments we are seeing an evolution of these models. The need for central computing centres with the capacity to store large volumes of data and powerful farms for Monte Carlo production is clear. New programming methodologies are, for the first time in 30 years, changing the way many physicists do their analysis.

  6. Interaction with DCI

    A crucial part of the review of central computing facilities was the exchange of information between the Panel and RAL/DCI.

    DCI provided the following documentation :-

    After each presentation from DCI, there was detailed questioning and discussion.

    1. First iteration

      The original document from DCI ([1] and [2]) consists of two distinct parts:

      1. a survey of current facilities and their future prospects;
      2. a discussion of 'strategic directions'.

      Their strategic directions are summarised as:

      • Concentrate and focus on the most cost-effective processor platforms for which UK HEP has a continuing long term requirement. Expand them to meet demand. These are CSF and WNT;
      • Continue to develop the Datastore as a key feature of the RAL facilities;
      • Continue to keep track of, and wherever possible benefit from, relevant developments at CERN and elsewhere;
      • Continue to develop international network monitoring facilities;
      • Continue to undertake infrastructural developments (aspects of high performance networking and video transmission/connectivity);
      • Adjust activities to make it possible to shift resources from staff and recurrent costs to make it available as capital, or savings.

      These are consistent with the directions the committee defined in section 3.2.

    2. Second iteration

      By the time of the second meeting with DCI, the Panel had received replies to its other questionnaires, and had refined its ëbaseline requirementsí as:

      1. VMS service to be closed at end of Sept. 1997. After that time the service may continue to run at risk for one or two users, but DCI effort/funds consumed should be essentially zero. Until September 1997 DCI effort/funds used must be reduced to absolute minimum.
      2. OSF user service to be closed at end of Sept. 1997. After that time, residual server role (for disk farm support) will continue, but DCI effort/funds must be minimised (cancel maintenance, use spares etc.). Until September 1997 DCI effort/funds should be low as users will be moving to CSF and system will be 'frozen'.
      3. The CSF is seen as the major platform until the next review of PP Central Computing. CSF must be upgraded rapidly to cater for increased use (existing users plus users moving from VMS and OSF platforms). Advice from DCI is requested on precisely what is needed and the most cost-effective way to provide the upgraded service, but a request in FY 97/8 is made such that there is at least a 50% increase in overall CPU power and commensurate increase in networking capability, an upgrade to HP-UX 10, a suitable increase in amount of diskspace, and provision of new software required such as C++ compiler.
      4. A rapid build up of WNT farm to 20 nodes, with adequate networking infrastructure, and commissioning for batch work. A development team should look into the possibilities of using the farm for a code development environment and for other applications.
      5. The Datastore should be upgraded in FY 97/8 according to the proposals of the data storage sub-group (DLT stacker, more diskspace, extra 2 Tbyte). A subsequent upgrade using REDWOOD drives should be planned for the Datastore itself for when this becomes viable.
      6. DCI should ensure that the UK HEP community will be able to conduct video conferences when SuperJANET III is implemented in the way we currently do using ATM on SuperJANET II.

      DCI were requested to revise their proposals for funding at 100%, 85% and 70% of the 1996/7 funding level, in the light of the refined baseline requirements, and this resulted in their second submission to the Panel [7].

    3. Conclusions

      DCI can only fulfil the baseline requirements with funding at 100% of the 1996/7 level. Below this there would be serious shortcomings. With 100% funding, and after the changes recommended had been made, then there would be 8.6SY (10.3SY) of effort and £238k (£152k) capital and recurrent (where the 1996/7 amounts are given in brackets).

  7. Findings of Review Panel

    The Review Panel has looked at the consequences of the decisions taken by the WB4 Review, and it concludes that the recommended changes were correct and the resulting central computing services have been successful. As proposed by WB4, it is the close coupling of CPU farms with a large datastore that make the central facilities such a valuable resource and one which makes the community less susceptible to the problems of international networking. The introduction of platform managers has also been a significant factor in the success of the facilities. The staff effort required to run the central HEP computing facilities has dropped from 26 SY pre-WB4, through 16 SY in 95/96 to 10.3 SY in 96/97 which is roughly the steady state level proposed in WB4. Capital and recurrent expenditure has run at about £200K per annum over this period. Although it is very difficult to make direct comparisons, the information received from other national laboratories shows that the level of resources devoted to running the central facilities by DCI is either at, or on the low side of what would be used elsewhere.

    The User Survey makes it very clear that users have a continuing need for central computer services in order to undertake Monte Carlo dataset production, for other well specified production purposes (e.g. the H1 data reprocessing), for analysis services for the smaller experiments (foreseen for BaBar within a regional centre), and also for a limited amount of code development. The operational terms of reference of the platform managers should be modified by CNAP to recognise the somewhat modified pattern of usage. However the Review Panel does not propose that the facility become a general purpose one. Central resources should continue to be tightly focused around the coupling of a large datastore and a concentration of powerful CPUs. The facilities should complement those available at home institutes and should be of greatest benefit to the majority of users.

    By locating the HEP central facilities at a large IT centre such as RAL/DCI the community gain from the backup available in depth for both staff and equipment. Maintenance costs are also lower because they are set at the level of the contract for CLRC as a whole with the equipment suppliers. However, CNAP should continue to be vigilant that both these advantages are realised in practice.

    The Panel believes that the trend established in WB4 should continue and there should be a further shift in resources from staff and recurrent costs into capital or savings. As a consequence the future staff levels in DCI will reduce from the current 10.3 SY to 8.6 SY with a consequent increase in capital and recurrent funds from 152k to 238k (assuming funding at the 100% level).

    From the existing and projected pattern of use (using information on the use of the Datastore updated to January 1997) it is clear that the capacity of the Datastore needs to be increased urgently. The Panel recommends an increase of 2 Tbytes as soon as possible and if possible a doubling of the existing 10 Tbytes capacity over the next 2 years (97/99). It will also be necessary to introduce a system of space allocation for experimental groups. The details of the latter are being worked out by the CNAP Data Storage Group independently of this review. We note that the Datastore will require a large capital upgrade in 2-4 years time to move to the most cost effective (i.e. low operator overhead) emerging technology and to meet the projected demand as the LHC experiments prepare for data taking. Although the choice of hardware is becoming clear there are some serious issues concerning software and the changing pattern of data storage that need to be addressed. The Panel welcomed the news that there is now UK HEP and DCI involvement in the CERN RD45 [11] development project which is also addressing these issues, and plans are in place for initial OODBMS tests.

    Over the last couple of years the power and flexibility of PCís has changed dramatically. It is now possible to run large networked systems using either WNT or LINUX to provide desktop machines for staff. Recently, a number of groups both in the UK and in other countries have demonstrated that PC farms could provide very cost effective CPU power.

    If UK HEP is to take advantage of these changes within a constant or falling budget, then some of the existing platforms have to close. The strategy that the Panel followed has been derived from general principles adopted by CNAP at its inception and the information from the user survey. The closure of the VMS service has already been discussed, agreed by CNAP and announced. It is following a trend already far advanced in the community and elsewhere. The decision to close one of the two UNIX platforms was taken as the only way to release resources for the development of a WNT production farm. The choice was dictated largely by the much larger fraction of the community using the CSF platform than by any obvious technical merit of CSF over OSF.

    To meet the needs of the existing users of the VMS and OSF platforms and to allow for natural development, the Panel recommends that the CSF platform be upgraded by roughly 50% CPU capacity and with a corresponding increase in diskspace and internal network capacity. The CSF platform will be used both for batch production and for the other more varied uses already agreed. The Panel note that the HP operating system needs to move to version HP-UX10 before the system can be used for BaBar code development. The present WNT development system (approved by CNAP in 1996) should be expanded rapidly to a batch production facility and its use for code development and analysis be investigated and if appropriate developed over the next couple of years.

    It is recognised that groups affected by the phased closure of existing platforms must be properly catered for, particularly those analysing data at the end of an experiment for whom the effort of changing to a new platform would be both difficult and inefficient. Following discussions with the groups concerned the Panel is satisfied that their legitimate needs will be met.

    Having decided the broad strategy to be recommended for the next few years, the Review Panel drew up more detailed ëbaseline requirementsí for the central computing services (see section 6.2). The Panel then negotiated with DCI to define the full set of services they could offer against three funding scenarios.

    The conclusions of the Panel concerning the three funding scenarios are :-

    The computing models of BaBar and the LHC experiments anticipate the extensive use of regional computing centres. The Panel has considered the BaBar and the LHC experiments situations to be somewhat different. The BaBar computing model is more advanced, the requirements are sooner, and the requests, in particular with respect to licencing issues, are better defined. BaBar is pioneering the concept of a UK regional centre and this will offer operational experience in order that plans can be made for the LHC experiments.

    Some of the Panel's recommendations already take into account BaBar's needs. It is foreseen that most of the demands of the BaBar experiment, as they have been formulated to the Review Panel, can be funded from resources allocated to the central services with possible specific items falling on the BaBar experimental budget. This should be monitored by CNAP as the requests become more precisely defined.

    Neither the Review Panel nor CNAP has had time to study the consequences for UK facilities in any detail. However, it is clear that to provide a specialised regional centre for each experiment may require resources beyond those currently assigned or projected for the central facilities. The PPC together with the PPESP will have to consider this fundamental issue as part of the total resourcing of the LHC experiments.

  8. Recommendations & Conclusions

    The Review Panel NOTE the following:

    For the future, the Panel recommends :


    1. That there is a continuing need for a central UK HEP computing facility coupled to a large datastore and that this need is best met by continuing to site the facility at the Department of Computing and Information of CLRC/RAL.
    2. That the baseline requirement (an enlarged CSF platform and a developing WNT platform together with increased capacity in the Datastore; essential network and video conferencing work) be approved with funding at the 100% level for the next three years.
    3. That a similar review of central computing facilities be conducted in 1999/2000 when decisions on major capital spending for the upgrade of the Datastore will have to be taken.
    4. That the BaBar central computing requirements are developed with close collaboration and liasion between the BaBar experiment, CNAP and DCI; and that this pioneering work is used to gain operational experience in order to plan for the requirements of the LHC experiments.
    5. That the PPESP with advice from CNAP consider the implications of resourcing specialised regional computing centres for the LHC experiments as part of the total resourcing of these experiments.

    In more detail our main recommendation (2) above implies the following:

    1. That the VMS and OSF services will close at the end of September 1997. The Panel has ensured through discussion with the users most concerned and the platform managers that adequate arrangements have been made for an orderly run down of these services and the transfer of users to the remaining platforms.
    2. That the CSF platform be upgraded by a 50% increase in capacity and the WNT platform be expanded from its present development size of 5 CPUs to 20. The interface to users will continue to be through identified platform managers. The CSF platform will offer both batch production and limited other services. The WNT platform will be for batch production in the first instance, but its extension to programme development and analysis work will be explored.
    3. That CNAP must redefine the terms of reference for the use of the platforms in line with the updated computing model defined in this report and the platform managers be informed accordingly.
    4. That the capacity of the Datastore should be increased as soon as possible by 2 Tbyte and doubled to 20 Tbyte by 1999. The DLT import/export stacker facility be installed. A Datastore space allocation system should be put in place urgently independently of any upgrade.
    5. That the CNAP Data Storage Subgroup, the PPD Central Computing Group and DCI to identify a single person for the Datastore to fulfill a similar role in user interfacing to that provided by the platform managers.
    6. That the CNAP Data Storage Subgroup together with DCI keep track of developments from the RD45 project and plan the necessary major upgrade to the Datastore foreseen as necessary in about 3 to 4 years time.
    7. That 1.5 SY of DCI effort continue to be assigned to essential network monitoring and videoconferencing activities.

Annexes

  1. Terms of Reference of the Review
    1. To review the central computing provision for Particle Physics in the UK.
    2. To evaluate the efficacy and cost effectiveness of the current provision and compare with the recommendations laid down by White Book 4.
    3. To make recommendations on the future strategy, planning for the LHC era but paying special attention to the next 2 years.
    4. To examine an appropriate model for 'Central Computing' taking into account the resources allocated to Particle Physics computing throughout the UK.
    5. To base the above deliberations within the context of future developments expected at CERN, DESY and SLAC as well as in national computing laboratories and large experiments.
    6. To report to the PPC by March 1997.

    Comments:

  2. Membership of the Review Panel

    R.Devenish (Oxford) (chairman)
    A.Halley (Glasgow)
    P.Jeffreys (RAL/PPD)
    S.McMahon (Liverpool)
    R.Middleton (RAL/PPD) (secretary)
    P.Renton (Oxford)
    P.Watkins (Birmingham)

  3. Summary of White Book 4 Review

    The review made by the WB4 Committee [9] covered the provision of data storage, computing power, software and hardware provision and networking. The review followed a period of transition in central computing in the UK in that the mainframe IBM service was no longer being used. Instead the facilities in place were more tailored to Particle Physics usage.

    An extensive user survey was conducted to find out the views in the community on the central services provided and of their future needs. From these replies a clear pattern emerged of the way in which the large majority of physicists, particularly those working on LEP and HERA experiments, carried out their analyses. The survey showed that the only reliable and up to date copy of the datasets used for analysis were kept at the accelerator centre (CERN or DESY). The model which emerged was that these datasets were initially accessed using computers at the accelerator centre and a reduced dataset (e.g. ntuple) was brought back to the local institute computers for detailed analysis. This transfer could either be on tape (e.g. exabyte) or by network. The `shelf-life' of the dataset might be typically one or two months.

    Almost all groups stated that reliable and fast international networking was of the utmost importance. Although this was not, in fact, a centrally provided service in the terms of the review, the network performance was in practice a determining factor in the evolution of the computer models adopted. Many groups gave, although not explicitly asked, impassioned views on the desirability of their local computer facilities. These views were reported in the WB4 report as input for the PPC to decide the balance of expenditure between central and local computing.

    The model described above was not, however, universal. In particular for non-accelerator experiments such as Soudan II there is no central laboratory, and the RAL Datastore has become the central datastore for the experiment.

    Another very important usage of central facilities was the production of Monte Carlo events for collaboration wide distribution. These events represented the UK quota for the various experiments.

    The WB4 committee identified two important areas for the future provision of central services:

    These recommendations had consequences on the nature and scope of the central services. The system envisaged was to be dedicated essentially to these limited goals and could be much simplified from what existed. No general purpose computing was recommended on the central machines (i.e. no e-mail, PAW etc.). Post WB4 some limited analysis work was permitted. Three platforms were foreseen; two UNIX-based (CSF and a more limited OSF service) and VMS. The capacity of the CSF service was to be increased in line with reasonable demand. Up to 10 Tbytes of the 20 Tbyte datastore were foreseen for eventual HEP use. The datasets stored were not to be backed up, so that they were not to be the primary source of the data.

    The recommendations to provide a more restricted service meant that it could be run by fewer people, resulting in economies in the staff costs. Since no back-up service was proposed a reduction was made in the number of operators. Effort was proposed for areas of general importance for particle physics. These included end-to-end monitoring of international network response, ATM and video conferencing. Taking all these considerations into account, and after discussion with CCD (now DCI), the overall staff complement was to be reduced from 20 to 10 SY in a timeframe of about 2 years. It was envisaged that in the longer term further reductions may be possible.

  4. International Context

    Although the current review is concerned with the future of central computing resources for UK HEP and is largely a domestic matter, it is important to put it into the context of developments at the accelerator centres used by UK experimentalists. It is also useful for the Panel and the PPC to be aware of what is happening at other national laboratories in support of their domestic HEP programmes. To this end a short note [10] outlining the purpose of the review and requesting some information was sent to accelerator centres and a number of national labs. In addition to strategic questions, a small number of questions were addressed to specific issues concerned with this review. The respondents are listed at the end of this appendix.

    Of course each of the major accelerator centres (CERN, DESY, FNAL and SLAC) and other national labs work within the context of their history. Only CERN was from the outset a facility to be shared by many countries, the other centres have encouraged collaboration from other institutions and other countries for quite a number of years but within differing frameworks. These differences are also to be found in their approach to computing resources. On the other hand the inexorable change to the most cost-effective way of providing CPU and data storage is common to all and not surprisingly most centres have followed the trend away from large multi-user general purpose machines to clusters of work-stations running a variety of operating systems - but with a predominance of UNIX rather than manufacturer specific systems such as VM or VMS. More recently most countries have at least one centre actively pursuing networked PCs as the next platform for data reduction and event generation, as well as being the box that sits on most people's desks. What is also clear is that nobody considers that they have 'solved' the problem of data storage and access and this is seen to be "The Problem" in computing in particle physics. There is however strong convergence on the choice of technology being pursued and the desire to move as rapidly as possible towards '100% automation' of tape movements.

    First some comments on the view from the accelerator labs. SLAC is rather different from the other labs in that it is smaller and for a number of years has really only had one large experiment at a time operating, at present SLD and soon BaBar. The computer centre considers itself as a partner with the experimental collaboration in developing the offline reconstruction and analysis facilities. The solution will be a compute farm closely coupled to a large datastore, but it will be owned and run by SLAC for the benefit of all SLAC users. In designing the farm attention has also to be given to the overall data management system and the connection technology between the CPUs and datastore. Some design numbers are useful (as the system is essentially for BaBar): by 1999 the aim is to have 30K MIPS CPU power; 16 StorageTek Redwood drives in 5 silos with a capacity of order 1.5 Peta Bytes capable of delivering data at an aggregate rate of 150 Mbytes/s; an aggregate network capacity within the farm of 250 Mbytes/s and 5 Tbytes of diskspace. A total of 13 people run the SLAC central computing facilities and that includes four responsible for physics tools.

    At CERN the situation is much more complex and it was with the large LEP experiments that the SHIFT system - of large single experiment offline reconstruction and analysis farms - developed. The pattern has also been followed by the HERA experiments at DESY. Although these farms are managed and part funded by the collaborations, the problem of reliable day-to-day operation has not been so easily solved. In fact in most cases the collaborations are now effectively contracting out the operation of their farms to the computer centre professionals. For LEP experiments data storage is not a problem but new systems are being put in place for current and near future high rate experiments such as NA48. This will give CERN some operational experience before detailed decisions have to be made for the LHC experiments. It is to be noted that full automation of tape operations features strongly in the plans. The WB4 computing model has some similarity to those emerging from the ATLAS and CMS groups, it is thought that 'regional data centres' will be needed. The present platforms at DCI were modelled very closely on what was available at CERN a few years ago. CERN expect to continue running UNIX farms and are developing a WNT farm. Concerning support, CERN regard it as very important that those supporting a platform are a coherent team that can provide mutual backup and give 24 hour service. If one assumes that some tape handling is necessary then a total group of about 12 people would be needed to support three platforms roughly: team leader, 3 platform managers, one network expert, one for the datastore, 3 general systems including user support, 3 operators. It is also noted that as a platform becomes more general purpose so the amount of support needed grows rapidly.

    DESY has always had central logging of data from experiments performed on the accelerators there and they continue to provide the central datastore for the HERA experiments. They have a lot of experience with robot systems (ACS, Ampex, Grau) and continue to upgrade to meet demand. Both the large collaborations H1 and ZEUS have their own data reduction farms based on SGI machines, but these are run by the computer centre. HERMES has developed and is using a Linux PC farm and ZEUS is developing a WNT farm. Germany is well served by a national 34 Mbit network which is heavily used by German University groups to access DESY. At the moment quite a lot of manpower is consumed running the datastores, quite a large fraction for software support. It is expected that desktop machines will be largely PCs within a few years.

    Remarks from FNAL are similar regarding farms and datastores. As network connections in the USA are an order of magnitude better on the whole than in Europe the way in which it is possible to work locally there is not a very good guide for the UK at the moment.

    From the national computer centres the reply from IN2P3 was the most extensive. The situation in France has some similarity with that in the UK but the facilities are larger. There are two computer centres, one (CCIN2P3) for experimental HEP and nuclear physics and IDRIS for high performance computing. CCIN2P3 has a staff of 36 and they run a large IBM/SP2 for interactive work and two farms, both based around RS6000 and HP workstations, and the national IN2P3 network PhyNet. The response points out the benefits that they feel they gain from a powerful centre, apart from the technology, particularly the support in depth and the 24 hour operation. The total capacity of the two farms is roughly twice that of the three platforms at DCI. The farms are connected to a large datastore based on a StorageTek system with Timberline drives (upgrade path to Redwoods). One farm is for Monte Carlo simulation, but the other was designed from the outset to be general purpose and has 900 Gbyte of diskspace. One large difference between the mode of operation in the UK and France is that the French do significant analysis at the CCIN2P3 centre and have had a copy of the full DST data for ALEPH and DELPHI there. They also perform a considerable part of data reprocessing for H1. They are not sure yet whether their mode of operation will continue to be valid in the LHC era. They note that there is a difference for geographically wide spread collaborations such as BaBar and for those that have a large number of collaborating institutes close to the accelerator. They are also not sure that the LHC collaborations will be able to afford a completely centralised computing infrastructure. They therefore intend to retain the capacity for handling and processing a large volume of data locally. The technical solutions that have been adopted and are being investigated are similar to those already described elsewhere. One final point to note is that PhyNet which provides connections to IN2P3 institutions and CERN is based on private leased lines. It is expected that this will be discontinued in the near future and the traffic will be moved to RENATER, the French Academic Network.

    A general comment that was made by most respondents was the importance of exploiting fully global file systems such as AFS and linking this to multi-level hierarchical file stores based on automated technology. This is most likely to be achieved in near to medium term within a single country where one can be confident of the existence of a network with sufficient bandwidth. It is another reason why it is important for UK HEP to continue to press for better international network connections.

    Replies were received from

    CERN: D. Jacobs (CN)
    DESY: D. Moenkemeyer
    IN2P3: E Auge
    INFN: F. Ruggieri
    NIKHEF: W. Hoogland
    FNAL: S. Wolbers
    SLAC: C. Boeheim

  5. References

    1. 'Review of RAL Particle Physics Computing' (15th November 1996)
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIpaper.ps
    2. 'Additional Input' (25th November 1996)
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIaddinput.ps
    3. Questionnaire to Users
      http://hepwww.rl.ac.uk/CNAP/CCreview/USERsurvey.txt
    4. Questionnaire on Computing Models
      http://hepwww.rl.ac.uk/CNAP/CCreview/comp-model.txt
    5. Questionnaire to DCI
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIsurvey.txt
    6. Revised Questionnaire to DCI
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIrevised.txt
    7. 'Review of RAL Particle Physics Computing' (23rd January 1997)
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIupdate.ps
    8. Presentation made by DCI to Review Panel (28th January 1997)
      http://hepwww.rl.ac.uk/CNAP/CCreview/DCIpres.htm
    9. The White Book 4 Report
      http://hepwww.rl.ac.uk/CNAP/CCreview/WB4.ps
    10. Note and request for comment from accelerator and national laboratories
      http://hepwww.rl.ac.uk/CNAP/CCreview/international.txt
    11. CERN/RD45 - A Persistent Object Manager for HEP
      http://wwwcn1.cern.ch/asd/cernlib/rd45/index.html