CHIPP Computing board Meeting ============================= CERN: 15th September 2004 Present: Hanspeter Beck, Roland Bernet, Derek Feichtinger, Christoph Grab, Imma Riu, Gian Luca Volpato Frederik Orellana, Marie-Christine Sawley, via VRVS: Stephan Egli temporarily : David Stickland All presentations are on the Web available. 1) Introduction and presentation of agenda. (C.Grab) ----------------------------------------------------- 2) Status reports Data challenges and production tests of experiments: ----------------------------------------------------------------------- * ATLAS : F.Orellana It is running exclusively on grids; the LCG, NorduGrid and Grid3. One big challenge both for Atlas as such and Atlas in Switzerland is user analysis. Jobs fail mainly due to misconfiguration and the jobs die right at the beginning and not at the end. Important short-term questions to be answered are: which type of events do Swiss Atlas physicists want to access now - DC2 Monte Carlo, testbeam data? * CMS : D.Stickland The main aim is to produce simulated data for the physics TDR due in 2005. The target is to get a throughput of 10 million events per month. * LHCb : R.Bernet Production is ongoing smootly, the biggest challenges being monitoring and data transfer. The next challenge will be data analysis for users. _______________________________________________________________________ 3) Presentation of CSCS : MC Sawley. ------------------------------------ * MCS presents the organisation and plans of CSCS, its latest developments, new rules and organisation as well as long term plans to acquire more hardware. By the end of 2004, new hardware for high-energy physics would be bought with agreed specifications with the community. Tendering is ongoing. MCS confirms the committment to become a strong partner for CHIPP in terms of computing. _______________________________________________________________________ 4) Purchase of a new Phoenix cluster at CSCS -------------------------------------------- * Status report from Gian Luca on the old cluster * Status report from Gian Luca on the present interim solution (the 5 machines), that maintain at the minimal level the Swiss contribution to LCG. * Presentation of requirements, and offers for the Phoenix cluster. Extensive discussions lead to a extensions of requirements, which Gian Luca will include in the next round of offers. Here are the main points of the discussion: Issues discussed for upgrading the CH-LCG cluster at CSCS: ---------------------------------------------------------- o A.Rubbia raised a few issues, which are listed in slides by C.Grab. they are taken into account in the discussion. o 32 bits versus 64 bits architecture discussion: - LCG and experiments SW presently operates reliably on 32-bit. - Advantage of 64 bits: very large file access very large jobs (memory) - CERN Linux should/will run on 64 bit machines. => we need reliability, so we all agree on 32-bit. o Intel vs AMD Opterons: In industry large installations of both architectures exist. in LCG only large Intel ones so far. Other centres (Prague, Karlsruhe ... ) just bought large Intel-Xeon clusters. Karlsruhe also bought 1/3 Opterons for "learning" (see slides by C.Grab). => we agree on Intel. o Setting priority on the file-server (FS): Highest priority should be given to the file-server. We need high reliability over high availability (SE), high capability, and redundance for data storage (RAID). (Question of Raid-5 or Raid-6 ??). The purchase of the file-server can be separated from compute nodes purchase. Higher level of service and maintenance agreement are required for FS. For FS, we need a backup possibility, to be investigated by CSCS. o Compute nodes (CN): the level of service agreement for CN can be "lower" than for the FS. System must be expandable, i.e. we want to be able to just add another set of nodes next year. o Needs for remote administration: - Remote on/off; reboot nodes; - status of the node; - temperature alert - Console export o Operating System: full compatibility with Linux RedHat 7.3 o Summary of the hardware requirements: -------------------------------------- - One File server: reliable server with RAID system; good service contract; CPU type irrelevant. with Storage of >= 5 TByte disk [ Dual-CPU; 2 GB RAM;1 floppy drive;1 CD-ROM drive 2 Gbit network cards;100 GB hard disk; Redundant power supply;Hot-swappable S-ATA disk drives ] - One Master Node: reliable ; [ Intel 32-bit architecture; Dual-CPU; 4 GB RAM; 1 floppy drive;1 CD-ROM drive;1 Gbit network card; 2 * 160 GB hard disks;Redundant power supply ] - 20 Compute Nodes: Intel Xeons [ Intel 32-bit architecture; Dual-CPU; 2 GB RAM x CPU = 4 GB x node; no floppy drive;1 CD-ROM drive; 1 Gbit network card; 100 GB hard disk ] - Offers from several companies, ie. Dalco, SUN, IBM, HP, DELL. comments on companies: Egli: PSI, UNIZH, ETHZ have very good experience with DALCO. Rubbia: good experience with DELL, although in desktops. Various points from the discussion: o Infrastructure (connection, networks, daily operation ..) provided by CSCS. Details of this will be written up in a "service+maintenance contract" document to be signed by CSCS and CHIPP. o The cluster named "Phoenix" will be (for now) purely LCG. the situation will be re-discussed once it has been stably operation for a few months. ______________________________________________________________________ 5) Funding Request (C.Grab) ------------------------- o The request for funding (submitted to the Swiss National Science Foundation (SNF), on behalf of CHIPP, in Feb. 2004, asking for a total of 128 kCHF) was judged positively. The official lettre should arrive within the next few weeks. The money will not be available before October (which is not really needed). _______________________________________________________________________ 6) Status of LCG development ARDA: (Feichtinger) ---------------------------------- D.Feichtinger described the status of the new gLite grid middleware, which is based on the Alien middleware used by the Alice experiment. Copies of transparencies are available on : www.chipp.ch C.Grab; 24.9.2004_V2 ________________________________________________________________________