You are here: Home User Information Facility Services Frontier Frontier Meetings for ATLAS Meeting Minutes Archive 2011 Minutes: 5/11/2011

Minutes: 5/11/2011

by John S. De Stefano Jr. last modified May 11, 2011 03:25 PM
Notes from the ATLAS Frontier meeting on May 11 2011.

frontier-minutes-20110511.txt — Plain Text, 5 kB (5162 bytes)

File contents

Participants: Dario Barberis, Catalin Condurache, John DeStefano, Alastair 
              Dewhurst, Dave Dykstra, David Front, Elizabeth Gallas, Shawn 
              McKee, Andreas Petzold, Sarah Williams, Andrew Wong

*** Site Status: ***

AGLT2 (US):
- No updates; working on CVMFS deployment

BNL:
- No updates

CERN:
- No updates

KIT:
- No updates; working on manpower for CVMFS nodes, hardware is ready

LYON:
- N/A

RAL:
- No problems; need to follow-up on additional Frontier box status
- CVMFS used for user analysis previously, rolled out to production as well
  * ~5k job slots on farm
- Tier 2 (RAL-PP) set up with 2 v2.7 Squids for CMS Frontier and CVMFS
  * Configured to fail-over to Tier 1 Squids
  * Aliases set; host names to be provided for CMS, ATLAS monitoring
- RAL configured to handle IT cloud requests redirected from PIC
  * No production level load seen yet: ATLAS configuration will be updated 

TRIUMF:
- No updates; upcoming Oracle outage today for kernel upgrades
  * No site Squid cache yet; jobs must be stopped during intervention


*** CVMFS Tests at MWT2: Sarah Williams ***

Slides: http://www.mwt2.org/~sarah/cvmfs-20110511.pdf

*** Project News: Dario Barberis ***

- David Front has been granted access (thanks to Serguei) to the server that
  Flavia used for building rpms. He should have complete information to
  continue this work in the future. David, let me know if my statement is
  incorrect!

- Roman Sorokoletov started earlier this month to support ATLAS databases,
  and the idea is that he will support the Frontier servers at CERN too. As he
  is new of this system he will need information and guidance in the near
  future.

- Florentin Bujor will start on 1st June to work full time on Frontier/Squid
  monitoring for ATLAS. Here is his task description as was defined (much)
  earlier this year:

- Title: Deployment and operation for monitoring tools for Squids and Frontier
  servers

- Context: Several tools exist that are potentially useful to monitor Frontier
  servers and Squids. Deployment work is needed in order to create a robust
  system in the context of the ATLAS Monitoring activities and to be able to
  pass on the monitoring tasks to ADCoS shifters. In addition to Squids
  currently used for database access, during 2011 Squids will also be used for
  software releases, database releases and conditions data. More work is
  needed in the deployment and monitoring of Squids for these activities.

- Work description: - Frontier servers: integrate SLS monitoring with the Site
  Status Board (SSB). Deploy AWSTATS monitoring to all sites and integrate
  into SSB. - Database Squids: check MRTG stats collection and find some
  meaningful display. Chase failing sites as their setup must be incorrect.
  Make sure the SSB info is reliable. Update the SAM tests and display them in
  SSB. Provide a global view for shifters including Frontier servers and
  Squids health. - Software and conditions data Squids: provide entries in SSB
  for them. Develop and deploy appropriate tests and make sure the results
  display correctly. Create shifters' views. - General site monitoring:
  provide support for monitoring tools in collaboration with the other members
  of the ADC Monitoring group.

- Please let me know if this job description should be updated and what you
  think are the priorities for Florentin (i.e. where he should start from).

- Flavia finally responded and offered to find a few hours during the last
  week of June to pass on her SLS monitoring scripts, cron tasks or whatever
  is needed to Alessandro Di Girolamo and/or Simone Campana. Unfortunately
  there will be no overlap between Flavia and Florentin at CERN so this
  additional step is needed. If John and/or Alastair can be involved in this
  discussion, better.

- Alessandro DG and Alexey A are waiting for feedback on the current AGIS API.

*** Deployment, Development, and Testing: ***

AGIS:
- Awaiting feedback on beta API

CMS ASGC issue:
- High load on database server
  * Taiwan site bandwidth bottle-necked by severed network cable
  * Frontier connections saturated
- Solution: have launchpad Squids quickly read entire response from
  Tomcat to free up database connections, then take as long as
  necessary to send to client
  * Risk: can cause over-allocation of Squid memory under highly unlikely 
    circumstances
  * Not ready to implement in ATLAS without proper monitoring of Squid memory 
    size

Meeting times:
- Will change to bi-weekly Thursday at 9:30 ET, opposite WLCG T1 Service 
  Coordination bi-weekly meeting

LHCb:
- No feedback as of today

Packages:
- David working on all ATLAS Frontier and Squid packages
  * RPMs in SVN repository include suggested changes, prerequisite packages
  * Overwrite of customization script in relocated installations not  
    reproducible
  * Possible problem with hourly cron scripts
  * More testing necessary

SLS:
- Monitoring still broken, being investigated
  * Alerts have been disabled

*** A.O.B.: ***

ReadyTalk:
- Bugs reported last week; fixes implemented to current interface
- Can't call non-US toll-free numbers from Skype
Document Actions
Filed under: , , ,