iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week

Home > iSGTW - 23 June 2010 > Feature - Just what do you think you're doing, Dave?

Just what do you think you’re doing, Dave?

Anyone who has seen the movie 2001: A Space Odyssey remembers the phrase “Just what do you think you’re doing, Dave?” While it looks similarly futuristic, ATLAS (above) is the setting for something completely different — the ‘Dave’ dataset. Photo courtesy CERN

What is 45 days old, has visited 22 countries and already has close to 500 children but will outlive them all?

Dave the ATLAS dataset.

Born on the 30th of March, 2010 to his proud parents, the ATLAS detector and the LHC, Dave weighed in at just over 3 Terabytes. Very soon after that, he was sent to the Rutherford Appleton Laboratory (RAL) grid operations center,  and the team there have been tracking his progress through the grid world ever since.

Dave is the result of accelerating two beams of protons in the Large Hadron Collider (LHC) and colliding them in the center of the ATLAS detector.

He is a selection of the collisions chosen by the triggers at ATLAS to be of interest to physicists. As soon as Dave was packaged up he was winging his way to RAL via the dedicated 10gigabit/s links between the UK and CERN.

Once he arrived at RAL, he was copied into the CASTOR Storage system and registered in the ATLAS LFC, ready to be processed by scientists from across the world.

It didn't take long and within 8 days Dave already had 51 descendants. These are the datasets resulting from scientists using Dave to examine their physics models. These are then available for further analysis. Trying to keep a track of all of Dave’s children (and grandchildren) isn’t easy but thankfully they currently break down into 3 types; Ursula, Dirk and Valerie.

• Ursula: These are datasets that have been created by individual ATLAS users
• Dirk: Datasets that have been produced as part of the standard ATLAS central production system
• Valerie: Produced and used by the central ATLAS production system to validate a site’s ability to analyse various configurations of data analysis.

Brian Davies at RAL is the man behind the idea to track a single dataset. He says that he is happy with how it has gone so far: “One of our tasks is to follow data distribution and to see how it if follows the virtual organization’s computer model. Dave was the first custodial raw dataset that RAL was responsible for from the 2010 7TeV run of the LHC. Tracking Dave is not just for fun: one of the aims is to answer the question ‘From one run at the LHC, how many files and what volume of data will get produced?.’ 

The thing which has been the most enlightening has been how quickly the number of real ATLAS users who have analyzed Dave and his children has grown. Besides that, I am interested to see how quickly his children become obsolete (if at all); and once this happens, how quickly they are removed from the grid.”

You can follow Dave’s progress along with that of his kids at the GridPP storage blog.

Neasan O'Neill, GridPP


 iSGTW 22 December 2010

Feature – Army of Women allies with CaBIG for online longitudinal studies

Special Announcement - iSGTW on Holiday

Video of the Week - Learn about LiDAR


NeHC launches social media

PRACE announces third Tier-0 machine

iRODS 2011 User Group Meeting

Jobs in distributed computing


Enter your email address to subscribe to iSGTW.


 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

December 2010

13-18, AGU Fall Meeting

14-16, UCC 2010

17, ICETI 2011 and ICSIT 2011

24, Abstract Submission deadline, EGI User Forum


January 2011

11, HPCS 2011 Submission Deadline

11, SPCloud 2011

22, ALENEX11

30 Jan – 3 Feb, ESCC/Internet2


February 2011

1 - 4, GlobusWorld '11

2, Lift 11

15 - 16, Cloudscape III

More calendar items . . .


FooterINFSOMEuropean CommissionDepartment of EnergyNational¬†Science¬†Foundation RSSHeadlines | Site Map