iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW 21 October 2009 > Feature - Here to help: embedded cyberinfrastructure experts

Feature - Here to help: embedded cyberinfrastructure experts


It isn’t easy designing software that can run on a cluster like Fermilab's Grid Computing Center. That’s why advanced technical support is so essential. Photo by Reidar Hahn, Fermilab Visual Media Services.

Although much of today’s scientific research relies on advanced computing, for many researchers learning how to adapt and optimize applications to run on supercomputers, grids, clouds, or clusters can be daunting.

To help newcomers, many cyberinfrastructure providers offer in-depth support tailored to fit each user’s needs. This is much more than the typical technical support that helps users write scripts to enable their jobs to run. Instead, cyberinfrastructure experts are embedded directly into a user team to provide longer-term assistance.

One example is TeraGrid User Support and Services, led by director Sergiu Sanielevici.

“The designation of a supercomputer is that it’s basically five to ten years ahead of the regular curve of technology,” said Sanielevici. “We’ve known from the start that it would be an absolute necessity for there to be experts at the various supercomputing sites to make a bridge between this advanced technology and the users.”

Each of the 11 TeraGrid sites is part of the User Services Working Group, whose members meet bi-weekly to discuss the best practices and share tips. To access around-the-clock support, users can email the TeraGrid help desk or submit a request via the TeraGrid user portal. All requests are immediately sorted, and site-specific problems are handed off to support staff at each site.

User teams can request a TeraGrid expert in their field to help solve a specific problem with their application. For example, applications originally created to run on only a few hundred processors must be adapted to run on thousands of processors. This can be a challenging task for an inexperienced grid or supercomputer user and could take several months to figure out without the help of an embedded TeraGrid expert.

About a month into each quarter, TeraGrid support staff at each site contact new users to see how their projects are going and if they have experienced any problems accessing the resources. “Users are a little hesitant to ask for help sometimes, so this is a proactive way to try to see who’s having problems,” said Chris Hempel, associate director of user services at the Texas Advanced Computing Center at The University of Texas at Austin, a TeraGrid site.

Support staff at each site can also detect when an unusual amount of stress is being placed on the system and contact that particular user to help identify and correct the problem. “We work with them to fix their code so it becomes more efficient and places a lot less stress on the system,” Hempel said.

Open Science Grid also offers a similar level of embedded support through its Engagement Program, which users can access by sending an email to the Engagement Team or submitting a ticket through the OSG portal.

“The Engagement Program is the front door for many users who come into OSG without a lot of existing knowledge of how to operate in a large-scale distributed environment,” said John McGee, OSG Engagement Program Coordinator.

Because OSG consists of many more sites than TeraGrid, the Engagement staff typically handles all support issues centrally instead of directing users to staff at individual sites.

The Engagement staff proactively monitors large job runs and offers support to users with problems or failures. A lot of time and effort is also spent behind the scenes to improve OSG infrastructure so users can more easily get up and running, and avoid failures in the first place.

To help new users to get started, Engagement staff members add code to the user’s application to make it as efficient as possible and then run it on OSG to assure it will operate correctly. The staff teaches users how to modify the code that was added to best suit their needs and then continues to support them over time if questions arise.

It is important to have cyberinfrastructure experts from many scientific fields provide the advanced embedded support needed to help scientists learn how to take full advantage of high-performance computing resources, McGee said. “It’s really important to not only immerse scientists in the cyberinfrastructure, but also to immerse the cyberinfrastructure experts into what it is the scientists are trying to accomplish.”

Amelia Williamson, for iSGTW

Tags:



Null
 iSGTW 22 December 2010

Feature – Army of Women allies with CaBIG for online longitudinal studies

Special Announcement - iSGTW on Holiday

Video of the Week - Learn about LiDAR

 Announcements

NeHC launches social media

PRACE announces third Tier-0 machine

iRODS 2011 User Group Meeting

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

December 2010

13-18, AGU Fall Meeting

14-16, UCC 2010

17, ICETI 2011 and ICSIT 2011

24, Abstract Submission deadline, EGI User Forum

 

January 2011

11, HPCS 2011 Submission Deadline

11, SPCloud 2011

22, ALENEX11

30 Jan – 3 Feb, ESCC/Internet2

 

February 2011

1 - 4, GlobusWorld '11

2, Lift 11

15 - 16, Cloudscape III


More calendar items . . .

 

FooterINFSOMEuropean CommissionDepartment of EnergyNational¬†Science¬†Foundation RSSHeadlines | Site Map