GridKa School 2010

Abstracts

 
 
 
Grid Computing and Cloud Computing, an Overview Tony Cass, CERN September 6, 14:30 - 15:30
Reading the headlines-or looking on googlefight-Clouds look to be eclipsing Grids as the answer to all computing problems. But is this the reality? In this presentation I will review the backgrounds and histories of both, comment on the relevance of the two techniques to different computing domains and perhaps be rash enough to give some thoughts on how HEP computing may develop.
 
 
The Tier-1 center GridKa at KIT Andreas Heiss, KIT September 6, 16:00 - 17:00
The German regional data and computing centre for high energy physics, GridKa, is one of 11 Tier-1 centres of the Worldwide LHC Computing Grid (WLCG).
GridKa is responible for raw data storage and reprocessing of all four LHC experiments. It serves and receives data to and from Tier-2 centres in several european countries and provides regional Grid services. Besides its role in WLCG, GridKa deliveres compute power and storage capacity to several non-LHC experiments and many other VOs of the German D-Grid initiative.
 
 
Virtualization Ulrich Schwickerath, CERN September 7, 9:00 - 9:40
In 2009 CERN has started to develop an Infrastructure as a Service (IaaS) setup, with the aim to run the batch farm as a (first) application on top of this infrastructure. Initial estimations indicated that one of the main challenges will be the required scale of the installation, which will have to be able to manage 10,000-20,000 simultaneous instances of virtual machines. A prototype of such a system has been implemented and tested on a large scale, and demonstrated to be able to cope with up to 15,000 VM instances simultaneously.
In this presentation the basic ideas and concepts for a virtualized batch system are presented. Specific aspects of designing a scalable infrastructure, like efficient image distribution mechanisms, are described in more detail. First preliminary results and performance measurements are shown, and experiences and lessons learned are summarized. The presentation is complemented by the talk given by Tony Cass.
 
 
Private Clouds with Open Source Christian Baun, KIT September 7, 9:40 - 10:20
The hype for cloud computing that started at the end of 2008 still exists in 2010. When talking about cloud computing it must be kept in mind, that different organizational types of cloud services exist.
  • Public Cloud: Service consumer and provider belong to different organizations.
  • Private Cloud: Service consumer and provider belong to the same organization.
  • Hybrid Cloud: Services from Public and Private Clouds are both used inside one infrastructure.
Also various technical organizational types of cloud services exist.
  • Infrastructure as a Service (IaaS) allows running virtual instances of servers without the need to directly access the bare metal.
  • Platform as a Service (PaaS) is a runtime and sometimes a development environment for a single or few programming languages.
  • Software as a Service (SaaS) provides applications to be consumed as a utility inside the browser without the need to install anything on client side.
  • Humans as a Service (HuaaS) follows the principle of crowdsourcing. Here human creativity can be used as a resource for a few pence or for free.
When trying to build up a Private Cloud, it's a good idea to choose a solution that is open source and API-compatible to the services of well-established Public Clouds. In this talk, open source Public Cloud software is discussed that meets this demands and compared to their Private Cloud brothers.
 
 
Large Scale Data Facility A. Garcia, KIT September 7, 10:50 - 11:30
The Large Scale Data Facility project was initiated in 2009 with the aim to provide added-value data services for high throughput scientific experiments, notably biological microscopy image processing and synchrotron radiation science in the initial phase.
The project involves several KIT Institutes and tightly cooperates with Bioquant at the University of Heidelberg, with the aim of making the services evolve as a central storage resource for other research institutions in Germany. Data storage capacity in the multi-petabyte range should combine with high throughput data processing solutions like Hadoop, metadata management tools, and user-friendly access portals to deliver added-value to the end-scientific-user.
 
 
The European Grid Landscape – from 2010 onwards Achim Streit, KIT September 7, 11:30 - 12:10
In 2010 the European Grid Landscape went through a significant change. The trilogy of successful EGEE projects is followed by the new European Grid Initiative (EGI) with its technology providers EMI (European Middleware Initiative) and IGE (Initiative for Globus in Europe).
The European Grid Initiative builds on top of the National Grid Initiatives (NGIs) in the European member countries, which operate and provide the computational resources as well as perform major parts of the operational duties. EGI itself has become a legal entity called EGI.eu with it’s headquarter in Amsterdam. Its objective is “to create and maintain a pan-European Grid Infrastructure in collaboration with National Grid Initiatives (NGIs) in order to guarantee the long-term availability of a generic e-infrastructure for all European research communities and their international collaborators.” In order to boot-strap this goal, the European Commission has granted a project called EGI-InSPIRE to facilitate the Europe-wide coordination of the various national activities and tasks.
In addition the European Commission has granted two projects to support, maintain and harmonize the four middleware technologies used in EGI, namely the technologies ARC, gLite and UNICORE in the EMI project and the Globus Toolkit in the IGE project.
This talk presents an overview on EGI and NGIs, EMI and IGE.
 
 
gLite Introduction Course Markus Stober, KIT September 7, 13:30 - 18:30
An introduction to the basic concepts of the grid and the gLite middleware is presented. After a short explanation of authentication and authorisation on the grid, the participants get acquainted with the workflow of grid jobs by submitting their first jobs to the grid. In addition, grid-based data storage and access is explained and practiced. Based on this, the participants learn advanced grid job submission techniques. The course concludes with an complex exercise that combines all tools and techniques presented before.
 
 
Unicore Workshop Rebecca Breu, Forschungszentrum Jülich September 7, 13:30 - 18:30
This session provides an overview of the grid middleware UNICORE. First, the system's overall architecture will be introduced, followed by a discussion of the features and some technical details of its main components. This includes a discussion of services for job execution, data storage, and user management, as well as service discovery and security issues.
In the practical part of the session, the participants will install the UNICORE server components. Afterwards the installation and usage of the UNICORE command line client and the Eclipse based graphical client will be covered. The participants will learn how to create, submit and monitor single jobs and workflows, as well as transfer files between sites.
 
 
Cloud Computing Tutorial Viktor Mauch, KIT September 7, 13:30-18:30
In the last two years cloud computing has achieved an important status in the IT scene. The hiring of computing power, storage and applications according to requirements is regarded as future business. In addition to commercial providers of IT resources a large number of open source solutions have found their way into the market.
This tutorial course gives an introduction of the basic concepts of cloud computing based on the common open source software framework OpenNebula.
 
 
xrootd Fabrizio Furano, CERN September 7, 13:30-18:30
This tutorial aims at introducing the basic knowledge that a system administrator should have about the xrootd platform and its deployment.
Starting from the basic concepts about its architecture, we will give the information that one should know in order to build a working xrootd cluster configured for file access. We will cover also the aspects related to the bundled setups, which encapsulate most of the technological aspects into a fully working solution. The tutorial will also give to the students the ability to find their way into the documentation in order to be able to understand more complex setups.
 
 
Grid Computing for LHC Johannes Elmsheuser, LMU München September 8, 9:00 - 9:40
At full operation intensity, the LHC at CERN will produce roughly 15 Petabytes of data annually, which thousands of scientists around the world will access and analyze in the Worldwide LHC Computing Grid. In this presentation I will describe the grid infrastructure setup by the LHC experiments to meet this challenge. An emphasis will be put on aspects of the distributed analysis as it is carried out by many physicists nowadays to analyze the newest LHC data.
 
 
User Support for Distributed Computing Infrastructures – The EGI model Torsten Antoni, KIT September 8, 9:40 - 10:20
In 2010 the European Grid Landscape went through a significant change. The trilogy of successful EGEE projects is followed by the new European Grid Initiative (EGI) with the National Grid Initiative as its building blocks and EMI (European Middleware Initiative) and IGE (Initiative for Globus in Europe) as technology providers.
Major changes like these also have an influence of the way user support is provided and on the tools used to build the support workflows on.
This presentation will give an overview of the support activities in EGI, covering application and user community support as well as support for middleware and infrastructure related issues. The relevant tools and workflow will be presented and the major new developments and changes will be introduced.
The EGI user support model is well suited to be extended to other distributed computing infrastructures.
 
 
gLite Administration Workshop Stefan Freitag, Florian Feldhaus, University of Dortmund September 8, 10:50 - 18:30
Stefan Freitag is working at the Robotics Research Institute in Dortmund.
Since three years he is administering Grid resources running different middleware flavours and one of the site security officers at Dortmund Uni-versity of Technology.
Florian Feldhaus is working at the IT & Medien Centrum in Dortmund. He is a site administrator for the Grid resources in Dortmund since 2007. As a physicist he was working in the LHCb collaboration for 2 years and participated in the development of the DIRAC Grid Framework.
Abstract: This administration workshop deals with the handling of various gLite site services. The students will learn in lectures and discussions how to deal with gLite installation, configuration, updates & problems, monitoring, and challenges of larger sites. The course offers a number of different services to install and manage such as a CREAM Compute Element, a site BDII, a batchsystem and many more . . .
 
 
Globus Workshop J. Laitinen, F. Zrenner, LRZ September 8, 10:50 - 18:30
This workshop targets users and administrators of Grid services. In the lecture we will give a short introduction to Grid computing with Globus, introduce the brand new Globus 5 toolkit release (GT5), explain how its various tools can be used to build and use a Grid, and describe where to find support in Europe.
In the hands-on session we will focus on the following most important tools from GT5:
  • Interactive Login (gsissh+GSI-SSHTerm)
  • File Transfer (GridFTP)
  • Cloud Services for File Transfer (via globus.org)
  • Credential Service (MyProxy + Short Lived Credential Service)
  • Job Submission (GRAM5)
The participants will install these tools and build a small Grid which they will then learn how to use.
Please check that your laptop has Java 5 or 6 Webstart installed.
You can click this link to test if Java Webstart based GSiSSH-Term starts.
You can accept opening security warning or close them. If you see them your installation is fine. In case of problems please contact grid-admin does-not-exist.lrz de

 
 
Alien Workshop Steffen Schreiner, CERN September 8, 10:50 - 18:30
AliEn is a lightweight Open Source Grid Framework built around other Open Source components using the combination of a Web Service and Distributed Agent Model. As the central middleware of the physics experiment's and ALICE, CBM, and Panda, it constitutes their virtual organizations (VO) and handles the processing and storage of the experiment's physics data, in particular simulation, reconstruction, and analysis (see http://alien2.cern.ch and http://alimonitor.cern.ch).
Within alternating presentation and hands-on sessions, the tutorial will cover all basic aspects of the employment and utilization of AliEn from a user perspective:
 - Client installation, setup, and access
 - Basic commands and functionality
 - Handling and editing files
 - Job submission, control, and the Job Description Language (JDL)
 - Working with the file catalog

The hands-on sessions will be conducted in the Grid VO of the ALICE experiment.

Steffen Schreiner is working as an AliEn developer at the ALICE experiment at CERN within a collaboration with the Center for Advanced Security Research Darmstadt (CASED).

 
 
High Energy Physics Session Hartmut Stadie, University of Hamburg September 8, 10:50 - 18:30

agenda

 
 
Distributed parametric optimization with the Geneva library Rüdiger Berlich, KIT September 9, 9:00 - 9:40

The Geneva ("Grid-enabled evolutionary algorithms") library is a C++-based Open Source solution for distributed parametric optimization studies (see http://launchpad.net/geneva for the code), developed with support of Karlsruhe Institute of Technology and the Helmholtz society of German research centres. It scales up to several hundred client nodes, running in a cluster, for solving large scale technical and scientific optimization problems in parallel. Apart from evolutionary algorithms, Geneva also implements swarm algorithms. The presentation introduces the topic of parametric optimization, discusses use cases and the challenges involved in creating applications for Grid- and Cloud environments.

 
 
ROOT/PROOF tutorial Jan Fiete Grosse-Oetringhaus, CERN September 9, 10:50 - 18:30
The Parallel ROOT Facility, PROOF, enables the interactive analysis of distributed data sets in a transparent way. It exploits the inherent parallelism in data sets of uncorrelated events via an architecture that optimizes I/O and CPU utilization in heterogeneous clusters with distributed storage. Furthermore, it allows to exploit the full potential of multi-core machines.
The first part of the tutorial starts with a short introduction to ROOT and some tools for fast data analysis and visualization on a desktop machine. Participants will learn how to layout, compile and a use a self-defined event class for analysis data stored as ROOT trees in ROOT files. These are used in general as a basis for analysis on a desktop or on a PROOF cluster. The ROOT selector framework will be discussed in detail.
The main part of the tutorial explains the advanced architecture of PROOF and its usage for tree/selector based analyses.
In practical exercises participants will be guided to run interactive analyses using an existing PROOF cluster and a distributed dataset. Emphasis is the practical usage of PROOF rather than administrative aspects.
 
 
ARC Grid Middleware Admins Tutorial Ivan Degtyarenko, EMI September 9, 10:50 - 18:30
The tutorial lectures and hands-on exercises introduce concepts of grid computing with ARC middleware and give the audience the skills needed to install and maintain the ARC grid resources. In addition the examples of running grid jobs and managing data on the ARC based grids will be presented. The course is particularly aimed for the grid administrators and support team members, but is open also for others interested in ARC middleware.
 
 
dCache Workshop S. Kalinin, University of Wuppertal; P. Millar, DESY; C. Mitterer, LMU; X. Mol, KIT; O. Tsigenov, RWTH Aachen September 9, 10:50 - 18:30

agenda

 
 
GAT Tutorial A. Beck-Ratzka, AEI Potsdam September 9, 10:50 - 18:30
Overview
GAT is an easy to use Grid API, which enable uniform job and file management in a heterogeneous Grid environment over adaptors. A user application only needs to code against the GAT API, and then it is possible to access gLite, Globus and Unicore resources with the same code. Due to the availability of so-called local adaptors, it is possible to create a program without having access to the Grid. This leads to a remarkable reduction of the development time. Furthermore it is much easier to code against the GAT API than against a middleware API directly, as e.g. Globus.
The tutorial gives a short introduction to GAT, and offers a session for file management and job management using GAT. The tutorial will restrict to the Globus middleware.
Requirements for participants: Own Laptop and Sun JDK version 6. An IDE would be helpful, but is not mandatory.
Agenda
10:50 – 11:40: Introduction and installation round
11:40 – 12:40: Lunch
12:40 – 14:40: File Management
14:40 – 15:10: Coffee Break
15:10 – 18:30: Job Management including further coffee break
 
 
Pilot frameworks J. Schultes, University of Wuppertal September 10, 9:00 - 9:40
All four LHC experiments started to use pilot frameworks (AliEn for ALICE, PANDA for ATLAS, CMSGlideins for CMS and and DIRAC for LHCb) as an alternative to an central broker. The development was driven due to the fact, that grid jobs failed after being queued for a longer time due to not fulfilled requirements in the side they were sent to. Therefore a pilot is a small job which checks the actually available system resources and asks than for a job, which fits to the resources ("Late Binding"). The Talk will go through the concept of job submission via pilot jobs and will discuss the advantages and disadvantages.
 
 
Grid Security John White, CERN September 10, 9:40 - 10:20
The security of Grid infrastructures is of utmost importance to not only the resource owners but the Grid users. The collection of services, executables and libraries that provide the security for the Grid infrastructure have been developed by a wide variety of projects to satisfy the user communities. As there have been multiple projects and there are diverse Grid user communities, the security models and services differ slightly but effort has been made to use standards-compliant and common approaches. The European Middleware Initiative project is mandated to combine the middleware of the three major European Grid technologies (ARC, gLite, UNICORE) and, in the security area, provide a consistent security layer enabling an easier usage of Grids for the user.
 
 
Grid and Clouds: A Look Ahead Fabrizio Gagliardi, Microsoft Research September 10, 10:50 - 11:50
The speaker will review latest trends in distributed computing focussing on the emerging cloud computing paradigms and using as an example the recently launched EU FP7 VENUS-C project.
 
 
Grid and Cloud Security Workshop CERT-Team, KIT September 10, 13:30 - 17:00
agenda:
Introduction (A. Lorenz, KIT)
Differences between Grid- and Cloud Security
Security Incidents (CERT-Team, KIT)
Example Incident (S. Freitag, Uni Dortmund)
Tea/Coffee break
Security Service Challenges (SSC) (U. Epting, KIT)
Example SSC4 - Forensik excursion (T. Dussa, KIT)
Tea/coffee break
Topics raised by attendees
Monitor your resources - optional
  • Pakiti (Software Vulnerability Detection System) (U. Epting) (15 Min)
  • Samhain (Host Intrusion Detection System) (U. Epting) (15 Min)
  • IDS/IPS Systeme (Network Intrusion Detection System) (CERT-Team) (15 Min)