TU Delft
 
Alexandru IOSUP
GrenchMark - third place award at ACM SRC/SuperComputing 2007
Parallel and Distributed Systems
EWI PDS A.Iosup Research GrenchMark at ACM SRC/SuperComputing 2007
 
 
 
 
 
 
 
 
 

GrenchMark: A Framework for Analyzing, Testing, and Comparing Grids
GrenchMark: A Framework for
Analyzing, Testing, and Comparing Grids

 
Quick links
intro framework results related publications references

Introduction
why and how is this work relevant?
Today, production grids bring together (tens of) thousands of resources. Infrastructures such as CERN’s LCG, NorduGrid, TeraGrid, the Open Science Grid, etc., offer similar or better throughputs when compared with large-scale parallel production environments [1]. However, the dynamicity, the heterogeneity, or simply the sheer scale of today’s grids expose problems in grid performance, reliability, and functionality. Early testing in real grids reveals lower performance than expected from simulations [2], failure rates from 10% up to 45% [3], [4]1, and functionality problems of around 1 in every 3 tests for widely installed grid services [6]. Thus, an important research question arises: How to gain insights into the performance and the reliability of large-scale distributed computing systems such as grids? In this work we attempt to address this question with the GRENCHMARK framework for analyzing, testing, and comparing grids.
1 The failure rate in today’s grids is much higher than that of contemporary large-scale parallel production installations [5].



The GrenchMark Framework
short presentation
GRENCHMARK is a comprehensive framework for realistic, repeatable, and comparable testing of large-scale distributed computing systems. We describe below only a few important design aspects. To support realistic testing, we focus on mechanisms to generate and process grid workloads. The Workload Modeler and the Workload Generator (components 2 and 3 in Figure 1) are responsible for realistic workload generation. We make use of databases of workloads and workload models, and we provide mechanisms for workload (selective) truncation and scaling. We also make use of a database of real applications. To facilitate repeatable testing, we store test provenance data, and we are able to replay tests. The Test Manager and the Workload Submitter (components 1 and 4 in Figure 1) are responsible for coordinating the repeatable testing and workload submission, respectively. Note that the Workload Submitter receives feedback from the tested environment, allowing tests in which the submission depends on dynamic information (e.g., testing with grid workflows, or testing with service-level agreements). To enable comparable testing results, we provide a framework for testing, including metrics that take into account the system size and other environment specifics. The Data Manager (component 5 in Figure 1) is responsible for storing and for analyzing the testing and other (e.g., provenance) data. Note that typical additional data are resource availability information (grids are dynamic but fully-monitorable environments) and the logs of different grid middleware.



Results with the GrenchMark Reference Implementation
Selected results, to show what GrenchMark can do
The GRENCHMARK reference implementation has been used for various testing scenarios in: grids [7], [3], [8], peer-to-peer file-sharing [9], and heterogeneous resource management (i.e, based on Condor [10]). Overall, we have run more than 250,000 test jobs in the last 18 months, in over 25 fullyautomated testing scenarios. Below we show and briefly comment a sample of the results of the Condor tests, performed during 2 weeks on 600 processors of the Condor pool at U.Wisc.-Madison. Figure 2 depicts the throughput and the goodput of the system for 100 consecutive runs of 1000 jobs each. The user obtains a high rate of goodput even in a production environment: over 0.5 CPUyears of goodput in two days. Condor is fair with respect to resource consumption, and the throughput and goodput rates are halved after the user exceeds his quota (at the beginning of 31 Aug 2006). Figure 3 shows the job wait time properties of 24 consecutive runs of 1000 jobs each. With the exception of mean and median job wait time, different runs exhibit different distribution properties. Since the range of the job wait time values for each run is large, some jobs have a much lower wait time than others. Figure 4 shows which jobs are thus favored by Condor. The jobs arriving first exhibit high wait time variability. Overall, there is a trend for jobs arriving later to wait more than the jobs arriving earlier. There appears to be no correlation between the average wait time and the run index, i.e., later runs do not have a higher average waiting time.
The GrenchMark framework design
Figure 1. The GrenchMark framework design.
The evolution of goodput and throughput over time
Figure 2. The evolution of goodput and throughput over time. The sampling interval is 4 hours. After 31 Aug 2006, the throughput is halved, and the cumulative goodput increases at half the rate from before.
The job wait time distribution for 24 consecutive test runs
Figure 3. The job wait time distribution for 24 consecutive test runs. Each distribution is depicted as a box-and-whiskers set with additional points for the median and the mean. Different runs exhibit similar mean and median, but very different value ranges.
The job wait time for three test runs
Figure 4. The job wait time for three test runs, selected for low (Run 15), average (Run 6), and high (Run 19) wait time. Overall, there is a trend for jobs arriving later to wait more than the jobs arriving earlier.



Related Work
are there others?
Following results from the parallel systems community, several grid performance evaluation and benchmarking approaches focus on tests using micro-benchmarks, microkernels, and application benchmarks [11], [12]. For the few other grid performance evaluation tools, the main focus is either distributed deployment and testing [13], or executing ad-hoc functionality tests [14], [15]. GRENCHMARK focuses more on the testing process, with additional types of workload data sources, richer workload generation, and more detailed analysis.




Acknowledgements
there is no 'me' in research, only team work
This work was carried out in the context of the Virtual Laboratory for e-Science project (www.vl-e.nl), which is supported by a BSIK grant from the Dutch Ministry of Education, Culture and Science (OC&W), and which is part of the ICT innovation program of the Dutch Ministry of Economic Affairs (EZ). We further thank Miron Livny and the Condor team at U.Wisc.-Madison for providing the testing environment used for part of this work. We also want to thank the people who have contributed (directly or indirectly) to this work over the years: Dr. Dick Epema, Dr. Nicolae Tapus, Dr. Catalin Dumitrescu, Catalin Cirstoiu, Mugurel Andreica, and Corina Stratan.




Publications, conferences, talks
validating our work...
A.Iosup, D.H.J.Epema, GrenchMark: a Framework for Testing Large-Scale Distributed Computing Systems, (submitted).
info the journal presentation of GrenchMark: over 25 use cases, replaying traces from the Grid Workloads Archive and from the Parallel Workloads Archive, comprehensive extensions over the GrenchMark presented in the CCGrid 2006 publication.
 
A.Iosup GrenchMark: a Framework for Testing Large-Scale Distributed Computing Systems, In the ACM/IEEE SuperComputing Conference on High Performance Networking and Computing (SC'07), Posters/ACM Student Research Competition. third place in the ACM SRC/Graduate Student competition.
info poster presenting GrenchMark.
 
GrenchMark poster gets third place for SC'07 ACM Student Research Competition (Graduate Students)
M. Andreica, N. Tapus, A. Iosup, D.H.J. Epema, C. Dumitrescu, I. Raicu, I. Foster, M. Ripeanu, Towards ServMark, an Architecture for Testing Grids, CoreGRID Technical Report TR-0062, Nov 29, 2006.
info grid computing, performance evaluation, testing real environments
 
A. Iosup, D.H.J.Epema, GrenchMark: A Framework for Analyzing, Testing, and Comparing Grids, In the 6th IEEE/ACM Int'l Symposium on Cluster Computing and the Grid (CCGrid'06) (accepted, 25%). An extended version can be found as Technical Report TU Delft/PDS/2005-007, ISBN 1387-2109).
info Using GrenchMark: simple and composite Grid jobs, replaying traces from the Parallel Workloads Archive, 10 use cases for analyzing, testing, and comparing common grid settings.
 
Article, PDF [510KB] | GrenchMark presentation, CCGrid'06 [PPT, 3MB]



References
these studies have enabled us to work on this project
  1. A. Iosup, D. H. J. Epema, C. Franke, A. Papaspyrou, L. Schley, B. Song, and R. Yahyapour, “On grid performance evaluation using synthetic workloads.” in JSSPP, ser. LNCS, vol. 4376, 2006, pp. 232–255.
  2. A. Iosup, C. Dumitrescu, D. H. Epema, H. Li, and L. Wolters, “How are real grids used? The analysis of four grid traces and its implications.” in GRID. IEEE Computer Society, 2006, pp. 262–269.
  3. A. Iosup and D. H. J. Epema, “Grenchmark: A framework for analyzing, testing, and comparing grids.” in CCGRID. IEEE Computer Society, 2006, pp. 313–320.
  4. O. Khalili et al., “Measuring the performance and reliability of production computational grids,” in GRID. IEEE Computer Society, 2006.
  5. B. Schroeder and G. A. Gibson, “A large-scale study of failures in highperformance computing systems,” in DSN. IEEE Computer Society, 2006, pp. 249–258.
  6. A. Iosup, D. Epema, P. Couvares, A. Karp, and M. Livny, “Build-and-test workloads for grid middleware: Problem, analysis, and applications,” in CCGRID. IEEE Computer Society, 2007, pp. 205–213.
  7. H. H. Mohamed and D. H. J. Epema, “Experiences with the koala coallocating scheduler in multiclusters.” in CCGRID. IEEE Computer Society, 2005, pp. 784–791.
  8. O. Sonmez, H. Mohamed, and D. Epema, “Communication-aware job placement policies for the koala grid scheduler,” in e-Science. IEEE Computer Society, 2006, pp. 79–86.
  9. J. Roozenburg, “Secure decentralized swarm discovery in tribler,” Master’s thesis, Delft University of Technology, Delft, NL, Nov. 2006.
  10. D. Thain, T. Tannenbaum, and M. Livny, “Distributed computing in practice: the condor experience.” Concurrency - Practice and Experience, vol. 17, no. 2-4, pp. 323–356, 2005.
  11. R. F. Van Der Wijngaart and M. Frumkin, “Nas grid benchmarks version 1.0,” NASA, Technical Report NAS-002-005, 2002. [Online]. Available: http://www.nas.nasa.gov/News/Techreports/2002/PDF/nas-02-005.pdf
  12. G. Tsouloupas and M. D. Dikaiakos, “GridBench: A workbench for grid benchmarking.” in EGC, ser. LNCS, vol. 3470, 2005, pp. 211–225.
  13. I. Raicu, C. Dumitrescu, M. Ripeanu, and I. T. Foster, “The design, performance, and use of diperf: An automated distributed performance evaluation framework.” J. Grid Comput., vol. 4, no. 3, pp. 287–309, 2006.
  14. G. Chun, H. Dail, H. Casanova, and A. Snavely, “Benchmark probes for grid assessment.” in IPDPS. IEEE Computer Society, 2004.
  15. S. Smallen, C. Olschanowsky, K. Ericson, P. Beckman, and J. M. Schopf, “The Inca test harness and reporting framework.” in SC. IEEE Computer Society, 2004, p. 55.



                                                                                                                                                                                                                                             
     


The newest version of this page can be found at: http://www.pds.ewi.tudelft.nl/~iosup/gmark-sc07.html
Copyright © 2007-2008 Alexandru Iosup. All Rights Reserved.
Google Analytics .