CSCI 590: Grid Computing Reliability

(Fall, 2005)

 

l       Instructor:

Lecturer: Dr. Yuanshun Dai

l       Credits: 3

l       Brief Description

The goal of this course is to introduce the student to the field of Grid Computing Reliability, including studying the state of the art of Grid Computing Reliability and related software tools and techniques, identifying a research focus, and conducting a small scale Reliable Grid Computing project in an application domain.

l       Books:

 

M. Xie, Y.S. Dai, K.L. Poh, Computing Systems Reliability, Kluwer Academic Publishers: New York, NY, U.S.A., April 2004. ISBN: 0-306-48496-X.

l       Specific Requirements:

1.        Study existing Grid Computing Reliability techniques and applications through literature reading.

2.        Identify a research focus in a specific area in Grid Computing combined with Reliability technology.

3.        Carry out an experimental Reliable Grid Computing project (to be determined)

l   Syllabus:

Week 1, Week 2: Principles of reliability (Chapter 1 and Chapter 2)

Week 3, Week 4: The infrastructures of the Grid system (The following papers which will be distributed on class)

Foster, I., Kesselman, C. and Tuecke, S. (2001), The anatomy of the grid: Enabling scalable virtual organizations, International Journal of High Performance Computing Applications, vol. 15, pp. 200-222.

Foster, I., Kesselman, C., Nick, J.M., Tuecke, S. (2002), Grid services for distributed system integration, Computer, vol. 35, no. 6, pp. 37-46.

Week 5: Grid Reliability: Hardware Component (Chapter 3)

Week 6: Grid Reliability: Software Component (Chapter 4)

Week 8: Grid Reliability: Firmware Component (Chapter 5)

Week 9, Week 10: Distributed System Reliability (Chapter 6)

Week 11: The Reliability for Grid RMS (Chapter 7.1, 7.2)

Week 12: The reliability for Grid Network (Chapter 7.3, 7.4),

Week 13: The reliability for Grid services (The following papers that will be distributed on class)

Y.S. Dai, Y. Pan, X.K. Zou, ¡°A hierarchical modelling and analysis for grid service reliability¡±, IEEE Transactions on Computers, In Press, 2006.

Y.S. Dai, G. Levitin, ¡°Reliability and Performance of Tree-structured Grid Services¡±, IEEE Transactions on Reliability, In Press, 2005.

Week 14: Grid Reliability Enhancement and Optimization (Chapter 9.4, 9.5)

Week 15: Report of Projects (Each Group's presentation has 15 minutes including Q&A)

Week 16: Review and Final Exam

l       Useful Link:

The ReGrid (Reliable Grid) Alliance: www.ReGrid.org