A practical introduction to OptimalGrid

A practical introduction to OptimalGrid - Part I

(Post 22/08/2006) OptimalGrid is a distributed computing middleware designed to support parallel computation on any Grid, cluster, or intranet group of computers. This tutorial will give you a practical introduction to OptimalGrid and show how you would design a Grid solution to your own development problems.

Section 1. Before you start

About this tutorial

The goal of this tutorial is to give you a practical introduction to OptimalGrid. Grid computing offers immense potential to industry, science, and individual users, but you need an easy way to harness that potential. OptimalGrid is an attempt to address this challenge.

OptimalGrid provides a Grid-enabled application framework you can use for rapid and easy development of Grid applications. It's working prototype middleware from IBM Almaden Research Center, and it's used by applications that require large-scale computation on a Grid. OptimalGrid hides the complexity of the underlying Grid infrastructure from the application developer. It also provides essential autonomic features such as self-configuration, healing, and optimization so you don't have to invent and implement these features. For you, the result is that you can rapidly build and deploy applications on a Grid.

After completing the tutorial, you should:

Understand what types of problems OptimalGrid is designed to solve
Understand the basics of how OptimalGrid implements the solution to these types of problems
Have a working installation of the OptimalGrid software
Be able to write the Java™ code to implement the solution to a specific problem

Should I take this tutorial?

Are you an application developer with some Java programming experience? And do you have a problem that requires more computing power than can be utilized on a single processor? If so, then you should complete the initial chapters that describe OptimalGrid and the problems that it is well equipped to handle. If you determine that, in fact, OptimalGrid and your problem are a good match, then you should complete the tutorial and learn how to use OptimalGrid to quickly implement a solution to your problem that can be implemented on a Grid network of computers.

Prerequisites

To complete this tutorial, you need a working knowledge of Java technology.

Minimum platform requirements for OptimalGrid

One or more computers (750-MHz processor)
Java runtime 1.3 or higher
TSpaces (included in distribution)
10-Mbit Ethernet to each processor
Storage requirements are application-dependent
Any operating system that supports Java code; tested platforms include Linux® and Windows®.

Recommended configuration

Linux
Cluster of more than one machine (1-GHz processor)
Java runtime 1.3 or higher with Java 3D extensions installed
TSpaces (included in distribution)
100-Mbit Ethernet to each processor
Storage requirements are application-dependent

Section 2. Introduction to OptimalGrid

What is OptimalGrid?

OptimalGrid -- a research prototype from IBM Almaden Research Center -- is middleware that aims to simplify creating and managing large-scale, connected, parallel grid applications. The OptimalGrid system is pure Java code and runs on any operating system or collection of operating systems that support Java technology. OptimalGrid is designed to support parallel applications that require ongoing communication between cluster processors or nodes.

The purpose of OptimalGrid is to optimize performance of a distributed Grid system given whatever resources and infrastructure are actually available. It is in this sense we are optimal.

For a high-level overview of the OptimalGrid components and architecture, see "OptimalGrid -- Autonomic computing on the grid."

Section 3. Solving a problem using the OptimalGrid object model

What's in this section?

This section describes the OptimalGrid object model and how it is used to describe a typical cellular automata, Finite Element problem, or other application where computational progress depends on sharing information between nodes.

The next section shows how you would implement the Java code to solve one of these typical problems.

OptimalGrid object model

All Finite Element Model (FEM) problems are solved numerically by portioning space into small finite regions or elements where small is typically defined by the smallest natural scale in a problem. Figure 1 shows a continuous solid object being turned into a discrete set of nodes, each with specific properties.

Original Problem Cell (OPC)

To describe the OptimalGrid system's approach, it's easiest to consider the simple two dimensional problem, as shown in Figure 2.

Here an element A is connected to its four closest neighbors, and each neighbor in turn is also connected to three elements plus A. Edge elements connect to either one, two, or three elements depending on their location. We call the smallest piece of a problem (in this case a single element) an Original Problem Cell or OPC. In the abstract, this is a node on the application graph that contains data, methods, and pointers to other OPCs.

The OptimalGrid object model implements code to solve a problem using abstract OPCs. The user implements a small set of methods defined in an OPC Abstract Class that are unique to the particular problem being solved. These methods describe the connectivity of the cell with its neighbors, and they specify the calculations to be performed by the cell using the information communicated by its neighbors.

A single OPC object is very small, requiring very little memory to hold and little computational power to execute. An OPC can contain 0 or more entities that may flow from one OPC to neighboring OPCs and 0 or more Properties that describe properties of the OPC or its entities.

Definition: An OPC is the smallest unit of work in the OptimalGrid system. Your problem solution is implemented by writing the Java code to implement the behavior of the OPC and its interaction with its neighbors.

OPC collections

The OptimalGrid system aggregates sets of OPCs that are connected to one another to form an OPC collection. Once created, OPC collections are fixed in size, for the lifetime of the problem.

Definition: An OPCCollection is a set of OPC objects that are connected together. The set of OPCs contained in an OPCCollection is assigned at problem initialization and remains fixed.

Variable Problem Partition (VPP)

The problem piece object containing a set of OPC Collections is defined as a Variable Problem Partition (VPP). This is illustrated in Figure 3. A VPP is the unit of work that is distributed to a compute node. Load balancing will be accomplished by
exchanging OPC Collections between VPPs.

Definition: A Variable Problem Partition or VPP is the set of OPC Collections assigned to a Grid compute node. The number of OPC Collections contained in a VPP is variable.

The OptimalGrid architecture

Figure 4 shows the OptimalGrid architecture.

The main coordinator component of the OptimalGrid system is the Autonomic Program Manager (APM). The APM does a number of things:

Manages the compute agents and the pieces of the problem
Can invoke a problem builder (the component that creates the initial problem) if the user doesn't do it manually from the console
Assigns the initial distribution of the problem given all of the information maintained on problems, compute agents, and the general computing environment
Invokes the various pluggable Rule Engines that track the events of the OptimalGrid system and store the lessons learned for optimizing the problem computation

Given an initial problem, such as the abstract FEM problem, OptimalGrid automatically partitions the problem into OPC collections, based on the problem complexity. Then, those OPC collections are grouped into variable problem partitions, based on the available compute resources. Suppose that, in this particular example, the FEM problem was composed of 2 million two-dimensional elements (OPCs), using a simple square mesh. For a given complexity, we might turn that into 800 collections of 2500 OPCs each, with an average 16 collection VPP having about 40,000 OPCs. With an even distribution, 50 compute agents would each get a 40,000 VPP (with the unit of change/modification being one 2500 collection). Faster networks (or shorter communications latency) might allow more compute agents with smaller VPPs, while larger compute agent memories and a slower network might require fewer, larger VPPs.

(Copyright IBM Corporation)

Công nghệ khác:

Những Keyboard cho dân chơi Game - phần 1	Kiến trúc đa nhân, thời đại mới của vi xử lý (Phần II)
Kiến trúc đa nhân, thời đại mới của vi xử lý (Phần I)	Tăng 3 lần sức mạnh cho FlashGet!
Song nhân - xu hướng của nhà sản xuất chip!	Công nghệ mới tạo malware không thể bị phát hiện
	Xem tiếp

Lịch khai giảng của hệ thống

Ngày	Giờ	T.Tâm
TP Hồ Chí Minh
Hà Nội