Coordinator
From XtremWebCH Wiki
Contents |
Description
XWCH is a three-tier architecture where the coordinator adds a middle tier between client and workers. There is no direct submission/result transfer between clients and workers. The coordinator accepts execution requests coming from clients, assigns the tasks to the workers according to a scheduling policy and the availability of data, transfers binary codes to workers (if necessary), supervises task execution on workers, detects worker crash/disconnection, re-launches tasks on any other available worker, and controls the transfer traffic on the network to ensure the balancing of bandwidth load.
The coordinator is composed of four services: the workers’ manager, the tasks’ manager, the scheduler and the communication manager.
The Workers’ Manager
The workers’ manager maintains a list of connected workers. It receives four types of common requests/signals from the workers: Register Request (RR), Work Request (WR), Work Alife signal (WA) and Work Result signal (WRS). The Register Request allows a worker to subscribe nearby the coordinator. When the Workers’ Manager receives a Work Request, it searches for the most appropriate task to be assigned to the concerned worker. During the execution of the task, the workers send WA signals to the coordinator to inform about their status. WA signals are considered, by the coordinator, as the “proof” that the workers are still “alive” (connected). When a worker finishes its execution, it sends a Work Result Signal to inform the coordinator about the location of the data it has produced.
The Tasks’ Manager
In XWCH, a parallel and distributed application is composed of a set of communicating tasks. A task is considered to be “ready” for execution if its input data are available (given by the user or produced by a previous task). A task is in “blocked” status if its input data are not yet available. Two lists are maintained by the Tasks’ Manager: blocked tasks and ready tasks. When receiving a Work Result Signal, the Tasks’ Manager checks whether the new available data correspond to input data awaited by one or several blocked tasks; it updates the lists of blocked and ready tasks accordingly.
The Scheduler
A Work Request transmits, as input parameter, the performance that can be delivered by the concerned worker. When receiving this request, the coordinator launches a scheduler module which selects the “most appropriate” ready task to allocate to that worker.
The Communication Manager
XWCH is supposed to be a Public Large Scale Distributed Platform. It is assumed to be deployed on a “public” network. In this context, the system should insure that the bandwidth provided by the network is not completely consumed by the traffic generated by XWCH: common requests, data transfers, etc. The data transmitted between two XWCH nodes (coordinators, workers, warehouses) are split into fixed size packets. A sleep time separates the transmission of two successive packets. This time depends on the load of the network as sensed by the coordinator: the higher the load the bigger the sleep time. Similarly, the number of competing WR and LS processed by the coordinator is fixed by the communication manager according to the workload of the network as sensed by the coordinator.