Tech

From XtremWebCH Wiki

Jump to: navigation, search

Contents

Technical documentation for XWCH

API documentation

  • Coordinator (beans): [1]


A diagram of the coordinator software architecture is shown below. In [8] you will find an SVG image of the main EJB classes (this does not work with Internet Explorer).

Coordinator architecture

The diagram relates to the source code as follows:

  • Web GUI for the coordinator: XWCHWeb_Entreprise/src/java/xwchweb
  • Facades used by GUI elements: EJB/src/java/webapplication1/bean
  • Web services used by workers, warehouses: EJB/src/java/webapplication1/soap
  • Entity beans: EJB/src/java/webapplication1/entity

You will find a description of all the components + build, deploy and test examples in the README file [9].

SVN repository

The SVN repository is at svn://www.xtremwebch.net/home/regis/dev/subversion_repository/xwch

A daily snapshot of the SVN trunk is available here: [10] and the results of its automatic build here: [11]

Moreover, a list of changes (svn log -v) is available here: [12]

Please contact our mailing list if you need SVN access.

Developer VM

A virtual machine that can be used for demos is available at [13]. root password is "xwchrocks". In the /root directory you find a script that downloads the SVN trunk, all the required components, compiles everything, starts a coordinator, a worker, and a warehouse, and runs a test.

After running the script, you can access the coordinator with your browser: http://YOUR-VM-IP/XWCHWeb:8080/Main.iface.

Normal disclaimers: since this is SVN trunk, some functionality may be unavailable or buggy.

HOW TO's and FAQ's

What is the difference between an application, a module and a task (a job)?

Let's start with a module: you prepare a module for something that you want to do quite frequently. Typically, you will want to prepare executables for some platforms (Linux, Windows), create a module M using the Web GUI, zip your executables and add the ZIP files (once again using the Web GUI).

In order to use your module, you will need an application. An application can use many modules and have many steps, but let's have a simple example: your application uses just one module M just once. This means that you prepare an input ZIP file I, use the API to create an application and then add (create) a job. When you add a job, you declare that you will use module M and your input I.

A job is simply an activation of your module, using your input. It will start when you "add" it by the API, and you can observe how it runs by the API's GetJobStatus call.


How many workers can report to 1 coordinator?

The workers register, get jobs and report their state to the coordinator using web services. So, this question is almost like asking "how many http requests my Java container can handle". In our simulations with modern hardware, 200 - 300 requests per second has been a reasonable figure. This translates to 12000 - 18 000 workers.

What is the division of labour between client, coordinator, warehouse and worker?

In a simplified way, as follows:

  • the client asks "ping warehouses" from the server. If no warehouses reply, the client should quit.
  • the client calls "create application" from the server
  • the client creates or attaches a module in the application. If a module is created, binaryfiles related to the module are sent to warehouses.
  • the client calls AddJob with applicationID, moduleID, a ZIP file containing input files, and instructions of how to run the job and what to recover.
  • AddJob sends the input files to warehouses and the instructions to coordinator.
  • the coordinator assigns the job to a worker
  • the worker recovers the input files and executables
  • the worker runs the task and places the outputs in a warehouse
  • the coordinator probes the worker, gets to know that job was finishes, informs client
  • the client recovers output files from a warehouse.

For a scenario diagram of the communication, see: http://www.xtremwebch.net/clientcomm.svg

How does data replication work?

The client API constructor contains two replication related parameters: the requested amount of replicas and the depth of search when looking for replicas.

In practice, the warehouses will distribute the replicas when the original data has been transfered by the client.

How does the scheduler in the coordinator select the worker(s) in which the jobs will run?

Currently, the potential workers are first selected based on the requirements stated by the client and the module (including: for which OS the executables have been provided). As of now (Nov 2010), the worker is selected from this set randomly. Other selection criteria will be implemented soon.

Why doesn't my job run well with a Windows worker?

Does you job produce any output? If not, the worker may think it is inactive. Here's a simple way of making a script that can be used for debugging on Windows workers:


echo starting > foo

yourexecuble >> foo 2>foo.err

echo done >>foo


.. where yourexecutable is the executable for Windows (see the first question about modules and executables). Naturally, your application will need to get the files "foo" and "foo.err" as output.

For debugging, how can I make my job run in some specific worker?

Use extrafields="host;workername"

Can I run my client over HTTPS, for increased security?

Yes, you'll find an example of that in clients/democlient-java's README. However, please bear in mind that the communication between the warehouse and the client is not crypted, even if the communication between the coordinator and client is.

The Derby database that comes with GlassFish uses all the CPU/memory? Can I use something else

Yes. If you are using the coordinator on Linux, you can use an installation package that support MySQL.

If you are using the coordinator on Windows or other non-unix, please follow these instructions: (change username and password).

1. Prepare glassfish

  • Download the mysql-connector-java-x.x.x-bin.jar from http://dev.mysql.com/downloads/connector/j/3.1.html
  • Extract the contents of the zip file unzip mysql-connector-java-x.x.x-bin.zip
  • Copy mysql-connector-java-x.x.x-bin.jar to GLASS_FISH_INSTALL_DIR/lib folder.
  • cd $GLASS_FISH_INSTALL_DIR
  • Start (or restart) your GlassFish Application server : ./bin/asadmin start-domain domain1

2. Prepare mysql

  • start the mysql database, e.g. /etc/init/d/mysql start
  • create a database and a user for it: mysql --host=localhost --port=3306 --user=root --password=##### -e"CREATE DATABASE xwchdb; CREATE USER username IDENTIFIED BY 'password'; GRANT ALL PRIVILEGES ON dbname.* TO username@'localhost' IDENTIFIED BY 'password' WITH GRANT OPTION;"

3. Prepare a connection pool

  • cd $GLASS_FISH_INSTALL_DIR
  • ./bin/asadmin create-jdbc-connection-pool --datasourceclassname com.mysql.jdbc.jdbc2.optional.MysqlDataSource --restype javax.sql.DataSource --property User=username:Password=password:URL=jdbc\\:mysql\\://127.0.0.1/dbname myxwchsqlpool


4. Test the connection pool

./bin/asadmin ping-connection-pool myxwchsqlpool

5. Create a database resource

./bin/asadmin create-jdbc-resource --connectionpoolid=myxwchsqlpool jdbc/xwchdb

Views
Personal tools