Strategy for distributed computing

Navigation:  Distributed computing >

Strategy for distributed computing

Previous pageReturn to chapter overviewNext page

Once you have configured the common network path and SPRAY has been installed on all client PCs, everything is ready for distributed computing with SPRAY. The performance of the network depends on the speed of the individual computers and the size of the tasks. The following remarks can be used as a guideline for your own network experiments.

 

In addition to the numerical work to be performed for the ray-tracing the processing of the tasks requires some extra time. The creation of the tasks on the master PC, the search for tasks on the client PCs and the exchange of configuration files and results cause a delay of a few seconds for every task. If a local computation on the master PC takes 10 seconds, and you split it up into 50 tasks to be processed in the network, it can easily take several minutes until you get the final result. Distributed computing is faster than local computing only if you have large simulation jobs and not too many tasks.

On the other hand, the number of tasks should not be too small. If you employ various computers which may differ in speed you should try to keep them all busy. If the individual tasks are too large the slow PCs will probably never give a useful contribution: While a slow PC is working on a too heavy computational load, the faster PCs may run out of work and start to work on the pending task of the slow PC as well, eventually overtaking it.

 

Current limitations in distributed computing:

NIGHTSHIFT cannot stop its SPRAY OLE server properly when the master PC finishes the distributed computing. Therefore it may happen that a client PC still works on a task of a previous simulation while the next simulation has been started already. In this case, the results are not passed back to the master PC (it is checked if the results belong to the current simulation) but the time of the client PC needed to finish the 'old' task is lost.

The master PC is not very busy in the case of distributed computing: It creates the tasks and then just adds up the results. Hence it is useful to start a NIGHTSHIFT program on the master PC as well in order to make use of its computational power. However, this does not work properly if you start the distributed computing manually by pressing the button 'Distributed computing'. We are working on a solution for this problem. If you control the master SPRAY by OLE automation (e.g. VisualBasic routines in Excel) everything works fine.