multirun - running serial programs in parallel on the astro or theory HPC clusters

multirun is a fairly simple program written in C (ask me for the source if you want) for running lists of commands/programs in parallel (at the same time) on multiple compute nodes. It can be useful if you have lots of "normal" - serial - programs that you want to run at the same time, with different parameters. It uses a file called "params.txt" to tell it what it should run.
For example, if you fill the file "params.txt" with:
/users/me/myProgram 311232 1231231
/users/me/myProgram 232377 8127387
etc...
The program will start up and use the MPI environment to see how many CPU cores it has been given access to.
It will then run each line on a different CPU core. When it has finished running a line, it will run the next one that needs doing, removing the completed line from the file.
So if you had 1000 lines that needed running, and you assigned 10 cores to it, it could run 10 at a time, and should be finished 10 times quicker than running them all one by one.
The "output" from the programs you run - "what you would see on the screen if you ran it" - all end up in the same file - the job output file. It is possible to separate them into different files with, for example, the parameter names in the filename - email me if you want to discuss how to do that.
You would submit multirun to the queue like this (for example):
addqueue -n 10 -m 1 /usr/local/shared/bin/multirun
Which would run it on 10 CPU cores, asking for 1 GB RAM per core.
Please email me, jonathan.patterson@physics.ox.ac.uk, if you have any problems/questions about using it.

Categories: Linux