Lionel,
It's not quite as easy as one would hope, and it has nothing to do
with Phenix or Phaser. We make use of extensive multiprocessing
across our cluster in our RAPD software used at the beamline to
speed up the results for the user.
I set this up a while ago, but from what I remember, you first
have to setup a 'parallel environment' (PE) in SGE. The options
maybe different depending on your version of SGE. We use 6.2u4.
There are probably default PE's setup already but they might not
have the correct parameters. I created a new one called 'smp' with
the number of 'slots' set to the number of cores of your cluster,
or some other lower limit. (If you set 'slots' to 12 and you
submit 5 jobs requiring 4 slots each, only three will run, until
one has finished and the resources are free.) The 'allocation
rules' are set to '$pe_slots' so that the job can use only the
cores on a single node. There are other rules that might be better
for what you want to do. In our case, I setup different queues
with different priorities that have access to specific PE's
depending on the jobs that are getting submitted at the beamline.
Your setup may not need this complexity. I would read through the
huge manual for SGE for details or do a search on Oracle's
website.
When you submit the job, make sure you add 'qsub ... -pe smp 1-4
...' which will tell SGE that your job will need 1-4 cores on a
single node. You could also just specify a single integer (4
instead of 1-4) to request 4 slots. Obviously, you can modify
these to your needs. After you submit the job, run 'qstat' and
look at the last column labeled 'slots' to see how many slots are
saved for the job.
In your Phaser command include 'JOBS 4' to match your requested
number of slots. I am not sure how much this speeds up a single
Phaser job because there isn't a whole lot of code to parallelize
in MR. Randy Read mentioned (either to me or the BB, I don't
remember) that everything that could be parallelized in Phaser is
done.
On a side note, if you write code in Python, and you start a new
multiprocessing.Process() it will automatically launch it on
another core on the same node. You have to account for this when
you request a specific number of slots during job submission,
otherwise you could overload your cluster pretty quickly. Many
programs will have an optimized number of slots to request and
requesting more slots will not make it run any faster, but it will
limit resources available for other jobs on the cluster. I assume
Phaser is one of these programs.
Jon
--
Jonathan P. Schuermann, Ph. D.
Beamline Scientist, NE-CAT
Argonne National Laboratory, 436E
9700 S. Cass Ave.
Argonne, IL 60439
Email: [email protected]
Tel: (630) 252-0682
On 07/10/2013 09:16 AM, L. Costenaro (IBB) wrote:
Hello,
I am trying to run phaser in a SGE cluster (qsub) using multiple
proc (either phenix.phaser from the command line or phaser-MR
from the GUI), but the jobs do not parallelize. When I run the
phaser-MR locally (same executable) it does parallelize
(multiple python threads).
Any help , advice would be welcome.
Best regards,
Lionel
_______________________________________________
phenixbb mailing list
[email protected]
http://phenix-online.org/mailman/listinfo/phenixbb