libtbx.easy_mp semantics
Hello everyone, Could someone please explain the libtbx.easy_mp semantics to me? Currently, if you run easy_mp with multiple processes, objects are 'reset' back to their original state, ie. the state of the object in the main thread is retained, changes in the processing threads are dropped. However if you run easy_mp with the number of processes set to 1, object states are modified, because the main thread is the executing thread. How it this supposed to work? For a more detailed test please run $ libtbx.python libtbx/tst_easy_mp_state.py The source code explains the test in more detail. This behaviour currently causes dials.integrate to fail on bzipped input files. Given that this code is used in many places and this particular feature is not obvious, it may explain other anecdotal emergent behaviour, too. -Markus -- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
Hi Markus, your observation of the facts is correct. However, I would argue that such behaviour is not the fault of easy_mp as such, but of the code executing. If the code is supposed to be executed in a parallel fashion, it should not change any of its arguments, because it will never end up in a predictable state. In Python, it is possible to take liberties because of the way the multiprocessing module works, but easy_mp never gave such guarantees (in fact, "threading" is an allowed option), and if there is only a single thread executing, it is most efficient to use the main thread. To solve your current problem, you can set LIBTBX_FORCE_PARALLEL, this should force easy_mp.parallel_map (I am assuming this is the function in question) to spawn new processes for each calculation, and this will effectively keep the current state of the arguments (I have not tested this, because your test is not present in my current copy, and it would not be convenient for me to update right now). Perhaps this can be turned into a command line argument to use when such code is executed, although I would advise against running such code in parallel. Alternative solution is to copy the object, although this is also not always an option. BW, Gabor On 2016-02-23 10:31, [email protected] wrote:
Hello everyone,
Could someone please explain the libtbx.easy_mp semantics to me?
Currently, if you run easy_mp with multiple processes, objects are 'reset' back to their original state, ie. the state of the object in the main thread is retained, changes in the processing threads are dropped.
However if you run easy_mp with the number of processes set to 1, object states are modified, because the main thread is the executing thread.
How it this supposed to work?
For a more detailed test please run
$ libtbx.python libtbx/tst_easy_mp_state.py
The source code explains the test in more detail.
This behaviour currently causes dials.integrate to fail on bzipped input files.
Given that this code is used in many places and this particular feature is not obvious, it may explain other anecdotal emergent behaviour, too.
-Markus
--
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
-- ################################################## Dr Gabor Bunkoczi Cambridge Institute for Medical Research Wellcome Trust/MRC Building Addenbrooke's Hospital Hills Road Cambridge CB2 0XY ##################################################
Dear Gabor, Many thanks for the pointers. Indeed with 'threading' everything works as expected. Dials.integrate (together with BZ2File and file handles) proves that python multiprocessing does not always copy objects properly across processes. I looked into multiprocessing, and apparently the authors saw no point in allowing objects to detect when they are being copied. Even the memory address of the object, or rather the closest thing there is to it (id), stays the same. I will therefore now check the process ID to determine if the object has been copied. Possibly not perfect, but fixes the original bug. I will leave the tst_easy_mp_state.py test in, and have added some more comments to it - in case anybody comes across some unintended multiprocessing features in the future. Best wishes -Markus -----Original Message----- From: Gabor Bunkoczi [mailto:[email protected]] Sent: 23 February 2016 12:05 To: cctbx mailing list Cc: Gerstel, Markus (DLSLtd,RAL,LSCI) Subject: Re: [cctbxbb] libtbx.easy_mp semantics Hi Markus, your observation of the facts is correct. However, I would argue that such behaviour is not the fault of easy_mp as such, but of the code executing. If the code is supposed to be executed in a parallel fashion, it should not change any of its arguments, because it will never end up in a predictable state. In Python, it is possible to take liberties because of the way the multiprocessing module works, but easy_mp never gave such guarantees (in fact, "threading" is an allowed option), and if there is only a single thread executing, it is most efficient to use the main thread. To solve your current problem, you can set LIBTBX_FORCE_PARALLEL, this should force easy_mp.parallel_map (I am assuming this is the function in question) to spawn new processes for each calculation, and this will effectively keep the current state of the arguments (I have not tested this, because your test is not present in my current copy, and it would not be convenient for me to update right now). Perhaps this can be turned into a command line argument to use when such code is executed, although I would advise against running such code in parallel. Alternative solution is to copy the object, although this is also not always an option. BW, Gabor On 2016-02-23 10:31, [email protected] wrote:
Hello everyone,
Could someone please explain the libtbx.easy_mp semantics to me?
Currently, if you run easy_mp with multiple processes, objects are 'reset' back to their original state, ie. the state of the object in the main thread is retained, changes in the processing threads are dropped.
However if you run easy_mp with the number of processes set to 1, object states are modified, because the main thread is the executing thread.
How it this supposed to work?
For a more detailed test please run
$ libtbx.python libtbx/tst_easy_mp_state.py
The source code explains the test in more detail.
This behaviour currently causes dials.integrate to fail on bzipped input files.
Given that this code is used in many places and this particular feature is not obvious, it may explain other anecdotal emergent behaviour, too.
-Markus
--
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
-- ################################################## Dr Gabor Bunkoczi Cambridge Institute for Medical Research Wellcome Trust/MRC Building Addenbrooke's Hospital Hills Road Cambridge CB2 0XY ################################################## -- This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
Hi Markus, Many thanks for the thorough investigation, and I am glad you found a simple solution to the problem! I am not sure that this affects you (boils down to whether DIALS supports Windows or not), but the way multiprocessing creates new processes is OS-dependent. On Unix, processes are forked (i.e. no copying take place), and objects are exactly identical (including their id()), whereas on Windows, a new process is started, and objects are transferred in a pickled form via a network channel (involving a potential id() change). Best wishes, Gabor On 2016-02-23 13:39, [email protected] wrote:
Dear Gabor,
Many thanks for the pointers. Indeed with 'threading' everything works as expected. Dials.integrate (together with BZ2File and file handles) proves that python multiprocessing does not always copy objects properly across processes. I looked into multiprocessing, and apparently the authors saw no point in allowing objects to detect when they are being copied. Even the memory address of the object, or rather the closest thing there is to it (id), stays the same. I will therefore now check the process ID to determine if the object has been copied. Possibly not perfect, but fixes the original bug.
I will leave the tst_easy_mp_state.py test in, and have added some more comments to it - in case anybody comes across some unintended multiprocessing features in the future.
Best wishes -Markus
-----Original Message----- From: Gabor Bunkoczi [mailto:[email protected]] Sent: 23 February 2016 12:05 To: cctbx mailing list Cc: Gerstel, Markus (DLSLtd,RAL,LSCI) Subject: Re: [cctbxbb] libtbx.easy_mp semantics
Hi Markus,
your observation of the facts is correct.
However, I would argue that such behaviour is not the fault of easy_mp as such, but of the code executing. If the code is supposed to be executed in a parallel fashion, it should not change any of its arguments, because it will never end up in a predictable state. In Python, it is possible to take liberties because of the way the multiprocessing module works, but easy_mp never gave such guarantees (in fact, "threading" is an allowed option), and if there is only a single thread executing, it is most efficient to use the main thread.
To solve your current problem, you can set LIBTBX_FORCE_PARALLEL, this should force easy_mp.parallel_map (I am assuming this is the function in question) to spawn new processes for each calculation, and this will effectively keep the current state of the arguments (I have not tested this, because your test is not present in my current copy, and it would not be convenient for me to update right now). Perhaps this can be turned into a command line argument to use when such code is executed, although I would advise against running such code in parallel. Alternative solution is to copy the object, although this is also not always an option.
BW, Gabor
On 2016-02-23 10:31, [email protected] wrote:
Hello everyone,
Could someone please explain the libtbx.easy_mp semantics to me?
Currently, if you run easy_mp with multiple processes, objects are 'reset' back to their original state, ie. the state of the object in the main thread is retained, changes in the processing threads are dropped.
However if you run easy_mp with the number of processes set to 1, object states are modified, because the main thread is the executing thread.
How it this supposed to work?
For a more detailed test please run
$ libtbx.python libtbx/tst_easy_mp_state.py
The source code explains the test in more detail.
This behaviour currently causes dials.integrate to fail on bzipped input files.
Given that this code is used in many places and this particular feature is not obvious, it may explain other anecdotal emergent behaviour, too.
-Markus
--
This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail. Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message. Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
-- ##################################################
Dr Gabor Bunkoczi
Cambridge Institute for Medical Research Wellcome Trust/MRC Building Addenbrooke's Hospital Hills Road Cambridge CB2 0XY ##################################################
participants (2)
-
Gabor Bunkoczi
-
markus.gerstel@diamond.ac.uk