Hi Graeme, I recall we've been here before, http://phenix-online.org/pipermail/cctbxbb/2017-December/001807.html I believe the solution is to use easy_mp.multi_core_run() instead of easy_mp.parallel_map(). The first function preserves stack traces of individual process, unlike easy_mp.parallel_map(). Regards, Rob On 03/04/2018 07:16, [email protected] wrote:
Folks,
Following up on user reports again of errors within easy_mp - all that gets logged is “something went wrong” i.e.
Using multiprocessing with 10 parallel job(s)
Traceback (most recent call last): File "/home/user/bin/dials-installer/build/../modules/dials/command_line/integrate.py", line 613, in <module> halraiser(e) File "/home/user/bin/dials-installer/build/../modules/dials/command_line/integrate.py", line 611, in <module> script.run() File "/home/user/bin/dials-installer/build/../modules/dials/command_line/integrate.py", line 341, in run reflections = integrator.integrate() File "/home/user/bin/dials-installer/modules/dials/algorithms/integration/integrator.py", line 1214, in integrate self.reflections, _, time_info = processor.process() File "/home/user/bin/dials-installer/modules/dials/algorithms/integration/processor.py", line 271, in process preserve_exception_message = True) File "/home/user/bin/dials-installer/modules/dials/util/mp.py", line 171, in multi_node_parallel_map preserve_exception_message = preserve_exception_message) File "/home/user/bin/dials-installer/modules/dials/util/mp.py", line 53, in parallel_map preserve_exception_message = preserve_exception_message) File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/easy_mp.py", line 627, in parallel_map result = res() File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/scheduling/result.py", line 119, in __call__ self.traceback( exception = self.exception() ) File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/scheduling/stacktrace.py", line 115, in __call__ self.raise_handler( exception = exception ) File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/scheduling/mainthread.py", line 100, in poll value = target( *args, **kwargs ) File "/home/user/bin/dials-installer/modules/dials/util/mp.py", line 91, in __call__ preserve_exception_message = self.preserve_exception_message) File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/easy_mp.py", line 627, in parallel_map result = res() File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/scheduling/result.py", line 119, in __call__ self.traceback( exception = self.exception() ) File "/home/user/bin/dials-installer/modules/cctbx_project/libtbx/scheduling/stacktrace.py", line 86, in __call__ raise exception RuntimeError: Please report this error to [email protected]: exit code = -9
I forget why it was decided that keeping the proper stack trace was a bad thing, but could this be revisited? It would greatly help to see it in the output of the program (if as is the case here I do not have the user data)
My email-fu is not strong enough to dig out the previous conversation
Cheers Graeme
-- Robert Oeffner, Ph.D. Research Associate, The Read Group Department of Haematology, Cambridge Institute for Medical Research University of Cambridge Cambridge Biomedical Campus Wellcome Trust/MRC Building Hills Road Cambridge CB2 0XY www.cimr.cam.ac.uk/investigators/read/index.html tel: +44(0)1223 763234