cctbx.python unnecessarily unfriendly
Hello CCTBX developers, I noticed in the cctbx.python script the following lines that call the python interpreter if [ -n "$LIBTBX__VALGRIND_FLAG__" ]; then exec $LIBTBX_VALGRIND "$LIBTBX_PYEXE" -Qnew "$@" elif [ $# -eq 0 ]; then exec "$LIBTBX_PYEXE" -Qnew else exec "$LIBTBX_PYEXE" -Qnew "$@" fi Here, the “-Qnew” flag forces the python interpreter to work with new division (where “/“ does not strictly mean integer division). This flag basically overrides the expected python behavior to which python programmers have been writing code for over a decade, wrecks some dependencies, and stifles new code that itself depends on cctbx. It is highly frustrating to work around this highly unnecessary flag!! Fortunately, python has a better way to handle this problem by just having “from __future__ import division” at the top of .py files. In fact, I have spot-checked several files in the cctbx source tree and noticed that the ALL have this line at the top. So why does the -Qnew flag still exist in the cctbx.python script? It is entirely unnecessary and requires any and all dependencies to be programmed to a behavior that is not standard for the language. James p.s. It is not reasonable to ask users who install your packages to patch the cctbx.python script or to patch dependencies that don’t conform to this non-standard behavior.
On Wed, Jan 8, 2014 at 12:51 PM, James Stroud
Here, the “-Qnew” flag forces the python interpreter to work with new division (where “/“ does not strictly mean integer division).
This flag basically overrides the expected python behavior to which python programmers have been writing code for over a decade, wrecks some dependencies, and stifles new code that itself depends on cctbx.
It is highly frustrating to work around this highly unnecessary flag!!
Fortunately, python has a better way to handle this problem by just having “from __future__ import division” at the top of .py files. In fact, I have spot-checked several files in the cctbx source tree and noticed that the ALL have this line at the top.
So why does the -Qnew flag still exist in the cctbx.python script? It is entirely unnecessary and requires any and all dependencies to be programmed to a behavior that is not standard for the language.
There are basically two reasons for this: 1) The old behavior is, in our opinion, fundamentally broken (and Python 3 completely discards it). (We didn't realize how many other packages were accidentally dependent on it until much later.) 2) Many of the development choices involving CCTBX are made based on the requirements of Phenix. Unfortunately, whereas Ralf was very militant about enforcing certain development rules for CCTBX code (as we continue to be), it has been nearly impossible to get the other Phenix developers to follow the same (or any!) rules. ("Cat herding" doesn't even begin to describe it.) Adding -Qnew was the easiest way to ensure that all of the code in Phenix had similar behavior with respect to division. I agree that this creates problems for third-party packages. It would certainly be safer and more intuitive if we could avoid it altogether; I will discuss this with the other developers. Since it is somewhat non-trivial at the moment to fix the rest of Phenix to have the desired behavior, and I'm trying to revise a paper among other things, I'm deferring this until some point when I have time and free brain cells (possibly until our next Phenix meeting in mid-March). For now, I've modified libtbx/env_config.py so you can do this: /usr/bin/python /path/to/libtbx/configure.py --old_division cctbx which will give you cleaner dispatchers. I'm pretty sure the rest of CCTBX should still behave sensibly, since we're pretty militant about including the __future__ statement in these files. Is that sufficiently usable for the short term? -Nat
On Jan 8, 2014, at 2:18 PM, Nathaniel Echols
1) The old behavior is, in our opinion, fundamentally broken (and Python 3 completely discards it). (We didn't realize how many other packages were accidentally dependent on it until much later.)
You won’t get any argument from me here.
2) Many of the development choices involving CCTBX are made based on the requirements of Phenix. Unfortunately, whereas Ralf was very militant about enforcing certain development rules for CCTBX code (as we continue to be), it has been nearly impossible to get the other Phenix developers to follow the same (or any!) rules. ("Cat herding" doesn't even begin to describe it.) Adding -Qnew was the easiest way to ensure that all of the code in Phenix had similar behavior with respect to division.
I agree that this creates problems for third-party packages. It would certainly be safer and more intuitive if we could avoid it altogether; I will discuss this with the other developers. Since it is somewhat non-trivial at the moment to fix the rest of Phenix to have the desired behavior, and I'm trying to revise a paper among other things, I'm deferring this until some point when I have time and free brain cells (possibly until our next Phenix meeting in mid-March). For now, I've modified libtbx/env_config.py so you can do this:
/usr/bin/python /path/to/libtbx/configure.py --old_division cctbx
which will give you cleaner dispatchers. I'm pretty sure the rest of CCTBX should still behave sensibly, since we're pretty militant about including the __future__ statement in these files. Is that sufficiently usable for the short term?
That works fine for a developer, but it doesn’t help much with users, who shouldn’t be expected to create a custom build of CCTBX. Attached is a tar file (fixit.tar) of three python scripts that should solve this problem easily: fix-division ============ For all .py files in the present directory, it will add the “from future” to any files that need it, recursively. The new line will go after any “#!” line or module level docstrings and before any other line. Variable whitespace is handled gracefully. This allows any document generators to recognize the module-level docstrings. undo ==== If testing after fix-division fails, just run the undo. I predict that this script will be unnecessary. cleanup ======= Once testing after fix-division succeeds, run cleanup to get rid of the backup files. James
Hi James,
fix-division ============ For all .py files in the present directory, it will add the “from future” to any files that need it, recursively. The new line will go after any “#!” line or module level docstrings and before any other line. Variable whitespace is handled gracefully. This allows any document generators to recognize the module-level docstrings.
libtbx.add_from_future_import_division path/to/some/directory does already do that. That's the script I wrote in order to mass insert "from __future__ import division" everywhere in the cctbx a few years back. Best wishes, Luc
On Jan 8, 2014, at 5:11 PM, Luc Bourhis
Hi James,
fix-division ============ For all .py files in the present directory, it will add the “from future” to any files that need it, recursively. The new line will go after any “#!” line or module level docstrings and before any other line. Variable whitespace is handled gracefully. This allows any document generators to recognize the module-level docstrings.
libtbx.add_from_future_import_division path/to/some/directory
does already do that. That's the script I wrote in order to mass insert "from __future__ import division" everywhere in the cctbx a few years back
So apply it!
On Wed, Jan 8, 2014 at 4:00 PM, James Stroud
/usr/bin/python /path/to/libtbx/configure.py --old_division cctbx
which will give you cleaner dispatchers. I'm pretty sure the rest of CCTBX should still behave sensibly, since we're pretty militant about including the __future__ statement in these files. Is that sufficiently usable for the short term?
That works fine for a developer, but it doesn’t help much with users, who shouldn’t be expected to create a custom build of CCTBX.
True - but our assumption (possibly incorrect) has been that there aren't many people downloading binary builds of CCTBX and then combining them with other Python modules. The majority of CCTBX "users", as far as I know, are either people actively developing code based on or within CCTBX, or the end users of Phenix, CCP4, etc. Attached is a tar file (fixit.tar) of three python scripts that should
solve this problem easily:
All of the code in CCTBX is already patched (I just fixed the one or two recent additions that weren't). The problem is all of the code in Phenix, which we would like to ensure has consistent behavior - this was the primary reason for adding "-Qnew". The maintenance overhead is already enormous and I don't have either the time or authority to micromanage my collaborators. (Nor does anyone else, for that matter.) In this case I think the current behavior is risky enough to justify a one-time intervention, but I'm not going to do anything without first discussing it with the Powers That Be. -Nat
On Jan 8, 2014, at 5:23 PM, Nathaniel Echols
True - but our assumption (possibly incorrect) has been that there aren't many people downloading binary builds of CCTBX and then combining them with other Python modules. The majority of CCTBX "users", as far as I know, are either people actively developing code based on or within CCTBX, or the end users of Phenix, CCP4, etc.
I hope that CCTBX will become more like other python packages and not so insulated by these arcane practices (that probably made sense when first implemented). The way to promote widespread use is to make CCTBX importable just like any other package, like matplotlib, scipy, numpy, etc. I think CCTBX is the only major package I know of that insulates itself like this and requires its own “dispatcher”. I already made the first (as far as I know) open source powder diffraction simulator with CCTBX (http://pythonhosted.org/radialx/), so it is a generally useful package. My simulator already has one user (not including me) even though I uploaded the first release less than 24 hours ago. But if you read the docs for my package, I recommend to have a separate interpreter for CCTBX specific code and am forced to splice out functionality into separate scripts because of the dependency problems. James
On 9 Jan 2014, at 02:00, James Stroud wrote:
The way to promote widespread use is to make CCTBX importable just like any other package, like matplotlib, scipy, numpy, etc. I think CCTBX is the only major package I know of that insulates itself like this and requires its own “dispatcher”.
The dispatcher's main job is to set PYTHONPATH, DYLD_LIBRARY_PATH on MacOS X, LD_LIBRARY_PATH on Linux and PATH on Windows, as well as LIBTBX_BUILD. Thus it can be argued that it is a by-product of the way the cctbx installs itself on a system. By that I mean that if we installed it with distutils like your typical Python package, then /usr/bin/python would be able to run any cctbx-based script because the cctbx shared libraries and Python modules would be where /usr/bin/python could find them, and we would just need to decide on where to put libtbx_env inside site_packages. For historical reasons, this is not the path that was taken. It can certainly be rectified but as Nat said, Phenix takes precedence here, since the cctbx effort ultimately relies on the funding of Phenix. As long as the Phenix people aren't comfortable with removing -Qnew or with making the installation rely on distutils, then so be it. Luc
On Wed, Jan 8, 2014 at 5:05 PM, Luc Bourhis
The dispatcher's main job is to set PYTHONPATH, DYLD_LIBRARY_PATH on MacOS X, LD_LIBRARY_PATH on Linux and PATH on Windows, as well as LIBTBX_BUILD. Thus it can be argued that it is a by-product of the way the cctbx installs itself on a system.
Actually, a major justification for the dispatchers was to avoid the "pollution" of the environment in ways that might interfere with other packages. (This is why the default behavior is also to completely wipe any existing values of LD_LIBRARY_PATH, etc.) Again, this mainly makes deployment of Phenix easier. By that I mean that if we installed it with distutils like your typical
Python package, then /usr/bin/python would be able to run any cctbx-based script because the cctbx shared libraries and Python modules would be where /usr/bin/python could find them, and we would just need to decide on where to put libtbx_env inside site_packages.
The reason why we don't do this is simply to speed development - there's no distinction between the working code (i.e. SVN checkout) and runtime code. Otherwise we might have to run "python setup.py install" every time we want to test even the most minor change. There are almost certainly other ways to avoid this requirement, but that was the logic at the time. For historical reasons, this is not the path that was taken. It can
certainly be rectified but as Nat said, Phenix takes precedence here, since the cctbx effort ultimately relies on the funding of Phenix. As long as the Phenix people aren't comfortable with removing -Qnew or with making the installation rely on distutils, then so be it.
It's not a matter of comfort: the limitation is what we have time for (or money, which is basically the same thing). Removing -Qnew is probably easy enough to be justified if it makes CCTBX as a whole more usable. I don't think anyone would argue against converting to use distutils on technical grounds; we're simply too busy. I can't speak for the other developers, but if someone else outside our group were willing to do this we would probably not complain and might even encourage it, as long as it didn't interrupt our day jobs. -Nat
Hi All,
On the distutils thing: I agree completely with Nat that in terms of developing cctbx it would be a monumental pain to have to be running python setup install all the while... however for someone who wants to *use* cctbx in the same way as they use matplotlib and so on (there are a lot of people like this here at Diamond - physical scientists mostly) it would be a great help to be able to install cctbx in the way that they are used to.
So what would be really cool would be to be able to use distutils methods to *install* cctbx in the usual pythonic way, but to keep up with the current mechanism for *working on* cctbx - though the Scons build mechanism may not help this... Could it be possible to run the two side-by-side? I have encountered people for whom this is a blocker for example they want to just add cctbx to their enthought python installation or something...
(agreed though about the Phenix caveats: this is not useful for Phenix and for the large part that is where the $$ come from for cctbx so ...)
I'll ask around here to see if anyone is interested in looking into this...
Thanks & cheerio, Graeme
From: [email protected] [mailto:[email protected]] On Behalf Of Nathaniel Echols
Sent: 09 January 2014 01:21
To: cctbx mailing list
Subject: Re: [cctbxbb] cctbx.python unnecessarily unfriendly
On Wed, Jan 8, 2014 at 5:05 PM, Luc Bourhis
Hi Graeme,
So what would be really cool would be to be able to use distutils methods to *install* cctbx in the usual pythonic way, but to keep up with the current mechanism for *working on* cctbx
Actually, thinking about it a bit more, it seems that installing the cctbx with distutils is orthogonal with the way we do it now. I mean, we could keep using the Python wrapper to make the build + source (non-)installation work but then have a script that would create a distutils-based distro. And you know what: the Debian people were working on this about one year ago, as part of creating a proper deb package for the cctbx. They proposed a series of patch to achieve that: if you browse the history of this mailing list, you will find the discussions we had. The Debian developer were Radostan Riedel, Baptiste Carvello and Frédéric-Emmanuel Picca. The last of the tree works at Soleil and his concerns were very similar to those you expressed. He even invited me to visit him, which I did. Unfortunately, neither I nor Nat have followed this project. Best wishes, Luc
Le 09/01/2014 09:51, Luc Bourhis a écrit :
[...] And you know what: the Debian people were working on this about one year ago, as part of creating a proper deb package for the cctbx. They proposed a series of patch to achieve that: if you browse the history of this mailing list, you will find the discussions we had. The Debian developer were Radostan Riedel, Baptiste Carvello and Frédéric-Emmanuel Picca. The last of the tree works at Soleil and his concerns were very similar to those you expressed. He even invited me to visit him, which I did. Unfortunately, neither I nor Nat have followed this project.
Hi Luc, hi all, the project got stuck due to lack of time. The last commits were made in February last year. Speaking for myself, sticking around on the mailing lists is as much time I can put in it these days (sorry about that). Cheers, Baptiste
In CCP4 we have the same problem that Linux distros have with cctbx.
We're using an ad-hoc shell script that copies files around as a
substitute for "make install" (or python setup.py install / scons
install) functionality.
Scons like any build system supports installation. From what I just
googled the functions for this are named Install() and InstallAs().
Maybe it would be easier to use them instead of distutils?
Marcin
On 9 January 2014 08:51, Luc Bourhis
Hi Graeme,
So what would be really cool would be to be able to use distutils methods to *install* cctbx in the usual pythonic way, but to keep up with the current mechanism for *working on* cctbx
Actually, thinking about it a bit more, it seems that installing the cctbx with distutils is orthogonal with the way we do it now. I mean, we could keep using the Python wrapper to make the build + source (non-)installation work but then have a script that would create a distutils-based distro. And you know what: the Debian people were working on this about one year ago, as part of creating a proper deb package for the cctbx. They proposed a series of patch to achieve that: if you browse the history of this mailing list, you will find the discussions we had. The Debian developer were Radostan Riedel, Baptiste Carvello and Frédéric-Emmanuel Picca. The last of the tree works at Soleil and his concerns were very similar to those you expressed. He even invited me to visit him, which I did. Unfortunately, neither I nor Nat have followed this project.
Best wishes,
Luc
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
Hi Marcin, from what I remember, distutils was not the difficult problem. The setup.py was mostly a thin wrapper around Scons. As you say, it's just another build system. The main reason we use it is, that the Debian packaging tools interface well with it. The difficulty was in making sure that cctbx would find its different bits (python modules, shared libraries, runtime and test data) in the different places where Debian traditionally install them (instead of the source + build directories). We tried to make this work even when using the system python. This involved some delicate surgery in the libtbx environment file. For this task, a python-written build system is a big plus, because it enables us to import the libtbx modules. Cheers, Baptiste Le 09/01/2014 20:04, Marcin Wojdyr a écrit :
In CCP4 we have the same problem that Linux distros have with cctbx. We're using an ad-hoc shell script that copies files around as a substitute for "make install" (or python setup.py install / scons install) functionality.
Scons like any build system supports installation. From what I just googled the functions for this are named Install() and InstallAs(). Maybe it would be easier to use them instead of distutils?
Marcin
On 9 January 2014 08:51, Luc Bourhis
wrote: Hi Graeme,
So what would be really cool would be to be able to use distutils methods to *install* cctbx in the usual pythonic way, but to keep up with the current mechanism for *working on* cctbx
Actually, thinking about it a bit more, it seems that installing the cctbx with distutils is orthogonal with the way we do it now. I mean, we could keep using the Python wrapper to make the build + source (non-)installation work but then have a script that would create a distutils-based distro. And you know what: the Debian people were working on this about one year ago, as part of creating a proper deb package for the cctbx. They proposed a series of patch to achieve that: if you browse the history of this mailing list, you will find the discussions we had. The Debian developer were Radostan Riedel, Baptiste Carvello and Frédéric-Emmanuel Picca. The last of the tree works at Soleil and his concerns were very similar to those you expressed. He even invited me to visit him, which I did. Unfortunately, neither I nor Nat have followed this project.
Best wishes,
Luc
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
participants (6)
-
Baptiste Carvello
-
Graeme.Winter@diamond.ac.uk
-
James Stroud
-
Luc Bourhis
-
Marcin Wojdyr
-
Nathaniel Echols