Hello all, I am playing with the Patterson function as calculated by cctbx and I am confused as to how I can extract and manipulate the data. More specifically, miller.patterson_map(...).real_map() returns a one-dimensional array of values. How do I get the corresponding 3D coordinates? I suspect it has to do with the gridding and some kind of fixed order enumeration, but I haven't found an example (the peak_search().sites() seems to be relevant, but I would like the whole thing). On a related note - is there a way to debug the python code of cctbx and step into the c++ code? Debugging is my favorite way to figure things out, but with cctbx most interesting things are hidden. I am using Eclipse + PyDev now. Thank you! Kind regards, Dmytro.
On a related note – is there a way to debug the python code of cctbx and step into the c++ code? Debugging is my favorite way to figure things out, but with cctbx most interesting things are hidden. I am using Eclipse + PyDev now.
I have done this successfully using WingIDE+Visual Studio on Windows and WingIDE+Xcode on Mac. First you will need a debug build of the cctbx, which you can get by adding --build=debug to the libtbx.configure options and recompiling. Then you need to first stop at a breakpoint in your Python debugger, then in your C++ debugger you need to find the "Attach to Process" (or similar) option in the menu and find the current Python
On 15 October 2012 12:51, Dmytro Guzenko
On Mon, Oct 15, 2012 at 12:51 PM, Dmytro Guzenko
I am playing with the Patterson function as calculated by cctbx and I am confused as to how I can extract and manipulate the data.
More specifically, miller.patterson_map(…).real_map() returns a one-dimensional array of values. How do I get the corresponding 3D coordinates? I suspect it has to do with the gridding and some kind of fixed order enumeration, but I haven’t found an example (the peak_search().sites() seems to be relevant, but I would like the whole thing).
The map object does indeed have at its core a one-dimensional flex.double or flex.complex_double object, but with the addition of a grid accessor, which allows you to index the array using a tuple. One could generate a similar object using this python code: from scitbx.array_family import flex map_values = flex.random_double(1000) map_values.reshape(flex.grid(10,10,10)) print map[5,5,5] Here 5,5,5 is simply the index of the grid point of interest, which will presumably correspond to some XYZ value, but which one is somewhat arbitrary depending on the desired grid spacing and the preferences of the FFT algorithm. You can get the size of the grid from the cctbx.miller.fft_map class returned by array.patterson_map(...) as fft_map.n_real(). Thus: patt_fft_map = fobs.patterson_map() patt_map = patt_fft_map.real_map_unpadded() n_real = patt_fft_map.n_real() unit_cell = patt_fft_map.unit_cell() Given a grid coordinate u,v,w you can calculate the fractional coordinate by dividing u,v,w by n_real, and if necessary, the Cartesian coordinate by unit_cell.orthogonalize(site_frac=site_frac). In practice, much of what we do is starting from known 3D coordinates (fractional or Cartesian) and either pulling out map values using interpolation, or selecting a set of 1D grid indices around those coordinates, rather than looking at grid values individually. I hope this is at least a semi-coherent explanation - if not hopefully the other developers can correct/clarify. (I just use this code, almost none of it is mine!) -Nat PS. FYI, the computational overhead for a Python call like "map[5,5,5]" is relatively large, so if you're looping over 3D indices, you're probably going to want to use C++ eventually. (Python is fine for prototyping, of course.)
Hello Nathaniel, Thanks for answering!
Given a grid coordinate u,v,w you can calculate the fractional coordinate by dividing u,v,w by n_real, and if necessary, the Cartesian coordinate by unit_cell.orthogonalize(site_frac=site_frac).
I think I got it, just to confirm: ------------------------------------------- # extracting x,y,z from pmap into cart_coord and corresponding weights into p_weight rmap = pmap.real_map_unpadded() n_real = pmap.n_real() unit_cell = pmap.unit_cell() grid2cart = maptbx.grid2cart(n_real,unit_cell.orthogonalization_matrix()) n_points_total = n_real[0]*n_real[1]*n_real[2] cart_coord = flex.double(flex.grid(3,n_points_total)) p_weight = flex.double(n_points_total) for plain_ind,grid_point in enumerate(flex.nested_loop(n_real)): cart_coord[0,plain_ind],cart_coord[1,plain_ind],cart_coord[2,plain_ind] = grid2cart(grid_point) p_weight[plain_ind] = rmap[grid_point] ------------------------------------------- The syntax is probably not optimal - I am new to python (and to crystallography, for that matter). By the way, I noticed that many weights are negative, is this possible in a patterson map or am I doing something wrong? I prepare the map like this (alphaHelix.pdb is from http://www.lorieau.com/methods-and-reference/biophysics-and-biochemistry/31-...): ------------------------------------------- pdb_inp = pdb.input(file_name="alphaHelix.pdb") xray_structure = pdb_inp.xray_structure_simple() fcalc = xray_structure.structure_factors(d_min=2, algorithm="fft").f_calc() pmap = fcalc.patterson_map(d_min=2, symmetry_flags=maptbx.use_space_group_symmetry) pmap.statistics().show_summary() ------------------------------------------- max 6.48589e+07 min -507708 <--- mean -8.76639e-10 <--- sigma 1.05336e+06 ------------------------------------------- Thanks again! Dmytro.
On Tue, Oct 16, 2012 at 6:47 AM, Dmytro Guzenko
rmap = pmap.real_map_unpadded() n_real = pmap.n_real() unit_cell = pmap.unit_cell() grid2cart = maptbx.grid2cart(n_real,unit_cell.orthogonalization_matrix())
n_points_total = n_real[0]*n_real[1]*n_real[2] cart_coord = flex.double(flex.grid(3,n_points_total)) p_weight = flex.double(n_points_total)
for plain_ind,grid_point in enumerate(flex.nested_loop(n_real)): cart_coord[0,plain_ind],cart_coord[1,plain_ind],cart_coord[2,plain_ind] = grid2cart(grid_point) p_weight[plain_ind] = rmap[grid_point]
There are better arrays for storing the coordinates, e.g.: cart_coord = flex.vec3_double(flex.grid(n_real)) p_weight = flex.double(rmap.size()) for plain_ind,grid_point in enumerate(flex.nested_loop(n_real)): cart_coord[grid_point] = grid2cart(grid_point) p_weight[plain_ind] = rmap[grid_point]
By the way, I noticed that many weights are negative, is this possible in a patterson map or am I doing something wrong?
This is normal - I believe it's a Fourier series termination error, but I still don't fully understand the reason. An Fobs map will have the same problem; the mean will always be approximately zero. Technically, if the map were perfect with no missing data, it would be uniformly positive, but that's never actually the case. There are ways to "fix" the map so it's all-positive, such as maximum entropy cleanup, but in practice I think we just ignore the negative values (at least for Fobs/2mFo-DFc maps - not sure how Patterson maps are usually handled). -Nat
participants (3)
-
Dmytro Guzenko
-
Nathaniel Echols
-
Richard Gildea