Hi, I have a problem where the use of a sparse matrix is appropriate, so I was pretty happy to grep cctbx for "sparse" and discover the code in scitbx/sparse. I'm just browsing through the code and tests to get a feel for how to use these types. In my existing Python code I set the values for some columns using flex.double vectors, then ultimately fill a flex.double(flex.grid(n,m)) by these columns using matrix_paste_column_in_place. It seems to me I can use a very similar pattern by replacing the flex.doubles with sparse.matrix_columns and setting columns in the sparse.matrix using slice notation: mat[:,0] = column. So far so good. The issue is that I'm not sure how best to go about setting the values in the matrix_column in the first place. At the moment I'm making a lot of use of flex.size_t selections and the set_selected method of flex.double to efficiently (from Python) set elements of one flex.double from another. It appears that there is no similar functionality for sparse.matrix_column. Am I right? Possibly the best I can do is construct a Python dictionary consisting of the indices from my size_t selection and the values from my flex.double array and use these to construct the matrix_column, as done in tst_sparse, e.g. sparse.matrix_column(m, {1:0.1, 2:0.2}). Without having tried this, I sense it might be a burden. I have a lot of values to set and don't really want to convert from flex.double via a dictionary. Are there any philosophical/technical/other reasons for not writing a set_selected method for sparse.vector? If not, I might try to do this. Thanks -- David
Hi David,
Without having tried this, I sense it might be a burden. I have a lot of values to set and don't really want to convert from flex.double via a dictionary. Are there any philosophical/technical/other reasons for not writing a set_selected method for sparse.vector? If not, I might try to do this.
First I am the author of that sparse matrix code. Then could you tell me what application you have in mind? My sparse matrix code has very specific memory and time trade-offs, that may not be suitable for your problem. But I can already answer your question: it would be slow as well as cumbersome to go through an intermediate dictionary. Initialising a sparse vector with a dictionary, I only implemented it to make testing easier. It is not designed to be used in heavy-duty production code. Moreover it would be perfectly legitimate for the sparse columns to support assignment through selections. I could easily make the following work: x = sparse.vector(10) # vector with 10 elements x.set_selected(index_selection, another_vector) x.compact() # part of the trade-offs I was talking about Best wishes, Luc
Hi Luc,
I suspect my application may be very similar to your original use
case. I am preparing a Jacobian matrix to pass into the add_equations
method of an lstbx non_linear_ls object. By the way, I am sure this is
your code as well - so I thank you for this. My gradient calculations
mostly involve vector calculus and are neatly done in a "vectorised"
fashion from the Python side using operations on flex.vec3_double,
mat3_double and so on. I naturally end up with some values in a
flex.double array for some subset of the problem that I'd like to put
into a sparse matrix column. The code you pasted looks ideal for this.
I don't intend to assign any particular index in the matrix_column
more than once. I'm not sure if that perhaps gives me the upper hand
wrt the trade-offs you mention? Additionally most of the problems I
deal with are not sparse, so I intend to keep the existing code for
most cases and only use this alternative when I know the problem will
be sparse.
If this all sounds reasonable, and you don't mind doing so, then I
would indeed appreciate it if you could make that code work.
Cheers
-- David
On 6 June 2014 14:18, Luc Bourhis
Hi David,
Without having tried this, I sense it might be a burden. I have a lot of values to set and don't really want to convert from flex.double via a dictionary. Are there any philosophical/technical/other reasons for not writing a set_selected method for sparse.vector? If not, I might try to do this.
First I am the author of that sparse matrix code.
Then could you tell me what application you have in mind? My sparse matrix code has very specific memory and time trade-offs, that may not be suitable for your problem.
But I can already answer your question: it would be slow as well as cumbersome to go through an intermediate dictionary. Initialising a sparse vector with a dictionary, I only implemented it to make testing easier. It is not designed to be used in heavy-duty production code. Moreover it would be perfectly legitimate for the sparse columns to support assignment through selections. I could easily make the following work:
x = sparse.vector(10) # vector with 10 elements x.set_selected(index_selection, another_vector) x.compact() # part of the trade-offs I was talking about
Best wishes,
Luc
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
On 6 Jun 2014, at 15:40, David Waterman
I don't intend to assign any particular index in the matrix_column more than once. I'm not sure if that perhaps gives me the upper hand wrt the trade-offs you mention?
That's the other way around actually: I designed the sparse columns so as to make operations like x(i) += 1.0 or x(i) = 2.0 (a) basically O(1), at the price of wasting some memory, and the necessity of calling x.compact() when the construction of the vector is finished. On the contrary, reading x(i) is O(log n) where n is the number of non-zero elements in x. I chose that because operations (a) are repeated many times to build a Jacobian by pieces and composition (the aim was constraints for small molecule crystallography). On the contrary, all sparse algorithms can be written so as to avoid indexing (by iterating through the known non-zero elements). Actually, looking at my code, I realise that I designed it so that compact() is called as needed! You don't need to do it explicitly. In any case, it looks like the right structure if you want to deal with sparse Jacobian indeed. Ok, no problem, I'll add selection then. Best wishes, Luc
Hi Luc,
That sounds great, thank you.
Cheers
-- David
On 6 June 2014 15:06, Luc Bourhis
On 6 Jun 2014, at 15:40, David Waterman
wrote: I don't intend to assign any particular index in the matrix_column more than once. I'm not sure if that perhaps gives me the upper hand wrt the trade-offs you mention?
That's the other way around actually: I designed the sparse columns so as to make operations like x(i) += 1.0 or x(i) = 2.0 (a) basically O(1), at the price of wasting some memory, and the necessity of calling x.compact() when the construction of the vector is finished. On the contrary, reading x(i) is O(log n) where n is the number of non-zero elements in x. I chose that because operations (a) are repeated many times to build a Jacobian by pieces and composition (the aim was constraints for small molecule crystallography). On the contrary, all sparse algorithms can be written so as to avoid indexing (by iterating through the known non-zero elements).
Actually, looking at my code, I realise that I designed it so that compact() is called as needed! You don't need to do it explicitly.
In any case, it looks like the right structure if you want to deal with sparse Jacobian indeed. Ok, no problem, I'll add selection then.
Best wishes,
Luc
_______________________________________________ cctbxbb mailing list [email protected] http://phenix-online.org/mailman/listinfo/cctbxbb
participants (2)
-
David Waterman
-
Luc Bourhis