cuSPARSE csrmm with dense matrix in row-major format - cuda

I want to use the cuSPARSE csrmm function to multiply two matrices. The A matrix is sparse and the B matrix is dense. The dense matrix is in row-major format. Is there some nice way (trick) to accomplish this without the need to explicitly transpose B? I am thinking of something similar to this for dense BLAS.
Thanks

Related

How to transpose a sparse matrix in cuSparse?

I am trying to compute A^TA using cuSparse. A is a large but sparse matrix. The proper function to use based on the documentation is cusparseDcsrgemm2. However, this is one of the few cuSparse operations that doesn't support an optional built-in transpose for the input matrix. There's a line in the documentation that said
Only the NN version is supported. For other modes, the user has to
transpose A or B explicitly.
The problem is I couldn't find a function in cuSparse that can perform a transpose. I know I can transpose in CPU and copy it to the GPU but that will slow down the application. Am I missing something? What is the right way to use cuSparse to compute A^TA?
For matrices that are in CSR (or CSC) format:
The CSR sparse representation of a matrix has identical format/memory layout as the CSC sparse representation of its transpose.
Therefore, if we use the cusparse provided function to convert a CSR format matrix into a CSC format, that resultant CSC-format matrix is actually the same as the CSR representation of the transpose of the original matrix. Therefore this CSR-to-CSC conversion routine could be used to find the transpose of a CSR format sparse matrix. (It can similarly be used to find the transpose of a CSC format sparse matrix.)

How to calculate a combination of convolution and correlation by FFT?

I'm trying to achieve an algorithm to efficiently calculate a combination of convolution and correlation such as following :
c(x,y)=(sum of i, (sum of j, a(x-i,y+j)*b(i,j)))
I have known that 1-D convolution or correlation can be solved by
a conv b = ifft(fft(a).*fft(b))
a corr b = ifft(fft(a).*conjg(fft(b)))
But I have no idea about the combination of them in 2-D or N-D problems. I think it is similar to 2-D convolution, but I don't know the specific deduction process.
The correlation can be written in terms of the convolution by reversing one of the arguments:
corr(x(t),y(t)) = conv(x(t),y(-t))
Thus, if you want the x-axis to behave like a convolution and the y-axis to behave like a correlation, reverse the y-axis only and compute the convolution. It doesn’t matter if you use a spatial or frequency domain implementation.

How can solve SVD from row-major matrix using cusolver gesvd function

I'm beginner for cuda. I want to try to solve svd for row-major matrix using cusolver API. but I'm confusing about leading dimension for matrix A.
I have a row-major matrix 100x10.(e.g, I have 100 data which is in the 10 dimensional space.)
As the CUDA documentation, cusolverDnDgesvd function needs lda parameter(leading dimenstion for matrix A). My matrix is row-major so I gave 10 to cusolver gesvd function. But function was not working. This function indicated that my lda parameter was wrong.
Ok, I gave 100 to cusolver gesvd function. Function was working but the results of function (U, S, Vt) seems to be wrong. I mean, I can't get the matrix A from USVt.
As my knowledge, cuSolver API assume all matrix is column-major.
If I changed my matrix into column-major, m is lower than n(10x100). But gesvd function only works for m >= n.
Yes, I'm in trouble. How can I solve this problem?
Row-major, col-major and leading dimension are concepts related to the storage. A matrix can be stored in either scheme, while representing the same mathematical matrix.
To get correct result, you could use cublasDgeam() to change your row-major 100x10 matrix into a col-major 100x10 matrix, which is equivalent to matrix transpose while keeing the storage order, before calling cusolver.
There are many sources talking about storage ordering,
https://en.wikipedia.org/wiki/Row-major_order
https://fgiesen.wordpress.com/2012/02/12/row-major-vs-column-major-row-vectors-vs-column-vectors/
https://eigen.tuxfamily.org/dox-devel/group__TopicStorageOrders.html
Confusion between C++ and OpenGL matrix order (row-major vs column-major)
as well as leading dimension
http://www.ibm.com/support/knowledgecenter/SSFHY8_5.3.0/com.ibm.cluster.essl.v5r3.essl100.doc/am5gr_leaddi.htm
You should google them.

CUSP function to generate a matrix with random values

I was wondering if CUSP library provides a function that creates a matrix with a specific number of columns, rows, and any random values?
I found poisson5pt function but it doesn't return back a matrix with the dimensions I specify!
Thanks in advance
In the CUSP matrix gallery you will find random.h which almost does what you want:
template <class MatrixType>
void random(size_t num_rows, size_t num_cols, size_t num_samples, MatrixType& output)
This will produce a matrix of the dimensions you specify with the number of random locations you request filled with 1.
It would be trivial to modify that to use a random value rather than unity, although I don't understand why you would ever want such a matrix. It will not be guaranteed to have any of the properties you probably need in a test matrix if you have plans to use such a matrix in any linear algebra operations.

cuSPARSE dense times sparse

I need to calculate the following matrix math:
D * A
Where D is dense, and A is sparse, in CSC format.
cuSPARSE allows multiplying sparse * dense, where sparse matrix is in CSR format.
Following a related question, I can "convert" CSC to CSR simply by transposing A.
Also I can calculate (A^T * D^T)^T, as I can handle getting the result transposed.
In this method I can also avoid "transposing" A, because CSR^T is CSC.
The only problem is that cuSPARSE doesn't support transposing D in this operation, so I have to tranpose it beforehand, or convert it to CSR, which is a total waste, as it is very dense.
Is there any workaround?Thanks.
I found a workaround.
I changed the memory accesses to D in my entire code.
If D is an mxn matrix, and I used to access it by D[j * m + i], now I'm accessing it by D[i * n + j], meaning I made it rows-major instead of columns-major.
cuSPARSE expectes matrices in column-major format, and because rows-major transposed is columns-major, I can pass D to cuSPARSE functions as a fake transpose without the need to make the transpose.