sparse_matrix

We provide a paradigm for conducting pairwise row operations (e.g., Pearson correlation) for large sparse matrices, such as connectome data or social network analysis. The classical np.corrcoef() function does not support the csr_matrix datatype, and transforming back to a dense matrix can be time-consuming or infeasible due to RAM limitations for large data. Our approach addresses this problem effectively by performing all operations on the sparse matrix directly. Users may also consider dask as a parallel computing add-on and/or alternative. We also provide an external datafile microns_allW.npz (90300×90300, density~0.001) for test purpose.

Through tests in generating random sparse matrices, it is clear that our sparse correlation consistently has better time performance than the numpy counterpart for larger matrices. Still, it is recommended to truncate the matrix into smaller blocks (e.g., 100×N) and calculate each separately, either synchronously or asynchronously. It is recommanded to use np.memmap() to create memory-map for data storage.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
README.md		README.md
microns_allW.npz		microns_allW.npz
zz_big_array.py		zz_big_array.py
zz_sparse_corr.py		zz_sparse_corr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

sparse_matrix

About

Uh oh!

Releases

Packages

Languages

StevenZhang0116/sparse_matrix

Folders and files

Latest commit

History

Repository files navigation

sparse_matrix

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages