This is an implementation of the DBSCAN clustering algorithm on top of Apache Spark. It is loosely based on the paper from G Luo, et al. "A Parallel DBSCAN Algorithm Based On Spark [1]".
This project is available under the Apache 2.0 license. See the LICENSE file for details.
- This project is maintained by Homayoun Heidarzadeh ([email protected]).
- Chris McCormick's python implementaion of DBSCAN, available @ Here
[1] Luo, Guangchun, et al. "A parallel dbscan algorithm based on spark." Big Data and Cloud Computing (BDCloud), Social Computing and Networking (SocialCom), Sustainable Computing and Communications (SustainCom)(BDCloud-SocialCom-SustainCom), 2016 IEEE International Conferences on. IEEE, 2016.