Skip to content

Conversation

@kuz
Copy link

@kuz kuz commented Aug 18, 2016

Since the only possible distance between two neighboring cells is 1 and there is no such thing as unreachable node then there is no need in Floyd-Warshall. We know the shape of the grid and we can just calculate position of each node. Then we count number of steps needed to get to that node from each other node. This reduces complexity from O(n^3) to O(n^2). In my case of SOM with side 25 (625 nodes) that was an improvement from 4 minutes to 18 seconds (total algorithm time).

Since the only possible distance between two neighboring cells is 1 and there is no such thing as unreachable node then there is no need in Floyd-Warshall. We know the shape of the grid and we can just calculate position of each node. Then we count number of steps needed to get to that node from each other node. This reduces complexity from O(n^3) to O(n^2). In my case of SOM with side 25 (625 nodes) that was an improvement from 4 minutes to 18 seconds (total algorithm time).
@naught101
Copy link
Owner

the only possible distance between two neighboring cells is 1 and there is no such thing as unreachable node

This is not true in general. The algorithm is designed to allow arbitrary networks as the "grid" for the SOM (e.g. if the user provides the grids. This is especially useful for evolving-grid SOM variants, but also includes higher dimensional rectilinear grids, etc..

Since the minimum-distances calculation doesn't happen very often (once, for non-evolving grids), this doesn't seem that important, but it might be worth converting the Floyd-Warshall algorithm to Cython eventually.

On 19 August 2016 3:57:17 AM AEST, Ilya Kuzovkin [email protected] wrote:

Since the only possible distance between two neighboring cells is 1 and
there is no such thing as unreachable node then there is no need in
Floyd-Warshall. We know the shape of the grid and we can just calculate
position of each node. Then we count number of steps needed to get to
that node from each other node. This reduces complexity from O(n^3) to
O(n^2). In my case of SOM with side 25 (625 nodes) that was an
improvement from 4 minutes to 18 seconds (total algorithm time).
You can view, comment on, or merge this pull request online at:

#1

-- Commit Summary --

  • Replace Floyd-Warshall with simpler approach

-- File Changes --

M sklearn/cluster/som_.py (16)

-- Patch Links --

https://github.com/naught101/scikit-learn/pull/1.patch
https://github.com/naught101/scikit-learn/pull/1.diff

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#1

Sent from my Android device with K-9 Mail. Please excuse my brevity.

naught101 pushed a commit that referenced this pull request Jun 9, 2017
…scikit-learn#7838)

* initial commit for return_std

* initial commit for return_std

* adding tests, examples, ARD predict_std

* adding tests, examples, ARD predict_std

* a smidge more documentation

* a smidge more documentation

* Missed a few PEP8 issues

* Changing predict_std to return_std #1

* Changing predict_std to return_std scikit-learn#2

* Changing predict_std to return_std scikit-learn#3

* Changing predict_std to return_std final

* adding better plots via polynomial regression

* trying to fix flake error

* fix to ARD plotting issue

* fixing some flakes

* Two blank lines part 1

* Two blank lines part 2

* More newlines!

* Even more newlines

* adding info to the doc string for the two plot files

* Rephrasing "polynomial" for Bayesian Ridge Regression

* Updating "polynomia" for ARD

* Adding more formal references

* Another asked-for improvement to doc string.

* Fixing flake8 errors

* Cleaning up the tests a smidge.

* A few more flakes

* requested fixes from Andy

* Mini bug fix

* Final pep8 fix

* pep8 fix round 2

* Fix beta_ to alpha_ in the comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants