Use a faster priority queue implementation #8

jmitchell · 2017-02-25T21:43:36Z

A priority queue based on Python's heapq is significantly faster than FibHeap. The PriorityQueue abstract class establishes a standard interface used by both implementations, and minimal changes were required to astar.py and dijkstra.py to use it.

Performance analysis

FibPQ

$ time python ./profile.py
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 7.4103100299835205, '\n')
('Starting Solve:', 'A-star Search')
('Nodes explored: ', 462307)
('Path found, length', 8650)
('Time elapsed: ', 32.061322927474976, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.040678024291992, '\n')
('Starting Solve:', 'Breadth first search')
('Nodes explored: ', 512576)
('Path found, length', 8650)
('Time elapsed: ', 2.32486891746521, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.369711875915527, '\n')
('Starting Solve:', 'Depth first search')
('Nodes explored: ', 152711)
('Path found, length', 8650)
('Time elapsed: ', 0.6843249797821045, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 7.939746856689453, '\n')
('Starting Solve:', "Dijkstra's Algorithm")
('Nodes explored: ', 511299)
('Path found, length', 8650)
('Time elapsed: ', 39.57828378677368, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 9.189558982849121, '\n')
('Starting Solve:', 'Left turn only')
('Nodes explored: ', 582092)
('Path found, length', 582092)
('Time elapsed: ', 1.0989949703216553, '\n')
Saving Image
python profile.py  127.03s user 1.56s system 100% cpu 2:08.42 total

HeapPQ

$ time python ./profile.py
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 7.687443017959595, '\n')
('Starting Solve:', 'A-star Search')
('Nodes explored: ', 462306)
('Path found, length', 8650)
('Time elapsed: ', 9.670147895812988, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.07496190071106, '\n')
('Starting Solve:', 'Breadth first search')
('Nodes explored: ', 512576)
('Path found, length', 8650)
('Time elapsed: ', 2.324921131134033, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.762192964553833, '\n')
('Starting Solve:', 'Depth first search')
('Nodes explored: ', 152711)
('Path found, length', 8650)
('Time elapsed: ', 0.719641923904419, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.0798499584198, '\n')
('Starting Solve:', "Dijkstra's Algorithm")
('Nodes explored: ', 511299)
('Path found, length', 8650)
('Time elapsed: ', 12.476689100265503, '\n')
Saving Image
Loading Image
Creating Maze
('Node Count:', 716516)
('Time elapsed:', 8.351932048797607, '\n')
('Starting Solve:', 'Left turn only')
('Nodes explored: ', 582092)
('Path found, length', 582092)
('Time elapsed: ', 0.9704020023345947, '\n')
Saving Image
python profile.py  76.46s user 1.72s system 100% cpu 1:17.91 total

Depends on Graphviz (http://www.graphviz.org/) and the BProfile package (pip install bprofile).

This will make it easier to test different priority queue implementations.

Based on runs of profile.py, the heapq implementation is significantly faster.

jmitchell · 2017-02-25T22:42:01Z

While testing more mazes I found a minor bug with the new priority queue. Please wait to merge until I've submitted a fix.

mikepound · 2017-02-25T23:02:08Z

This is interesting. I ignored the HeapPQ because I thought didn't didn't have a "decreasekey" function. It seems you've implemented one through a remove + add option. It certainly looks faster, wikipedia suggests that a fibheap is the optimal pq structure, but I strongly suspect that depends on the way it's being used, and also relies on my implementation being optimised, which it probably isn't!

Mazes with multiple solutions were failing when A-Star and Dijkstra used the new HeapPQ implementation. Those issues are now resolved. Based on the current profile.py FibPQ and HeapPQ now neck and neck. FibPQ: profile.py 228.22s user 2.53s system 100% cpu 3:50.53 total HeapPQ: profile.py 228.36s user 3.66s system 100% cpu 3:51.86 total There's still room for other optimization opportunities, like refactoring so client code doesn't call `unvisited.minimium` to get a copy of the minimum entry and then immediately call `unvisited.removeminimum`. If the `removeminimum` were changed to return the removed entry, redundant `minimum`, `remove`, and `insert` calls could be eliminated. Based on the profiler output images, HeapPQ's bottleneck by and large is `heappop` and `heappush`, so this change could be a major speed improvement. The new priority queue, QueuePQ, is based on Python's Queue.PriorityQueue. Underneath it's also implemented using heapq, but adds synchronization primitives. This makes it slower than HeapPQ; however, the synchronization features may be desirable in some contexts.

jmitchell · 2017-02-26T09:06:29Z

I fixed the bug and profiled more extensively on other inputs. The initial gains I saw on perfect2k.png apparently don't generalize. There's more context in the latest commit comment. I anticipate more refactoring will lead to performance improvements.

Fibonacci heaps, as I understand it, are theoretically ideal, but not necessarily in practice. Binary heaps, like heapq's implementation, can represent the tree in a contiguous array of memory, which is better for cache locality. In contrast Fibonacci heaps are less consistently structured, so they necessarily have more layers of indirection. I don't know in this case which approach will prove fastest after optimizations, but I'm really curious!

Rather than getting the minimum element and then calling removeminimum, just have removeminimum return the removed element. This gives a significant speed improvement to all priority queue implementations. HeapPQ's relative gain in performance exceeds FibPQ's, so it's now the default. FibPQ: profile.py 203.06s user 3.63s system 100% cpu 3:26.50 total HeapPQ: profile.py 130.79s user 2.84s system 100% cpu 2:13.50 total

jmitchell · 2017-02-26T09:49:55Z

The latest change is a nice speed improvement for all priority queues. It includes changing to the HeapPQ implementation. More info in commit message, and latest profiler images below.

FibPQ

profile.py 203.06s user 3.63s system 100% cpu 3:26.50 total

HeapPQ

profile.py 130.79s user 2.84s system 100% cpu 2:13.50 total

mikepound · 2017-02-26T12:12:45Z

Also, having thought about it, even if the fib heap decreasekey is very efficient, I'm not actually sure it's called on a perfect maze. There's never an alternative path to a node, so we're never in a situation where it's needed. It's pretty rare even in the braid mazes. A pq that focuses on speed of insert and removemin is likely to be faster.

jmitchell added 6 commits February 25, 2017 11:12

Run reindent.py on all files

b357dfe

Add gitignore

c0ed4a2

Make entrypoint ammenable to profiling

6534e4e

Generate call graph of expensive functions

a2fa1a0

Depends on Graphviz (http://www.graphviz.org/) and the BProfile package (pip install bprofile).

Add PriorityQueue abstract class and FibHeap shim

0cdc72f

This will make it easier to test different priority queue implementations.

Use heapq priority queue implementation instead of FibHeap

9360823

Based on runs of profile.py, the heapq implementation is significantly faster.

jmitchell changed the title ~~Use a faster priority queue implementation~~ WIP - Use a faster priority queue implementation Feb 26, 2017

jmitchell changed the title ~~WIP - Use a faster priority queue implementation~~ Use a faster priority queue implementation Feb 26, 2017

mikepound merged commit 0e88d6d into mikepound:master Feb 26, 2017

jmitchell deleted the optimize-priority-queue branch February 27, 2017 00:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use a faster priority queue implementation #8

Use a faster priority queue implementation #8

Uh oh!

jmitchell commented Feb 25, 2017

Uh oh!

jmitchell commented Feb 25, 2017

Uh oh!

mikepound commented Feb 25, 2017

Uh oh!

jmitchell commented Feb 26, 2017

Uh oh!

jmitchell commented Feb 26, 2017 •

edited

Loading

Uh oh!

mikepound commented Feb 26, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use a faster priority queue implementation #8

Use a faster priority queue implementation #8

Uh oh!

Conversation

jmitchell commented Feb 25, 2017

Performance analysis

FibPQ

HeapPQ

Uh oh!

jmitchell commented Feb 25, 2017

Uh oh!

mikepound commented Feb 25, 2017

Uh oh!

jmitchell commented Feb 26, 2017

Uh oh!

jmitchell commented Feb 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FibPQ

HeapPQ

Uh oh!

mikepound commented Feb 26, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jmitchell commented Feb 26, 2017 •

edited

Loading