Priority queues
Transcription
Priority queues
Algorithms and Data Structures, Fall 2011 Priority queues Rasmus Pagh Based on slides by Kevin Wayne, Princeton Algorithms, 4th Edition Monday, October 24, 11 · Robert Sedgewick and Kevin Wayne · Copyright © 2002–2011 Priority queue Collections. Insert and delete items. Which item to delete? Stack. Remove the item most recently added. Queue. Remove the item least recently added. Symbol table. Remove any desired item. Priority queue. Remove the largest (or smallest) item. operation argument insert insert insert remove max insert insert insert remove max insert insert insert remove max return value size P Q E Q X A M X P L E P 1 2 3 2 3 4 5 4 5 6 7 6 contents (unordered) P P P P P P P P P P P E Q Q E E E E E E E E M E X X X M M M M A A A A A A A P M P P P L L L E E A sequence of operations on a priority queue Monday, October 24, 11 P P E E E A A A A A A A 2 Priority queue API Requirement. Generic items are Comparable. public class MaxPQ<Key extends Comparable<Key>> MaxPQ() create a priority queue MaxPQ(maxN) create a priority queue of initial capacity maxN void insert(Key v) insert a key into the priority queue Key max() return the largest key Key delMax() return and remove the largest key boolean isEmpty() int size() is the priority queue empty? number of entries in the priority queue API for a generic priority queue 3 Monday, October 24, 11 Priority queue applications • • • • • • • • • • Event-driven simulation. [customers in a line, colliding particles] Numerical computation. [reducing roundoff error] Data compression. [Huffman codes] Graph searching. [Dijkstra's algorithm, Prim's algorithm] Computational number theory. [sum of powers] Artificial intelligence. [A* search] Statistics. [maintain largest M values in a sequence] Operating systems. [load balancing, interrupt handling] Discrete optimization. [bin packing, scheduling] Spam filtering. [Bayesian spam filter] Generalizes: stack, queue (why?) 4 Monday, October 24, 11 Simulation example 1 % java CollisionSystem 100 5 Monday, October 24, 11 Simulation example 1 % java CollisionSystem 100 5 Monday, October 24, 11 Priority queue client example Challenge. Find the largest M items in a stream of N items (N huge, M large). • • Fraud detection: isolate $$ transactions. File maintenance: find biggest files or directories. Constraint. Not enough memory to store N items. sort key 6 Monday, October 24, 11 Priority queue client example Challenge. Find the largest M items in a stream of N items (N huge, M large). • • Fraud detection: isolate $$ transactions. File maintenance: find biggest files or directories. Constraint. Not enough memory to store N items. % more tinyBatch.txt Turing 6/17/1990 vonNeumann 3/26/2002 Dijkstra 8/22/2007 vonNeumann 1/11/1999 Dijkstra 11/18/1995 Hoare 5/10/1993 vonNeumann 2/12/1994 Hoare 8/18/1992 Turing 1/11/2002 Thompson 2/27/2000 Turing 2/11/1991 Hoare 8/12/2003 vonNeumann 10/13/1993 Dijkstra 9/10/2000 Turing 10/12/1993 Hoare 2/10/2005 644.08 4121.85 2678.40 4409.74 837.42 3229.27 4732.35 4381.21 66.10 4747.08 2156.86 1025.70 2520.97 708.95 3532.36 4050.20 sort key 6 Monday, October 24, 11 Priority queue client example Challenge. Find the largest M items in a stream of N items (N huge, M large). • • Fraud detection: isolate $$ transactions. File maintenance: find biggest files or directories. Constraint. Not enough memory to store N items. % more tinyBatch.txt Turing 6/17/1990 vonNeumann 3/26/2002 Dijkstra 8/22/2007 vonNeumann 1/11/1999 Dijkstra 11/18/1995 Hoare 5/10/1993 vonNeumann 2/12/1994 Hoare 8/18/1992 Turing 1/11/2002 Thompson 2/27/2000 Turing 2/11/1991 Hoare 8/12/2003 vonNeumann 10/13/1993 Dijkstra 9/10/2000 Turing 10/12/1993 Hoare 2/10/2005 644.08 4121.85 2678.40 4409.74 837.42 3229.27 4732.35 4381.21 66.10 4747.08 2156.86 1025.70 2520.97 708.95 3532.36 4050.20 % java TopM Thompson vonNeumann vonNeumann Hoare vonNeumann 5 < tinyBatch.txt 2/27/2000 4747.08 2/12/1994 4732.35 1/11/1999 4409.74 8/18/1992 4381.21 3/26/2002 4121.85 sort key 6 Monday, October 24, 11 Priority queue client example Challenge. Find the largest M items in a stream of N items (N huge, M large). Transaction data use a min-oriented pq type is Comparable pq contains largest M items Exercise: Theoretical analysis of the above, using properties of priority queue implementations. 7 Monday, October 24, 11 Priority queue client example Challenge. Find the largest M items in a stream of N items (N huge, M large). MinPQ<Transaction> pq = new MinPQ<Transaction>(); use a min-oriented pq while (StdIn.hasNextLine()) { String line = StdIn.readLine(); Transaction item = new Transaction(line); pq.insert(item); if (pq.size() > M) pq contains pq.delMin(); largest M items } Transaction data type is Comparable Exercise: Theoretical analysis of the above, using properties of priority queue implementations. 7 Monday, October 24, 11 How to implement a priority queue ADT? Answer 1: Special case of search tree API. Answer 2: Can do this simpler! Recursive definition: A priority queue (PQ) for n elements can be represented as: - The maximum element, and - two recursive PQs for remaining about (n-1)/2 elements each, if n>1. Problem session If one defines a PQ recursively as above, how can one implement: - max? - insert? - delMax? 8 Monday, October 24, 11 Complete binary tree For an elegant implementation of the previous idea, we need an alternative tree representation. Complete tree: Perfectly balanced, except for bottom level. complete tree with N = 16 nodes (height = 4) Property. Height of complete binary tree with N nodes is ⎣lg N⎦. Pf. Height only increases when N is a power of 2. 9 Monday, October 24, 11 Binary heap representations Binary heap. Array representation of a heap-ordered complete binary tree. Heap-ordered binary tree. • • Keys in nodes. No smaller than children’s keys. Array representation. • • i a[i] 0 - Take nodes in level order. 1 T 2 S 3 R S R 4 P 5 N 6 O 7 A P N O A 8 E 9 10 11 I H G E I T No explicit links needed! 1 8 E 9 I G T 3 R 2 S 4 P H 5 N 10 H 6 O 7 A 11 G Heap representations 10 Monday, October 24, 11 Binary heap properties Proposition. Largest key is a[1], which is root of binary tree. indices start at 1 Proposition. Can use array indices to move through tree. • • Parent of node at k is at k/2. Children of node at k are at 2k and 2k+1. i a[i] 0 - 1 T 2 S 3 R S R 4 P 5 N 6 O 7 A P N O A 8 E 9 10 11 I H G E I T 1 8 E 9 I G T 3 R 2 S 4 P H 5 N 10 H 6 O 7 A 11 G Heap representations 11 Monday, October 24, 11 Promotion in a heap Scenario. Node's key becomes larger key than its parent's key. To eliminate the violation: • • Exchange key in node with key in parent. Repeat until heap order restored. S private void swim(int k) { while (k > 1 && less(k/2, k)) { exch(k, k/2); k = k/2; } parent of node at k is at k/2 } R P 5 G E I H T O A violates heap order (larger key than parent) G 1 T 2 R S P 5 G E I H O A G Bottom-up heapify (swim) 12 Monday, October 24, 11 Insertion in a heap Insert. Add node at end, then swim it up. Cost. At most 1 + lg N compares. insert remo T R N P public void insert(Key x) { pq[++N] = x; swim(N); } E H I G O S A key to insert T R N P E H I G O add key to heap violates heap order S T swim up R S P E A N I G O sink down A H Heap operation 13 Monday, October 24, 11 Demotion in a heap Scenario. Node's key becomes smaller than one (or both) of its children's keys. To eliminate the violation: • • Exchange key in node with key in larger child. Repeat until heap order restored. private void sink(int k) { children of node while (2*k <= N) at k are 2k and 2k+1 { int j = 2*k; if (j < N && less(j, j+1)) j++; if (!less(k, j)) break; exch(k, j); k = j; } } violates heap order (smaller than a child) 2 R H 5 S P E T I N O A G T 2 5 P E R S I 10 H N O A G Top-down reheapify (sink) 14 Monday, October 24, 11 Delete the maximum in a heap Delete max. Exchange root with node at end, then sink it down. Cost. At most 2 lg N compares. insert remove the maximum T R N P public Key delMax() E I G { Key max = pq[1]; exch(1, N--); N sink(1); pq[N+1] = null; P return max; E I G } H O S A P E key to insert N I G R P add key to heap violates heap order S E N I G I R G O A remove node from heap T S S N violates heap order S prevent loitering O A A exchange keys with root H T P O H R swim up E R S T H key to remove T O R P sink down I A H E N H O A G Heap operations 15 Monday, October 24, 11 Priority queues with remove, key updates • • Some priority queue ADTs support the removal of a specific item - Need to specify a “handle” to identify the item Two ways to do this efficiently for a heap: - 1. Maintain a hash table that keeps track of the location of each item in the heap. - 2. Leave removed items in the heap. When deleting the maximum, check if the item has been deleted (using a hash table). • Some priority queue ADTs give special methods for changing (increasing/ decreasing) the priority of an item. - Changes that move items towards the front of the queue are fast! 16 Monday, October 24, 11 Priority queues implementation cost summary order-of-growth of number of comparisons for priority queue with N items implementation insert del max max unordered array 1 N N ordered array N 1 1 binary heap log N log N 1 d-ary heap logd N d logd N 1 Fibonacci 1 impossible 1 log N † 1 † amortized Gerth Brodal, Aarhus University 1 1 why impossible? 17 Monday, October 24, 11 Priority queues in Java • java.util.PriorityQueue – decreaseKey through remove + insert – From Java 1.5 API documentation: An unbounded priority queue based on a priority heap. … Implementation note: this implementation provides O(log(n)) time for the insertion methods …; linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size). • An implementation with better remove performance can be found e.g. in the JDSL library. 18 Monday, October 24, 11 Is number of comparisons the right measure of complexity? Hard drive Latency: 5ms (15 million cycles) SSD Latency: 50µs (150,000 cycles) 19 Monday, October 24, 11 Heaps and cache-efficiency Suppose - We have a heap of primitive data types (not objects), e.g. integers. - Our heap is stored in a memory with block size at least d (each memory access transfers a block of d words of memory). Memory transfers of binary heap - 1 transfer to access first log d levels - 1 transfer for each subsequent level - Total: 1+log(n)-log(d) Memory transfers of d-ary heap - 1 transfer to access each level - Total: log(n)/log(d) Problem: Different d needed to optimize for different block sizes. 20 Monday, October 24, 11 van Emde Boas (vEB) layout Presented in master’s thesis of Prokop (1999): Alternative way of laying out a complete binary tree in an array. Observation: A tree of depth d can be “cut up” into 2d/2+1 trees: - A top tree of the first d/2 levels - 2d “bottom” trees with root at level d/2+1 vEB layout: - First store top tree (recursively using vEB layout) - Then store each bottom tree (recursively using vEB layout) Standard heap layout (level order) van Emde Boas layout 21 Monday, October 24, 11 Properties of van Emde Boas layout 1. Can efficiently compute children/parent of node, given a suitable lookup table. 2. The number of memory transfers at any level of the memory hierarchy in a root-to-leaf traversal is within a factor 2 of optimal. “Cache-oblivious algorithms” make efficient use of caches without considering cache parameters. There exits heaps (such as the funnel heap) that are theoretically better than a binary heap with vEB layout. 22 Monday, October 24, 11 ‣ ‣ ‣ ‣ ‣ API elementary implementations binary heaps heapsort event-driven simulation 23 Monday, October 24, 11 11) exch(1, 10) sink(1, 9) S Heapsort R L E E 11) P A X L Basic plan for in-place Msort.E A A E O E E in arbitrary order 4 A 8 heap construction 11) exch(1, 8) sink(1, 7) S X L 2 8 E E 11) 4 A R 9 5 10 S E E (heap-ordered) Monday, October 24, 11 T A M RP 7 (in L6 Xplace) AE E 10 A L X E T X AE RMM LM L 4 8 M RM S exch(1,ex sink(1,si AAP XRO E E TE EX starting point (heap-ordered) SPO T LO 9 A T 3R EP R SE 5 P SE A ex exch(1,s sink(1, 1S O 2 P AP A SX 10 E TE L ML 6X RO 11E X X sink(3, 11) exch(1, 10) 7A AP A R result (sorted) sink(4, 11) 7 RSE LTP sorted R resultE L XO 11 sink(4, 11) exch(1, 11) sink(1, 10) (in place) RE E OT O S M SE 9 6 EM 3 EX L T SP RM starting point (arbitrary order) sortdown E R 11 A 5 LT build a max-heap E M RP X T SL starting point (arbitrary order) O R S A O exch(1, 2) 11) sink(5, sink(1, 1) P 3 exch(1, 7) sink(5, 11) sink(1, 6) X L O O M T 1 SE 1 2 X T X T exch(1, 3) sink(1, 2) start with array of keys S M P O heap construction E L M S R P A R E L X T • Create max-heap with all N keys. the maximum key. exch(1, 9) •S Repeatedly remove R sink(1, 8) L E R O X E exch(1, 4) sink(1, 3) S S S 24 e exch(1,s starting point (heap-ordered) starting point (arbitrary order) Heapsort: heap construction sink(5, 11) exch(1, 11) sink(1, 10) S O L First pass. Build heap using bottom-up method. for (int k = N/2; k >= 1; k--) sink(a, k, N); sink(4, 11) A X 4 8 5 T 9 S O 10 6 E 3 R X 7 A 11 E L P M starting point (arbitrary order) sink(5, 11) E R R T OO L E A S R X T L O A X SR P E L R T S O E L E SM R A E T O X P A A AA A E X T S P L E SM R E E exch(1, 7) sink(1, 6) exch(1, 4) M E sink(1, 3) XS OE ET EX result (heap-ordered) T E S E M A RE M O R exch(1, 8) sink(1, 7) exch(1, 5) O L sink(1, 4) R LL PO MM E A A X E E L SA R TP A X X S X R LE sink(1, 11) exch(1, 10) sink(1, 9) S L L L P R T P M M O M A X sink(4, 11) T P P E E P M A E X T E exch(1, 9) 8) exch(1,sink(1, 6) M sink(1, 5) P E E P E E O starting point (heap-ordered) sink(2, 11) S M M R L O T S X exch(1, 11)T sink(1, 10) S O T L sortdown sink(3, 11) 2 R O M heap construction 1 S P E E X E E A R exch(1, 10) sink(1, 9) S L P L R T S O M O M A X E E P M P R T T E P T O X P X Heapsort: constructing (left) and sorting down (right) a sink(3, 11) Monday, October 24, 11 O exch(1, 9) sink(1, 8) S X exch(1, 3) sink(1, 2) R P E 25 E A E Heapsort: sortdown Second pass. • • sortdown heap construction 1 S R Remove the maximum, oneT at a Etime. A X 2 3 O 4 5 6 9 10 11 sink(5, 11) L E P M while (N > 1) { sink(4, 11) exch(a, 1, N--); O L T sink(a, 1, N); E P M } E exch(1, 10) sink(1, 9) E exch(1, 9) sink(1, 8) S L E E sink(2, 11) T L P E E sink(1, 11) X L R A 2 T E X 4 X T P 8 R E E O result (heap-ordered) L 9 S A 3 E 5 10 P O 1 Heapsort: constructing (left) and sorting down (right) a heap Monday, October 24, 11 M S E L S E O A R A L R M S A E exch(1, 7) sink(1, 6) X T S P O E X T S M E L E exch(1, 2) sink(1, 1) P M R T M A R E L R O X A E exch(1, 8) sink(1, 7) S X T S P O A X T S M M E L E exch(1, 3) sink(1, 2) R O E L R P A R A E X T E X T P L X T A R P O exch(1, 4) sink(1, 3) S O M O O A X M S R P R E A X E L E A R P O X T S R S L E E exch(1, 5) sink(1, 4) T O E A E E O starting point (heap-ordered) M S A R P A X sink(3, 11) M M R T P L P M L S exch(1, 11) sink(1, 10) S O M T 7 E L P Leave in array, instead ofMstarting null ing out. point (arbitrary order) 8 exch(1, 6) sink(1, 5) X M 6 O E 7 P 11 X T result (sorted) 26 Heapsort: mathematical analysis Proposition. Heap construction uses fewer than 2 N compares and exchanges. Proposition. Heapsort uses at most 2 N lg N compares and exchanges. Significance. In-place sorting algorithm with N log N worst-case. • • • Mergesort: no, linear extra space. in-place merge possible, not practical Quicksort: no, quadratic time in worst case. N log N worst-case quicksort possible, Heapsort: yes! not practical Bottom line. Heapsort is optimal for both time and space, but: • • • Inner loop longer than quicksort’s. Makes poor use of cache memory. Not stable. 27 Monday, October 24, 11 Conclusion Heaps give a more space- and time-efficient implementation of the priority queue ADT. - Yields a very space efficient sorting algorithm. - Cache-efficiency can be achieved by changing the layout. Be careful when deleting items from a heap - can take linear time in some implementations! We will be using priority queues as building blocks in graph algorithms. 28 Monday, October 24, 11