HW #1: Brute Force Search Algorithms Sample Solution Prof. Nathan Sturtevant
Transcription
HW #1: Brute Force Search Algorithms Sample Solution Prof. Nathan Sturtevant
HW #1: Brute Force Search Algorithms Sample Solution Prof. Nathan Sturtevant University of Denver Denver, Colorado sturtevant@cs.du.edu Abstract This document describes the results for the first homework assignment in COMP-4704-1, Fall 2010. In particular, we perform an analysis of three brute-force search algorithms, BFS, DFS and DFID. We compare the performance of these algorithms on a grid-based domain, a n-ary tree, and on the sliding-tile puzzle. Our results confirm the predicted performance from lecture. Introduction In this homework we study the performance of brute-force algorithms. We study breadth-first search (BFS), depth-first search (DFS) and depth-first iterative deepening (DFID). We analyze these algorithms on three domains, which we describe in more detail in the next section. Our results confirm the analysis of the algorithms seen in class and give insight into which algorithms work best on different types of domains. Background This section provides background information about the algorithms and domains analyzed. Algorithm 1 BFS(sstart ) 1: vector sstart 2: while vector is not empty do 3: next pop front from vector 4: for each successor si of next do 5: if si is not in CLOSED then 6: Add si to closed 7: Put si on the back of vector 8: end if 9: end for 10: end while Algorithm 2 DFS(sstart ) 1: vector sstart 2: while vector is not empty do 3: next pop front from vector 4: for each successor si of next do 5: Put si on the front of vector 6: end for 7: end while Algorithms The first algorithm analyzed is BFS. A BFS expands nodes in order of distance (counted by edges, not edge cost) from the start state. BFS pseudo-code can be found in Algorithm 1. BFS has a simple queue for sorting nodes waiting to be expanded and has a closed list to perform duplicate detection. As a result, in an exponential domain with branching factor b and depth d, BFS takes O(bd ) time and space. The next algorithm analyzed is DFS, which expands states in order of depth. As a result, DFS is not guaranteed to be complete in an domain with cycles and may may run out of stack space in a large domain. Pseudo-code for DFS can be found in Algorithm 2. Note that the only difference between DFS and BFS comes in line 5 where the successors are put on the front of the vector instead of the back. (Compare with BFS in Algorithm 1 line 7.) If it finds a solution at depth d, DFS will only use O(d) space. As the algorithm is not guaranteed to be complete, the running time depends on the properties of the state space. Copyright c 2010, Nathan Sturtevant. All rights reserved. The final algorithm we analyze is DFID. DFID performs multiple depth-first searches, but bounds each search by a given depth. For a brute-force search algorithm, DFID is both time and space optimal, taking O(bd ) time and O(d) space. Pseudo-code for DFID is in Algorithm 3. We show the depth-limited DFS using a recursive function, while DFS and BFS used a queue. Domains We use three domains for analysis, a grid-based domain, a nary tree, and on the sliding-tile puzzle. These are illustrated in Figures 1. The first domain is a directed graph aligned to a grid with length and width k. The start location is in the lower-left hand corner. From each state you can either travel right, up, or diagonally right and up. This domain grows polynomially in k; there are k 2 total states. This domain has many short cycles. The second domain is an n-ary tree. The example shown Algorithm 3 DFID(sstart , d) 1: for i = 1 . . . d do 2: LIMITED DFS(sstart , i, 0) 3: end for LIMITED DFS(s, d, dcurr ) 1: if dcurr d then 2: return 3: end if 4: for each successor si of s do 5: LIMITED DFS(si , d, dcurr + 1) 6: end for is a binary tree, but any branching factor and depth can be specified for the tree. This domain grows exponentially in the depth; there are O(nd ) states for an n-ary tree with depth d. An n-ary tree has no cycles in it. The third and final domain is the sliding tile puzzle. In the sliding-tile puzzle the blank (labelled as 0) can swap with any adjacent tile. For a sliding-tile puzzle with t tiles, there are t!/2 possible states. The sliding tile puzzle grows exponentially, and also contains cycles. size 1 2 3 4 5 6 7 8 9 10 11 12 BFS 1 4 9 16 25 36 49 64 81 100 121 144 DFS 1 6 31 160 841 4,494 24,319 132,864 731,281 4,048,726 22,523,359 125,797,984 DFID 1 14 103 623 3,632 21,074 122,475 713,471 4,164,832 24351,542 142,562,799 835,435,855 Figure 2: Nodes expanded in the grid by BFS, DFS and DFID with varying grid size. Algorithm n-ary(2, 10) n-ary(3, 10) n-ary(3, 10) BFS 2047 88573 1398101 DFS 2047 88573 1398101 DFID 4082 132852 1864129 ( b b 1 ) 2 bd 4096 132860 1864135 Figure 3: Nodes expanded in the n-ary tree by various algorithms with varying branching factor. In this section we report the results of running each algorithm on each domain. Additionally, for BFS we run with and without duplicate detection. We have chosen to present results by domain, as this best illustrates the differences between the algorithms. The n-ary tree, with no cycles, is well suited for algorithms that do not perform duplicate detection. If we are simply trying to traverse the whole domain, DFS is the best approach, as it will expand each node exactly once. If we do not know the depth of the solution, DFID will likely be the best algorithm, as BFS will run out of memory (due to both the open and closed list). Grid-based movement Sliding Tile Puzzle Experimental Results In Figure 2 we show the total number of nodes expanded by BFS, DFS and DFID as we increase the size of the grid from 1 to 12. The results with BFS are with duplicate detection. Without duplicate detection BFS and DFS expand the same number of nodes. We note that BFS expands exactly k 2 nodes for a given grid size. DFS and DFID, however, turn a polynomial domain into an exponential domain. For the case of k = 12, the maximum path length in the grid is 24. Solving for b24 = 125, 797, 984 give an estimate of b as 2.18. For this domain, the presence of many small cycles means that duplicate detection is essential to efficiently explore the search space. N-ary tree We present results for each algorithm with an n-ary tree in Figure 3. Because there are no cycles in the tree, BFS and DFS expand exactly the same number of nodes, regardless of whether BFS performs duplicate detection. DFID, because it performs multiple iterations, expands more nodes that both of the other algorithms. For these parameters, the predicted number of nodes expanded by BFS and DFS matches exactly. The last column of the table is the theoretical number of nodes expanded by DFID (( b b 1 )2 bd ). Although the actual number of nodes does not match exactly, the formula closely predicts the number of nodes expanded by DFID. In the sliding tile puzzle we were unable to run DFS. Even when we checked for duplicates on the path explored so far, our recursive implementation ran out of stack space. This is because a DFS will recursively explore almost all states in the state space. DFID, however, is more successful. We presented detailed sliding-tile puzzle results here. First, we look at the number of nodes expanded by BFS on the 3x3 and 3x4 sliding tile puzzles in Figures 4 and 5. In the left portion of Figure 4 we show the number of states at each level of the sliding-tile puzzle when we perform duplicate detection. BFS is easily able to explore the entire state space. Without duplicate detection we aborted the search after 15ply, as it was obvious that BFS was going to quickly run out of memory. Note that without duplicate detection, at even depths the blank will always be on the side of the puzzle, producing a branching factor of 3.00. At odd depths, the blank will either be in the middle with a branching factor of 4.0, or in the corner with a branching factor of 2.0. This lets us compute that the blank will be in the middle 1/3 of the time and in the corner 2/3 of the time, even though the ratio of corner to middle states is higher. Figure 5 has results for the 3x4 sliding tile puzzle. A memory efficient implementation should be able to finish a breadth-first search in this domain (with only 239 million states), however we ran out of memory at approximately 0 1 2 3 4 5 6 7 8 n-ary tree (for n=2) sliding-tile puzzle Grid-based domain Figure 1: Illustration of domains. depth 31 after expanding 35 million states. Without duplicate detection we again aborted the search at depth 15. These results are just for BFS. We now compare BFS and DFID in the 3x3 and 3x4 sliding tile puzzle in Figure 6. With these results we can see that DFID expands far more nodes than a BFS because it does not perform duplicate detection. However, it is able to explore extremely large number of nodes without running out of memory. After thirty minutes it was able to finish an iteration of depth 31 with 4 billion nodes. It took about an hour to complete the depth 32 iteration.. In this case we were not searching for a particular goal state, so we just allowed the search to continue until 30 minutes ran out. If we were looking for a particular state at large depths, DFID would be able to solve larger problems than BFS. Conclusions In this homework we have analyzed three brute-force search algorithms. We have seen that in polynomial domains such as the grid, BFS is clearly the best algorithm to use, as its duplicate detection keeps the number of explored states from growing exponentially. DFS works better in exponential domains, but will have trouble finding optimal solutions unless we know the solution depth already. Additionally, DFS can run out of stack space when exploring large connected state spaces. In these instances DFID is the best algorithm to use, as it will not run out of memory and is guaranteed to find optimal solutions. Appendix Sample code for BFS is found in Figure 7. Other algorithms are implemented in a similar manner. Depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 BFS (3x3) 1 3 7 15 31 51 90 152 268 420 706 1102 1850 2874 4767 7279 11764 17402 26931 37809 54802 71912 95864 116088 140135 155713 170273 176547 180457 181217 181438 181440 - DFID (3x3) 1 4 11 26 57 108 199 358 653 1136 1995 3450 6097 10468 18287 31406 55125 94488 165139 283234 496217 850508 1487415 2550294 4465117 7653760 13390043 22955978 40181217 68879028 120521983 206615422 - BFS (3x4) 1 3 7 16 36 73 136 258 490 921 1702 3094 5588 10030 17884 31783 55998 97800 168967 288855 487218 810424 1326202 2137202 3385213 5270492 8052888 12062610 17683964 25331836 35397636 (memory) - DFID (3x4) 1 4 11 27 63 136 273 538 1058 2069 4017 7767 15024 29041 56058 108218 208997 403533 778735 1502836 2901277 5601010 10810394 20863933 40272489 77739799 150053400 289621017 559024410 1079056186 2082809349 4020190533 7759743423 Figure 6: Nodes expanded at each level in the 3x3 and 3x4 sliding-tile puzzle by BFS and DFID. Depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Nodes (Sum) 1 3 7 15 31 51 90 152 268 420 706 1102 1850 2874 4767 7279 11764 17402 26931 37809 54802 71912 95864 116088 140135 155713 170273 176547 180457 181217 181438 181440 Nodes (Level) 1 2 4 8 16 20 39 62 116 152 286 396 748 1024 1893 2512 4485 5638 9529 10878 16993 17110 23952 20224 24047 15578 14560 6274 3910 760 221 2 b (ratio of levels) inf 2.00 2.00 2.00 2.00 1.25 1.95 1.59 1.87 1.31 1.88 1.38 1.89 1.37 1.85 1.33 1.79 1.26 1.69 1.14 1.56 1.01 1.40 0.84 1.19 0.65 0.93 0.43 0.62 0.19 0.29 0.01 Nodes (Sum) 1 3 9 25 73 201 585 1609 4681 12873 37449 102985 299593 823881 2396745 6591049 Nodes (Level) 1 2 6 16 48 128 384 1024 3072 8192 24576 65536 196608 524288 1572864 4194304 b inf 2.00 3.00 2.67 3.00 2.67 3.00 2.67 3.00 2.67 3.00 2.67 3.00 2.67 3.00 2.67 Figure 4: Nodes expanded at each level in the 3x3 sliding-tile puzzle with BFS and duplicate detection (left) and without duplicate detection (right). Depth 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 Nodes (Sum) 1 3 7 16 36 73 136 258 490 921 1702 3094 5588 10030 17884 31783 55998 97800 168967 288855 487218 810424 1326202 2137202 3385213 5270492 8052888 12062610 17683964 25331836 35397636 (memory) Nodes (Level) 1 2 4 9 20 37 63 122 232 431 781 1392 2494 4442 7854 13899 24215 41802 71167 119888 198363 323206 515778 811000 1248011 1885279 2782396 4009722 5621354 7647872 10065800 (memory) b (ratio of levels) inf 2.00 2.00 2.25 2.22 1.85 1.70 1.94 1.90 1.86 1.81 1.78 1.79 1.78 1.77 1.77 1.74 1.73 1.70 1.68 1.65 1.63 1.60 1.57 1.54 1.51 1.48 1.44 1.40 1.36 1.32 - Nodes (Sum) 1 3 9 26 79 236 719 2169 6595 19956 60591 183560 556933 1688075 5120045 15522426 Nodes (Level) 1 2 6 17 53 157 483 1450 4426 13361 40635 122969 373373 1131142 3431970 10402381 b inf 2.00 3.00 2.83 3.12 2.96 3.08 3.00 3.05 3.02 3.04 3.03 3.04 3.03 3.03 3.03 Figure 5: Nodes expanded at each level in the 3x4 sliding-tile puzzle with BFS and duplicate detection (left) and without duplicate detection (right).