HW #1: Brute Force Search Algorithms Sample Solution Prof. Nathan Sturtevant

Transcription

HW #1: Brute Force Search Algorithms Sample Solution Prof. Nathan Sturtevant
HW #1: Brute Force Search Algorithms
Sample Solution
Prof. Nathan Sturtevant
University of Denver
Denver, Colorado
sturtevant@cs.du.edu
Abstract
This document describes the results for the first homework
assignment in COMP-4704-1, Fall 2010. In particular, we
perform an analysis of three brute-force search algorithms,
BFS, DFS and DFID. We compare the performance of these
algorithms on a grid-based domain, a n-ary tree, and on the
sliding-tile puzzle. Our results confirm the predicted performance from lecture.
Introduction
In this homework we study the performance of brute-force
algorithms. We study breadth-first search (BFS), depth-first
search (DFS) and depth-first iterative deepening (DFID). We
analyze these algorithms on three domains, which we describe in more detail in the next section. Our results confirm
the analysis of the algorithms seen in class and give insight
into which algorithms work best on different types of domains.
Background
This section provides background information about the algorithms and domains analyzed.
Algorithm 1
BFS(sstart )
1: vector
sstart
2: while vector is not empty do
3:
next
pop front from vector
4:
for each successor si of next do
5:
if si is not in CLOSED then
6:
Add si to closed
7:
Put si on the back of vector
8:
end if
9:
end for
10: end while
Algorithm 2
DFS(sstart )
1: vector
sstart
2: while vector is not empty do
3:
next
pop front from vector
4:
for each successor si of next do
5:
Put si on the front of vector
6:
end for
7: end while
Algorithms
The first algorithm analyzed is BFS. A BFS expands nodes
in order of distance (counted by edges, not edge cost) from
the start state. BFS pseudo-code can be found in Algorithm 1. BFS has a simple queue for sorting nodes waiting to
be expanded and has a closed list to perform duplicate detection. As a result, in an exponential domain with branching
factor b and depth d, BFS takes O(bd ) time and space.
The next algorithm analyzed is DFS, which expands states
in order of depth. As a result, DFS is not guaranteed to be
complete in an domain with cycles and may may run out of
stack space in a large domain. Pseudo-code for DFS can be
found in Algorithm 2. Note that the only difference between
DFS and BFS comes in line 5 where the successors are put
on the front of the vector instead of the back. (Compare
with BFS in Algorithm 1 line 7.) If it finds a solution at
depth d, DFS will only use O(d) space. As the algorithm is
not guaranteed to be complete, the running time depends on
the properties of the state space.
Copyright c 2010, Nathan Sturtevant. All rights reserved.
The final algorithm we analyze is DFID. DFID performs
multiple depth-first searches, but bounds each search by a
given depth. For a brute-force search algorithm, DFID is
both time and space optimal, taking O(bd ) time and O(d)
space. Pseudo-code for DFID is in Algorithm 3. We show
the depth-limited DFS using a recursive function, while DFS
and BFS used a queue.
Domains
We use three domains for analysis, a grid-based domain, a nary tree, and on the sliding-tile puzzle. These are illustrated
in Figures 1.
The first domain is a directed graph aligned to a grid with
length and width k. The start location is in the lower-left
hand corner. From each state you can either travel right, up,
or diagonally right and up. This domain grows polynomially
in k; there are k 2 total states. This domain has many short
cycles.
The second domain is an n-ary tree. The example shown
Algorithm 3
DFID(sstart , d)
1: for i = 1 . . . d do
2:
LIMITED DFS(sstart , i, 0)
3: end for
LIMITED DFS(s, d, dcurr )
1: if dcurr
d then
2:
return
3: end if
4: for each successor si of s do
5:
LIMITED DFS(si , d, dcurr + 1)
6: end for
is a binary tree, but any branching factor and depth can be
specified for the tree. This domain grows exponentially in
the depth; there are O(nd ) states for an n-ary tree with depth
d. An n-ary tree has no cycles in it.
The third and final domain is the sliding tile puzzle. In the
sliding-tile puzzle the blank (labelled as 0) can swap with
any adjacent tile. For a sliding-tile puzzle with t tiles, there
are t!/2 possible states. The sliding tile puzzle grows exponentially, and also contains cycles.
size
1
2
3
4
5
6
7
8
9
10
11
12
BFS
1
4
9
16
25
36
49
64
81
100
121
144
DFS
1
6
31
160
841
4,494
24,319
132,864
731,281
4,048,726
22,523,359
125,797,984
DFID
1
14
103
623
3,632
21,074
122,475
713,471
4,164,832
24351,542
142,562,799
835,435,855
Figure 2: Nodes expanded in the grid by BFS, DFS and
DFID with varying grid size.
Algorithm
n-ary(2, 10)
n-ary(3, 10)
n-ary(3, 10)
BFS
2047
88573
1398101
DFS
2047
88573
1398101
DFID
4082
132852
1864129
( b b 1 ) 2 bd
4096
132860
1864135
Figure 3: Nodes expanded in the n-ary tree by various algorithms with varying branching factor.
In this section we report the results of running each algorithm on each domain. Additionally, for BFS we run with
and without duplicate detection. We have chosen to present
results by domain, as this best illustrates the differences between the algorithms.
The n-ary tree, with no cycles, is well suited for algorithms
that do not perform duplicate detection. If we are simply trying to traverse the whole domain, DFS is the best approach,
as it will expand each node exactly once. If we do not know
the depth of the solution, DFID will likely be the best algorithm, as BFS will run out of memory (due to both the open
and closed list).
Grid-based movement
Sliding Tile Puzzle
Experimental Results
In Figure 2 we show the total number of nodes expanded
by BFS, DFS and DFID as we increase the size of the grid
from 1 to 12. The results with BFS are with duplicate detection. Without duplicate detection BFS and DFS expand the
same number of nodes. We note that BFS expands exactly
k 2 nodes for a given grid size. DFS and DFID, however,
turn a polynomial domain into an exponential domain. For
the case of k = 12, the maximum path length in the grid
is 24. Solving for b24 = 125, 797, 984 give an estimate of
b as 2.18. For this domain, the presence of many small cycles means that duplicate detection is essential to efficiently
explore the search space.
N-ary tree
We present results for each algorithm with an n-ary tree in
Figure 3. Because there are no cycles in the tree, BFS and
DFS expand exactly the same number of nodes, regardless of
whether BFS performs duplicate detection. DFID, because it
performs multiple iterations, expands more nodes that both
of the other algorithms. For these parameters, the predicted
number of nodes expanded by BFS and DFS matches exactly. The last column of the table is the theoretical number
of nodes expanded by DFID (( b b 1 )2 bd ). Although the actual number of nodes does not match exactly, the formula
closely predicts the number of nodes expanded by DFID.
In the sliding tile puzzle we were unable to run DFS. Even
when we checked for duplicates on the path explored so far,
our recursive implementation ran out of stack space. This
is because a DFS will recursively explore almost all states
in the state space. DFID, however, is more successful. We
presented detailed sliding-tile puzzle results here.
First, we look at the number of nodes expanded by BFS on
the 3x3 and 3x4 sliding tile puzzles in Figures 4 and 5. In the
left portion of Figure 4 we show the number of states at each
level of the sliding-tile puzzle when we perform duplicate
detection. BFS is easily able to explore the entire state space.
Without duplicate detection we aborted the search after 15ply, as it was obvious that BFS was going to quickly run out
of memory. Note that without duplicate detection, at even
depths the blank will always be on the side of the puzzle,
producing a branching factor of 3.00. At odd depths, the
blank will either be in the middle with a branching factor of
4.0, or in the corner with a branching factor of 2.0. This lets
us compute that the blank will be in the middle 1/3 of the
time and in the corner 2/3 of the time, even though the ratio
of corner to middle states is higher.
Figure 5 has results for the 3x4 sliding tile puzzle. A
memory efficient implementation should be able to finish a
breadth-first search in this domain (with only 239 million
states), however we ran out of memory at approximately
0 1 2
3 4 5
6 7 8
n-ary tree (for n=2)
sliding-tile puzzle
Grid-based domain
Figure 1: Illustration of domains.
depth 31 after expanding 35 million states. Without duplicate detection we again aborted the search at depth 15.
These results are just for BFS. We now compare BFS and
DFID in the 3x3 and 3x4 sliding tile puzzle in Figure 6. With
these results we can see that DFID expands far more nodes
than a BFS because it does not perform duplicate detection.
However, it is able to explore extremely large number of
nodes without running out of memory. After thirty minutes
it was able to finish an iteration of depth 31 with 4 billion
nodes. It took about an hour to complete the depth 32 iteration.. In this case we were not searching for a particular
goal state, so we just allowed the search to continue until 30
minutes ran out. If we were looking for a particular state at
large depths, DFID would be able to solve larger problems
than BFS.
Conclusions
In this homework we have analyzed three brute-force search
algorithms. We have seen that in polynomial domains such
as the grid, BFS is clearly the best algorithm to use, as its duplicate detection keeps the number of explored states from
growing exponentially. DFS works better in exponential domains, but will have trouble finding optimal solutions unless
we know the solution depth already. Additionally, DFS can
run out of stack space when exploring large connected state
spaces. In these instances DFID is the best algorithm to use,
as it will not run out of memory and is guaranteed to find
optimal solutions.
Appendix
Sample code for BFS is found in Figure 7. Other algorithms
are implemented in a similar manner.
Depth
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
BFS (3x3)
1
3
7
15
31
51
90
152
268
420
706
1102
1850
2874
4767
7279
11764
17402
26931
37809
54802
71912
95864
116088
140135
155713
170273
176547
180457
181217
181438
181440
-
DFID (3x3)
1
4
11
26
57
108
199
358
653
1136
1995
3450
6097
10468
18287
31406
55125
94488
165139
283234
496217
850508
1487415
2550294
4465117
7653760
13390043
22955978
40181217
68879028
120521983
206615422
-
BFS (3x4)
1
3
7
16
36
73
136
258
490
921
1702
3094
5588
10030
17884
31783
55998
97800
168967
288855
487218
810424
1326202
2137202
3385213
5270492
8052888
12062610
17683964
25331836
35397636
(memory)
-
DFID (3x4)
1
4
11
27
63
136
273
538
1058
2069
4017
7767
15024
29041
56058
108218
208997
403533
778735
1502836
2901277
5601010
10810394
20863933
40272489
77739799
150053400
289621017
559024410
1079056186
2082809349
4020190533
7759743423
Figure 6: Nodes expanded at each level in the 3x3 and 3x4
sliding-tile puzzle by BFS and DFID.
Depth
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Nodes (Sum)
1
3
7
15
31
51
90
152
268
420
706
1102
1850
2874
4767
7279
11764
17402
26931
37809
54802
71912
95864
116088
140135
155713
170273
176547
180457
181217
181438
181440
Nodes (Level)
1
2
4
8
16
20
39
62
116
152
286
396
748
1024
1893
2512
4485
5638
9529
10878
16993
17110
23952
20224
24047
15578
14560
6274
3910
760
221
2
b (ratio of levels)
inf
2.00
2.00
2.00
2.00
1.25
1.95
1.59
1.87
1.31
1.88
1.38
1.89
1.37
1.85
1.33
1.79
1.26
1.69
1.14
1.56
1.01
1.40
0.84
1.19
0.65
0.93
0.43
0.62
0.19
0.29
0.01
Nodes (Sum)
1
3
9
25
73
201
585
1609
4681
12873
37449
102985
299593
823881
2396745
6591049
Nodes (Level)
1
2
6
16
48
128
384
1024
3072
8192
24576
65536
196608
524288
1572864
4194304
b
inf
2.00
3.00
2.67
3.00
2.67
3.00
2.67
3.00
2.67
3.00
2.67
3.00
2.67
3.00
2.67
Figure 4: Nodes expanded at each level in the 3x3 sliding-tile puzzle with BFS and duplicate detection (left) and without
duplicate detection (right).
Depth
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Nodes (Sum)
1
3
7
16
36
73
136
258
490
921
1702
3094
5588
10030
17884
31783
55998
97800
168967
288855
487218
810424
1326202
2137202
3385213
5270492
8052888
12062610
17683964
25331836
35397636
(memory)
Nodes (Level)
1
2
4
9
20
37
63
122
232
431
781
1392
2494
4442
7854
13899
24215
41802
71167
119888
198363
323206
515778
811000
1248011
1885279
2782396
4009722
5621354
7647872
10065800
(memory)
b (ratio of levels)
inf
2.00
2.00
2.25
2.22
1.85
1.70
1.94
1.90
1.86
1.81
1.78
1.79
1.78
1.77
1.77
1.74
1.73
1.70
1.68
1.65
1.63
1.60
1.57
1.54
1.51
1.48
1.44
1.40
1.36
1.32
-
Nodes (Sum)
1
3
9
26
79
236
719
2169
6595
19956
60591
183560
556933
1688075
5120045
15522426
Nodes (Level)
1
2
6
17
53
157
483
1450
4426
13361
40635
122969
373373
1131142
3431970
10402381
b
inf
2.00
3.00
2.83
3.12
2.96
3.08
3.00
3.05
3.02
3.04
3.03
3.04
3.03
3.03
3.03
Figure 5: Nodes expanded at each level in the 3x4 sliding-tile puzzle with BFS and duplicate detection (left) and without
duplicate detection (right).