Homework I, Advanced algorithms 2014
Transcription
Homework I, Advanced algorithms 2014
Homework I, Advanced algorithms 2014 Before you start: 1. The deadlines in this course are strict. This homework set is due one October 17 at 13.00 and should be delivered on paper in the mail slot of Johan Håstad on level 4 on Lindstedtsvägen 3. 2. This homework is supposed to be done individually. 3. Note that in problems with subproblem, the first number given is the total number of points for the problem and later there is information how this total is distributed over the subproblems. 4. When asked to solve a computational problem by hand please submit, in some form or other, the useful calculations that lead to the answer. 5. Unless explicitly instructed to do so, you are not supposed to search for the answer to a problem on the Internet. You are of course allowed to look for general information on the Internet. Let us give two clarifying examples. For the problem on Chinese Remainder Theorem you can of course study information on the Chinese Remainder Theorem in general. We hope it is equally clear to you that on the problem of finding a lower bounds on the number comparisons to compute the median, you are not supposed to look for sources on the Internet proving exactly this fact. If you are in doubt what you can do, contact Johan Håstad or one of the teaching assistants. The problems are given in no particular order. If something seems wrong, then visit http://www.csc.kth.se/utbildning/kth/kurser/DD2440/avalg14/homework to see if any errata was posted. If this does not help, then email johanh@kth.se. Don’t forget to prefix your email subject with Avalg14. 1 (8p) As we have discussed in class, a generator in Zp is a number g such that the powers of g give all non-zero elements of Zp . For instance 3 is a generator in Z7 as its powers are (in order) 3,2,6,4,5,1. Find all generators in Z31 and for each found generator g find the discrete logarithm of 2. In other words solve g y ≡ 2 for each g. You are supposed to do this by hand. Hint: If you want to save yourself computational work, finding one generator and all its powers in order should be a long way towards both finding all the generators and the discrete logs of 2. 2 n (8p) We proved in class that given a set, S, of N pairwise distinct strings (xi )N i=1 in {0, 1} , and a random matrix H of size m × n then the number of pairs (xi , xj ) with i 6= j such that Hxi = Hxj (note that these are m-bit strings) is expected to be N (N − 1)2−1−m . In particular if m ≥ log N we are likely to have a reasonably good hash function. This is not always the case 2a (2p) Construct one bad example in the form of a set S and a non-zero matrix H such that all strings in S hash onto the same value. 2b (3p) Construct a bad set S such that for many matrices H, all strings in S map to the same string. Finding the maximally bad S is needed for a full score but you need not make a formal proof that it is optimal. Page 1 (of 2) Advanced algorithms • Fall 2014 Johan Håstad 2c (3p) Do give a formal proof that the S you constructed in the previous sub-problem is the worst possible in this respect. 3 (8p) Let P4 be the first four digits of your personal number (i.e. YYMM, the year and month you were born) and let p be the smallest prime larger than P4 and q the smallest prime larger than p. You may find p and q by computer but the following you should do by hand. Find a number x such that 4711 mod p x≡ 1 mod q 4 (8p) Your task is to study a double hash1 table. In the file http://www.csc.kth.se/utbildning/ kth/kurser/DD2440/avalg14/homework/input you find 222 entries each a bit-string of length 64 written in hex. In fact they are sorted. 4a (5p) Construct a double hash table for this set of data. In particular construct a hash function H given by a 22 × 64 matrix such that the total number of collisions under H is bounded by 222 . Then for each i such that at least three elements map to i under H construct a hash function Hi mapping 64 bits to the fewest number of bits possible to make all x such Hx = i map to distinct values under Hi . Do not only give the resulting functions but give a short account of your efforts. In particular report statistics on the number of Hi needed and how many of each number of output bits. Please make the description of the functions used available in a directory with public access (and specify its location in your solution). The first 22 lines should contain the rows of the matrix H (each as a 16 hex characters). Then for each i with at least three preimages give first a line with i and the number si of the number of output bits and then si lines with the rows of the corresponding matrix. Hint: To compute this set of hash functions it might be useful to use the machine-operation that takes the bit-wise and of two machine words. 4b 5 (3p) Discuss if this double hash table would be your favorite way to do search queries in this data set. What method would you use and why? (8p) In class we proved that if we are in a comparison model then n − 1 comparisons are needed to find the median in a set of n inputs. Your task is to, for odd n, improve this bound to (3n − 3)/2 assuming that the algorithm takes (n − 1)/2 disjoint pairs and starts by making the comparisons given by these pairs. In other words you need to prove that the algorithms needs to make an additional n − 1 comparisons after this start. Hint: Prove that if the median is the element not taking part in any of the initial comparisons then you might need n − 1 more comparison to verify this fact. 1 This refers to double hasing as described in Section 20.3 in the lecture notes by Håstad. The most common notion of double hashing is something else and the lectures notes should be updated. Page 2 (of 2) Advanced algorithms • Fall 2014 Johan Håstad