Assignment 1, 2nd ed.
Transcription
Assignment 1, 2nd ed.
Umeå University Department of Computing Science Fall 2014 5DA001 Assignment 1 Non-linear least squares v. 2 2014-11-18 5DA001 — Non-linear optimization The deadline for this assignment can be found at: http://www8.cs.umu.se/kurser/5DA001/HT14/timetable.html (Link Planning and Readings on the course homepage.) • The submission should consist of: – The complete report, including a front page with the following information: 1. Your name (and of your colleague if you work in pairs). 2. The course name. 3. The assignment number. 4. Your username(s) at the Department of Computing Science. 5. The version of the submission (in case of re-submissions). – An appendix with the source code. • To simplify feedback, the report, except appendices, must have page numbers and each section should be numbered. • If you submit a report with linked references (e.g. written in LATEX), please verify that references are ok and not “figure ??”. • It should be possible to understand your report without knowing the specification in detail. Thus, it is recommended you start your report with a short summary of the specification. • Your report should be submitted as a pdf file uploaded via the https://www8.cs.umu.se/~labres/py/handin.cgi page, also avaiable via the results link at the bottom left of the course home page. • Furthermore, the source code should be available in a folder called edu/5da001/assN in your home folder, where N is the assignment number. You will probably have to create the folder yourself. • The submitted code should be Matlab-compatible. If you develop your code in Octave, test your code in Matlab before submitting it! • Auxiliary code and data needed for this assignment will be placed at http://www8.cs.umu.se/kurser/5DA001/HT14/assignment1/. 1 Introduction This assignment includes implementation of two or more non-linear optimization algorithms and applying them to two or more application problems. 1.1 Algorithms The optimization algorithms are GN-A Gauss-Newton (Section 3.1.1) with Armijo line search (Section 3.1.2). GN-W Gauss-Newton (Section 3.1.1) with Wolfe line search (Section 3.1.3). LM-P Levenberg-Marquardt with Powell dogleg (Section 3.3). BFGS-W BFGS (Section 3.2) with Wolfe line search (Section 3.1.3). 1.2 Application problems The applications problems are ELLIPSE Fit an ellipse to a number of measured points. The ellipse problem is described in Section 5.1. HOMOGRAPHY Estimate the homography between two planar projections of an object. Described in Section 5.2. RELORIENT Relative orientation of two cameras. Described in Section 5.3. Code for the RELORIENT problem is given. 2 Task You may work alone or in pairs. There is a baseline task if you work alone and additional work (one more algorithm or test problem) if you work in pairs. The baseline task contains an implementation and an investigation part: 2.1 Implementation • Implement the GN-A algorithm. • Implement one optional algorithm • Implement either the ELLIPSE or the HOMOGRAPHY test problems. This includes: – Code for your model function. – Code to plot the model corresponding to a parameter vector x, i.e. 2 ∗ for the ellipse problem, plot both the ellipse points and an illustration of the ellipse, ∗ for the homography problem, plot both the transformed points and an illustration of the homography. This plotting function is central as a visual feedback to you (and me). – Code for the residual/jacobian function. – Code to find initial values from observations only. 2.2 Basic questions For each test problem, answer the following questions: • How many parameters n does the problem have, as a function of the number of points k? • How many parameters n0 are global, i.e. does not depend on the number of points, i.e. what is n for k = 0? • How many observations (elements of the residual vector) does each point generate? • What is the minimum number of points needed in order to obtain a unique solution? That is, for what k is the total number of observations m equal to the number of parameters n? • The redundancy is defined as r = m − n. How many points are needed to have a redundancy r ≥ n0 ? 2.3 2.3.1 Investigation Optimization • Show that your code returns after a maximum of one iteration if x0 is a minimizer for your test problem. • Show the iteration trace for a nice test problem and nice starting approximation. • Show the iteration trace for a difficult test problem. In particular, show how the damping (linesearch or trust-region) works. • Construct an example where the solution gives problems and show how the damping struggles to solve the problem (and possibly fails). Hint: Construct a degenerate problem. 3 • Construct an example where the solution is OK but the starting approximation gives problems. Hint: Pick a starting approximation that corresponds to a degenerate problem. • Show the iteration trace for the RELORIENT test problem with the supplied starting approximation. 2.3.2 Analysis • For the given test data for your problem(s), answer the following questions: – What is the estimated coordinate measurement error, also known as “standard deviation of unit weight” σ0 ? – What is the estimated standard deviations of the global parameters? – Which two parameters have the highest (by absolute value) correlation value? How high is it? – Which point i has the highest redundancy number ri ? – Which point j has the lowest redundancy numbers rj ? – Plot the solutions for the following data sets: ∗ All points. ∗ All points except point i. ∗ All points except point j. Did the solution change more when point i or j was removed? Is this consistent with the redundancy numbers? • For the ellipse problem, pick one of the ellipses given in Appendix B as your given data. 2.4 2.4.1 Code compliance Test problems Your residual/Jacobian function should have the following parameter list: function [r,J,JJ]=function_name(x,...) where x is a column vector with the parameters and r is the residual vector, J is the Jacobian, and JJ is a numeric approximation of the Jacobian. The ellipses (...) indicate that other parameters may be necessary. See the example code in Appendix A. 2.4.2 Optimization methods Your optimization methods should all have the following parameter list function [x,code,n,...]=method_name(fun,x0,convTol,maxIter,params,...) 4 where fun is a function handle to or a string with the name of your residual function, x0 is a column vector with the starting approximation, convTol is the convergence tolerance (scalar), and maxIter (scalar) is the maximum number of iterations allowed. The minimizer (if found) is returned in the column vector x. The status of the optimization is returned in code, where 0 indicates convergence, and -1 failure to converge in the allowed number of iterations. Other failure codes are allowed. The number of iterations needed is returned in the scalar n. The cell array params contain any extra parameters to send to the residual function. For the example in Appendix A, param={t,y}. The calling sequence in your optimization code is f=feval(x,param{:});1 . It is furthermore strongly suggested that the iterates xi are returned as columns of a matrix X. This will enable the optimization code to be kept simple while allowing for later “playback” (plotting, printing) to analyze the iteration sequence. 2.4.3 Result scripts Any result that you refer to in your report should have a corresponding script (i.e. a file with matlab code that is executed without any parameters) that is referred to in the report. For instance, the test that your GN-A algorithm detects that x0 is given a minimizer might be called gna_verify_x0_is_minimizer or test_4_1 (if the test is presented in Section 4.1 of your report). 2.4.4 Repeatability It is important that any test that uses random errors should be repeatable. This can be achieved by resetting the random number generator via the command rng(n), where n is some integer, before calling randn to generate the random numbers. You should probably test that your code works with different values of n as well. 2.5 Hints • Do not forget the 1/2 in the least squares linesearch algorithm! • How do you expect the damping to behave, far from and near the solution, respectively, for a nice problem? For instance, do you expect a small (1) or large (= 1) step length near the solution? • One way to generate a suitably difficult test case is to: 1. Pick an x∗ that is close to a degenerate solution. 1 You may choose to “hide” the extra parameters via a function declaration like fun=@(x)ellipse(x)-d; before calling the optimization method, but your optimization code should still be able to handle functions that do take extra parameters. 5 2. Generate k points pi that satisfy your model exactly (points on the ellipse or generated from the homography), 3. Generate simulated measurement points qi by adding random errors. The code e=sigma*randn(size(p)); q=p+e; will add independent, normally distributed errors with standard deviation sigma to each of your measurements. Furthermore, removing the projection of e into the range space of J(x∗ ) will maintain the same x∗ (for sanity checks). 3 Optimization algorithms 3.1 3.1.1 Gauss-Newton The Gauss-Newton method Implement the Gauss-Newton minimization method for unconstrained non-linear least squares problems. Use kJpk ≤ (1 + krk) as the convergence criteria. See Section 4 for suggested constant values. Suggested additional input parameters: • Parameters (constants) needed by the line search. Suggested additional output parameters: • A matrix with all iterates xi as columns. Useful for analyzing the algorithm. • A vector with all step lengths αi . Useful for analyzing the line search. • The error code -2 could be used to indicate that the line search could not find a suitable step length. 3.1.2 Armijo line search with backtracking Implement a line search algorithm from a point xk along a search direction pk that uses the Armijo condition with backtracking α = 1, 12 , 14 , . . .. Suggested input parameters: • The name of the function calculating the residual r. • The current point xk . • The current search direction pk . • Shortest acceptable step length αmin . Necessary to guarantee termination of the step length algorithm. • Parameter(s) necessary to test the Armijo condition. 6 Suggested output parameters: • The accepted step length αk . If no such step length exists, αk = 0 should be returned. 3.1.3 The Bracket-Zoom Wolfe line search Implement the line search algorithm described in the textbook, chapter 3.5, algorithms 3.5-3.6. Suggested input parameters: • The name of the function calculating the residual r and Jacobian J. • The current point xk . • The current search direction pk . • Acceptable step length interval αmin , αmax . Necessary to guarantee termination of the step length algorithm. • Parameter(s) necessary to test the Wolfe condition. Suggested output parameters: • The accepted step length αk . If no such step length exists, αk = 0 should be returned. 3.2 The BGFS method with Wolfe line search Implement the BFGS method with Wolfe line search (textbook chapter 6.1, algorithm 6.1). Suggested additional input parameters: • Parameters (constants) needed by the line search. Suggested additional output parameters: • A vector with all step lengths αi . For analyzing the line search. 3.3 Levenberg-Marquardt with Powell dogleg Implement the Levenberg-Marquardt method. Use the dogleg algorithm to solve the subproblem min ψk (p) p = 1 f (xk ) + ∇f (xk )T p + pT ∇2 f (xk )p 2 s.t. kpk ≤ ∆k , where ∇2 f (xk ) is approximated by J(xk )T J(xk ). Use kJpGN k ≤ (1 + krk) as the convergence criteria, where pGN is the Gauss-Newton search direction. See Section 4 for suggested constant values. Suggested additional input parameters: 7 • The inital trust-region size ∆0 . Suggested additional output parameters: • Vectors with search direction lengths (kpk k), gain ratios (ρk ), and trustregion sizes ∆k . Useful for analyzing the global strategy. 4 Common constants Suggested constants: Convergence tolerance = 10−8 , maxIter=50, c1 = 1e − 4, c2 = 0.9, αmin = 10−3 , αmax = 10, ∆0 = 1 or ∆0 = 10−6 , η = 0.25. 5 5.1 Application problems Ellipse fitting Problem: An oblique projection of a sphere becomes an ellipse. The problem of determining the position of the sphere in 3D, may be formulated to contain the following subproblem: Model function: A point p = (x, y)T on an ellipse with center (cx , cy ), semimajor axis length a, semi-minor axis length b and inclination δ has coordinates x cx cos δ − sin δ a cos θ = h(θ) = + y cy sin δ cos δ b sin θ for some value of the “phase angle” θ. Problem: Given m measured points p̃ = (x̃, ỹ)T , find the parameters of the ellipse which is closest to all points, as measured by the Eucledian distance, i.e. solve the problem r1 (x) 1 min r(x)T r(x), where r(x) = ... and ri (x) = h(θi ) − p̃i x 2 rm (x) T for the unknowns x = cx cy a b δ θ1 . . . θm . On application the uses the ellipse fitting is in Radiostereometry (RSA), where the spherical head of a hip joint prosthesis is projected in two X-ray images. See figures 1 and 2. 5.1.1 Example data See Appendix B for real data for this problem. 8 [ Figure 1: The projection (right) of the spherical head of the hip joint (left) by two X-ray tubes generates two elliptical projections. f a b d xoy Figure 2: The parameters of the projected ellipse. 9 5.2 Planar projective transformation (homography) If a plane is viewed from an oblique angle, the coordinates p = (x, y)T in the plane is transformed according to a homography: " # 0 a11 x+a12 y+a13 x a31 x+a32 y+1 = h(p) = a21 x+a22 y+a23 . y0 a31 x+a32 y+1 Problem: Given a number of measured 2d coordinates p̃ = (x̃, ỹ)T and corresponding known coordinates p = (x, y)T , determine the parameters aij of the homography, i.e. solve the problem r1 (x) 1 min r(x)T r(x), where r(x) = ... and ri (x) = h(pi ) − p̃i x 2 rm (x) T for the unknowns x = a11 a12 a13 a21 a22 a23 a31 a32 . The matrix a11 a12 a13 A = a21 a22 a23 , a31 a32 1 describes the homography between the two images. Figure 3: The image coordinates p̃i are measured in the left image. The corresponding “true” coordinates pi are here assumed to be the corners of the unit square. By calculating the homography A between pi and p̃i , we can “rectify” the image (right). The following Matlab code generates the rectified image: T=maketform(’projective’,A’); I2=imtransform(I1,T,’XData’,[-1,2],’YData’,[-1,2],’XYScale’,0.01); 5.3 Relative orientation of two cameras Code will be available for this problem. The text is to increase your understanding of the problem. 10 5.3.1 Point projection Assume we have a camera with known internal orientation, i.e. the focal length f and the principal point (x0p , yp0 )T (optical center of image) is known. Ignoring the effects of lens distortion, the relationship between a 3d object point p = (x, y, z)T and its projected 2d image coordinates q = (x0 , y 0 )T is described by the collinearity equations x0 − x0p y 0 − yp0 m11 (x − xc ) + m12 (y − yc ) + m13 (z − zc ) , m31 (x − xc ) + m32 (y − yc ) + m33 (z − zc ) m21 (x − xc ) + m22 (y − yc ) + m23 (z − zc ) = −f , m31 (x − xc ) + m32 (y − yc ) + m33 (z − zc ) = −f where the optical center of the camera is placed at world coordinates (xc , yc , zc )T and the rotation matrix m11 m12 m13 M = m21 m22 m23 m31 m32 m33 describes the orientation of the camera with respect to the world coordinate system. The rotation matrix may be parameterized in many ways. For this assignment, assume M is parameterized by the x-y-z (roll-pitch-yaw) Euler angles, i.e. 1 0 0 sin ω , M = Mκ M φ Mω , Mω = 0 cos ω 0 − sin ω cos ω cos φ 0 − sin φ cos κ sin κ 0 1 0 , Mφ = 0 Mκ = − sin κ cos κ 0 . sin φ 0 cos φ 1 0 0 With substitutions xc U x V = M y − yc zc W z the collinearity equations becomes 0 0 U x −fW , x q = 0 = h(p) = 0p V y . yp − f W 5.3.2 Relative orientation Assume we have two cameras with known internal orientation and we have measured corresponding points q̃ 1 = (x0 , y 0 )T and q̃ 2 = (x0 , y 0 )T in two images of the same object point p = (x, y, z)T . The projection in each cameras satisfy the collinearity condition, with different camera centers and orientation, i.e. we 11 have twelve degrees of freedom. By locking seven degrees of freedom we can determine the relative orientation of the two cameras and the three-dimension position of the object points (with respect to the cameras). One solution is to put camera one at the origin, aligned with the world coordinate system, i.e. (xc1 , yc 1 , zc1 )T = (0, 0, 0)T , (ω1 , φ1 , κ1 )T = (0, 0, 0)T , and camera two at a fixed distance along the x-axis, i.e. xc2 = bX . Problem: Given m pairs of points with unknown object coordinates pi = (xi , yi , zi )T and measured image coordinates q̃i1 = (x̃0i , ỹi0 )T and q̃i2 = (x̃0i , ỹi0 )T , determine the relative orientation of the cameras and the object positions pi , i.e. solve the problem 1 r1 (x) .. . 1 rm (x) 1 T min r(x) r(x), where r(x) = 2 x 2 r1 (x) . .. 2 (x) rm and ri1 (x) = h(pi ) − q̃i1 is the residual in the first image, and ri2 (x) = h(pi ) − q̃i2 is the residual in the second image. The unknowns to solve for are x = yc 2 5.4 zc2 ω2 φ2 κ2 p1 ... pm T . Application The problem application is when you have a number of measurements in two or more images. See figures 4 and 5. 12 4 30 24 29 25 26 5 27 28 31 13 12 15 14 Figure 4: Overview of calculated camera positions and object points for the Zürich City Hall data set. Figure 5: Result of relative orientation of cameras 5 and 14. 13 A Example of a problem function function [r,J,JJ]=antelope_r(x,t,y) %ANTELOPE_R Residual/jacobian function for the antelope problem. % % R=ANTELOPE_F(X,T,Y) returns the residual vector for the exponential % antelope population model with parameters X=[K1;K2] and observations Y % taken at time T. If Y and T are M-by-1 vectors, R will be returned as % an M-by-1 vector. % % The antelope population model and its residual are calculated as % % M(X; T) = K1 * EXP( K2 * T ), % % R(X) = M(X; T) - Y. % % [R,J]=... also returns the analytical Jacobian J of R(X). % % [R,J,JJ]=... returns JJ as the numerical approximation of J, as % calculated by JACAPPROX. Useful for debugging the implementation of J. % %See also: JACAPPROX. % $Id: antelope_r.m 1226 2014-11-13 13:56:10Z niclas $ % Compute the residual. r=x(1)*exp(x(2)*t)-y; if nargout>1 % Compute the analytical Jacobian only if asked to. Avoid unnecessary % calculations when only the residual is wanted. J=[exp(x(2)*t), x(1)*t.*exp(x(2)*t)]; end if nargout>2 % Compute the numerical jacobian only if asked to. IMPORTANT to avoid % an infinite recursive loop via JACAPPROX. JJ=jacapprox(mfilename,x,1e-6,{t,y}); end 14 (a) Image 2 (b) Image 1 Figure 6: Example images with point sets on three different ellipses. The first point set (blue) contain points on the surface of the femoral head. The second point set (red) contain points on the surface of the hemispherical backshell of the cup. The third point set (green) contain points on the opening of the cup. B Ellipse data The supplied function http://www8.cs.umu.se/kurser/5DA001/HT14/assignments/assignment1/code/ellipse_data.m can be called to get real ellipse points, see help ellipse_data. The image and object numbers are illustrated in Figure 6. The image files are available at http://www8.cs.umu.se/kurser/5DA001/HT14/assignments/assignment1/images. 15