Smooth generic camera calibration for optical
Transcription
Smooth generic camera calibration for optical
Smooth generic camera calibration for optical metrology – the concept Alexey Pak Fraunhofer Institute of Optronics, System Technologies and Image Processing (IOSB) Vision and Fusion Lab (IES), Informatics Department, Karlsruhe Institute of Technology KIT Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 Digital camera as a metrological instrument • Modern cameras are fast, inexpensive, use non-coherent light • Typical parameters: 3 colors @ 256 intensity levels (8-bit value resolution), O(103) lines / field of view of O(10) degrees (angular resolution of O(10-4) rad) • Shape measurement: laser triangulation, deflectometry, shape-from-X, … • Modern methods aim at high precision (down to O(10) nm), use complex optical schemes, require non-trivial data processing Need metrological-quality camera model and adequate calibration tools! Camera Illumination Studied object Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 2 Types of camera calibration acc. to Hanning ’11 Photometric calibration • Goal: find relation between the intensity of light incident on a sensor element and the output (digital) pixel value • EMVA 1288 standard, calibrated once by manufacturer • Not considered in this talk Abbildung 3: Bildentstehung (links) und mathematisches Kameramodell (rechts) (Quelle: EMVA 1288) Extrinsic geometrical calibration • Goal: establish camera position and orientation with respect 16. GMA/ITG-Fachtagung Sensoren und Messsysteme 2012 to the global system of coordinates (SoC) Global SoC • 6 parameters: SoC origin and three rotation angles Intrinsic geometrical calibration • Goal: characterize the imaging geometry in the camera’s own SoC • No “standard” camera model, always model-dependent parameters Note: • Extrinsic and intrinsic calibration are not independent • Both need to be performed after any camera/lens adjustment Ideal geometrical calibration: universal, stable, non-degenerate, unbiased model; simple calibration procedure; clear characterization of the resulting uncertainty. Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 3 37 Generic camera model (assume geometrical optics) Imaging model in camera’s SoC may be specified in terms of the two mappings: Direct mapping (3D point to sensor point): 2D sensor space p = (u, v)T Real 3D space Camera Sensor limits Observed object point Inverse mapping (sensor point to set of 3D points): 2D sensor space p = (u, v)T Real 3D space All objects on the ray {o, r}project to point π on sensor Camera Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 4 Generic camera model (further details) An image is produced by sampling the sensor space at discrete points: 2D Sensor space p i ,i =1,..., N • The sensor space parameterization, grid geometry, and the region boundaries may be chosen arbitrarily (e.g., (u, v) in [0, 1]2) • It makes sense to exploit the natural (physical) sensor continuity and layout Images are always blurred (i.e. have finite sharpness): Some distribution of intensity; π understood as its central point Some spatial distribution of the light field (beam); ray {o, r} is its central axis ¶f (p ) ¶p Camera p = (u, v)T Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 5 Global and local ambiguities in camera models The choice of the camera’s SoC is arbitrary: Rotation by ΔR, translation by Δt New intrinsic parameters: • A 6-dimensional global symmetry Global SoC Ray specification via direction and origin is ambiguous: • The coefficients α, β may be arbitrary functions of u, v • An ∞-dimensional local symmetry "a, b Î R Any calibration model needs a recipe to fix those freedoms! Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 6 What kind of data can be used for calibration? e.g. Zhang ‘00 OpenCV calibration: de-facto standard in Computer Vision • Produce images of a flat calibration pattern from several different camera poses • Pattern contains a set of recognizable features at known (x, y, z) locations Several (a priori unknown) camera poses Camera images Easily performed with a printed pattern • Extract features: (u, v) positions corresponding to known (x, y, z) points • Only a sparse set of features can be extracted • Accuracy of feature extraction is typically unknown and algorithm-dependent Is there anything better? Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 7 What kind of data can be used for calibration? e.g. Sturm, Ramalingam ‘03 Phase-shift coding, active patterns, … : • From several different camera poses, produce images of a sequenceFlat screens: stable, accurate modulation in of calibration patterns displayed on e.g. LCD screen O(106) points over O(1 • Sequence of values encodes (x, y, z) position of each point m2), O(103) intensity Several (a priori unknown) camera poses Sequences of camera images levels Performed with inexpensive flat screens, stable camera mounts • • • • Decoding: for each camera pixel (u, v), recover corresponding (x, y, z) Dense field of decoded data, uncertainty can be accurately quantified Proper choice of patterns makes the method robust against blur, distortions Already used in metrology (cf. deflectometry, pattern projection) Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 8 Pattern decoding and uncertainty quantification (1) Cosine phase-shifted active patterns: T i =0 i =1 i – pattern index, A – screen brightness, B – screen contrast, T – spatial period … i = N -1 x Displayed gray values at some position x: gi (x) = A+ B× cos [j (x)+ yi ], where j (x) = 2p x and 2p i yi = T N Camera at some its pixel y observes: vi (y) = C+ D× gi (x) Allow some constant linear transform with unknown parameters C and D, local to pixel y x is recovered modulo spatial period T: Decoding: recover x from the observed gray values N-1 a = å vi × sin yi , i=0 N-1 b b = å vi × cos yi , tan [j (x)] = - , a i=0 Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 T -1 é b ù x= tan ê- ú + m×T, ë aû 2p mÎ Z 9 Pattern decoding and uncertainty quantification (2) Cosine phase-shifted active patterns: T i =0 i =1 i – pattern index, A – screen brightness, B – screen contrast, T – spatial period … i = N -1 x All possible decoded values of x Disambiguation of recovered values: use multiple spatial frequencies Typical approach: merge data by finding the closest decoded positions corresponding to different frequencies (cf. ambiguity in multi-wavelength interferometry) All possible decoded values of x Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 Can we do better? 10 Pattern decoding and uncertainty quantification (3) Cosine phase-shifted active patterns: T i =0 i =1 i – pattern index, A – screen brightness, B – screen contrast, T – spatial period … i = N -1 x Decoding: recover position x from the observed gray values at camera pixel y T -1 é b ù tan ê- ú + m×T a = å vi × sin yi , b = å vi × cos yi , x = ë aû 2p i=0 i=0 N-1 N-1 May also determine: • Effective contrast at camera pixel y: • Error in the recovered position x: 2 B = a2 + b2 N * dx = T 2p 2 dv × * N B If contrast is small, expect poor decoding We expect that camera pixel values are obtained with some a priori known uncertainty δv (cf. EMVA 1288, photometric calibration, at least 1/256) Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 11 Pattern decoding and uncertainty quantification (4) Cosine phase-shifted active patterns: Pattern sequence modulated by spatial frequency f1 dp1 dx d x1 T1 x Model probability distribution over x as a Gaussian mixture Pattern sequence modulated by spatial frequency f2 dp2 dx T2 d x2 x Merge distributions: multiply distributions (Bayes’ posterior PDF): dp1 dp2 × dx dx dx • • Resulting x: position of the highest-weight peak Uncertainty of x: width of the highest peak x Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 12 Pattern decoding and uncertainty quantification (5) Result of Bayesian decoding: posterior x and δx. What else can we find? May determine valid (i.e. encoded) points, remove masked or dirty pixels: • If the final uncertainty δx is larger than some threshold, discard the point May find the optimal coding uncertainty (= the highest useful frequency): • Start with low-frequency patterns (large T), gradually increase pattern frequency • Due to blurring, high-frequency patterns produce lower effective contrast B* • Once the posterior δx after merging starts growing, stop and report x May estimate anisotropic error (2D decoding): • Use independent decoding for x- and y-directions on screen • May also use many directions in x-y plane, combine into covariance matrix Coding screen space Camera sensor space p = (u, v) m = (x, y)T T S Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 May further extend uncertainty quantification to 3D space 13 Pattern decoding and uncertainty quantification (6) Result of Bayesian decoding: posterior x and δx. What else can we find? May estimate the Gaussian blurring kernel size (in screen plane): é 2p x ù g(x) = A+ B× cosê + yú ë T û é x2 ù 1 • Normalized blurring kernel: b(x) = exp ê2ú 2 2W ë û 2p W • Convolution (blurring) is equivalent é 2p 2W2 ù é 2p x ù to a change of contrast: (g Ä b)(x) = A+ B× cosê + y ú × exp êú 2 ë T û T ë û • Original pattern (no blurring): Observe the contrast B* for several pattern frequencies T, fit kernel size W: Ä = Blurring can be calibrated independently; not considered in this talk Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 14 Sample registration data Sample camera image Decoded x-coordinate on screen Decoded y-coordinate on screen Estimated blur kernel Estimated inverse x-uncertainty Estimated inverse y-uncertainty Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 15 Which camera models already exist? (1) Basic pinhole camera: the simplest model Hanning ’11 Beyerer, Puente Leon, Frese ‘13 Six parameters to calibrate: “magnification”, “skewness”, “central point” Systematic depthdependent error Imaging plane z = 1: proxy of sensor space Calibration: • May use several known points in 3D, few camera poses • Least-squares regression (bundle adjustment) PCM summary: + Simple model, widely useful in theoretical studies + Closed-form, differentiable direct and inverse transforms + Fast rendering available - Not flexible enough to describe realistic cameras - Cannot describe wide-angle cameras/lenses, catadioptric devices, imaging systems with multiple projection centers, etc. Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 Calibration point Real view ray Estimated ray Common projection center 16 Which camera models already exist? (2) Zhang ‘00 Pinhole model with polynomial corrections Systematic depthdependent error Distortions limited to the 2D imaging plane coordinates Extra 5-8 parameters Imaging plane z = 1: proxy of sensor space Calibration data and procedure (OpenCV library): • Static calibration pattern, sparse set of point-like features • Efficient (semi-linear) regression to simultaneously find intrinsic and extrinsic parameters for each pose Calibration point Real view ray Notes: • Uncertainty of feature position extraction is assumed uniform and isotropic in the entire 3D volume • Errors in estimated distortion parameters lead to systematic errors in the measurements Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 Estimated ray Common projection center 17 Which camera models already exist? (3) Wei, Ma ‘91 Hanning ‘11 Two-plane camera model May avoid depthdependent error Hanning’s calibration algorithm: • Uses static patterns with sparse point-like features • Semi-linear regression to estimate distortion parameters Advantages: + Flexible, multi-center projection possible + May use splines to parameterize distortions Calibration point Estimated view ray Plane z = 1 Drawbacks: - Relies on global PCM: implicit systematic error - Implicit regularization of ambiguities (e.g. frame choice) - Uncertainty of feature position assumed to be uniform and isotropic (Euclidean distance between points and view rays) - Blurring effects ignored - No available reference implementation for tests… Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 Reference central ray No common projection center for all rays! Plane z = -1 18 Which camera models already exist? (4) Sturm, Ramalingam ‘03 Generic camera calibration for metrological applications • Instead of camera mappings Pdirect and Pinverse, define one large table TGCCinverse • Per each pixel, identify 3D ray origin and direction (6 parameters) • Sensor space position π is completely ignored! Calibration: • Dense set of 3D points found for each pose from the displayed sequence of active coding patterns (only point positions) • Simultaneously find intrinsic and extrinsic parameters • Per-pixel solution in closed form + optimization Summary: + Very general, non-parametric, arbitrary distortions - Accuracy of coding points assumed uniform and isotropic - No continuity of sensor space; inter-pixel values undefined - No fast method to project 3D points back to sensor (render images) - Infinite sharpness assumed - Very CPU-intensive (regression problem solved for each pixel) Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 19 Proposal: smooth generic camera calibration (1) Generic camera calibration + smooth model parameterization • • Imaging with modern cameras is locally very smooth No global pinhole camera • Fit camera projection mapping as a combination of smooth kernels ϕ chosen to account for the local smoothness of the sensor (FEM-style solution) • Minimize the global ray consistency metric defined wrt 3D registration errors Find the “nearest” point on ray: Single ray consistency metric: Single view ray S Decoded calibration point with respective 3D uncertainty ellipsoid Efficient closed-form solution! Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 20 Proposal: smooth generic camera calibration (2) Generic camera calibration + smooth model parameterization • Fit camera projection mapping as a combination of smooth kernels ϕ chosen to account for the local smoothness of the sensor (FEM-style solution) • Minimize the global ray consistency metric defined wrt 3D registration errors Single ray consistency metric: Global consistency metric: • All components in Δ are known analytically and can be efficiently differentiated • Can minimize Δ using e.g. Levenberg-Marquardt algorithm sGCC calibration: C* = argminC D, Non-linear least squares problem Common vector of all intrinsic and extrinsic parameters Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 21 Proposal: smooth generic camera calibration (3) Generic camera calibration + smooth model parameterization Regularization conditions: explicit recipe to fix calibration symmetries • Unconstrained symmetries manifest themselves as extremely large eigenvalues of covariance matrix that results from the optimization step • May introduce different regularization conditions depending on the problem • For example, for nearly-pinhole cameras with narrow view field may use: oz (u, v) = 0, rz (u, v) = 1 Fixes local re-scaling freedom for r and the freedom for o to move along r z=1 ox (0, 0) = oy (0, 0) = 0, rx (0, 0) = ry (0, 0) = 0, ¶ry ¶o (0, 0) = x (0, 0) = 0. ¶u ¶u z= 0 Six conditions on the six global degrees of freedom (choice of the camera’s SoC) These constraints may be efficiently implemented as linear conditions on optimization parameters Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 22 (Preliminary) simulation results • • • • Mitsuba physically-accurate renderer, 7 camera poses, 40 patterns Finite-aperture perspective camera (generic camera rendering not available yet) Complete toolchain with decoding and Levenberg-Marquardt optimization Finite elements: uniform cubic B-splines on a 10x10 mesh, 20 iterations, 20 min Final sGCC error metric (pose 5) Final absolute re-projection error (Euclidean distance) Resulting functions rx, ry, rz Work in progress, new results will follow… Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 23 (Preliminary) simulation results Simulation based on synthetic data: • Ground truth: generic camera with non-trivial smooth distortion functions (deviation from pinhole camera of order O(10-3) units) • Registration data simulated directly with Gaussian noise (no rendering / decoding) • 3 camera poses, 1024 x 1024 data points • 40 iterations, about 20 minutes Ground truth functions and poses Resulting errors in calibration functions • Error in ox, oy: O(10-4) • Error in rx, ry: O(10-6) • Perhaps regularization is not precise enough? Evolution of error metric Eigenvalues of cov. matrix Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 24 Summary • Camera calibration problem has a non-trivial mathematical structure • Popular camera models and respective calibration procedures have undesirable properties for metrological applications We may better exploit information from active coding patterns: • Screen pixel positions, their uncertainties, blurring kernel size Smooth generic camera calibration: • Universal differentiable un-biased camera model • Explicit control over the smoothness degree of the model • Optimization covariance and fit quality have physical interpretation! • Explicit specification of the (problem-dependent) regularization recipe • Efficient rendering of 3D scenes is possible (fast iterative projection) • Blurring effects can be consistently quantified and modeled Thank you for your attention! Alexey Pak, Fraunhofer IOSB / IES Chair, KIT Karlsruhe, 18.09.2015 25