How to Implement (ORAM in) MPC Marcel Keller Peter Scholl Nigel Smart
Transcription
How to Implement (ORAM in) MPC Marcel Keller Peter Scholl Nigel Smart
How to Implement (ORAM in) MPC Marcel Keller Peter Scholl University of Bristol 21 February 2014 Nigel Smart Overview 1. How to Implement MPC 2. ORAM in MPC Part I How to Implement MPC SPDZ: MPC with Preprocessing Offline Phase I Offline Phase I Correlated Randomness I I Secret Inputs I Online Phase Online Phase I I Outputs Independent of secret inputs Homomorphic encryption with distributed decryption Highly parallelizable No encryption Information-theoretic security in random oracle model How to Share a Secret with Authentication Shares MAC shares MAC key a1 γ(a)1 α1 a2 γ(a)2 α2 a3 γ(a)3 α3 = α·a P = i γ(a)i i ai = a a P = α P i αi How to Share a Secret with Authentication Shares MAC shares MAC key a1 + b1 γ(a)1 + γ(b)1 α1 a2 + b2 γ(a)2 + γ(b)2 α2 a3 + b3 γ(a)3 + γ(b)3 α3 a+b P = i ai + bi Pα · (a + b) = i γ(a)i + γ(b)i = a+b = α P i αi Multiplication with Random Triple (Beaver Randomization) x · y = (x + a − a) · (y + b − b) = (x + a) · (y + b) − (y + b) · a − (x + a) · b + a · b Multiplication with Random Triple (Beaver Randomization) x · y = (x + a − a) · (y + b − b) = (x + a) · (y + b) − (y + b) · a − (x + a) · b + a · b Masked and opened Random secret triple Toolchain Overview Python-like high-level code I Compiler I I Compiler I I I Bytecode I Virtual machine I I Virtual machine Python Easier development Circuit optimization Speed not an issue Memory overhead I I Online phase C++ Fast ∼ 150 instructions Core Technique: I/O Parallelization z =x ·y z =x ·y u =z ·w u =v ·w Core Technique: I/O Parallelization z =x ·y z =x ·y u =z ·w u =v ·w 1. Mask and open x and y 1. Mask and open x, y , v , and w 2. Compute z 2. Compute z and u 3. Mask and open z and w 4. Compute u Goal: Automatize I/O Parallelization I Manual parallelization is tedious: x10 = x2 · x3 x20 = x4 + x6 x30 = x19 · x1 x40 = x12 · x39 x50 = x28 + x16 x11 = x8 + x4 x12 = x10 · x1 x21 = x16 + x2 x22 = x0 + x12 x31 = x16 + x26 x32 = x0 · x10 x41 = x34 + x7 x42 = x32 + x5 x51 = x15 + x38 x52 = x50 · x46 x13 = x7 + x9 x23 = x22 + x14 x24 = x11 + x19 x33 = x26 + x32 x34 = x7 + x3 x43 = x12 + x26 x44 = x43 · x38 x53 = x19 + x2 x54 = x20 · x13 x25 = x4 · x19 x26 = x23 · x9 x35 = x9 · x29 x36 = x33 + x22 x45 = x38 + x14 x46 = x44 · x27 x55 = x21 + x22 x56 = x19 · x6 x18 = x11 · x15 x27 = x7 · x5 x28 = x13 + x21 x37 = x29 · x24 x38 = x16 + x23 x47 = x22 + x24 x48 = x39 · x38 x57 = x46 + x1 x58 = x38 · x55 x19 = x13 · x7 x29 = x14 + x27 x39 = x15 + x37 x49 = x21 · x3 x59 = x47 + x29 x14 = x7 · x1 x15 = x9 + x12 x16 = x13 · x14 x17 = x0 + x11 I SIMD not suitable for every application Circuit as Directed Acyclic Graph y x + + send send 1 1 recv recv triple I Nodes: Instructions I Edges: Output of instruction is input to another I Two instructions for open: sending and receiving Edge weight: I I I ∗ ∗ ∗ − − + One between sending and receiving Zero otherwise I Longest path with respect to weights determines communication round I Merge all communication per round ⇒ Optimal number of rounds Merge by Communication Round / Longest Path AD I Need to re-compute topological order (no edges pointing backwards) ⇒ Standard algorithm linear in number of edges I Heuristic to shorten lifetime of variables ⇒ Reduced memory usage BH CEF GI Part II Oblivious RAM in MPC Goal: Oblivious Data Structures I Generally I I I Oblivious array / dictionary I I I Secret pointers Secret type of access if needed Secret index / key Secret whether reading or writing Oblivious priority queue I I Secret priority and value Secret whether decreasing priority or inserting Oblivious RAM x0 x1 Client (CPU) x0 x2 x0 Server (Encrypted RAM) Oblivious RAM x$ x$ Client (CPU) x$ x$ x$ Server (Encrypted RAM) Oblivious RAM in MPC x$ x$ Client MPC circuit x$ Reveal x$ x$ Server Secret Sharing Simple Oblivious Array (Trivial ORAM) Inner product with index vector [0] [a0 ] .. .. . . [0] [ai−1 ] [1] · [ai ] = [ai ] [0] [ai+1 ] .. .. . . [0] [an−1 ] Index vector computation ? [i] = 0 .. . ? [i] 7→ [i] = i .. . ? [i] = n − 1 Simple Oblivious Array Index vector computation without equality test P i I Bit decomposition: [x] 7→ ([x0 ], . . . , [xn−1 ]) such that x = i xi 2 ? Demux: ([x0 ], . . . , [xn−1 ]) 7→ ([δ0 ], . . . , [δ2n −1 ]) such that δi = (i = x) Example: n = 4, i = 2 I [1] [0] [0] [0] [0] [1] [1] [0] [0] [2] [1] Tree-based ORAM I Entries stored in tree of trivial ORAMs of fixed size (buckets) I Access only buckets on path to specific leaf per ORAM access I Store path in smaller ORAM ⇒ recursion I Data-independent eviction to distribute entries over tree I Original idea by Shi et al. (Asiacrypt 2011) I Path ORAM: improved eviction by Stefanov et al. (CCS 2013) Oblivious Array Access Timings Access time (ms) Two Parties, Online Phase Original tree-based ORAM Path ORAM Simple oblivious array 102 101 100 100 101 102 103 104 Size 105 106 Oblivious Priority Queue 1 : 10 2 : 18 3 : 25 I Store value-priority pairs I Values unique Operations I I I I 0 : 20 4 : 23 I Two oblivious arrays I I [00, λ, 0, 1, 01] Remove value with minimal priority Insert new pair Update value to lower priority Binary heap by priority Index to find entries in heap by value Dijkstra’s Shortest Path Algorithm a 3 1 s b 2 1 c Dijkstra’s Shortest Path Algorithm a: 1 3 1 s b 2 1 c: 1 c: 1 a: 1 Dijkstra’s Shortest Path Algorithm a: 1 3 1 s b: 3 2 1 c: 1 b: 3 c: 1 Dijkstra’s Shortest Path Algorithm a: 1 3 1 s b: 2 2 1 c: 1 b: 2 Dijkstra’s Shortest Path Algorithm a: 1 3 1 s b: 2 2 1 c: 1 Dijkstra’s Algorithm in MPC for each vertex do outer loop body for each neighbor do inner loop body I Number of vertices and edges public I Graph structure in two oblivious arrays (vertices and edges) I Use oblivious priority queue Dijkstra’s algorithms uses two nested loops I I I I I One vertices, one of neighbors thereof MPC would reveal the number of neighbors for every vertex Replace by loop over all edges in same order Flag set when starting with a new vertex I Polylog overhead over classical algorithm I Previous work: polynomial overhead Dijkstra’s Algorithm in MPC for each edge do outer loop body (dummy if same vertex) inner loop body I Number of vertices and edges public I Graph structure in two oblivious arrays (vertices and edges) I Use oblivious priority queue Dijkstra’s algorithms uses two nested loops I I I I I One vertices, one of neighbors thereof MPC would reveal the number of neighbors for every vertex Replace by loop over all edges in same order Flag set when starting with a new vertex I Polylog overhead over classical algorithm I Previous work: polynomial overhead Dijkstra’s Algorithm in MPC Timings for Cycle Graphs No ORAM No ORAM (estimated) Simple array Simple array (estimated) Tree-based array Tree-based array (estimated) 108 Total time (s) 106 104 102 100 10−2 100 101 102 103 Size 104 105 Conclusion I Practical MPC requires a dedicated compiler. (ACM CCS 2013 / ePrint 2013:143) I Oblivious data structures in MPC are feasible and useful. (ePrint 2014:137)