Lecture 1 Intro.key
Transcription
Lecture 1 Intro.key
Practical Secure Two-Party Computation and Applications Thomas Schneider Estonian Winter School in Computer Science 2016 Overview Lecture 1: Introduction to Secure Two-Party Computation Lecture 2: Private Set Intersection Lecture 3: Tools and Applications Lecture 4: Hardware-assisted Cryptographic Protocols 2 The Engineering Cryptographic Protocols Group (ENCRYPTO) Thomas Schneider Daniel Demmler Ágnes Kiss Michael Zohner Info: http://encrypto.de 3 Interested in Practical Secure Computation? We have an open, fully funded position as Ph.D. Student / Research Assistant in Engineering Scalable Secure Computation Darmstadt - 30km south of FRA - 150,000 inhabitants (5.8 Mio in Frankfurt/Rhine-Main Metro Area) - 40,000 students TU Darmstadt - Ranked #1 for IT security research in Germany (#5 in Europe) - Among Top 5 universiKes for computer science in Germany http://encrypto.de/jobs 4 Practical Secure Two-Party Computation and Applications Lecture 1: Introduction Estonian Winter School in Computer Science 2016 The Web of Services Our life moves into the web... ... and so does our data. 6 How were web services used yesterday? http://www.google.de heart disease “heart disease” attacker can eavesdrop or modify communication 7 How should web services be used today? https://www.google.de “heart disease” secure channel protects communication against external attackers heart disease HTTPS per default since 01/2010 02/2011 11/2012 8 Data breaches happen every day... June 2, 2011: Google attacked from China Computer hackers in China broke into the Gmail accounts of several hundred people, including senior US government officials, military personnel and political activists. ... from outsiders November 29, 2010: New WikiLeaks Publication WikiLeaks releases US State Department communiqués that offer an extraordinary look at the inner workings, and sharp elbows of diplomacy. ... or insiders October 16, 2012: Espionage Malware MiniFlame Kaspersky Labs discover that MiniFlame is most likely a targeted cyberweapon to conduct in-depth surveillance and cyber-espionage. ... or malware. 9 How could web services be used tomorrow? httpp://www.google.de encrypted query process under encryption heart disease encrypted response sensitive data remains encrypted ➪ Privacy-Preserving Web Services 10 Vision: Privacy-Preserving Web Services process sensitive data without any data leakage, e.g., Privacy-Preserving Medical Diagnostics Services give health recommendations without direct access to patient’s data. Privacy-Preserving Face Recognition Services detect criminals without allowing to trace honest citizens. Privacy-Preserving Cloud Computing Services allow to store and process data at untrusted service providers. 11 Is this possible at all? Andrew Chi-Chi Yao 1986: Any efficiently computable function can be evaluated securely. ➪ Secure Computation 12 Secure Two-Party Computation x f(x,y) f y All Lectures: Semi-Honest (Passive) Adversaries 13 Secure Two-Party Computation • • • • public function f (·, ·) compute arbitrary function f on private data x, y Server S Client C without trusted third party Is C richer? x>y reveal nothing but result z = f(x,y) private data x x = $2 Mio Example: Yao’s Millionaires’ Problem private data y y = $1 Mio S2PC true z = f (x, y) 14 Secure Two-Party Computation Auctions [NaorPS99], ... Remote Diagnostics [BrickellPSW07], ... DNA Searching [Troncoso-PastorizaKC07], ... Biometric Identification [ErkinFGKLT09], ... Medical Diagnostics [BarniFKLSS09], ... 15 Oblivious Transfer (OT) (x0, x1) OT r xr 1-out-of-2 OT is an essential building block for secure computation. 16 How to Measure Efficiency of a Protocol? ✓ Runtime (depends on implementation & scenario) ✓ Communication • • # bits sent (important for networks with low bandwidth) # rounds (important for networks with high latency) ? Computation Usually: count # crypto operations, e.g., • # modular exponentiations • # point multiplications • # hash function evaluations (SHA) • # block cipher evaluations (AES) • # One-Time Pad evaluations But also non-cryptographic operations do matter! faster • • 17 Overview of this lecture Part 1: Yao vs. GMW Special Purpose Protocols Generic Protocols Arithmetic Circuit Homomorphic Encryption Boolean Circuit Yao GMW OT Public Key Crypto >> Symmetric Crypto Part 2: Efficient OT Extensions >> One-Time Pad 18 Part 1: Yao vs. GMW and Efficient Circuits T. Schneider, M. Zohner: GMW vs. Yao? Efficient secure two-party computation with low depth circuits. In FC’13. 19 Yao’s Garbled Circuits Protocol [Yao86] f (·, ·) e.g., x < y Client C Server S private data x = x1 , .., xn private data y = y1 , .., yn xn yn < • Circuit z Setup Phase Online Phase (e x; ?) e x, y e) f (x, y) = C(e e C e y x2 y2 . . . c2 xn yn • Garbled Circuit C < x1 y1 c1 x2 y2 . . . c2 < x1 y1 c1 z 0 e1 OT(x; (e x , x )) Part 2: Efficient OT e c01 , e c11 Garbled Values E(e x01 , ye10 ; E(e x01 , ye11 ; E(e x11 , ye10 ; E(e x11 , ye11 ; g(0,0) e c1 ) g(0,1) e c1 ) g(1,0) e c1 ) g(1,1) e c1 ) Garbled Table 20 Garbled Circuits [Yao86] Conventional circuit Garbled circuit 01 keys look random 01 01 01 01 given input keys, can compute output key only (Slide from Viet-Tung Hoang) 21 Garbled Gate [Yao86] X A B Y Y 0 X 1 X 2 X 3 C given two input keys, can compute only output key D (Slide from Viet-Tung Hoang) 22 Overview of Efficient Garbled Circuit Constructions 1990 Point-and-Permute [BeaverMicaliRogaway] 1999 3-row reduction [NaorPinkasSumner] 2008 Free-XOR [KolesnikovSchneider] 2009 2-row reduction [PinkasSchneiderSmartWilliams] 2012 Garbling via AES [KreuterShelatShen] 2013 Fixed-key AES [BellareHoangKeelveedhiRogaway] 2014 FleXor [KolesnikovMohasselRosulek] 2015 HalfGates [ZahurRosulekEvans] (Slide from Payman Mohassel) 23 Summary of Garbled Circuit Constructions size (× t) XOR Classical garble cost (AES) AND XOR AND eval cost (AES) XOR AND large 8 5 P&P 4 4 1 GRR3 3 4 1 Free XOR 0 3 0 4 0 1 HalfGates 0 2 0 4 0 2 t: symmetric security parameter, e.g., t=128 (Slide from Mike Rosulek) 24 Summary: Yao - the Apple How to eat an apple? bite-by-bite + Yao has constant #rounds - Evaluating a garbled gate requires symmetric crypto in the online phase 25 The GMW Protocol [GMW87] Secret share inputs: a = a1 ⊕ a2 b = b1 ⊕ b2 a Non-Interactive XOR gates: c1 = a1 ⊕ b1 ; c2 = a2 ⊕ b2 Interactive AND gates: c1 , b1 ∧ c2 , b2 AND d1 Recombine outputs: d = d1 ⊕ d2 b c ^ d d2 26 Evaluating ANDs via Multiplication Triples [Beaver91] Part 2: Efficient OTs Setup phase: Generate multiplication triple (a1⊕a2) (b1⊕b2) = c1⊕c2 for each AND via 2 OTs: 1) P1: m0, m1 ∈R {0,1}; P2: a2 ∈R {0,1} 2) P1 and P2 run OT, where P1 inputs (m0, m1), P2 inputs a2 and gets u2=ma2 3) P1 sets b1 = m0 ⊕ m1; v1 = m0 4) P1 and P2 repeat steps 1-3 with reversed roles to obtain (a1, u1); (b2, v2) 5) Pi sets ci = (ai bi) ⊕ ui ⊕ vi Online phase: P1 → P2: d1=x1⊕a1; e1=y1⊕b1 P1 ← P2: d2=x2⊕a2; e2=y2⊕b2 P1, P2: d=d1⊕d2; e=e1⊕e2 P1: z1=db1⊕ea1⊕c1⊕de P2: z2=db2⊕ea2⊕c2 c1x,1b, 1y1 ∧ xc22, y, 2b2 AND dz11 zd2 2 27 Summary: GMW - the Orange How to eat an orange? 1) peel (almost all the effort) Setup phase: - precompute multiplication triples for each AND gate using 2 R-OTs and constant #rounds + no need to know function, only max. #ANDs 2) eat (easy) Online phase: + evaluating circuit needs OTP operations only - 2x2 bit communication per layer of AND gates 28 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. 29 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Interactive AND gates via Beaver’s multiplication triples [D. Beaver. Efficient multiparty protocols using circuit randomization. CRYPTO’91.] setup phase: 1-out-of-4 OT online phase: 2 independent 2-bit messages (sent in parallel) => 1x network latency per layer of AND gates 30 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Use AES-based PRF for OT extensions (instead of SHA-1). 31 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Load Balancing: • Run half of the precomputed OTs in each direction (in parallel). • Run base OTs twice (in parallel). => Each party has exactly the same workload. 32 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Use GMP instead of NTL for base OTs. 33 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Process data in chunks of bytes (instead of bits). 34 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Use assembly implementation of OpenSSL for SHA-1 (instead of C implementation of PolarSSL). 35 Benchmarks of an optimized GMW implementation [SZ13] Runtime in seconds for 512-bit multiplication circuit (800k AND gates, depth 38) over Gigabit LAN. Single Instruction Multiple Data: Evaluate multiple circuits in parallel (here 32). (inspired by Sharemind) 36 Remaining Bottlenecks in LAN Setting 1.4% 0.8% 1% 0.1% (Base OTs) 7% 3% 3% 20% 32% 35% 98% 47% 37% 16% 37 Yao vs. GMW Yao GMW Free XOR S: 4, R: 2 (online) symmetric crypto per AND setup: S: 6, R: 6 S→R: 2t communication [bit] per AND setup: S→R:t || R→S:t online: S→R:2 || R→S:2 O(1) rounds setup: O(1) online: O(ANDdepth(f)) t memory per wire [bit] t: symmetric security parameter 1 38 Efficient Circuit Constructions for Secure Computation Classical circuit design: - few gates ( small chip area) - low depth ( high clock frequency) Circuits for secure computation: - low ANDsize (#non-XORs communication and symmetric crypto) - low ANDdepth (#rounds in GMW’s online phase) Automatically generate optimized circuits from high-level descriptions: E. M. Songhori, S. U. Hussain, A.-R. Sadeghi, T. Schneider, F. Koushanfar: TinyGarble: Highly compressed and scalable sequential garbled circuits. In IEEE S&P’15. D. Demmler, G. Dessouky, F. Koushanfar, A.-R. Sadeghi, T. Schneider, S. Zeitouni: Automated Synthesis of Optimized Circuits for Secure Computation. In ACM CCS’15. 39 i i i i i i i optimized to a small number of XOR gates. An equivalent construction for com ci+1 with the same number of non-XOR gates was given in [BPP00, BDP00]. Example Circuit: Addition Ripple-Carry-Adder x ` y` ADD + x 2 y2 . . . c3 x 1 y1 c2 + s2 s`+1 s` + 0 s1 si = xi ⊕ yi ⊕ ci Figure 3.3: Circuit: Addition (ADD) ci+1 = ((xi ⊕ yi) ∧ (xi ⊕ ci)) ⊕ xi [BoyarPeraltaPochuev00] ANDsize = ℓ, ANDdepth = ℓ 3.3.1.2 Subtraction [LF80] (SUB) Ladner-Fischer-Adder x4 y4 x3 y3 x2 y2 x1 y1 p4,0 c4,0 p3,0 c3,0 p2,0 c2,0 p1,0 c1,0 pi,0=xi⊕yi, ci,0=xi∧yi Subtraction in two’s complement representation is defined as x y = x+¬y+1. p constructed p a subtraction circuit (SUB) can be analogously to the addition circui c c 1-bit subtractors ( ) as shown in Fig. 3.4. Each 1-bit subtractor the pi,j=pi,j-1∧pcomputes k,j-1 out bit ci+1 = (xi ^ ¬yi ) _ (xi ^ ci )p_ (¬yp i ^ ci ) = (xi , yi , ci ))[01001101] di↵ ci,j=(pi,j-1∧ck,j-1and )∨cthe i,j-1 c c bit di = xi ¬yi ci = (xi , yi , ci )[10010110]. The size of SUB is equal to that of s5 4,1 2,1 4,1 2,1 4,2 3,2 4,2 3,2 s4 s3 s2 s1 3.3.1.3 ControlledANDsize Addition/Subtraction (ADDSUB) = ℓ+1.25 ℓ log2(ℓ), ANDdepth = 1+2 log2(ℓ) 40 A Summary of Circuit Building Blocks Example Circuits Summarized in [SchneiderZohner13] Table 7. Size and Depth of Circuit Constructions (dH : Hamming weight) Circuit Addition Ripple-carry ADD/SUBℓRC Ladner-Fischer ADDℓLF LF subtraction SUBℓLF (ℓ,3) Carry-save ADDCSA (ℓ,n) RC network ADDRC (ℓ,n) CSA network ADDCSA Size S Depth D ℓ 1.25ℓ⌈log 2 ℓ⌉ + ℓ 1.25ℓ⌈log 2 ℓ⌉ + 2ℓ ℓ + S(ADDℓ ) ℓn − ℓ + n − ⌈log2 n⌉ − 1 ℓn − 2ℓ + n − ⌈log2 n⌉ ℓ+⌈log n⌉ +S(ADDLF 2 ) ℓ 2⌈log 2 ℓ⌉ + 1 2⌈log 2 ℓ⌉ + 2 D(ADDℓ )+1 ⌈log2 n − 1⌉ + ℓ ⌈log2 n − 1⌉ ℓ+⌈log n⌉ +D(ADDLF 2 ) Multiplication RCN school method MULℓRC 2ℓ2 − ℓ 2ℓ − 1 ℓ 2 CSN school method MULCSN 2ℓ + 1.25ℓ⌈log 2 ℓ⌉ − ℓ + 2 3⌈log 2 ℓ⌉ + 4 ℓ 2 RC squaring SQRRC ℓ −ℓ 2ℓ − 3 ℓ 2 LF squaring SQRLF ℓ + 1.25ℓ⌈log 2 ℓ⌉ − 1.5ℓ − 2 3⌈log 2 ℓ⌉ + 3 Comparison Equality EQℓ ℓ−1 ⌈log 2 ℓ⌉ ℓ Sequential greater than GTS ℓ ℓ ℓ D&C greater than GTDC 3ℓ − ⌈log2 ℓ⌉ − 2 ⌈log2 ℓ⌉ + 1 Selection Multiplexer MUXℓ ℓ 1 Minimum MIN(ℓ,n) (n − 1)(S(GTℓ )+ℓ) ⌈log 2 n⌉(D(GTℓ )+1) Can(ℓ,n) trade-off larger size for better depth. Minimum index MINIDX (n − 1)(S(GTℓ )+ℓ + ⌈log2 n⌉) ⌈log 2 n⌉(D(GTℓ )+1) 41 Part 2: Efficient OTs http://encrypto.de/code/OTExtension G. Asharov, Y. Lindell, T. Schneider, M. Zohner: More efficient oblivious transfer and extensions for faster secure computation. In ACM CCS’13. 42 Oblivious Transfer (OT) (x0, x1) OT r xr 1-out-of-2 OT is an essential building block for secure computation. 43 OT - Bad News - [ImpagliazzoRudich89]: there’s no black-box reduction from OT to OWFs - Several OT protocols based on public-key cryptography - e.g., [NaorPinkas01] yields ~1,000 OTs per second - Since public-key crypto is expensive, OT was believed to be inefficient 44 A Public-Key Based OT Protocol: [NaorPinkas01] Common input: G=<g> of prime order q input: x0, x1 t ∈R [0,q) C= gt input: b C PK0 PK1=C/PK0 r0, r1 ∈R [0,q) E0=<gr0, H((PK0)r0) ⊕ x0> E1=<gr1, H((PK1)r1) ⊕ x1> E0, E1 k ∈R [0,q) PKb = gk PK1-b = C/PKb Eb=<L, R> h=H(Lk)=H((PKb)rb) xb=h⊕R output: xb 45 OT - Good News - [Beaver95]: OTs can be precomputed (only OTP in online phase) - OT Extensions (similar to hybrid encryption): use symmetric crypto to stretch few “real” OTs into longer/many OTs - [Beaver96]: OT on long strings from short seeds - [IshaiKilianNissimPetrank03]: many OTs from few OTs l-bit k-bit k OTs “real” OTs [Beaver96] m OTs [IKNP03] 46 OT Extension of [IKNP03] (1) - Alice inputs m pairs of ℓ-bit strings (xi,0 , xi,1) - Bob inputs m-bit string r and obtains xi,ri in i-th OT 47 OT Extension of [IKNP03] (2) - Alice and Bob perform k “real” OTs on random seeds with reverse roles (k: security parameter) 48 OT Extension of [IKNP03] (3) - Bob generates a random m x k bit matrix T and masks his choices r - The matrix is masked with the stretched seeds of the “real” OTs PRG: pseudo-random generator (instantiated with AES) 49 OT Extension of [IKNP03] (4) - Transpose matrices V and T - Alice masks her inputs and obliviously sends them to Bob H: correlation robust function (instantiated with hash function) 50 Computation Complexity of OT Extension Per OT: 1 # PRG evaluations 2 2 # H evaluations 1 Time distribution for 10 Mio. OTs (in 21s): 1 % 10 % 33 % 42 % 14 % "real" OTs H (SHA-1) PRG (AES) Transpose Misc (Snd/Rcv/XOR) Non-crypto part was bottleneck!!! 51 Algorithmic Optimization: Efficient Matrix Transposition - Naive matrix transposition performs mk load/process/store operations - [Eklundh72]’s algorithm reduces number of operations to O(m log2 k) swaps - Swap whole registers instead of bits - Transposing 10 times faster 52 Algorithmic Optimization: Parallelization - OT extension can easily be parallelized by splitting the T matrix into sub-matrices - Since columns are independent, OT is highly parallelizable 53 Communication Complexity of OT Extension Per OT: Bits sent 2ℓ Yao: ℓ = k = 128 2k GMW: ℓ = 1, k = 128 Alice Alice Bob Bob 54 Protocol Optimization: General OT Extension - Instead of generating a random T matrix, we derive it from sj,0 (similar to garbled 3-row reduction) - Reduces data sent by Bob by factor 2 55 Specific OT Functionalities - Secure computation protocols often require a specific OT functionality - Yao with free XORs requires strings x0, x1 to be XOR-correlated - GMW with multiplication triples can use random strings - Correlated OT: random x0 and x1 = x0 ⊕ x - Random OT: random x0 and x1 Correlated OT Random OT e.g., for Yao e.g., for GMW 56 Specific OT Functionalities: Correlated OT (C-OT) - Choose xi,0 as random output of H (modeled as RO here), similar to garbled 3-row reduction - Compute xi,1 as xi,0 ⊕ xi to obliviously transfer XOR-correlated values - Reduces data sent by Alice by factor 2 57 Specific OT Functionalities: Random OT (R-OT) - Choose xi,0 and xi,1 as random outputs of H (modeled as RO here), similar to garbled 3-row reduction - No data sent by Alice 58 Performance Evaluation: Original Implementation Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 2T 14,2 2,6 4T Performance for 10 Mio. OTs on 80-bit strings - C++ implementation of [SZ13] implementing OT extension of [IKNP03] - Performance for 10 Mio. OTs on 80-bit strings 59 Performance Evaluation: Efficient Matrix Transposition Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 14,2 2,6 2T 4T Performance for 10 Mio. OTs on 80-bit strings - Efficient matrix transposition – improves computation - Only decreases runtime in LAN where computation is the bottleneck 60 Performance Evaluation: General OT Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 2T 14,2 2,6 4T Performance for 10 Mio. OTs on 80-bit strings - Generate T matrix from seeds – improves communication Bob → Alice - Runtimes only slightly faster (bottleneck: communication Alice → Bob) 61 Performance Evaluation: Correlated/Random OT Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 2T 14,2 2,6 4T Performance for 10 Mio. OTs on 80-bit strings - Correlated/Random OT – improved communication Alice → Bob - WiFi runtime faster by factor 2 (bottleneck: communication Bob → Alice) 62 Performance Evaluation: Parallelization Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 14,2 2,6 2T 4T Performance for 10 Mio. OTs on 80-bit strings - Parallel OT extension with 2 and 4 threads – improved computation - LAN runtime decreases linear in # of threads - WiFi runtime remains the same (bottleneck: communication) 63 Performance Evaluation: Summary Gigabit LAN WiFi 802.11g Runtime in s 40 30,7 30 20 30,5 29,4 20,6 14,4 13,9 10 0 10,6 14,4 10,0 14,2 14,2 5,0 Orig EMT G-OT C-OT R-OT 2T 14,2 2,6 4T Performance for 10 Mio. OTs on 80-bit strings - OT is very efficient - Communication is the bottleneck for OT (even without using AES-NI) 64 Summary Part 1: Yao vs. GMW - can trade-off size for depth - Yao has constant #rounds good for high-latency networks (Internet) - GMW can precompute all crypto, good for low-latency networks (LAN) Part 2: OT extension - send 1 ciphertext + |payload| - communication is the bottleneck Bottleneck of today’s secure computation protocols is communication. 65 EXERCISE 1 Measure speed of crypto operations with the “openssl speed” command and order them according to throughput: • aes-128-cbc (block cipher) • dsa2048 (public-key crypto using modular exponentiation) • ecdsap256 (public-key crypto using point multiplication on elliptic curve) • rsa2048 (public-key crypto using modexp in RSA group) • sha256 (hash function) 66 Literature [ALSZ13] G. Asharov, Y. Lindell, T. Schneider, M. Zohner: More efficient oblivious transfer and extensions for faster secure computation. In ACM CCS’13. [BarniFKLSS09] M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti, A.-R. Sadeghi, T. Schneider: Secure Evaluation of Private Linear Branching Programs with Medical Applications. In ESORICS’09. [Beaver91] D. Beaver: Efficient multiparty protocols using circuit randomization. In CRYPTO’91. [Beaver95] D. Beaver: Precomputing oblivious transfer. In CRYPTO’95. [BrickellPSW07] J. Brickell, D. E. Porter, V. Shmatikov, E. Witchel. Privacy-preserving remote diagnostics. In ACM CCS’07. [DDKSSZ15] D. Demmler, G. Dessouky, F. Koushanfar, A.-R. Sadeghi, T. Schneider, S. Zeitouni: Automated Synthesis of Optimized Circuits for Secure Computation. In ACM CCS’15. [CHKMR12] S. G. Choi, K.-W. Hwang, J. Katz, T. Malkin, D. Rubinstein: Secure multi-party computation of Boolean circuits with applications to privacy in on-line marketplaces. In CT-RSA’12. [Eklundh72] J. O. Eklundh. A fast computer method for matrix transposing. In IEEE Transactions on Computers, 1972. [ErkinFGKLT09] Z. Erkin, M. Franz, J. Guajardo, S. Katzenbeisser, I. Lagendijk, T. Toft: Privacy-preserving face recognition. In PETS’09. [GMW87] O. Goldreich, S. Micali, A. Wigderson: How to play any mental game or a completeness theorem for protocols with honest majority. In STOC’87. [IKNP03] Y. Ishai, J. Kilian, K. Nissim, E. Petrank: Extending oblivious transfers efficiently. In CRYPTO’03. [ImpagliazzoRudich89] R. Impagliazzo, S. Rudich. Limits on the provable consequences of one-way permutations. In STOC’89. [NaorPinkas01] M. Naor, B. Pinkas: Efficient oblivious transfer protocols. In SODA’01. [NaorPS99] M. Naor, B. Pinkas, R. Sumner: Privacy preserving auctions and mechanism design. In EC’99. [SHSSK15] E. M. Songhori, S. U. Hussain, A.-R. Sadeghi, T. Schneider, F. Koushanfar: TinyGarble: Highly compressed and scalable sequential garbled circuits. In IEEE S&P’15. [SZ13] T. Schneider, M. Zohner: GMW vs. Yao? Efficient secure two-party computation with low depth circuits. In FC’13. [Troncoso-PastorizaKC07] J. R. Troncoso-Pasoriza, S. Katzenbeisser, M. U. Celik: Privacy preserving error resilient DNA searching through oblivious automata. In ACM CCS’07. [Yao86] A. C. Yao. How to generate and exchange secrets. In FOCS’86. 67