Big but personal data - South Asia Institute

Transcription

Big but personal data - South Asia Institute
Big but personal data
How our behavior makes
us unique
Yves-Alexandre de Montjoye
MIT Media Lab
12 points
Is the way you move
as unique as
your fingerprint
We can use points to
identify a fingerprint
Scott
1 point for mobility data
From 10 to 11am
~ 1 km²
2 points
Around 11:30am
3 points
For lunch
Boston
How many points do I need
to uniquely identify a
mobility traces?
De-identification
Entire country of 1.5 millions people
Our behavior is unique enough
4 points
Identify 95% of people
Resolution: 800 pixels
Resolution: 300 pixels
Resolution: 150 pixels
Resolution: 75 pixels
Resolution: 30 pixels
Where’s Thierry ?
?
11am
noon
8am ––12pm
Estimating Privacy
Number of points
Spatial resolution
Temporal resolution
Harder to find people
Harder to find people
Much easier to find people
It’s hard to hide in the crowd
de Montjoye, Y. A., Hidalgo, C. A., Verleysen, M., & Blondel, V. D. (2013).
Unique in the Crowd: The privacy bounds of human mobility. Nature SRep, 3.
BFI: Personality test
BFI: Personality test
Behavioral indicators derived from
metadata using the Bandicoot toolbox
Predicting personality
using metadata
de Montjoye, Y. A., Quoidbach, J., Robic, F., & Pentland, A. S.
(2013). Predicting personality using novel mobile phone-based
metrics. In Social Computing, Behavioral-Cultural Modeling and
Prediction (pp. 48-55). Springer Berlin Heidelberg.
Unicity: quantifying the privacy-utility
trade-off
Simple anonymization does not work even when
the data is coarse
1. Informed anonymization
2. Online system
Informed anonymization: D4D
Challenge
e.g. 2-week mobility traces
of 27 x 300.000 individuals
+ Bandicoot indicators
de Montjoye, Y. A., Smoreda, Z., Trinquart, R., Ziemlicki, C., & Blondel, V.
D. (2014). D4D-Senegal: The Second Mobile Phone Data for
Development Challenge. arXiv preprint arXiv:1407.4885.
Online systems: from privacy to
security using openPDS
- Only shares
answers, not raw
data
- Individual
quantification of
the risks
openPDS
de Montjoye Y.-A., Wang S., Pentland A., On the Trusted Use of LargeScale Personal Data. IEEE Data Engineering Bulletin, 35-4 (2012).
de Montjoye, Y. A., Shmueli, E., Wang, S. S., & Pentland, A. S. (2014).
openPDS: Protecting the Privacy of Metadata through SafeAnswers.
PLoS ONE, 9(7), e98790.
Yves-Alexandre de Montjoye
MIT Media Lab
yva@mit.edu
http://deMontjoye.com
In collaboration with Alex “Sandy” Pentland, César Hidalgo, Vincent Blondel,
Michel Verleysen, Erez Shmueli, Arek Stopczynski, Sune Lehmann