Document 6534887

Transcription

Document 6534887
Data Spillage in Hadoop Clouds: An Overview
Adviser: Dr. Brandeis Marshall
Student: Dri Torres
Problem Statement
Motivation
What are possible Data Spill solutions in Hadoop, and
can they integrate with current Data Spill procedure?
Economic, Data Spill recovery costs millions.
Address Security downfalls in Hadoop
Significance
To harness the full potential of Big Data, Data Spillage must be
addressed
What is Data
Spillage?
The transfer
of classified
data onto an
unclassified
information
systems
Once
classified data
has leaked it is
dirty
What is Big
Data?
32% of
Companie
s use
Hadoop
A Buzz Word
Less than
50% think it is
clearly
defined
Processes
terabytes of
data in
minutes
Hadoop Stores
and processes
Big Data
Volume,
Velocity,
Variety (The 3
Vs)
NIST Procedure for Data Spillage
The Problem with
Hadoop is “it’s a
moving target”
Spillage and Cloud
Computing
Cloud Environments
A Top-Down
Approach
Standardization
Information
Sharing
Governments
should establish
an advanced
Analytics Agency
Standardized
Software and
Hardware
Leads to
Information
Securing
Standard skilled
and professional
staff
Always report a
spill. Fully
document and
publish work
• Public
• Private
• Separate Cloud
Technology
• Map Reduce
• Yet Another Resource
YARN(MR
Negotiator
v2)
• Adopted 2013
• Compatible with more
than MR
Long Term Storage
Big Data Analytics
Knowledge Systems for Metadata
Inter-Agency Collaboration
Solutions
Establish Big Data
control centers
Private
Hadoop
Networks
Sample Cases for
Analyzing Spills
References
Gang-Hoon Kim, Silvana Trimi, and Ji-Hyong Chung. 2014. Big-data applications in the
government sector. Commun. ACM 57, 3 (March 2014), 78-85. DOI=10.1145/2500873
http://doi.acm.org/10.1145/2500873
Lindner, Felix FX, and Sandro Gaycken. "Back to Basics: Beyond Network Hygiene." Best
Practices in Computer Network Defense: Incident Detection and Response (2014): 54-64.
Recurity Labs. IOS Press, 10 Feb. 2014. Web. 17 July 2014. <http://recurity-labs.com>.
Michael Stonebraker and Judy Robertson. 2013. Big data is 'buzzword du jour;' CS
academics 'have the best job'. Commun. ACM 56, 9 (September 2013), 10-11.
DOI=10.1145/2500468.2500471
This material is based upon work supported by the National Science Foundation under grant #1062970. Any opinions, findings,
and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views
of the National Science Foundation.
INSuRE is training students in information security research. Sponsored by the National Science Foundation with problems
provided by the National Security Agency.