SHRP 2 Safety Database Access Phase 1 Stakeholder Request for

Transcription

SHRP 2 Safety Database Access Phase 1 Stakeholder Request for
Purpose
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
The Transportation Research Board (TRB) of the National Academy of Sciences is
seeking input from interested stakeholders on key implementation issues to be
addressed during the initial five-year implementation phase (Phase 1) of the
Strategic Highway Research Program 2 (SHRP 2) Safety Database. This is a Request
for Information (RFI) only and not a solicitation for proposals.
Information and comments are being sought that address: approaches to providing
widespread access to the data for qualified researchers while ensuring the
confidentiality of personally identifiable information from volunteer research
participants; best practices for managing large data sets; research areas for
database utilization; building capacity for data use; provision of user support
services; analysis and visualization tools and techniques; and means to foster and
sustain user communities that will support program operations as described below
and aid in developing a business plan for long-term, sustainable operation of the
database. Responders are encouraged to consider both short-term (within five
years) and long-term (beyond five years) time horizons in thinking about the data.
Background
The SHRP 2 Safety Database, consisting of linked data from the Naturalistic Driving
Study (NDS) and Roadway Information Database (RID), provides transportation
safety practitioners with an unprecedented national resource. Traditionally, data
collected by police investigation after a crash has occurred have been used in
addressing highway safety issues. This information comes from an examination of
the scene, interviews with drivers and witnesses, and judgments about what the
driver was doing at the time of the crash. The SHRP 2 Safety Database takes this
state of knowledge to the next level by providing detailed data about what the
driver was doing before and during the crash sequence as well as in ‘near-crash’ and
‘normal’ driving situations. Such information will enable safety practitioners to
design better countermeasures to reduce the societal harm that results from the
33,561 fatalities and 2,362,000 injuries that occurred on the nation’s roadways in
2012 1.
The SHRP2 Safety Database consists of an extensive collection of detailed
information describing the vehicle, driver, trip and roadway. The study collected
data from over 3,000 volunteer drivers of all age and gender groups, during a three
year data collection period (most drivers participated 1 to 2 years) amounting to
nearly 50 million vehicle miles, 5 million trip files, over 3,900 vehicle years, more
than 1 million hours of video, and 2 petabytes of NDS data. Data were collected
1
http://www-nrd.nhtsa.dot.gov/Pubs/812016.pdf
1
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
across six sites: Florida, Indiana, New York, North Carolina, Pennsylvania, and
Washington. Instrumented vans collected detailed roadway data on 25,000
centerline miles of roadway across the study sites. In addition approximately
400,000 miles of less detailed data from the inventories of the study states, and
supplemental data on traffic, weather, work zones, crashes, traffic laws, roadway
improvements, and other subjects were aggregated to create the RID, resulting in
50-60 GB of spatial data with an additional 8TB video log.
Access to the SHRP 2 Safety Database is managed through a set of data sharing
policies and procedures to ensure compliance with the data privacy requirements
established by the responsible Institutional Review Boards (IRB) in accordance with
federal law governing human subjects testing. NDS data access is provided to
qualified researchers who have completed training from an accredited IRB, data
downloads of any sort require a Data Sharing Agreement, and access to Personally
Identifiable Information (PII) is only allowed in a secure data enclave. PII may be
used to generate non-identifying reduced data sets but may not be copied or
removed from the enclave. The RID does not contain any PII and may be accessed
openly.
The entire SHRP 2 Safety Database will be complete in early 2015 and plans are
currently being made to provide access to qualified researchers. At this time, access
to a partial summary dataset is currently being provided via a web-based interface
(https://insight.shrp2nds.us/), while access to detailed data is available through the
Virginia Tech Transportation Institute (VTTI) for NDS and the Iowa State University
Center for Transportation Research and Education (CTRE) for RID. Personally
identifiable information is available through the VTTI’s secure data enclave in
Blacksburg, VA. A further explanation of the database and current means to access it
are available on the SHRP 2 Safety Database (InSight) website linked above.
Additional information concerning data collection and use, participant privacy
protection, and database structure(s) are contained in:
Appendix A: Assurance of Subject Confidentiality
Appendix B: SHRP 2 NDS Data Access and Privacy Requirements
Appendix C: Requirements for an Operator of the SHRP 2 NDS Data
Appendix D: SHRP 2 NDS and RID Database Structure
Phase 1 Operation
A National Research Council (NRC)-appointed Committee on Long-Term
Stewardship of Safety Data (LTSC) for the Second Strategic Highway Research
2
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
Program, recommended that “because of the uniqueness, scale, scope, and
complexity of the data, as well as uncertainty about user demand and willingness to
pay for access, devising a final plan for long-term stewardship would be premature;
therefore, future oversight and administration of the data should proceed in a
phased manner.” 2 Accordingly, a 5-year period commencing in early 2015 will serve
as “Phase 1” of use and oversight of SHRP 2 Safety Data.
Phase 1 is being administered by TRB under a Cooperative Agreement with the
Federal Highway Administration (FHWA) and in accordance with a Memorandum of
Understanding among TRB, FHWA, the American Association of State Highway and
Transportation Officials (AASHTO), and the National Highway Traffic Safety
Administration (NHTSA).
Consistent with the recommendations of the LTSC4,3, TRB will operate the Phase 1
program focusing on two primary goals:
(1) Promoting conditions under which SHRP 1 Safety Data will be available
to qualified users during Phase 1, and
(2) Gaining experience and data to support decisions about implementation
and oversight of the data after Phase 1.
Phase 1 is overseen by the newly NRC-appointed Safety Data Oversight Committee
(SDOC). The SDOC, with supporting technical panels called Expert Task Groups
(ETGs), guide efforts to operate and maintain the database during Phase 1 and
provide access to data users. The SDOC is charged with establishing the program
structure and approving all policies and procedures, including developing and
approving data sharing/access policies and procedures for all operators and users
of the data.
During Phase 1, VTTI will continue to serve as the repository for the SHRP 2 NDS
database and CTRE will continue to house the RID. Modifications to data access
policies and procedures described above and in the attachments are being
considered by the SDOC in order to facilitate researcher access to more of the
database while maintaining participant confidentiality. Responses to this RFI will
inform the SDOC’s considerations. Note that changes to data sharing policies and
procedures are also subject to review by the Institutional Review Boards of the
National Academy of Sciences and the Virginia Polytechnic Institute and State
University.
During the Phase 1 period it will be important to focus on database usability by
making the data visible, accessible and analyzable to a wide variety of qualified
2
3
http://www.trb.org/StrategicHighwayResearchProgram2SHRP2/Blurbs/168924.aspx
http://www.trb.org/StrategicHighwayResearchProgram2SHRP2/Blurbs/169661.aspx
3
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
researchers to optimize the value of this national resource and understand its
utility, while maintaining the privacy of study participants in accordance with the
informed consent agreement. Developing and deploying data visualization and
analysis tools to simplify access and improve the quality and speed of analysis will
be key to supporting this goal. It will also be critically important to understand the
costs and funding models associated with database utilization.
Request for Information
TRB is seeking input from stakeholders on how to best accomplish the stated goals
of Phase 1. Information and comments are being sought addressing: approaches to
providing widespread and cost-effective access to the data for qualified researchers
while ensuring the confidentiality of personally identifiable information; best
practices for managing large data sets; research areas for database utilization;
building capacity for data use; provision of user support services; analysis and
visualization tools and techniques; and means to foster and sustain user
communities that will support program operations and aid in developing a business
plan for long-term, sustainable operation of the database.
Recommendations contained in both of the Long Term Stewardship Committee’s
letter reports cited in the Background section of this document, as well as
requirements for protecting participant privacy described in the Appendices A-C,
should be carefully considered when responding.
We welcome all stakeholder’s thoughts and concepts for optimizing Phase 1
operation. In particular, we are soliciting input in five major areas concerning
possible:
1) Data Access Models:
• Management of the full processed data set and any reduced data sets
• Types of data products various user communities would find useful
• Data quality and version control
• Disposition of derivate work products, such as analysis techniques and
reduced data sets, including proprietary rights vs. public access, quality
assurance, hosting and availability of knowledgeable technical support.
• Website access and user support
• Means to provide support to all interested users
• Protecting participant privacy, including implications of data use and
retention outside of the United States
• Centralized vs. distributed data access models
4
•
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
Implications of a single prime operator maintaining the working copy of
the processed database vs. multiple operators maintaining working
copies of the processed database
2) Approaches to Managing Operations:
• Single versus Multiple Entity Control
• Public / Private Partnerships
• Consortium Approaches
• Measures of Performance for Evaluating Phase 1 Operations
3) Sustainable Financial Models:
• Business Models & Fee Structures
• Sources of Financial Support for Database Infrastructure
• Cost Sharing Approaches
• Funding for Analysis Projects
4) Potential User Communities:
• Safety / Mobility / Health Care / Vehicle Insurance / Other
• Public (Local / State / Federal) / Private / Academic
• “Non-Safety” (Planning / Operations / Environment and Energy /
Psychology / Machine learning / Data Mining and Analysis / Database
Operations)
• Domestic / Global
5) Database Technology:
• Effect of Current ‘Big Data’ Technologies on Phase 1 Approach
• Applicability of Relational Database Management Systems
• Other Database Management approaches that should be considered
• Data Redundancy and Archiving Needs
• Impact of Database Technology Evolution during Phase 1
Comments are welcome on any or all of the topic areas listed, as well as on items we
may have overlooked. Comments should provide enough specifics to demonstrate
that a described approach is feasible and under what conditions, including examples
of where it is being used. TRB will utilize the information received to support SDOC
oversight of Phase 1 operations.
Instructions for Submission
Proposed concepts should explicitly address how their comments support data
access for all interested qualified researchers regardless of experience level or
financial means while maintaining participant privacy.
5
SHRP 2 Safety Database Access Phase 1
Stakeholder Request for Information
Respondents should indicate if they have been involved in any similar large data
management projects with government agencies and if so, describe the business
model and lessons learned.
This is a request for information only and not a solicitation for proposals. No
funding is associated with this RFI.
All RFI responses become the property of the Transportation Research Board. Final
disposition will be made according to the policies thereof.
RFI Contact:
David J. Plazak, Senior Program Officer
Strategic Highway Research Program 2 Transportation Research Board 500
Fifth Street, NW Washington, DC 20001 Phone: 202-334-1834
E-Mail: dplazak@nas.edu
RFI Due Date: Responses are due by the close of business February 17, 2015. Late
submissions will be considered to the extent possible in compiling the results. The
preferred electronic format is PDF. Please submit responses to the e-mail address
sown above.
Any clarifications regarding this RFI will be posted on the SHRP 2 Web site
(www.TRB.org/SHRP2) before January 30, 2015. Announcements of such
clarifications will be posted on the front page and, when possible, will be noted in
the TRB e-newsletter. Responders are advised to check the Web site frequently.
6
Appendix A:
Assurance of Subject Confidentiality
(Excerpted from the SHRP 2 Naturalistic Driving Study Participant Consent Form)
To summarize, your level of confidentiality in this study is as follows:
1. There will be video of your face and upper body. There will be audio recorded, but only
for 30 seconds if you press the red incident button. The study also will collect health
and driving data about you. The video, audio, and other data that personally identifies
you, or could be used to personally identify you, will be held under a high level of
security at one or more data storage facilities. Your data will be identified with a code
rather than your name.
2. The faces of other drivers will be blurred, blacked out, or replaced by an animation once
it is determined they have not signed a consent form. This will be done prior to any
additional video review by the researchers. No identifying information will be
collected on passengers.
3. For the purposes of this project, only authorized project personnel, authorized
employees of the project sponsors, and qualified research partners will have access to
study data containing personally identifying information, or that could be used to
personally identify you. The data, including face video, which has been blurred, blacked
out, or replaced by animation, may be shown at research conferences and by the
research sponsors for the highway and road safety purposes identified above. Under no
circumstances will your name and other personally identifying information be
associated with the video clips.
4. The personally identifying data collected in this study may be analyzed in the future for
other research purposes by this project team or by other qualified researchers in a
secure environment. Such efforts will require those researchers to sign a data sharing
agreement, which will continue to protect your confidentiality, and will also require
additional IRB approval. The confidentiality protection provided to you by these data
sharing agreements will be as great as or greater than the level provided and described
in this document. Research partners will not be permitted to copy raw data that
identifies you, or that could be used to identify you, or to remove it from the secure
facility in which it is stored except with your consent.
5. A Certificate of Confidentiality has been obtained from the National Institutes of Health.
With this Certificate, the researchers and study sponsors cannot be forced to disclose
information that may identify you, even by a court subpoena, in any federal, state, or
local civil, criminal, administrative, legislative, or other proceedings. However, the
Certificate of Confidentiality does not prevent the researchers from disclosing
voluntarily matters such as child abuse, or a participant’s threatened violence to self or
others. In terms of a vehicle, this could also include items such as driving under the
influence of drugs or alcohol, allowing an unlicensed minor to drive the vehicle, or
habitually running red lights at high speed. Such behaviors may result in your removal
from the study and reporting of the behavior to the appropriate authorities.
i
Appendix B:
SHRP 2 NDS Data Access and Privacy Requirements
The SHRP 2 NDS data were collected from the vehicles of volunteer participant drivers.
This activity was carried out under federal regulations pertaining to human subjects
research, which require the review and approval of at least one Institutional Review Board
(IRB). The fundamental requirements for data access and privacy protection are drawn
from the consent form signed by the participants and the data collection protocol, both of
which were approved by several IRBs.
In general, access to NDS data:
•
•
•
will only be granted for research purposes.
will take place under a data sharing agreement that protects confidentiality as
described in the consent form.
will require the consent of the researcher’s IRB.
In the case of personally identifying information (PII), such as face video, audio, and GPS,
the following additional conditions must be met:
•
•
PII may only be used in a secure data enclave and cannot be removed from the
enclave.
Face video may be shown in public only if the face is sufficiently obscured to prevent
identification of the participant.
The research protocol requires that data be destroyed on the following schedule:
•
•
•
30 years after collection for personally identifying data.
40 years after collection for de-identified continuous sensor data.
Indefinitely for de-identified summary data resulting from analyses and reductions.
The current data sharing agreement for SHRP 2 NDS data contains several requirements
derived from the basic requirements in the consent form and the data collection protocol.
These include the following general requirements for a researcher who receives a portion
of the data:
•
•
•
•
•
The recipient and all personnel who will handle the data must be certified as having
received standard training in the ethics of human subjects research.
The recipient must agree to hold the provided data securely.
The recipient must agree not to share the data with others who have not signed a
data sharing agreement.
The recipient must agree not to attempt to learn the identity of research
participants.
If the recipient discovers identifying information or data in a dataset that was
intended to be non-identifying, he or she must agree to provide that information to
the data host so that it can be properly de-identified for future use.
i
•
•
•
•
The recipient must agree not to use data for purposes other than the research
described in the data sharing agreement and for presentation, demonstration, and
project reporting; an additional data sharing agreement will be required for other
uses.
The recipient must agree to properly acknowledge the source of the data in any
presentations.
The recipient must agree not to release or share information leading to the
identification of participants or to release or share non-identifying raw data.
The recipient must agree to return or destroy the data on a pre-determined
schedule.
Qualified researchers may access personally identifying data, such as videos of a driver’s
face, consistent with the approved needs of the research project. To do this they must
agree:
•
•
•
•
•
To obtain institutional approval in advance for their research
Certify that they have appropriate training in the use of personally identifying data
Access the data in a secure enclave.
Not remove personally identifying data from the enclave
Not disclose any data that might in any way identify individual participants.
ii
Appendix C:
Requirements for an Operator of the SHRP 2 NDS Data
Any SHRP 2 NDS operator must ensure that all of the requirements for data use in
Appendix B are met for each occurrence of data access or data sharing. These requirements
also apply to each instance of data use by the operator’s own personnel or agents.
In addition, the operator must possess the facilities and resources necessary for:
•
•
•
•
•
•
•
•
•
Data encryption and security
Electronic firewalls and locked storage facilities
Password authentication of users
Audit trails
Disaster prevention and recovery plans
Security measures for all data backup tapes, at the same level as working data
Reduction of PII for other researchers
Ethics training and non-disclosure agreements for data center staff
Creating, managing, and storing required documentation, such as data sharing
agreements, certifications of ethics training, IRB approvals, etc.
Personally-identifying data must be accessed only in a “secure enclave” with computers
that cannot access the internet and with no CD or DVD recordable drives and no writeable
USB ports, so that personally-identifying data cannot be removed from the enclave. Persons
accessing the data must not bring cell phones or cameras into the enclave. In the enclave,
persons accessing the data “reduce” it by coding variables of interest, for example coding
eye glances from video data. The reduced data are not personally identifying and may be
removed from the enclave.
As the governing body for Phase 1, the SDOC may modify or add to the policies described in
this document as long as the level of confidentiality for the data and privacy protection for
the SHRP 2 NDS participants is never less than the protection described to the participants
in their consent forms.
i
Appendix D:
SHRP Safety Database Content and Structure
The SHRP 2 NDS data, currently housed at VTTI, occupy about 2 PB of storage composed of
roughly 1.2 PB of video data and 600 TB of sensor data. Video data reside on an EMC Isilon
system with a total of 15 nodes. Sensor data are stored in an IBM Infosphere Data
Warehouse (DB2) database. The sensor data in this configuration include time series
parametric data collected from the data acquisition system in the participant vehicles, plus
metadata, indexing space, etc. Sensor data are stored with varying frequency (often
referred to as “asynchronous”), each sensor at its native sampling rate. Periodic sample
rates can vary from 1 HZ for GPS to a short buffered 640 Hz for acceleration that is only
stored when a collision is detected by the Data Acquisition System (DAS). Rate data from
the gyro are nominally sampled at 10 HZ (and buffered at 100 Hz for a collision), 3-axis
acceleration at 10 HZ (buffered at 640 Hz), forward radar at 12.5 Hz and video channels at
15 Hz. All channels are referenced to a common system clock, which is disciplined by GPS;
GPS time is also recorded as part of the GPS sensor module.
NDS data are stored in segments of time series referred to together as trip files. Typically,
the DAS begins recording shortly after the vehicle is started and records continuously until
the vehicle is turned off. The period of time from key-on to key-off is referred to as a trip.
Note that under certain conditions a single key-on to key-off trip may be comprised of
multiple trip files that would need to be concatenated to reconstruct the trip. A typical trip
may have 1.5 to 2 million records of sensor data. Overall, the sensor data are estimated at
5-10 trillion records.
The time series data, as well as ancillary data, metadata, and other structured data, are
stored in a massively parallel processing (MPP) relational database system that makes
them accessible for data mining and analysis. These data are available for ad hoc use by
data reductionists annotating individual files as well as providing a platform for data
mining, trigger execution, and other complex analytics. In an MPP database, most data
typically are scattered among several “worker” servers, each with a segment of the large
data set, while queries are submitted to a “query coordinator” node that distributes the
queries to the workers and gathers the query results for the client. The current MPP
database configuration uses thirty-two database partitions among four worker nodes.
System enhancements may be needed to accommodate the increased workloads and
number of users anticipated in Phase 1.
The tables below provide an overview of the detailed NDS data elements collected.
Vehicle ID
Model Year
Powertrain
Right Front Tread Depth
Left Rear Pressure
Battery Voltage
Battery Date
Vehicle Detail Variables
Vehicle Classification
Vehicle Make
Left Front Tread Depth
Right Rear Tread Depth
Right Front Pressure
Battery Amps
Data Collection Configuration
i
Advanced Technology Vehicle
Site Name
Left Rear Tread Depth
Left Front Pressure
Right Rear Pressure
Battery Condition
Integrated Cell Phone
Controls Location
Phonebook Display Location
Onstar
Music Control
Speech recognition
Factory Navigation
Accept Nomadics
Driver Assessments
Driver Demographic Questionnaire Data Dictionary
Medical Conditions & Medications Data Dictionary
Driving Knowledge Data Dictionary
Risky Behavior Data Dictionary
JAMAR Hand Strength Data Dictionary
Driver Vision Testing Results Data Dictionary
Medical Conditions and Medications - Exit Data
Dictionary
Exit Interview Data Dictionary
Trip ID
Trip Start UTC Hour of Day
Trip Start UTC Month
Trip End UTC Hour of Day
Trip Start Local Time Hour of Day
Trip Start Month Local
Trip End Local Time Hour of Day
Trip Day of Week
Trip Day Number in Study
Trip Duration
Trip Centroid Longitude
Max Speed
Mean Speed
Time Moving
Time Not Moving
Maximum Acceleration
Maximum Deceleration
Maximum Lateral Acceleration
Minimum Lateral Acceleration
Maximum Turn Rate
Minimum Turn Rate
Number of Longitudinal Accels >
Threshold
Number of Longitudinal Decels >
Threshold
Number of Lateral Accels > Threshold
Brake Activations
Lane Tracker Right Side High Quality
Time
Lane Tracker Left Side High Quality
Time
Face Tracker High Quality Time
Trip Distance Origin to Destination
ABS Activation
Acceleration, y-axis
Acceleration, z-axis fast
Cabin Audio
Dilution of Precision, Position
Electronic Stability Control
Epoch State
Head Position X Baseline
Head Position Z
Phonebook Access
Navigation Display Location
Nomadics Method
Driving History Questionnaire Data Dictionary
Barkley's Quick Screen Data Dictionary (ADHD screening)
Perception of Risk Data Dictionary
Sensation Seeking Data Dictionary
Modified Manchester Driver Behavior Data Dictionary
Sleep Questionnaire Data Dictionary
Visual and Cognitive Testing Results Data Dictionary
Trip Summary Variables
ABS Available
ABS Activation
Turn Signal Available
Turn Signal Activations
Traction Control Available
Traction Control Activation
Vehicle Network Supports Seatbelt
Seatbelt Usage Percentage
Vehicle Network Supports Wipers
Time Wipers Used
Vehicle Network Supports Cruise
Control
Time at 0-10 mph
Time at 10-20 mph
Time at 20-30 mph
Time at 30-40 mph
Time at 40-50 mph
Time at 50-60 mph
Time at 60-70 mph
Time at 70-80 mph
Time at > 80 mph
Distance at 0-10 mph
Distance at 10-20 mph
Time Where Radar Targets = 0
Time Where Radar Targets = 1
Time Where Radar Targets = 2
Time Where Radar Targets = 3
Time Where Radar Targets = 4
Time Where Radar Targets = 5
Time Where Radar Targets = 6+
Distance Where Radar Targets = 0
Distance Where Radar Targets = 1
Distance Where Radar Targets = 2
Distance at 30-40 mph
Distance at 40-50 mph
Distance at 50-60 mph
Distance Where Headway 1.0 - 1.5 s
Distance Where Headway 1.5 - 2.0 s
Distance Where Headway 2.0 - 2.5 s
Distance at 20-30 mph
Distance at 60-70 mph
Distance at 70-80 mph
Distance at > 80 mph
Vehicle Model Year
Time Series Variables
Acceleration, x-axis
Acceleration, y-axis fast
Airbag, Driver
Cruise Control
Distance
Elevation, GPS
Head Confidence
Head Position Y
Head Position Z Baseline
ii
Distance Where Radar Targets = 5
Time Where Headway 0.0 - 0.5 s
Time Where Headway 0.5 - 1.0 s
Time Where Headway 1.0 - 1.5 s
Time Where Headway 1.5 - 2.0 s
Time Where Headway 2.0 - 2.5 s
Time Where Headway 2.5 - 3.0 s
Time Where Headway 3.0 - 3.5 s
Time Where Headway > 3.5 s
Distance with Lead Vehicle
Distance Where Headway 0.0 - 0.5 s
Distance Where Headway 0.5 - 1.0 s
Distance Where Headway 2.5 - 3.0 s
Distance Where Headway 3.0 - 3.5 s
Distance Where Headway > 3.5 s
Minimum TTC
Acceleration, x-axis fast
Acceleration, z-axis
Alcohol
Day
Driver Button Flag
Engine RPM
Head Position X
Head Position Y Baseline
Head Rotation X
Head Rotation X Baseline
Head Rotation Z
Headlight Setting
Lane Marking, Distance, Right
Lane Marking, Type, Right
Lane Width
Longitude
Pedal, Accelerator Position
Pitch Rate, y-axis fast
Radar, Range Rate Forward X Track 1
Radar, Range Rate Forward X Track 4
Radar, Range Rate Forward X Track 7
Radar, Range Rate Forward Y Track 2
Radar, Range Rate Forward Y Track 5
Radar, Range, Forward X Track 0
Radar, Range, Forward X Track 3
Radar, Range, Forward X Track 6
Radar, Range, Forward Y Track 1
Radar, Range, Forward Y Track 4
Radar, Range, Forward Y Track 7
Radar, Target Identification Track 2
Radar, Target Identification Track 5
Roll Rate, x-axis
Speed, GPS
Subject_ID
Timestamp
vehicle_id
Video, Driver and Left Side View
Video, Rear View
Yaw Rate, z-axis fast
Head Rotation Y
Head Rotation Z Baseline
Illuminance, Ambient
Lane Marking, Probability, Right
Lane Markings, Probability, Left
Latitude
Month
Pedal, Brake
PRNDL
Radar, Range Rate Forward X Track 2
Radar, Range Rate Forward X Track 5
Radar, Range Rate Forward Y Track 0
Radar, Range Rate Forward Y Track 3
Radar, Range Rate Forward Y Track 6
Radar, Range, Forward X Track 1
Radar, Range, Forward X Track 4
Radar, Range, Forward X Track 7
Radar, Range, Forward Y Track 2
Radar, Range, Forward Y Track 5
Radar, Target Identification Track 0
Radar, Target Identification Track 3
Radar, Target Identification Track 6
Roll Rate, x-axis fast
Speed, Vehicle Network
Temperature, Interior
Traction Control
Video IP and Steering Wheel View
Video, Forward Roadway
Wiper Setting
Year
Head Rotation Y Baseline
Heading, GPS
Lane Marking, Distance, Left
Lane Marking, Type, Left
Lane Position Offset
Location
Number of Satellites
Pitch Rate, y-axis
Radar, Range Rate Forward X Track 0
Radar, Range Rate Forward X Track 3
Radar, Range Rate Forward X Track 6
Radar, Range Rate Forward Y Track 1
Radar, Range Rate Forward Y Track 4
Radar, Range Rate Forward Y Track 7
Radar, Range, Forward X Track 2
Radar, Range, Forward X Track 5
Radar, Range, Forward Y Track 0
Radar, Range, Forward Y Track 3
Radar, Range, Forward Y Track 6
Radar, Target Identification Track 1
Radar, Target Identification Track 4
Radar, Target Identification Track 7
Seatbelt, Driver
Steering Wheel Position
Time
Turn Signal
Video Frame
Video, Occupancy Snapshot
Yaw Rate, z-axis
The SHRP 2 RID database, currently housed at Iowa State University, is a 50-60 GB spatial
database that nominally covers the areas where the SHRP 2 NDS participants drove. The
RID combines information from the 4 sources listed below in a single database with a
common reference system and does not contain any PII. In addition, Item 2 below includes
a video log that adds about 8 TB to the file size.
1. A centerline base map with coverage of the continental US.
2. Roadway data collected by SHRP 2 following a standard protocol and quality control
(about 12,500 centerline miles total for the 6 sites).
3. Existing roadway data provided by the six states and other organizations (content
differs from each source, about 200,000 centerline miles).
4. Supplemental information, for example, crash history, traffic, weather, work zones,
pertinent traffic safety laws, etc.)
iii
The table below provides an overview of the detailed RID data elements collected
Mobile Van Data
25,000 driven/ 12,500 centerline
miles across the six NDS sites
Acquired Roadway Data 4
~200,000 Centerline Miles
Includes HPMS files for the six
states plus
Acquired Supplemental Data
Existing data and information
from State DOTs, Public Agencies,
and Private Sources:
Horizontal Curvature: Radius,
Length, PC, PT, Direction
Functional Classification
Grade
Signals
Cross Slope
Intersections
Lane in terms of the number, width,
and type ( turn, passing, acceleration,
car pool, etc…)
Access Control
Intersection location, number of
approaches, and control
(uncontrolled, all-way stop, two-way
stop, yield, signalized, roundabout).
Ramp termini are considered
intersections
All MUTCD signs
Barriers
Bridge Location
Speed limit data
Vertical Alignment
Interchanges
Median presence (Y/N), type
(depressed, raised, flush, barrier)
Rest Areas
Speed limit laws
Cell phone and
text messaging
laws
Shoulder type/curb; paved width if
exists
Rumble Strip presence (Y/N)
location (centerline, edgeline,
shoulder)
Lighting presence(Y/N)
4
Roadway Variables
Traffic
information –
AADT
Traffic Data continuous
counts (ATR)
Traffic Data short duration
counts
Pavement Condition
Aerial imagery
Automated
enforcement laws
Terrain
Alcohol-impaired
and drugged
drivers laws
Graduated driver
licensing (GDL)
laws
Tunnels
Data items not consistent between states
Crash history
data
iv
State motor
cycle helmet use
laws
Seat belt use
laws
Local
climatological
data (LCD)
NOAA
Cooperative
weather
observer/other
sources
Winter road
conditions
(DOT)
Work zone
511 information
Changes to
existing
infrastructure
condition
Roadway
capacity
improvements
RR Crossings
(FRA)
In order to make data access more efficient, reduced or aggregated files are available for
use via the SHRP2 NDS Insight website.
The Trip Summary file provides categorical data on each trip roadway data including class
and speed limit, time spent in various speed bins (0-10, 10-20, etc.), and the number of
accelerations higher than a specific threshold. The trip summary dataset has a single record
for each of the estimated 5 M trips, which researchers can query to identify and tabulate
the trips satisfying the conditions of interest necessary to address a specific research
question. The query feature allows things such as the driver’s demographics and
assessments, the vehicle’s descriptive data to be linked to the trip summary. A trip ID is
included should the researcher need to go the detailed time history data to extract
additional information. The file will be completed in early 2015.
For safety-related analysis, collisions and near-collisions are expected to be the most
interesting events for researchers. SHRP 2 is locating crashes and near-crashes based on
participant reports (the incident button) and by processing all trip files with various
crash/near-crash triggers. Crash files (expect ~1,100 of varying severity) and Near-crash
files (expect ~7000) are being created along with Baseline files (expect ~30,000) randomly
selected across all vehicles to provide a denominator for risk calculations. These will come
in two forms, Epoch files containing 30 seconds of data (20 seconds before and 10 seconds
after the ‘precipitating event’) and Baseline files containing 20 seconds of data to compare
to the Epoch ‘before’ segments. The crash and near crash events include most sensor data
plus forward video and results from manual eye glance coding. Event files contain 6
seconds of categorical data similar to the trip summary file coded from the last 5 seconds of
‘before’ data and 1 second ‘after’ along with the data from manual eye glance reduction.
These files will be released in phases during 2014 with the entire dataset complete by early
2015. The crash, near-crash and baseline files will have a total of about 37,700 records that
can be searched quickly to find events with the desired characteristics for a specific
analysis. An event ID will link to the Epoch file with the time history data for more detailed
analysis. The NDS Data Access website includes an event viewer to view the synchronized
video and sensor channels.
In order to add roadway characteristics from the RID with the NDS data, the NDS and RID
data is being linked enabling researchers to identify all trips passing over a given roadway
segment as well as all roadway segments over which a given trip travels. Linking will match
trip IDs and roadway segment IDs. This effort completes in early 2015.
v