SHRP 2 Safety Database Access Phase 1 Stakeholder Request for
Transcription
SHRP 2 Safety Database Access Phase 1 Stakeholder Request for
Purpose SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information The Transportation Research Board (TRB) of the National Academy of Sciences is seeking input from interested stakeholders on key implementation issues to be addressed during the initial five-year implementation phase (Phase 1) of the Strategic Highway Research Program 2 (SHRP 2) Safety Database. This is a Request for Information (RFI) only and not a solicitation for proposals. Information and comments are being sought that address: approaches to providing widespread access to the data for qualified researchers while ensuring the confidentiality of personally identifiable information from volunteer research participants; best practices for managing large data sets; research areas for database utilization; building capacity for data use; provision of user support services; analysis and visualization tools and techniques; and means to foster and sustain user communities that will support program operations as described below and aid in developing a business plan for long-term, sustainable operation of the database. Responders are encouraged to consider both short-term (within five years) and long-term (beyond five years) time horizons in thinking about the data. Background The SHRP 2 Safety Database, consisting of linked data from the Naturalistic Driving Study (NDS) and Roadway Information Database (RID), provides transportation safety practitioners with an unprecedented national resource. Traditionally, data collected by police investigation after a crash has occurred have been used in addressing highway safety issues. This information comes from an examination of the scene, interviews with drivers and witnesses, and judgments about what the driver was doing at the time of the crash. The SHRP 2 Safety Database takes this state of knowledge to the next level by providing detailed data about what the driver was doing before and during the crash sequence as well as in ‘near-crash’ and ‘normal’ driving situations. Such information will enable safety practitioners to design better countermeasures to reduce the societal harm that results from the 33,561 fatalities and 2,362,000 injuries that occurred on the nation’s roadways in 2012 1. The SHRP2 Safety Database consists of an extensive collection of detailed information describing the vehicle, driver, trip and roadway. The study collected data from over 3,000 volunteer drivers of all age and gender groups, during a three year data collection period (most drivers participated 1 to 2 years) amounting to nearly 50 million vehicle miles, 5 million trip files, over 3,900 vehicle years, more than 1 million hours of video, and 2 petabytes of NDS data. Data were collected 1 http://www-nrd.nhtsa.dot.gov/Pubs/812016.pdf 1 SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information across six sites: Florida, Indiana, New York, North Carolina, Pennsylvania, and Washington. Instrumented vans collected detailed roadway data on 25,000 centerline miles of roadway across the study sites. In addition approximately 400,000 miles of less detailed data from the inventories of the study states, and supplemental data on traffic, weather, work zones, crashes, traffic laws, roadway improvements, and other subjects were aggregated to create the RID, resulting in 50-60 GB of spatial data with an additional 8TB video log. Access to the SHRP 2 Safety Database is managed through a set of data sharing policies and procedures to ensure compliance with the data privacy requirements established by the responsible Institutional Review Boards (IRB) in accordance with federal law governing human subjects testing. NDS data access is provided to qualified researchers who have completed training from an accredited IRB, data downloads of any sort require a Data Sharing Agreement, and access to Personally Identifiable Information (PII) is only allowed in a secure data enclave. PII may be used to generate non-identifying reduced data sets but may not be copied or removed from the enclave. The RID does not contain any PII and may be accessed openly. The entire SHRP 2 Safety Database will be complete in early 2015 and plans are currently being made to provide access to qualified researchers. At this time, access to a partial summary dataset is currently being provided via a web-based interface (https://insight.shrp2nds.us/), while access to detailed data is available through the Virginia Tech Transportation Institute (VTTI) for NDS and the Iowa State University Center for Transportation Research and Education (CTRE) for RID. Personally identifiable information is available through the VTTI’s secure data enclave in Blacksburg, VA. A further explanation of the database and current means to access it are available on the SHRP 2 Safety Database (InSight) website linked above. Additional information concerning data collection and use, participant privacy protection, and database structure(s) are contained in: Appendix A: Assurance of Subject Confidentiality Appendix B: SHRP 2 NDS Data Access and Privacy Requirements Appendix C: Requirements for an Operator of the SHRP 2 NDS Data Appendix D: SHRP 2 NDS and RID Database Structure Phase 1 Operation A National Research Council (NRC)-appointed Committee on Long-Term Stewardship of Safety Data (LTSC) for the Second Strategic Highway Research 2 SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information Program, recommended that “because of the uniqueness, scale, scope, and complexity of the data, as well as uncertainty about user demand and willingness to pay for access, devising a final plan for long-term stewardship would be premature; therefore, future oversight and administration of the data should proceed in a phased manner.” 2 Accordingly, a 5-year period commencing in early 2015 will serve as “Phase 1” of use and oversight of SHRP 2 Safety Data. Phase 1 is being administered by TRB under a Cooperative Agreement with the Federal Highway Administration (FHWA) and in accordance with a Memorandum of Understanding among TRB, FHWA, the American Association of State Highway and Transportation Officials (AASHTO), and the National Highway Traffic Safety Administration (NHTSA). Consistent with the recommendations of the LTSC4,3, TRB will operate the Phase 1 program focusing on two primary goals: (1) Promoting conditions under which SHRP 1 Safety Data will be available to qualified users during Phase 1, and (2) Gaining experience and data to support decisions about implementation and oversight of the data after Phase 1. Phase 1 is overseen by the newly NRC-appointed Safety Data Oversight Committee (SDOC). The SDOC, with supporting technical panels called Expert Task Groups (ETGs), guide efforts to operate and maintain the database during Phase 1 and provide access to data users. The SDOC is charged with establishing the program structure and approving all policies and procedures, including developing and approving data sharing/access policies and procedures for all operators and users of the data. During Phase 1, VTTI will continue to serve as the repository for the SHRP 2 NDS database and CTRE will continue to house the RID. Modifications to data access policies and procedures described above and in the attachments are being considered by the SDOC in order to facilitate researcher access to more of the database while maintaining participant confidentiality. Responses to this RFI will inform the SDOC’s considerations. Note that changes to data sharing policies and procedures are also subject to review by the Institutional Review Boards of the National Academy of Sciences and the Virginia Polytechnic Institute and State University. During the Phase 1 period it will be important to focus on database usability by making the data visible, accessible and analyzable to a wide variety of qualified 2 3 http://www.trb.org/StrategicHighwayResearchProgram2SHRP2/Blurbs/168924.aspx http://www.trb.org/StrategicHighwayResearchProgram2SHRP2/Blurbs/169661.aspx 3 SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information researchers to optimize the value of this national resource and understand its utility, while maintaining the privacy of study participants in accordance with the informed consent agreement. Developing and deploying data visualization and analysis tools to simplify access and improve the quality and speed of analysis will be key to supporting this goal. It will also be critically important to understand the costs and funding models associated with database utilization. Request for Information TRB is seeking input from stakeholders on how to best accomplish the stated goals of Phase 1. Information and comments are being sought addressing: approaches to providing widespread and cost-effective access to the data for qualified researchers while ensuring the confidentiality of personally identifiable information; best practices for managing large data sets; research areas for database utilization; building capacity for data use; provision of user support services; analysis and visualization tools and techniques; and means to foster and sustain user communities that will support program operations and aid in developing a business plan for long-term, sustainable operation of the database. Recommendations contained in both of the Long Term Stewardship Committee’s letter reports cited in the Background section of this document, as well as requirements for protecting participant privacy described in the Appendices A-C, should be carefully considered when responding. We welcome all stakeholder’s thoughts and concepts for optimizing Phase 1 operation. In particular, we are soliciting input in five major areas concerning possible: 1) Data Access Models: • Management of the full processed data set and any reduced data sets • Types of data products various user communities would find useful • Data quality and version control • Disposition of derivate work products, such as analysis techniques and reduced data sets, including proprietary rights vs. public access, quality assurance, hosting and availability of knowledgeable technical support. • Website access and user support • Means to provide support to all interested users • Protecting participant privacy, including implications of data use and retention outside of the United States • Centralized vs. distributed data access models 4 • SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information Implications of a single prime operator maintaining the working copy of the processed database vs. multiple operators maintaining working copies of the processed database 2) Approaches to Managing Operations: • Single versus Multiple Entity Control • Public / Private Partnerships • Consortium Approaches • Measures of Performance for Evaluating Phase 1 Operations 3) Sustainable Financial Models: • Business Models & Fee Structures • Sources of Financial Support for Database Infrastructure • Cost Sharing Approaches • Funding for Analysis Projects 4) Potential User Communities: • Safety / Mobility / Health Care / Vehicle Insurance / Other • Public (Local / State / Federal) / Private / Academic • “Non-Safety” (Planning / Operations / Environment and Energy / Psychology / Machine learning / Data Mining and Analysis / Database Operations) • Domestic / Global 5) Database Technology: • Effect of Current ‘Big Data’ Technologies on Phase 1 Approach • Applicability of Relational Database Management Systems • Other Database Management approaches that should be considered • Data Redundancy and Archiving Needs • Impact of Database Technology Evolution during Phase 1 Comments are welcome on any or all of the topic areas listed, as well as on items we may have overlooked. Comments should provide enough specifics to demonstrate that a described approach is feasible and under what conditions, including examples of where it is being used. TRB will utilize the information received to support SDOC oversight of Phase 1 operations. Instructions for Submission Proposed concepts should explicitly address how their comments support data access for all interested qualified researchers regardless of experience level or financial means while maintaining participant privacy. 5 SHRP 2 Safety Database Access Phase 1 Stakeholder Request for Information Respondents should indicate if they have been involved in any similar large data management projects with government agencies and if so, describe the business model and lessons learned. This is a request for information only and not a solicitation for proposals. No funding is associated with this RFI. All RFI responses become the property of the Transportation Research Board. Final disposition will be made according to the policies thereof. RFI Contact: David J. Plazak, Senior Program Officer Strategic Highway Research Program 2 Transportation Research Board 500 Fifth Street, NW Washington, DC 20001 Phone: 202-334-1834 E-Mail: dplazak@nas.edu RFI Due Date: Responses are due by the close of business February 17, 2015. Late submissions will be considered to the extent possible in compiling the results. The preferred electronic format is PDF. Please submit responses to the e-mail address sown above. Any clarifications regarding this RFI will be posted on the SHRP 2 Web site (www.TRB.org/SHRP2) before January 30, 2015. Announcements of such clarifications will be posted on the front page and, when possible, will be noted in the TRB e-newsletter. Responders are advised to check the Web site frequently. 6 Appendix A: Assurance of Subject Confidentiality (Excerpted from the SHRP 2 Naturalistic Driving Study Participant Consent Form) To summarize, your level of confidentiality in this study is as follows: 1. There will be video of your face and upper body. There will be audio recorded, but only for 30 seconds if you press the red incident button. The study also will collect health and driving data about you. The video, audio, and other data that personally identifies you, or could be used to personally identify you, will be held under a high level of security at one or more data storage facilities. Your data will be identified with a code rather than your name. 2. The faces of other drivers will be blurred, blacked out, or replaced by an animation once it is determined they have not signed a consent form. This will be done prior to any additional video review by the researchers. No identifying information will be collected on passengers. 3. For the purposes of this project, only authorized project personnel, authorized employees of the project sponsors, and qualified research partners will have access to study data containing personally identifying information, or that could be used to personally identify you. The data, including face video, which has been blurred, blacked out, or replaced by animation, may be shown at research conferences and by the research sponsors for the highway and road safety purposes identified above. Under no circumstances will your name and other personally identifying information be associated with the video clips. 4. The personally identifying data collected in this study may be analyzed in the future for other research purposes by this project team or by other qualified researchers in a secure environment. Such efforts will require those researchers to sign a data sharing agreement, which will continue to protect your confidentiality, and will also require additional IRB approval. The confidentiality protection provided to you by these data sharing agreements will be as great as or greater than the level provided and described in this document. Research partners will not be permitted to copy raw data that identifies you, or that could be used to identify you, or to remove it from the secure facility in which it is stored except with your consent. 5. A Certificate of Confidentiality has been obtained from the National Institutes of Health. With this Certificate, the researchers and study sponsors cannot be forced to disclose information that may identify you, even by a court subpoena, in any federal, state, or local civil, criminal, administrative, legislative, or other proceedings. However, the Certificate of Confidentiality does not prevent the researchers from disclosing voluntarily matters such as child abuse, or a participant’s threatened violence to self or others. In terms of a vehicle, this could also include items such as driving under the influence of drugs or alcohol, allowing an unlicensed minor to drive the vehicle, or habitually running red lights at high speed. Such behaviors may result in your removal from the study and reporting of the behavior to the appropriate authorities. i Appendix B: SHRP 2 NDS Data Access and Privacy Requirements The SHRP 2 NDS data were collected from the vehicles of volunteer participant drivers. This activity was carried out under federal regulations pertaining to human subjects research, which require the review and approval of at least one Institutional Review Board (IRB). The fundamental requirements for data access and privacy protection are drawn from the consent form signed by the participants and the data collection protocol, both of which were approved by several IRBs. In general, access to NDS data: • • • will only be granted for research purposes. will take place under a data sharing agreement that protects confidentiality as described in the consent form. will require the consent of the researcher’s IRB. In the case of personally identifying information (PII), such as face video, audio, and GPS, the following additional conditions must be met: • • PII may only be used in a secure data enclave and cannot be removed from the enclave. Face video may be shown in public only if the face is sufficiently obscured to prevent identification of the participant. The research protocol requires that data be destroyed on the following schedule: • • • 30 years after collection for personally identifying data. 40 years after collection for de-identified continuous sensor data. Indefinitely for de-identified summary data resulting from analyses and reductions. The current data sharing agreement for SHRP 2 NDS data contains several requirements derived from the basic requirements in the consent form and the data collection protocol. These include the following general requirements for a researcher who receives a portion of the data: • • • • • The recipient and all personnel who will handle the data must be certified as having received standard training in the ethics of human subjects research. The recipient must agree to hold the provided data securely. The recipient must agree not to share the data with others who have not signed a data sharing agreement. The recipient must agree not to attempt to learn the identity of research participants. If the recipient discovers identifying information or data in a dataset that was intended to be non-identifying, he or she must agree to provide that information to the data host so that it can be properly de-identified for future use. i • • • • The recipient must agree not to use data for purposes other than the research described in the data sharing agreement and for presentation, demonstration, and project reporting; an additional data sharing agreement will be required for other uses. The recipient must agree to properly acknowledge the source of the data in any presentations. The recipient must agree not to release or share information leading to the identification of participants or to release or share non-identifying raw data. The recipient must agree to return or destroy the data on a pre-determined schedule. Qualified researchers may access personally identifying data, such as videos of a driver’s face, consistent with the approved needs of the research project. To do this they must agree: • • • • • To obtain institutional approval in advance for their research Certify that they have appropriate training in the use of personally identifying data Access the data in a secure enclave. Not remove personally identifying data from the enclave Not disclose any data that might in any way identify individual participants. ii Appendix C: Requirements for an Operator of the SHRP 2 NDS Data Any SHRP 2 NDS operator must ensure that all of the requirements for data use in Appendix B are met for each occurrence of data access or data sharing. These requirements also apply to each instance of data use by the operator’s own personnel or agents. In addition, the operator must possess the facilities and resources necessary for: • • • • • • • • • Data encryption and security Electronic firewalls and locked storage facilities Password authentication of users Audit trails Disaster prevention and recovery plans Security measures for all data backup tapes, at the same level as working data Reduction of PII for other researchers Ethics training and non-disclosure agreements for data center staff Creating, managing, and storing required documentation, such as data sharing agreements, certifications of ethics training, IRB approvals, etc. Personally-identifying data must be accessed only in a “secure enclave” with computers that cannot access the internet and with no CD or DVD recordable drives and no writeable USB ports, so that personally-identifying data cannot be removed from the enclave. Persons accessing the data must not bring cell phones or cameras into the enclave. In the enclave, persons accessing the data “reduce” it by coding variables of interest, for example coding eye glances from video data. The reduced data are not personally identifying and may be removed from the enclave. As the governing body for Phase 1, the SDOC may modify or add to the policies described in this document as long as the level of confidentiality for the data and privacy protection for the SHRP 2 NDS participants is never less than the protection described to the participants in their consent forms. i Appendix D: SHRP Safety Database Content and Structure The SHRP 2 NDS data, currently housed at VTTI, occupy about 2 PB of storage composed of roughly 1.2 PB of video data and 600 TB of sensor data. Video data reside on an EMC Isilon system with a total of 15 nodes. Sensor data are stored in an IBM Infosphere Data Warehouse (DB2) database. The sensor data in this configuration include time series parametric data collected from the data acquisition system in the participant vehicles, plus metadata, indexing space, etc. Sensor data are stored with varying frequency (often referred to as “asynchronous”), each sensor at its native sampling rate. Periodic sample rates can vary from 1 HZ for GPS to a short buffered 640 Hz for acceleration that is only stored when a collision is detected by the Data Acquisition System (DAS). Rate data from the gyro are nominally sampled at 10 HZ (and buffered at 100 Hz for a collision), 3-axis acceleration at 10 HZ (buffered at 640 Hz), forward radar at 12.5 Hz and video channels at 15 Hz. All channels are referenced to a common system clock, which is disciplined by GPS; GPS time is also recorded as part of the GPS sensor module. NDS data are stored in segments of time series referred to together as trip files. Typically, the DAS begins recording shortly after the vehicle is started and records continuously until the vehicle is turned off. The period of time from key-on to key-off is referred to as a trip. Note that under certain conditions a single key-on to key-off trip may be comprised of multiple trip files that would need to be concatenated to reconstruct the trip. A typical trip may have 1.5 to 2 million records of sensor data. Overall, the sensor data are estimated at 5-10 trillion records. The time series data, as well as ancillary data, metadata, and other structured data, are stored in a massively parallel processing (MPP) relational database system that makes them accessible for data mining and analysis. These data are available for ad hoc use by data reductionists annotating individual files as well as providing a platform for data mining, trigger execution, and other complex analytics. In an MPP database, most data typically are scattered among several “worker” servers, each with a segment of the large data set, while queries are submitted to a “query coordinator” node that distributes the queries to the workers and gathers the query results for the client. The current MPP database configuration uses thirty-two database partitions among four worker nodes. System enhancements may be needed to accommodate the increased workloads and number of users anticipated in Phase 1. The tables below provide an overview of the detailed NDS data elements collected. Vehicle ID Model Year Powertrain Right Front Tread Depth Left Rear Pressure Battery Voltage Battery Date Vehicle Detail Variables Vehicle Classification Vehicle Make Left Front Tread Depth Right Rear Tread Depth Right Front Pressure Battery Amps Data Collection Configuration i Advanced Technology Vehicle Site Name Left Rear Tread Depth Left Front Pressure Right Rear Pressure Battery Condition Integrated Cell Phone Controls Location Phonebook Display Location Onstar Music Control Speech recognition Factory Navigation Accept Nomadics Driver Assessments Driver Demographic Questionnaire Data Dictionary Medical Conditions & Medications Data Dictionary Driving Knowledge Data Dictionary Risky Behavior Data Dictionary JAMAR Hand Strength Data Dictionary Driver Vision Testing Results Data Dictionary Medical Conditions and Medications - Exit Data Dictionary Exit Interview Data Dictionary Trip ID Trip Start UTC Hour of Day Trip Start UTC Month Trip End UTC Hour of Day Trip Start Local Time Hour of Day Trip Start Month Local Trip End Local Time Hour of Day Trip Day of Week Trip Day Number in Study Trip Duration Trip Centroid Longitude Max Speed Mean Speed Time Moving Time Not Moving Maximum Acceleration Maximum Deceleration Maximum Lateral Acceleration Minimum Lateral Acceleration Maximum Turn Rate Minimum Turn Rate Number of Longitudinal Accels > Threshold Number of Longitudinal Decels > Threshold Number of Lateral Accels > Threshold Brake Activations Lane Tracker Right Side High Quality Time Lane Tracker Left Side High Quality Time Face Tracker High Quality Time Trip Distance Origin to Destination ABS Activation Acceleration, y-axis Acceleration, z-axis fast Cabin Audio Dilution of Precision, Position Electronic Stability Control Epoch State Head Position X Baseline Head Position Z Phonebook Access Navigation Display Location Nomadics Method Driving History Questionnaire Data Dictionary Barkley's Quick Screen Data Dictionary (ADHD screening) Perception of Risk Data Dictionary Sensation Seeking Data Dictionary Modified Manchester Driver Behavior Data Dictionary Sleep Questionnaire Data Dictionary Visual and Cognitive Testing Results Data Dictionary Trip Summary Variables ABS Available ABS Activation Turn Signal Available Turn Signal Activations Traction Control Available Traction Control Activation Vehicle Network Supports Seatbelt Seatbelt Usage Percentage Vehicle Network Supports Wipers Time Wipers Used Vehicle Network Supports Cruise Control Time at 0-10 mph Time at 10-20 mph Time at 20-30 mph Time at 30-40 mph Time at 40-50 mph Time at 50-60 mph Time at 60-70 mph Time at 70-80 mph Time at > 80 mph Distance at 0-10 mph Distance at 10-20 mph Time Where Radar Targets = 0 Time Where Radar Targets = 1 Time Where Radar Targets = 2 Time Where Radar Targets = 3 Time Where Radar Targets = 4 Time Where Radar Targets = 5 Time Where Radar Targets = 6+ Distance Where Radar Targets = 0 Distance Where Radar Targets = 1 Distance Where Radar Targets = 2 Distance at 30-40 mph Distance at 40-50 mph Distance at 50-60 mph Distance Where Headway 1.0 - 1.5 s Distance Where Headway 1.5 - 2.0 s Distance Where Headway 2.0 - 2.5 s Distance at 20-30 mph Distance at 60-70 mph Distance at 70-80 mph Distance at > 80 mph Vehicle Model Year Time Series Variables Acceleration, x-axis Acceleration, y-axis fast Airbag, Driver Cruise Control Distance Elevation, GPS Head Confidence Head Position Y Head Position Z Baseline ii Distance Where Radar Targets = 5 Time Where Headway 0.0 - 0.5 s Time Where Headway 0.5 - 1.0 s Time Where Headway 1.0 - 1.5 s Time Where Headway 1.5 - 2.0 s Time Where Headway 2.0 - 2.5 s Time Where Headway 2.5 - 3.0 s Time Where Headway 3.0 - 3.5 s Time Where Headway > 3.5 s Distance with Lead Vehicle Distance Where Headway 0.0 - 0.5 s Distance Where Headway 0.5 - 1.0 s Distance Where Headway 2.5 - 3.0 s Distance Where Headway 3.0 - 3.5 s Distance Where Headway > 3.5 s Minimum TTC Acceleration, x-axis fast Acceleration, z-axis Alcohol Day Driver Button Flag Engine RPM Head Position X Head Position Y Baseline Head Rotation X Head Rotation X Baseline Head Rotation Z Headlight Setting Lane Marking, Distance, Right Lane Marking, Type, Right Lane Width Longitude Pedal, Accelerator Position Pitch Rate, y-axis fast Radar, Range Rate Forward X Track 1 Radar, Range Rate Forward X Track 4 Radar, Range Rate Forward X Track 7 Radar, Range Rate Forward Y Track 2 Radar, Range Rate Forward Y Track 5 Radar, Range, Forward X Track 0 Radar, Range, Forward X Track 3 Radar, Range, Forward X Track 6 Radar, Range, Forward Y Track 1 Radar, Range, Forward Y Track 4 Radar, Range, Forward Y Track 7 Radar, Target Identification Track 2 Radar, Target Identification Track 5 Roll Rate, x-axis Speed, GPS Subject_ID Timestamp vehicle_id Video, Driver and Left Side View Video, Rear View Yaw Rate, z-axis fast Head Rotation Y Head Rotation Z Baseline Illuminance, Ambient Lane Marking, Probability, Right Lane Markings, Probability, Left Latitude Month Pedal, Brake PRNDL Radar, Range Rate Forward X Track 2 Radar, Range Rate Forward X Track 5 Radar, Range Rate Forward Y Track 0 Radar, Range Rate Forward Y Track 3 Radar, Range Rate Forward Y Track 6 Radar, Range, Forward X Track 1 Radar, Range, Forward X Track 4 Radar, Range, Forward X Track 7 Radar, Range, Forward Y Track 2 Radar, Range, Forward Y Track 5 Radar, Target Identification Track 0 Radar, Target Identification Track 3 Radar, Target Identification Track 6 Roll Rate, x-axis fast Speed, Vehicle Network Temperature, Interior Traction Control Video IP and Steering Wheel View Video, Forward Roadway Wiper Setting Year Head Rotation Y Baseline Heading, GPS Lane Marking, Distance, Left Lane Marking, Type, Left Lane Position Offset Location Number of Satellites Pitch Rate, y-axis Radar, Range Rate Forward X Track 0 Radar, Range Rate Forward X Track 3 Radar, Range Rate Forward X Track 6 Radar, Range Rate Forward Y Track 1 Radar, Range Rate Forward Y Track 4 Radar, Range Rate Forward Y Track 7 Radar, Range, Forward X Track 2 Radar, Range, Forward X Track 5 Radar, Range, Forward Y Track 0 Radar, Range, Forward Y Track 3 Radar, Range, Forward Y Track 6 Radar, Target Identification Track 1 Radar, Target Identification Track 4 Radar, Target Identification Track 7 Seatbelt, Driver Steering Wheel Position Time Turn Signal Video Frame Video, Occupancy Snapshot Yaw Rate, z-axis The SHRP 2 RID database, currently housed at Iowa State University, is a 50-60 GB spatial database that nominally covers the areas where the SHRP 2 NDS participants drove. The RID combines information from the 4 sources listed below in a single database with a common reference system and does not contain any PII. In addition, Item 2 below includes a video log that adds about 8 TB to the file size. 1. A centerline base map with coverage of the continental US. 2. Roadway data collected by SHRP 2 following a standard protocol and quality control (about 12,500 centerline miles total for the 6 sites). 3. Existing roadway data provided by the six states and other organizations (content differs from each source, about 200,000 centerline miles). 4. Supplemental information, for example, crash history, traffic, weather, work zones, pertinent traffic safety laws, etc.) iii The table below provides an overview of the detailed RID data elements collected Mobile Van Data 25,000 driven/ 12,500 centerline miles across the six NDS sites Acquired Roadway Data 4 ~200,000 Centerline Miles Includes HPMS files for the six states plus Acquired Supplemental Data Existing data and information from State DOTs, Public Agencies, and Private Sources: Horizontal Curvature: Radius, Length, PC, PT, Direction Functional Classification Grade Signals Cross Slope Intersections Lane in terms of the number, width, and type ( turn, passing, acceleration, car pool, etc…) Access Control Intersection location, number of approaches, and control (uncontrolled, all-way stop, two-way stop, yield, signalized, roundabout). Ramp termini are considered intersections All MUTCD signs Barriers Bridge Location Speed limit data Vertical Alignment Interchanges Median presence (Y/N), type (depressed, raised, flush, barrier) Rest Areas Speed limit laws Cell phone and text messaging laws Shoulder type/curb; paved width if exists Rumble Strip presence (Y/N) location (centerline, edgeline, shoulder) Lighting presence(Y/N) 4 Roadway Variables Traffic information – AADT Traffic Data continuous counts (ATR) Traffic Data short duration counts Pavement Condition Aerial imagery Automated enforcement laws Terrain Alcohol-impaired and drugged drivers laws Graduated driver licensing (GDL) laws Tunnels Data items not consistent between states Crash history data iv State motor cycle helmet use laws Seat belt use laws Local climatological data (LCD) NOAA Cooperative weather observer/other sources Winter road conditions (DOT) Work zone 511 information Changes to existing infrastructure condition Roadway capacity improvements RR Crossings (FRA) In order to make data access more efficient, reduced or aggregated files are available for use via the SHRP2 NDS Insight website. The Trip Summary file provides categorical data on each trip roadway data including class and speed limit, time spent in various speed bins (0-10, 10-20, etc.), and the number of accelerations higher than a specific threshold. The trip summary dataset has a single record for each of the estimated 5 M trips, which researchers can query to identify and tabulate the trips satisfying the conditions of interest necessary to address a specific research question. The query feature allows things such as the driver’s demographics and assessments, the vehicle’s descriptive data to be linked to the trip summary. A trip ID is included should the researcher need to go the detailed time history data to extract additional information. The file will be completed in early 2015. For safety-related analysis, collisions and near-collisions are expected to be the most interesting events for researchers. SHRP 2 is locating crashes and near-crashes based on participant reports (the incident button) and by processing all trip files with various crash/near-crash triggers. Crash files (expect ~1,100 of varying severity) and Near-crash files (expect ~7000) are being created along with Baseline files (expect ~30,000) randomly selected across all vehicles to provide a denominator for risk calculations. These will come in two forms, Epoch files containing 30 seconds of data (20 seconds before and 10 seconds after the ‘precipitating event’) and Baseline files containing 20 seconds of data to compare to the Epoch ‘before’ segments. The crash and near crash events include most sensor data plus forward video and results from manual eye glance coding. Event files contain 6 seconds of categorical data similar to the trip summary file coded from the last 5 seconds of ‘before’ data and 1 second ‘after’ along with the data from manual eye glance reduction. These files will be released in phases during 2014 with the entire dataset complete by early 2015. The crash, near-crash and baseline files will have a total of about 37,700 records that can be searched quickly to find events with the desired characteristics for a specific analysis. An event ID will link to the Epoch file with the time history data for more detailed analysis. The NDS Data Access website includes an event viewer to view the synchronized video and sensor channels. In order to add roadway characteristics from the RID with the NDS data, the NDS and RID data is being linked enabling researchers to identify all trips passing over a given roadway segment as well as all roadway segments over which a given trip travels. Linking will match trip IDs and roadway segment IDs. This effort completes in early 2015. v