Full Data Processing Workflow Description

Transcription

Full Data Processing Workflow Description
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
Data Entry Workflow
• Forms mailed to DEC by February 1
• DEC logs ID number of reporting business into
Report Tracking database
– Business or applicator name and address returned
– Data used to track which businesses have not filed
• Diskette submissions are batched and sent to
Compaq
Data Entry Workflow
• Forms are boxed and boxes numbered
– These numbers ultimately stored in PSUR
database
• Reports sent by businesses to replace previous
submissions batched separately
– Project team will replace data in database
Data Entry Workflow
• Optical character recognition forms boxed separately
• Lason picks up forms
• Lason scans paper forms and marks form with image
number
– Image number stored in PSUR database
Data Entry Workflow
• Form images sent overseas for data entry
– Machine generated forms to China
– Hand written forms to India
– OCR forms to Mexico
• Data key punched or scanned
Data Entry Workflow
• Quality control steps are performed
– Questionable values checked against images
– Computer program edits data based on PSUR
data entry specifications
– Identification numbers and EPA registration
numbers looked up using set-up data from the
DEC pesticide management systems (provided by
PSUR)
Data Entry Workflow
• CDs containing all scanned source reports sent to
DEC
– DEC does QA/QC on indecipherable/illegible
reports?
• Data formatted into four file types based on the report
forms (form numbers: 25, 26, 26A, 27)
Data Entry Workflow
• Files transmitted from overseas to Cornell using the
Internet
• File transfer reports e-mailed to database manager
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
Media Registration
• Files received by PSUR checked against file names
on transfer report
• Files moved from FTP (Internet) site to our Media
Registration application
– Windows application that connects to main
database on the Unix server
– Updates four database tables
•
•
•
•
File information
Media (FTP, diskette etc.) information
Administrative action that occurred
Who sent us the media
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
File Verification
• Files transferred from media registration application
to the main Unix server
• Compaq files reformatted into same format as Lason
files
File Verification
• Applications written in Informix 4GL verify whether
files are readable by the computer
– Required number of fields present
– Required separators between fields present
– Files named so the type of data can be identified
(applications, sales etc.)
– Fields formatted correctly
– Fields contain required data types (numbers,
characters etc.)
File Verification
• If file is unreadable, file is rejected and an audit report
is generated
– Wrong number of fields in data record
• Invalid records are rejected and an audit report is
generated
– Field lengths
– Date types
– Record numbering
File Verification
• If more than 100 records rejected from a file, the
entire file is rejected
– Audit report generated
• Only valid records are loaded into the database
– Record loaded into tables in “as is”format
– Data is not ready for reporting at this point
File Verification
PSUR
Electronic File
Specification
Re
jec
ted
Fil
e
Database
Registered
Files
Registered File
Verify
File
Name
Named
Registered File
Re
jec
ted
File
Au
dit
Re
po
rt
Load
Records
Re
jec
ted
File
Au
dit
Re
po
rt
ort
ep
R
dit
Au
Verify
Field
Data Types
Acc
epte
d Fi
le
Return
to
UNIX
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
Audit Reporting
• Audit reports sent out to Lason and Compaq
• Consulting provided to data entry vendors
– Reports summarized
– Procedures for retransmission of files clarified
– Causes of audit report errors investigated further
• Data entry vendors retransmit corrected files
Audit Reporting
• Files received are checked against the audit reports
to confirm that all files have been replaced
• Media registration and file verification steps repeated
for corrected files
• Entire cycle repeated until all files pass file
verification or a deadline is reached
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Database
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Adhoc
Report
Requests
Setup Data
DEC Product
Registration
Database
DEC Commercial
Permit
Database
PSUR
DEC Applicator
Certification
Database
Commercial
Zip Code
Data
DEC Business
Registration
Database
Setup Tables
• Tables containing data used to check the validity of
other data
• Set up data from DEC departmental databases
– 3 databases provide our business and applicator
data
• Names and addresses
• Identification numbers
• Status of information (current/expired)
Setup Tables
• Product registration database provides our product
data
– EPA registration numbers
– Product names
– Status of information (current/expired)
Setup Tables
• Data vendor feeds our zip code location data
– Zip codes
– Municipalities
– Zip code/municipality relationships
– Zip code/county relationships
– Status of information (current/expired)
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Database
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Adhoc
Report
Requests
Data Validation Workflow
• 4 application processes, 1 per record type
• Processes identify business or applicator that
reported data
– Each input record contains 1 or 2 DEC issued
identification numbers
Data Validation Workflow
– Separate ID numbers for :
• Pesticide applicators
• Pesticide application businesses
• Pesticide sales businesses
– Record matched to set up table using ID number
– If no match found, record stored as an “unknown”
business
Data Validation Workflow
• Table that records which businesses have filed
reports is updated if necessary
• Data edits and data vendor checks are performed
• 3 types of edits
– Data validations
• Fields checked against set up tables
– Ex., Zip code valid?
• Value checks
– Ex., Date month and day within valid ranges?
Data Validation Workflow
• Presence checks
– Ex., Required fields reported?
– Vendor audits
• Data checked for conformity with data entry
specifications
– Ex., ID numbers checked to verify the vendor’s ID
number validation
– Vendor flags
• Edits that check whether data codes not actual values
were transmitted
Data Validation Workflow
– Ex., Illegible fields are keyed as “??”and
indecipherable fields as “$$”
• Records with date ranges
– DEC allows reporting of a date range in special
circumstances
• Ex., Jan 1 - may 1
– Application program divides quantity of pesticide
applied by number of days
Data Validation Workflow
– One record created for each day in the date range
• Multiple records created from single input record
• Quantities converted from reported units of measure
into gallons or pounds
– Both measurements used because data to convert
pesticide products into a single unit of measure is
not available to us
• Various government agencies collect this data but it is
either confidential or incomplete
Data Validation Workflow
– Initial reported values also retained
• After validations performed, reporting database
tables are loaded
– Tables are separate from those used to hold the
input records
• Database now ready for reporting
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Database
Adhoc
Report
Requests
Generate Reports
• Report generation programs are run
– Output is a text file without any formatting
• Output files transferred to Windows for report
formatting
– FileMaker Pro database
• More user friendly labels are added
• Records are grouped by county or zip code
Generate Reports
• Subtotals and counts are calculated
• Adobe Acrobat files generated from FileMaker
• Acrobat files placed on PSUR FTP site for Reporting
Section to download
• CDs made of all reports for storage
Generate Reports
• 8 reports delivered in accordance with legislative
mandate
– Total pesticide applications by EPA registration
number for county, zip code, and entire state
– Total retail pesticide sales by EPA registration
number for county, zip code, and entire state
Generate Reports
– Total wholesale pesticide sales by EPA
registration number to commercial applicators for
end use
– Total wholesale pesticide sales by EPA
registration number to resellers
Generate Reports
• 5 database statistics reports are delivered for use in
the DEC report narrative
– Number of different products used statewide and
by county
– Top 10 products applied in each county
• Separate listings for gallons and pounds
– Top 10 products applied statewide
• Separate listings for gallons and pounds
Generate Reports
• Queries done against the database produce
additional statistics
– Number of commercial applicators that reported
– Number of commercial permittees that reported
Audit
Audit
Reporting
Reports
Registered
Audit
Files
FTP
Reports
Media
Registration
Report
Verified
Files
Files (CU)
Data
Entry
Verify
Files
Report
Files
Media Data
Files Data
Activity Data
Source
Record &
Image
Files
Data
Report
Forms
Prepare
Control/Setup
Info
Zip Codes
County Codes
Product
Certified Applicators
Control
Database
Load
and Setup
Rejected
Files
PSUR
Database
Data
Source
Record
Regulated Businesses
Commercial Permittees
Perform
Validation
Application Record
Sales Record
Employed Applicator
Database
Governors Report
Internal DEC
Advisory Board
Public
Generate
Reports
Web
Pages
Reformat Files
For
Web
Pre-Web Report
PSUR
Adhoc
Report
Requests
Reformat Files for Web
• Scripts are run in the report generation database to
add HTML tags to the report files
• Files broken up into sizes suitable for web browsing
• Files are exported into the web site management tool
Reformat Files for Web
• Index web pages, navigation bars, text, and
standardized page formatting are applied
• Files are published on the Pesticide Management
Education Program web site
Reformat Files for Web