Full Data Processing Workflow Description
Transcription
Full Data Processing Workflow Description
Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests Data Entry Workflow • Forms mailed to DEC by February 1 • DEC logs ID number of reporting business into Report Tracking database – Business or applicator name and address returned – Data used to track which businesses have not filed • Diskette submissions are batched and sent to Compaq Data Entry Workflow • Forms are boxed and boxes numbered – These numbers ultimately stored in PSUR database • Reports sent by businesses to replace previous submissions batched separately – Project team will replace data in database Data Entry Workflow • Optical character recognition forms boxed separately • Lason picks up forms • Lason scans paper forms and marks form with image number – Image number stored in PSUR database Data Entry Workflow • Form images sent overseas for data entry – Machine generated forms to China – Hand written forms to India – OCR forms to Mexico • Data key punched or scanned Data Entry Workflow • Quality control steps are performed – Questionable values checked against images – Computer program edits data based on PSUR data entry specifications – Identification numbers and EPA registration numbers looked up using set-up data from the DEC pesticide management systems (provided by PSUR) Data Entry Workflow • CDs containing all scanned source reports sent to DEC – DEC does QA/QC on indecipherable/illegible reports? • Data formatted into four file types based on the report forms (form numbers: 25, 26, 26A, 27) Data Entry Workflow • Files transmitted from overseas to Cornell using the Internet • File transfer reports e-mailed to database manager Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests Media Registration • Files received by PSUR checked against file names on transfer report • Files moved from FTP (Internet) site to our Media Registration application – Windows application that connects to main database on the Unix server – Updates four database tables • • • • File information Media (FTP, diskette etc.) information Administrative action that occurred Who sent us the media Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests File Verification • Files transferred from media registration application to the main Unix server • Compaq files reformatted into same format as Lason files File Verification • Applications written in Informix 4GL verify whether files are readable by the computer – Required number of fields present – Required separators between fields present – Files named so the type of data can be identified (applications, sales etc.) – Fields formatted correctly – Fields contain required data types (numbers, characters etc.) File Verification • If file is unreadable, file is rejected and an audit report is generated – Wrong number of fields in data record • Invalid records are rejected and an audit report is generated – Field lengths – Date types – Record numbering File Verification • If more than 100 records rejected from a file, the entire file is rejected – Audit report generated • Only valid records are loaded into the database – Record loaded into tables in “as is”format – Data is not ready for reporting at this point File Verification PSUR Electronic File Specification Re jec ted Fil e Database Registered Files Registered File Verify File Name Named Registered File Re jec ted File Au dit Re po rt Load Records Re jec ted File Au dit Re po rt ort ep R dit Au Verify Field Data Types Acc epte d Fi le Return to UNIX Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests Audit Reporting • Audit reports sent out to Lason and Compaq • Consulting provided to data entry vendors – Reports summarized – Procedures for retransmission of files clarified – Causes of audit report errors investigated further • Data entry vendors retransmit corrected files Audit Reporting • Files received are checked against the audit reports to confirm that all files have been replaced • Media registration and file verification steps repeated for corrected files • Entire cycle repeated until all files pass file verification or a deadline is reached Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Database Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Adhoc Report Requests Setup Data DEC Product Registration Database DEC Commercial Permit Database PSUR DEC Applicator Certification Database Commercial Zip Code Data DEC Business Registration Database Setup Tables • Tables containing data used to check the validity of other data • Set up data from DEC departmental databases – 3 databases provide our business and applicator data • Names and addresses • Identification numbers • Status of information (current/expired) Setup Tables • Product registration database provides our product data – EPA registration numbers – Product names – Status of information (current/expired) Setup Tables • Data vendor feeds our zip code location data – Zip codes – Municipalities – Zip code/municipality relationships – Zip code/county relationships – Status of information (current/expired) Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Database Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Adhoc Report Requests Data Validation Workflow • 4 application processes, 1 per record type • Processes identify business or applicator that reported data – Each input record contains 1 or 2 DEC issued identification numbers Data Validation Workflow – Separate ID numbers for : • Pesticide applicators • Pesticide application businesses • Pesticide sales businesses – Record matched to set up table using ID number – If no match found, record stored as an “unknown” business Data Validation Workflow • Table that records which businesses have filed reports is updated if necessary • Data edits and data vendor checks are performed • 3 types of edits – Data validations • Fields checked against set up tables – Ex., Zip code valid? • Value checks – Ex., Date month and day within valid ranges? Data Validation Workflow • Presence checks – Ex., Required fields reported? – Vendor audits • Data checked for conformity with data entry specifications – Ex., ID numbers checked to verify the vendor’s ID number validation – Vendor flags • Edits that check whether data codes not actual values were transmitted Data Validation Workflow – Ex., Illegible fields are keyed as “??”and indecipherable fields as “$$” • Records with date ranges – DEC allows reporting of a date range in special circumstances • Ex., Jan 1 - may 1 – Application program divides quantity of pesticide applied by number of days Data Validation Workflow – One record created for each day in the date range • Multiple records created from single input record • Quantities converted from reported units of measure into gallons or pounds – Both measurements used because data to convert pesticide products into a single unit of measure is not available to us • Various government agencies collect this data but it is either confidential or incomplete Data Validation Workflow – Initial reported values also retained • After validations performed, reporting database tables are loaded – Tables are separate from those used to hold the input records • Database now ready for reporting Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Database Adhoc Report Requests Generate Reports • Report generation programs are run – Output is a text file without any formatting • Output files transferred to Windows for report formatting – FileMaker Pro database • More user friendly labels are added • Records are grouped by county or zip code Generate Reports • Subtotals and counts are calculated • Adobe Acrobat files generated from FileMaker • Acrobat files placed on PSUR FTP site for Reporting Section to download • CDs made of all reports for storage Generate Reports • 8 reports delivered in accordance with legislative mandate – Total pesticide applications by EPA registration number for county, zip code, and entire state – Total retail pesticide sales by EPA registration number for county, zip code, and entire state Generate Reports – Total wholesale pesticide sales by EPA registration number to commercial applicators for end use – Total wholesale pesticide sales by EPA registration number to resellers Generate Reports • 5 database statistics reports are delivered for use in the DEC report narrative – Number of different products used statewide and by county – Top 10 products applied in each county • Separate listings for gallons and pounds – Top 10 products applied statewide • Separate listings for gallons and pounds Generate Reports • Queries done against the database produce additional statistics – Number of commercial applicators that reported – Number of commercial permittees that reported Audit Audit Reporting Reports Registered Audit Files FTP Reports Media Registration Report Verified Files Files (CU) Data Entry Verify Files Report Files Media Data Files Data Activity Data Source Record & Image Files Data Report Forms Prepare Control/Setup Info Zip Codes County Codes Product Certified Applicators Control Database Load and Setup Rejected Files PSUR Database Data Source Record Regulated Businesses Commercial Permittees Perform Validation Application Record Sales Record Employed Applicator Database Governors Report Internal DEC Advisory Board Public Generate Reports Web Pages Reformat Files For Web Pre-Web Report PSUR Adhoc Report Requests Reformat Files for Web • Scripts are run in the report generation database to add HTML tags to the report files • Files broken up into sizes suitable for web browsing • Files are exported into the web site management tool Reformat Files for Web • Index web pages, navigation bars, text, and standardized page formatting are applied • Files are published on the Pesticide Management Education Program web site Reformat Files for Web