The Quality Assessment / Quality Control (QA/QC) Script includes a checklist of the conditions that must be met prior to PAD-US aggregation. The PAD-US development team uses a Python script to generate a QA/QC report based on the items on this checklist. Issues are flagged in reports to support additional review and formatting by data stewards or the PAD-US Team to ensure all PAD-US input files are in a common format to the extent possible. The QA/QC script is run each time a source file is submitted, supporting dialog with data stewards to address corrections as needed. The checklist is summarized below:
QA/QC CHECKLIST FOR INPUTS TO PAD-US
-
Check Projection and Geographic Coordinate System
-
Check format and structure or checkout copy of previous PAD-US version
-
Ensure all records are complete for required attributes
-
Ensure all coded domains are assigned
-
Check for unknown or blank Unit Names (for example, “unknown,” “Unk,” or “ “).
-
Check for records where Unit Name does not meet the following standards: Proper Case, extra spaces, special characters, tabs and returns
-
Ensure reference fields are attributed and meet standards (for example, GIS Source Date = yyyy/mm/dd).
-
Flag duplicate polygons when polygons overlap and attributes are the same.
-
Flag records where Owner Name and Owner Type do not match standard crosswalk
-
Flag records where Manager Name and Manager Type do not match standard crosswalk
-
Flag duplicate Source_PAID records by Aggregator Source
-
Identify, report and delete records with zero geometry
-
Apply categorical GAP Status Code, IUCN Category, and Public Access measures in the absence of local review (that is, where Code Source = blank, null or GAP - Default)
-
Flag locally reviewed GAP Codes that do not match categorical assignments for manual review
-
Check and repair geometry
-
Compare and / or clip input data to Census state jurisdictional boundaries, as needed, and attribute State Name.
Two different methods for checking and assigning State Name attributes are used, the main method is to ‘clip’ all input data to the boundaries provided by the CENSUS State Boundary File. This method creates new polygons for each piece of a source polygon that crosses a state line. The second method uses a selection by location to select and attribute records based on the location of their centroid, this does not alter the shape of the polygon.