Data Download Help
Data in the HRSA Data Warehouse (HDW) are available for download in a variety of ways, including: text-only, CSV, Excel, PDF, DBF, SAS, SPSS, and ZIP. Some common issues with downloaded files are listed below.
Some HDW reports contain text values that are composed solely of numeric characters (most commonly ZIP codes, FIPS codes, and some types of identification numbers). When Excel opens a CSV file, it makes a “best-guess” attempt to determine the type of data in each column. Excel does not properly recognize these special cases and interprets the data as numeric values, which causes the leading zeroes to be lost. For example, the ZIP code 01234 would be displayed and processed as simply 1234, which is not a valid ZIP code format.
The data columns that typically experience this issue are:
- ZIP codes
- State FIPS codes
- County FIPS codes
- County Subdivision codes
- Census Tract numbers or FIPS codes
- Health Professional Shortage Area (HPSA) identification numbers
- Medically Underserved Area/Population (MUA/P) identification numbers
To ensure that the data are properly handled, carefully review the data type for each column in the file. Set the column type to “Text” instead of “General” (which is the default) to tell Excel that the data in that column should be treated as text instead of numeric data.
To access the .DBF files, first download the file to your system. Then, to open the .DBF file in Excel, open Excel, and select File > Open. In the file type drop down box, select “All Files” and navigate to the folder where the file was saved, then double click on the filename of the downloaded .dbf file.
The ZIP9 to PCSA v3.1 crosswalk file is too large to be opened with Microsoft Excel. Users that are familiar with SAS can use the provided SAS script to open the file. The SAS script demonstrates the following three actions:
- Displays the first 100 records;
- Displays records filtered for some selected 5-digit ZIP codes (in the provided SAS code, ZIP Codes 22033 and 20852 were used as examples); and
- Exports records filtered for some selected 5-digit ZIP Codes (in the SAS provided code, ZIP Codes 22033 and 20852 were used as examples), to the Excel file “PCSA_zip9_pcsav31.xlsx”.
In order to use the scripts, please do the following:
- Download the DBF file “zip9_pcsav31.dbf” to the folder “C:\temp” – the script will look for the data file in that folder.
- Start SAS and run the SAS code “zip9_pcsav31.sas”. As is, this SAS code will perform the three actions described above.
- To select the ZIP Codes which you require, please change the 5-digit ZIP Codes for Data_for_Zip as necessary. For example, change ('22033','20852') to ('96789','96701','96786') to export the Mililani, Aiea, and Wahiawa ZIP Codes.
By default, Microsoft Excel removes leading zeros when data is imported into an Excel file which can result in invalid data. For example, the ZIP code 02108 would appear as 2108, which is not a valid ZIP code format. To preserve the integrity of the data, the data warehouse added quotation marks to the data in automatic Excel download files. To remove the quotation marks, please use the find and replace function in Excel to find quotation marks (“”) and replace them with no value.
Download the KML (or “shape files”) and use them with mapping applications like Google Maps to work with spatial data. Files contain additional data fields about the shapes and points.
MS Access Files
To begin using downloaded data in Microsoft Access, double click on the Setup.exe file included in the download file and follow the instructions from the setup wizard to setup the MS-Access Database version of the data.
Many downloadable data files in the data warehouse are compressed, or zipped, to reduce the file size and to allow multiple files to be packaged and downloaded as one unit.
To unzip a compressed file, double-click on the .zip file after downloading to open. Drag and drop the contents of the zipped file to a folder somewhere else on your computer to begin working with the data.