Data Releases by the Numbers
October 1, 2016 marked the first day of the new fiscal year and the transition of the USGS public access guidance, “Review and Approval of Scientific Data for Release,” from an Instructional Memorandum (“IM”) to formal policy. For a little over a year, the ScienceBase team has been working to develop and refine a process for scientists to release these formal data products on the ScienceBase platform. Over the last year, new features have been built out in ScienceBase to help support this effort and instructional materials and training have been developed. The ScienceBase team has been extremely appreciative of scientists’ and data managers’ patience and input as the process has evolved and improved over time.
We’d like to take a look back at the ScienceBase data release effort in fiscal year 2016 (FY16) and provide some statistics on the system’s use as these new policy guidelines have evolved. In FY16, there were a total of 116 data releases finalized in ScienceBase (other data release products are also actively being completed). The majority of the data releases came from the Water Mission Area (figure 1). The ScienceBase team saw a general increase in the number and frequency of data releases over the course of the fiscal year (figure 2).
Figure 1. Number of data releases in ScienceBase by Mission Area as of October 1, 2016
Figure 2. Number of data releases in ScienceBase by month in FY16
Readers should note that there is an established process for completing a formal USGS data release product in ScienceBase (https://www.sciencebase.gov/about/content/data-release). The workflow helps streamline the process for authors, ensures the final data products are labeled and cataloged consistently for system tracking purposes, and ensures that USGS policy guidelines are being met for these resources within ScienceBase. User guidance is also available from the ScienceBase team.
Calculating and Storing Checksums in ScienceBase
As of August 2016, ScienceBase now calculates and stores MD5 checksums for data files. Checksums are unique alphanumeric strings calculated for an uploaded and stored data file. They are used to ensure that no errors have been introduced during transmission or storage. For users familiar with the JSON representation of a ScienceBase item, checksum information will be stored in the “files” field of the JSON object. An example is shown below:
This feature will help with data management and preservation goals as ScienceBase works towards becoming a trusted digital repository. In the future, checksums may be used to help avoid duplication of content in the ScienceBase system.
Relating Items in ScienceBase
Resources stored in ScienceBase are often related to one another in particular ways. In ScienceBase, it is possible to specify the nature of the relationship between the two resources. For example, one item can be identified as a “product of” another item. These defined relationships are called Associations in ScienceBase. Becoming familiar with these labels and implementing them across a collection of content can be especially useful if users would like to highlight associations between resources.
Go to the item and on the right column, click on “Associate an Item." This action will display a dialogue box where users can search for and specify the related ScienceBase item by item ID or title. The pull down menu can be used to indicate the type of relationship between the two items. A user can also reverse the relationship by checking the box on the right. In the example, the item “Corn” is associated with another item called “The Farm.” As you can see, “Corn” is a product of “The Farm.”
Once you add the association, you will see that “Corn” is a product of “The Farm” item and you will be able to click on “The Farm” link to view its content.
Adding an association in one item will add the inverse relationship to the other item so that both are linked to one another. Now “The Farm” item contains an association indicating that it produced “Corn.”
Associations are powerful features for linking items across the ScienceBase system. Note: when creating associations in the ScienceBase system, please ensure that linked items have the same permission settings. If you associate a public item with a private item, the relationship will appear on both items. Users who click a link to an item they don’t have permission to view will receive a “page not found” message.