Skip to main content
Advanced Search

Filters: partyWithName: Community for Data Integration - CDI (X)

Folders: ROOT > ScienceBase Catalog > Community for Data Integration (CDI) > CDI Projects Fiscal Year 2016 ( Show all descendants )

13 results (85ms)   

View Results as: JSON ATOM CSV
thumbnail
Recent open data policies of the Office of Science and Technology Policy (OSTP) and Office of Management and Budget (OMB), which were fully enforceable on October 1, 2016, require that federally funded information products (publications, etc.) be made freely available to the public, and that the underlying data on which the conclusions are based must be released. A key and relevant aspect of these policies is that data collected by USGS programs must be shared with the public, and that these data are subject to the review requirements of Fundamental Science Practices (FSP). These new policies add a substantial burden to USGS scientists and science centers; however, the upside of working towards compliance with...
thumbnail
Land-use researchers need the ability to rapidly compare multiple land-use scenarios over a range of spatial and temporal scales, and to visualize spatial and nonspatial data; however, land-use datasets are often distributed in the form of large tabular files and spatial files. These formats are not ideal for the way land-use researchers interact with and share these datasets. The size of these land-use datasets can quickly balloon in size. For example, land-use simulations for the Pacific Northwest, at 1-kilometer resolution, across 20 Monte Carlo realizations, can produce over 17,000 tabular and spatial outputs. A more robust management strategy is to store scenario-based, land-use datasets within a generalized...
Wetland soils are vital to the Nation because of their role in sustaining water resources, supporting critical ecosystems, and sequestering significant concentrations of biologically-produced carbon. The United States has the world’s most detailed continent-scale digital datasets for soils and wetlands, yet scientists and land managers have long struggled with the challenge of integrating these datasets for applications in research and in resource assessment and management. The difficulties include spatial and temporal uncertainties, inconsistencies among data sources, and inherent structural complexities of the datasets. This project’s objective was to develop and document a set of methods to impute wetland...
thumbnail
Large amounts of data are being generated that require hours, days, or even weeks to analyze using traditional computing resources. Innovative solutions must be implemented to analyze the data in a reasonable timeframe. The program HTCondor (https://research.cs.wisc.edu/htcondor/) takes advantage of the processing capacity of individual desktop computers and dedicated computing resources as a single, unified pool. This unified pool of computing resources allows HTCondor to quickly process large amounts of data by breaking the data into smaller tasks distributed across many computers. This project team implemented HTCondor at the USGS Upper Midwest Environmental Sciences Center (UMESC) to leverage existing computing...
thumbnail
Web portals are one of the principal ways geospatial information can be communicated to the public. A few prominent USGS examples are the Geo Data Portal (http://cida.usgs.gov/gdp/ [URL is accessible with Google Chrome]), EarthExplorer (http://earthexplorer.usgs.gov/), the former Derived Downscaled Climate Projection Portal, the Alaska Portal Map (http://alaska.usgs.gov/portal/), the Coastal Change Hazards Portal (http://marine.usgs.gov/coastalchangehazardsportal/), and The National Map (http://nationalmap.gov/). Currently, web portals are developed at relatively high effort and cost, with web developers working with highly skilled data specialists on custom solutions that meet user needs. To address this issue,...
thumbnail
The goal of this project was to develop a novel methodology to combine the USGS Gap Analysis Program (GAP) national land cover and species distribution data with disturbance data to describe and predict how disturbance affects biodiversity. Specifically, the project team presented a case study examining how energy development in the Williston Basin can affect grassland birds; however, the methods developed are scalable and transferable to other types of habitat conversion (anthropogenic or natural), regions, and taxa. This project had six key components: Develop a dataset delineating all oil well pads in the Williston Basin. Develop a habitat conversion tool to determine the amount and previous land cover from...
thumbnail
The purpose of this project was to document processes for USGS scientists to organize and share data using ScienceBase, and to provide an example interactive mapping application to display those data. Data and maps from Chase and others (2016a, b) were used for the example interactive maps. Principal Investigator : Katherine J Chase, Andy Bock, Thomas R Sando Accomplishments The accomplishments for this project are described below. The project team developed an interactive mapping application in R that connects to data on ScienceBase, using Shiny, Leaflet (Cheng and Xie, 2016), and sbtools (Winslow and others, 2016) (fig. 10). USGS scientists can refer to the R code in the mapping application to build their...
thumbnail
Legacy data (n) - Information stored in an old or obsolete format or computer system that is, therefore, difficult to access or process. (Business Dictionary, 2016) For over 135 years, the U.S. Geological Survey has collected diverse information about the natural world and how it interacts with society. Much of this legacy information is one-of-a-kind and in danger of being lost forever through decay of materials, obsolete technology, or staff changes. Several laws and orders require federal agencies to preserve and provide the public access to federally collected scientific information. The information is to be archived in a manner that allows others to examine the materials for new information or interpretations....
thumbnail
The goal of this project is to improve the USGS National Earthquake Information Center’s (NEIC) earthquake detection capabilities through direct integration of crowd-sourced earthquake detections with traditional, instrument-based seismic processing. During the past 6 years, the NEIC has run a crowd-sourced system, called Tweet Earthquake Dispatch (TED), which rapidly detects earthquakes worldwide using data solely mined from Twitter messages, known as “tweets.” The extensive spatial coverage and near instantaneous distribution of the tweets enable rapid detection of earthquakes often before seismic data are available in sparsely instrumented areas around the world. Although impressive for its speed, the tweet-based...
thumbnail
The purpose of the Data Management Training (DMT) Clearinghouse project was twofold. First, the project aimed to increase discoverability and accessibility of the wealth of learning resources that have been developed to inform and train scientists about data management in the Earth sciences. Secondly, the project team wanted to facilitate the use of these learning resources by providing descriptive information (metadata) that can help research scientists, students, or teachers assess whether the resource would be appropriate and useful for their needs. The project team established the following objectives for the project: Create an online, searchable, and browsable clearinghouse of learning resources on data...
thumbnail
USGS research in the Western Geographic Science Center has produced several geospatial datasets estimating the time required to evacuate on foot from a Cascadia subduction zone earthquake-generated tsunami in the U.S. Pacific Northwest. These data, created as a result of research performed under the Risk and Vulnerability to Natural Hazards project, are useful for emergency managers and community planners but are not in the best format to serve their needs. This project explored options for formatting and publishing the data for consumption by external partner agencies and the general public. The project team chose ScienceBase as the publishing platform, both for its ability to convert spatial data into web services...
thumbnail
Increasing attention is being paid to the importance of proper scientific data management and implementing processes that ensure that products being released are properly documented. USGS policies have been established to properly document not only publications, but also the related data and software. This relatively recent expansion of documentation requirements for data and software may present a daunting challenge for many USGS scientists whose major focus is their physical science and who have less expertise in information science. As a proof of concept, this project has created a software solution that facilitates this process through a user-friendly, but comprehensive, interface embedded in an existing...
As research and management of natural resources shift from local to regional and national scales, the need for information about aquatic systems to be summarized to multiple scales is becoming more apparent. Recently, four federally funded national stream assessment efforts (USGS Aquatic GAP, USGS National Water-Quality Assessment Program, U.S. Environmental Protection Agency [EPA] StreamCat, and National Fish Habitat Partnership) identified and summarized landscape information into two hydrologically and ecologically significant scales of local and network catchments for the National Hydrography Dataset Plus (NHDPlus). These efforts have revealed a significant percentage of assessment funds being directed to the...


    map background search result map search result map Developing a USGS Legacy Data Inventory to Preserve and Release Historical USGS Data Developing a USGS Legacy Data Inventory to Preserve and Release Historical USGS Data