Skip to main content
Advanced Search

Filters: Tags: Machine learning (X)

170 results (92ms)   

View Results as: JSON ATOM CSV
thumbnail
This data set consists of ground control points used for independent pixel-level model validation (ground_control_points.gpkg): This dataset consists of 295 points distributed across the 15 vegetation classes on the island of Lāna‘i. The points were randomly generated from the final species-specific land cover classification map and stratified by class to ensure representation across all classes. The dataset provides species-specific land cover labels for the points, with the spatial location corresponding to the pixel coordinate location on the 2m resolution land cover map. Comparing modeled class assignments to these expert-validated classes enables an independent accuracy assessment supplemental to the polygon-based...
thumbnail
Groundwater is a vital resource in the Mississippi embayment physiographic region (Mississippi embayment) of the central United States and can be limited in some areas by high concentrations of trace elements. The concentration of trace elements in groundwater is largely driven by oxidation-reduction (redox) processes. Redox processes are a group of biotically driven reactions in which energy is derived from the exchange of electrons. In groundwater, this commonly occurs through decomposition of organic matter (carbon) by microbes, which consumes dissolved oxygen (DO). Under low DO conditions, iron (Fe), manganese, and arsenic can dissolve from coatings on aquifer sediments and be released into groundwater. Therefore,...
thumbnail
This page contains 15 estimated quantiles for 9,203 level-12 Hydrologic Unit Code in the Southeastern United States for the decades 1950-1959, 1960-1969, 1970-1979, 1980-1989, 1990-1999, and 2000-2009. A multi-output neural network was used to generate the estimated quantiles (Worland and others, 2019). The R scripts that generated the predictions are also included along with a README file. The 15 quantiles are associated with the following 15 non-exceedance probabilities (NEPs): 0.0003, 0.0050, 0.0500, 0.1000, 0.2000, 0.3000, 0.4000, 0.5000, 0.6000, 0.7000, 0.8000, 0.9000, 0.9500, 0.9950, and 0.9997. The quantiles were calculated using the Weibull plotting position (more details can be found in the accompanying...
thumbnail
A barrier island habitat prediction model was used to forecast barrier island habitats (for example, beach, dune, intertidal marsh, and woody vegetation) for Dauphin Island, Alabama, based on potential island configurations associated with a variety of restoration measures and varying future conditions of storminess and sea-levels. In this study, we loosely coupled a habitat model framework with decadal hydrodynamic geomorphic model outputs to forecast habitats for 2 potential future conditions related to storminess (that is, "medium" storminess and "high" storminess based on storm climatology data) and 4 sea-level scenarios (that is, a "low" increase in sea level 0.3 m by around 2030 and 2050 and 1.0 m by around...
thumbnail
This dataset includes model inputs (specifically, meteorological inputs to the predictive models and flags for predicted ice-cover) and is part of a larger data release of lake temperature model inputs and outputs for 2,332 lakes in the U.S. states of North Dakota, South Dakota, Minnesota, Wisconsin, and Michigan (https://doi.org/10.5066/P9PPHJE2).
thumbnail
Bats play crucial ecological roles and provide valuable ecosystem services, yet many populations face serious threats from various ecological disturbances. The North American Bat Monitoring Program (NABat) aims to assess status and trends of bat populations while developing innovative and community-driven conservation solutions using its unique data and technology infrastructure. To support scalability and transparency in the NABat acoustic data pipeline, we developed a fully-automated machine-learning algorithm. This dataset includes audio files of bat echolocation calls that were considered to develop V1.0 of the NABat machine-learning algorithm, however the test set (i.e., holdout dataset) has been excluded from...
thumbnail
This data release replicates the methods detailed in the 2017 publication titled "Improving predictions of hydrological low-flow indices in ungaged basins using machine learning" for a different data set. The original data set and the associated readme file for the model archive can be viewed here: https://doi.org/10.5066/F7CR5S4T. The original data set contained streamflow data for sites located in South Carolina, Georgia, and Alabama. The data set used in this data release is for 6 states in the Southern Midwest U.S.A. The datafile contains the annual minimum seven-day mean streamflow with an annual exceedance probability of 90% (7Q10) for 173 basins in Arkansas (AR), Iowa (IA), Kansas (KS), Missouri (MO), Nebraska...
thumbnail
Ensemble-tree machine learning (ML) regression models can be prone to systematic bias: small values are overestimated and large values are underestimated. Additional bias can be introduced if the dependent variable is a transform of the original data. Six methods were evaluated for their ability to correct systematic and introduced bias: (1) empirical distribution matching (EDM); (2) regression of observed on estimated values (ROE); (3) linear transfer function (LTF); (4) linear equation based on Z-score transform (ZZ); (5) second machine learning model used to estimate residuals (ML2-RES); and (6) Duan smearing estimate applied after ROE is implemented (ROE-Duan). The performance of the methods was evaluated using...
thumbnail
This dataset provides shapefile of outlines of the 68 lakes where temperature was modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).
thumbnail
This dataset includes compiled water temperature data from a variety of sources, including the Water Quality Portal (Read et al. 2017), the North Temperate Lakes Long-TERM Ecological Research Program (https://lter.limnology.wisc.edu/), the Minnesota department of Natural Resources, and the Global Lake Ecological Observatory Network (gleon.org). This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).
thumbnail
This dataset includes "test data" compiled water temperature data from an instrumented buoy on Lake Mendota, WI and discrete (manually sampled) water temperature records from North Temperate Lakes Long-TERM Ecological Research Program (NTL-LTER; https://lter.limnology.wisc.edu/). The buoy is supported by both the Global Lake Ecological Observatory Network (gleon.org) and the NTL-LTER. The dataset also includes Lake Mendota model erformance as measured as root-mean squared errors relative to temperature observations during the test period. This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).
thumbnail
This dataset provides high-resolution, species-specific land cover maps for the Hawaiian island of Lāna'i based on 2020 WorldView-2 satellite imagery. Machine learning models were trained on extensive ground control polygons and points. The land cover maps capture the distribution and diversity of vegetation with high accuracy to support conservation planning and monitoring. This data release consists of two child items, one containing the field and expert collected ground control data used to train our models, and another consisting of resulting land cover maps for the island of Lāna‘i. The research effort that generated these input data, and products are carefully described in the associated manuscript Berio Fortini...
thumbnail
This raster integrates the species-specific and community classifications using a hierarchical approach based on classification certainty. A 0.66 probability threshold was applied, with pixels assigned the finest species-specific class as long as the probability exceeded the threshold. Pixels below the threshold were assigned to the broader community class meeting the threshold. This approach displays the most detailed class possible given a minimum confidence, providing a map that balances specificity and certainty. Please note that to reduce the inherent 'salt and pepper' noise in the final land cover classification map, we applied a 3x3 pixel moving window majority filter to the final classification results.
thumbnail
This section provides spatial data files that describe the river and reservoirs in the Delaware River Basin included in this release. One shapefile of polylines describes the 70 river reaches that define the modeling network, and another shapefile of polygons includes the two reservoirs (Pepacton, Cannonsville) for which data are included in this release.
thumbnail
Defining site potential for an area establishes its possible long-term vegetation growth productivity in a relatively undisturbed state, providing a realistic reference point for ecosystem performance. Modeling and mapping site potential helps to measure and identify naturally occurring variations on the landscape as opposed to variations caused by land management activities or disturbances (Rigge et al. 2020). We integrated remotely sensed data (250-m enhanced Moderate Resolution Imaging Spectroradiometer (eMODIS) Normalized Difference Vegetation Index (NDVI) (https://earthexplorer.usgs.gov/)) with land cover, biogeophysical (i.e., soils, topography) and climate data into regression-tree software (Cubist®). We...
thumbnail
Multiple modeling frameworks were used to predict daily temperatures at 0.5m depth intervals for a set of diverse lakes in the U.S. states of South Dakota, North Dakota, Minnesota, Wisconsin, and Michigan. Process-Based (PB) models were configured and calibrated with training data to reduce root-mean squared error. Uncalibrated models used default configurations (PB0; see Winslow et al. 2016 for details) and no parameters were adjusted according to model fit with observations. Process-Guided Deep Learning (PGDL) models were deep learning models with an added physical constraint for energy conservation as a loss term. These models were pre-trained with uncalibrated Process-Based model outputs (PB0) before training...
thumbnail
High-resolution elevation data provide a foundational layer needed to understand regional hydrology and ecology under contemporary and future-predicted conditions with accelerated sea-level rise. While the development of digital elevation models (DEMs) from light detection and ranging data has enhanced the ability to observe elevation in coastal zones, the elevation error can be substantial in densely vegetated coastal wetlands. In response, we developed a machine learning model to reduce vertical error in coastal wetlands for a 1-m DEM from 2018 that covered Nassau and Duval Counties, Florida. Error was reduced by using a random forest regression model within situ observations and predictor variables from optical...
thumbnail
An extreme gradient boosting (XGB) machine learning model was developed to predict the distribution of nitrate in shallow groundwater across the conterminous United States (CONUS). Nitrate was predicted at a 1-square-kilometer (km) resolution at a depth below the water table of 10 m. The model builds off a previous XGB machine learning model developed to predict nitrate at domestic and public supply groundwater zones (Ransom and others, 2022) by incorporating additional monitoring well samples and modifying and adding predictor variables. The shallow zone model included variables representing well characteristics, hydrologic conditions, soil type, geology, climate, oxidation/reduction, and nitrogen inputs. Predictor...
thumbnail
GeoTiff grids of models of prospectivity for clastic-dominated (CD) and Mississippi Valley-type (MVT) Pb-Zn mineralization for the US and Canada (combined) and Australia that used data provided in this report are provided here. The models are the result of a study by Lawley and others (2022) that used a data-driven machine learning approach called Gradient Boosting to predict the mineral prospectivity for clastic-dominated (CD) and carbonate-hosted (MVT) deposits across the United States, Canada, and Australia. The study was part of a tri-national collaboration between the U.S. Geological Survey, the Canadian Geological Survey, and Geoscience Australia called the Critical Minerals Mapping Initiative. The original...


map background search result map search result map 7Q10 Records and Basin Characteristics for 173 basins in Arkansas, Iowa, Kansas, Missouri, Nebraska, and Oklahoma (2017) Estimated quantiles for the pour points of 9,203 level-12 hydrologic unit codes in the southeastern United States, 1950--2009 Process-guided deep learning water temperature predictions: 1 Spatial data (GIS polygons for 68 lakes) Process-guided deep learning water temperature predictions: 4 Training data Process-guided deep learning water temperature predictions: 6a Lake Mendota detailed evaluation data Landscape position-based habitat modeling for the Alabama Barrier Island feasibility assessment at Dauphin Island Dissolved oxygen probability rasters of groundwater in the Mississippi River Valley alluvial and Claiborne aquifers Predicting Water Temperature Dynamics of Unmonitored Lakes with Meta Transfer Learning: 5 Model predictions Data Release for Evaluation of Six Methods for Correcting Bias in Estimates from Ensemble Tree Machine Learning Regression Models Using Targeted Training Data to Develop Site Potential for the Upper Colorado River Basin from 2000 - 2018 4 Model Code: Deep learning approaches for improving prediction of daily stream temperature in data-scarce, unmonitored, and dammed basins [Prospectivity Models] Prospectivity models - clastic-dominated (CD) and Mississippi Valley-type (MVT) GeoTIFF grids for the United States, Canada, and Australia Predictions and supporting data for network-wide 7-day ahead forecasts of water temperature in the Delaware River Basin: 1) Waterbody information for 70 river reaches and 2 reservoirs Training dataset for NABat Machine Learning V1.0 Data for Machine Learning Predictions of Nitrate in Shallow Groundwater in the Conterminous United States Corrected digital elevation model in coastal wetlands in Nassau and Duval Counties, Florida, 2018 High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 - Ground Control Points High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 - Mixed Class Process-guided deep learning water temperature predictions: 6a Lake Mendota detailed evaluation data Landscape position-based habitat modeling for the Alabama Barrier Island feasibility assessment at Dauphin Island High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 - Ground Control Points High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 - Mixed Class High-Resolution Land Cover Maps of Lāna‘i, Hawai‘i, 2020 Corrected digital elevation model in coastal wetlands in Nassau and Duval Counties, Florida, 2018 Predictions and supporting data for network-wide 7-day ahead forecasts of water temperature in the Delaware River Basin: 1) Waterbody information for 70 river reaches and 2 reservoirs Process-guided deep learning water temperature predictions: 4 Training data Process-guided deep learning water temperature predictions: 1 Spatial data (GIS polygons for 68 lakes) Dissolved oxygen probability rasters of groundwater in the Mississippi River Valley alluvial and Claiborne aquifers Using Targeted Training Data to Develop Site Potential for the Upper Colorado River Basin from 2000 - 2018 Predicting Water Temperature Dynamics of Unmonitored Lakes with Meta Transfer Learning: 5 Model predictions 7Q10 Records and Basin Characteristics for 173 basins in Arkansas, Iowa, Kansas, Missouri, Nebraska, and Oklahoma (2017) Estimated quantiles for the pour points of 9,203 level-12 hydrologic unit codes in the southeastern United States, 1950--2009 4 Model Code: Deep learning approaches for improving prediction of daily stream temperature in data-scarce, unmonitored, and dammed basins Data Release for Evaluation of Six Methods for Correcting Bias in Estimates from Ensemble Tree Machine Learning Regression Models Data for Machine Learning Predictions of Nitrate in Shallow Groundwater in the Conterminous United States Training dataset for NABat Machine Learning V1.0 [Prospectivity Models] Prospectivity models - clastic-dominated (CD) and Mississippi Valley-type (MVT) GeoTIFF grids for the United States, Canada, and Australia