Skip to main content
USGS - science for a changing world

Codebook vectors and predicted rare earth potential from a trained emergent self-organizing map displaying multivariate topology of geochemical and reservoir temperature data from produced and geothermal waters of the United States

Dates

Publication Date
Start Date
2018-01-01
End Date
2018-12-31

Citation

Engle, M.A., 2019, Codebook vectors and predicted rare earth potential from a trained emergent self-organizing map displaying multivariate topology of geochemical and reservoir temperature data from produced and geothermal waters of the United States: U.S. Geological Survey data release, https://doi.org/10.5066/P9GCYKG0.

Summary

This data release consists of three products relating to a 82 x 50 neuron Emergent Self-Organizing Map (ESOM), which describes the multivariate topology of reservoir temperature and geochemical data for 190 samples of produced and geothermal waters from across the United States. Variables included in the ESOM are coordinates derived from reservoir temperature and concentration of Sc, Nd, Pr, Tb, Lu, Gd, Tm, Ce, Yb, Sm, Ho, Er, Eu, Dy, F, alkalinity as bicarbonate, Si, B, Br, Li, Ba, Sr, sulfate, H (derived from pH), K, Mg, Ca, Cl, and Na converted to units of proportion. The concentration data were converted to isometric log-ratio coordinates (following Hron et al., 2010), where the first ratio is Sc serving as the denominator to the [...]

Contacts

Attached Files

Click on title to download individual files attached to this item.

Codebook vectors.csv 1.45 MB
Product_10.5066P9GCYKG0_METADATA.xml
Original FGDC Metadata

View
45.37 KB
Raw var mean and sd.csv 888 Bytes
REE potential for US.csv 727.11 KB

Purpose

This data release is provided to: 1) allow users to map new sample sources to the ESOM using a minimum distance measurement (such Euclidean distance) through an algorithm such a k-nearest neighbor and 2) provide predicted rare earth element potential output from the exercise for produced and geothermal waters of the United States. Any data sets used for mapping to the trained ESOM need to be isometrically log-ratio transformed and standardized (using means and standard deviations from the first table) using the exact same formulation of the training dataset used to create this matrix. This case be useful both for instances of data classification or for non-linear estimation. In the case of the latter, missing values (i.e., those in need of estimation) can be imputed from the codebook vector for the best match unit (i.e., the neuron with the smallest multivariate distance to the point being estimated). The imputed value can then convert back into the original units through the inverse of data standardization and for concentration data, the inverse of the isometric log-ratio transformation (Hron et al., 2010). Note that for concentration data, the results are in units of proportion and can be converted back into the original units by multiplying each row by the sum of the compositional data in the original dataset.

Map

Communities

  • Energy Resources Program
  • USGS Data Release Products

Tags

Provenance

Please see attached metadata record for full dataset provenance.

Additional Information

Identifiers

Type Scheme Key
DOI https://www.sciencebase.gov/vocab/category/item/identifier doi:10.5066/P9GCYKG0

Item Actions

View Item as ...

Save Item as ...

View Item...