Skip to main content

Codebook vectors from a trained emergent self-organizing map displaying multivariate topology of geochemical and reservoir temperature data from produced and geothermal waters of the United States

Dates

Publication Date
Time Period
2018-08-31

Citation

Engle, M.A., 2018, Codebook vectors from a trained emergent self-organizing map displaying multivariate topology of geochemical and reservoir temperature data from produced and geothermal waters of the United States: U.S. Geological Survey data release, https://doi.org/10.5066/P9376ALD.

Summary

This data matrix contains the codebook vectors for a 82 x 50 neuron Emergent Self-Organizing Map which describes the multivariate topology of reservoir temperature and geochemical data for 190 samples of produced and geothermal waters from across the United States. Variables included are coordinates derived from reservoir temperature and concentration of Sc, Nd, Pr, Tb, Lu, Gd, Tm, Ce, Yb, Sm, Ho, Er, Eu, Dy, F, alkalinity as bicarbonate, Si, B, Br, Li, Ba, Sr, sulfate, H (derived from pH), K, Mg, Ca, Cl, and Na converted to units of proportion. The concentration data were converted to isometric log-ratio coordinates (following Hron et al., 2010), where the first ratio is Sc serving as the denominator to the geometric mean of all of [...]

Contacts

Point of Contact :
Mark A Engle
Originator :
Mark A Engle
Metadata Contact :
Eric A Morrissey
Publisher :
U.S. Geological Survey
Distributor :
U.S. Geological Survey - ScienceBase
SDC Data Owner :
Energy Resources Program
USGS Mission Area :
Energy and Minerals

Attached Files

Click on title to download individual files attached to this item.

Product_10.5066P9376ALD_DATA.csv 1.45 MB

Purpose

This data matrix is provided to allow users to map new sample sources to the ESOM using a minimum distance measurement (such Euclidean distance) through an algorithm such a k-nearest neighbor. Any data sets used in this way need to be isometrically log-ratio transformed and standardized using the exact same formulation of the training dataset used to create this matrix. This case be useful both for instances of data classification or for non-linear estimation. In the case of the latter, missing values (i.e., those in need of estimation) can be imputed from the codebook vector for the best match unit (i.e., the neuron with the smallest multivariate distance to the point being estimated). The imputed value can then convert back into the original units through the inverse of data standardization and for concentration data, the inverse of the isometric log-ratio transformation (Hron et al., 2010). Note that for concentration data, the results are in units of proportion and can be converted back into the original units by multiplying each row by the sum of the compositional data in the original dataset.

Map

Communities

  • Energy Resources Program
  • USGS Data Release Products

Tags

Provenance

Please see attached metadata record for full dataset provenance.

Additional Information

Identifiers

Type Scheme Key
DOI https://www.sciencebase.gov/vocab/category/item/identifier doi:10.5066/P9376ALD

Item Actions

View Item as ...

Save Item as ...

View Item...