- Publication Date
- 2018-10-12
- Time Period
- 2018-08-31

Engle, M.A., 2018, Codebook vectors from a trained emergent self-organizing map displaying multivariate topology of geochemical and reservoir temperature data from produced and geothermal waters of the United States: U.S. Geological Survey data release, https://doi.org/10.5066/P9376ALD.

This data matrix contains the codebook vectors for a 82 x 50 neuron Emergent Self-Organizing Map which describes the multivariate topology of reservoir temperature and geochemical data for 190 samples of produced and geothermal waters from across the United States. Variables included are coordinates derived from reservoir temperature and concentration of Sc, Nd, Pr, Tb, Lu, Gd, Tm, Ce, Yb, Sm, Ho, Er, Eu, Dy, F, alkalinity as bicarbonate, Si, B, Br, Li, Ba, Sr, sulfate, H (derived from pH), K, Mg, Ca, Cl, and Na converted to units of proportion. The concentration data were converted to isometric log-ratio coordinates (following Hron et al., 2010), where the first ratio is Sc serving as the denominator to the geometric mean of all of [...]

This data matrix is provided to allow users to map new sample sources to the ESOM using a minimum distance measurement (such Euclidean distance) through an algorithm such a k-nearest neighbor. Any data sets used in this way need to be isometrically log-ratio transformed and standardized using the exact same formulation of the training dataset used to create this matrix. This case be useful both for instances of data classification or for non-linear estimation. In the case of the latter, missing values (i.e., those in need of estimation) can be imputed from the codebook vector for the best match unit (i.e., the neuron with the smallest multivariate distance to the point being estimated). The imputed value can then convert back into the original units through the inverse of data standardization and for concentration data, the inverse of the isometric log-ratio transformation (Hron et al., 2010). Note that for concentration data, the results are in units of proportion and can be converted back into the original units by multiplying each row by the sum of the compositional data in the original dataset.

- Energy Resources Program
- USGS Data Release Products

Please see attached metadata record for full dataset provenance.