lab

 

Data included in any metadata item that is based on ChAMP shall conform to the following guidelines for interoperability, interchange, and/or comparison.

Numeric Values

For maximum use of numeric metadata items, numeric values should report i) an estimate of the precision of a value and ii) a unit of measure.  Numeric values should be provided in scientific notation wherever possible.  As there are potentially multiple pieces of information that would go into reporting a numeric value (value, error, unit) authors should logically group these when reported.

Reporting precision

If possible, authors of metadata should specifically include a numeric value for the error in the reported value.  This is might be (from worst to best)

  • 12.34 (no stated error - implied error of 0.01)
  • 12.34 ± 0.02 (reported rounded error)
  • 12.34(56) ± 0.0223 (reported unrounded error)
  • Separation of a numeric value into logical parts: mantissa (1.23456), exponent (+1), error (0.00223 - relative to the mantissa), and significant digits (4)

Any integer values reported should be identified as such, i.e. by inclusion of a qualifier like 'exact', or indication of signficant digits of 0 (as a replacement for infinite).  In addition, inclusion of the error type (e.g. absolute, SD, CI etc.) is strongly encouraged.

Reporting units

All units should be using the International System of Units (SI) where ever possible.  Authors are highly encouraged to use references to  standardized representations of units such as UnitsML or SWEET to potential allow interconversion of numeric values into other equivalent units.  Authors also need to specify units in an unambiguous manner so that they can be appropriately compared.  As an example of this reporting a value in ppm (parts per million) is ambiguous because it could be mass/volume, mass/mass, or frequency/frequency.

Textual (String) Values

As the majority of likely representation formats are text based, textual data should be encoded in UTF-8.  Although not encouraged, if there is a need to Base64 encode data in any field, users must start the encoding with the raw text as UTF-8 also. Best practices for the representation of metadata items using any data format should be derived from the specification of the format being used i.e. for

Date-Time Values and Ranges