1 EML: Ecological Metadata Language

Main Build Status Develop Build Status EML

Cite as:

Matthew B. Jones, Margaret O’Brien, Bryce Mecum, Carl Boettiger, Mark Schildhauer, Mitchell Maier, Timothy Whiteaker, Stevan Earl, Steven Chong. 2019. Ecological Metadata Language version 2.2.0. KNB Data Repository. doi:10.5063/F11834T2

@article{EML_2019, title={Ecological Metadata Language version 2.2.0}, url={https://eml.ecoinformatics.org}, DOI={10.5063/f11834t2}, publisher={KNB Data Repository}, author={Jones, Matthew and O’Brien, Margaret and Mecum, Bryce and Boettiger, Carl and Schildhauer, Mark and Maier, Mitchell and Whiteaker, Timothy and Earl, Stevan and Chong, Steven}, year={2019} }

The Ecological Metadata Language (EML) defines a comprehensive vocabulary and a readable XML markup syntax for documenting research data. It is in widespread use in the earth and environmental sciences, and increasingly in other research disciplines as well. EML is a community-maintained specification, and evolves to meet the data documentation needs of researchers who want to openly document, preserve, and share data and outputs. EML includes modules for identifying and citing data packages, for describing the spatial, temporal, taxonomic, and thematic extent of data, for describing research methods and protocols, for describing the structure and content of data within sometimes complex packages of data, and for precisely annotating data with semantic vocabularies. EML includes metadata fields to fully detail data papers that are published in journals specializing in scientific data sharing and preservation.

1.1 Getting Started

Composing an EML document can be done in a simple text editor (e.g., Atom), via scripting languages like R and python (e.g., the R eml package), in general-purpose XML authoring tools (e.g., Oxygen), and in custom web-based metadata editing tools (e.g., MetacatUI). While these tools expand and shift over time, the core metadata language has been consistent and backwards compatible, allowing for decades of seamless interoperability of data sets in many repositories.

EML documents can be started simply, and then additional detail added over time. On the simple end, an EML document that provides basic bibliographic information would be sufficient for citing a data set and for simple discovery in catalogs:

<?xml version="1.0"?>
<eml:eml
    packageId="doi:10.xxxx/eml.1.1" system="https://doi.org"
    xmlns:eml="https://eml.ecoinformatics.org/eml-2.2.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns:stmml="http://www.xml-cml.org/schema/stmml-1.1"
    xsi:schemaLocation="https://eml.ecoinformatics.org/eml-2.2.0 xsd/eml.xsd">
    
    <dataset>
        <title>Primary production of algal species from Southeast Alaska, 1990-2002</title>
        <creator id="https://orcid.org/0000-0003-0077-4738">
            <individualName>
                <givenName>Matthew</givenName>
                <givenName>B.</givenName>
                <surName>Jones</surName>
            </individualName>
            <electronicMailAddress>jones@nceas.ucsb.edu</electronicMailAddress>
            <userId directory="https://orcid.org">https://orcid.org/0000-0003-0077-4738</userId>
        </creator>
        <keywordSet>
            <keyword>biomass</keyword>
            <keyword>productivity</keyword>
        </keywordSet>
        <contact>
            <references>https://orcid.org/0000-0003-0077-4738</references>
        </contact>
    </dataset>
</eml:eml>

This document can then be supplemented with additional metadata describing research projects and methods, structural information about the data, and much more.

1.2 About the EML Project

The EML project is an open source, community oriented project dedicated to providing a high-quality metadata specification for describing data relevant to diverse disciplines that involve observational research like ecology, earth, and environmental science. The specification is maintained by voluntary project members who donate their time and experience in order to advance information management for ecology. Project decisions are made by consensus of the current maintainers on the project.

We welcome contributions to this work in any form. Individuals who invest substantial amounts of time and make valuable contributions to the development and maintenance of EML (in the opinion of current project maintainers) will be invited to become EML project maintainers. Contributions can take many forms, including the development of the EML schemas, writing documentation, and helping with maintenance, among others.

1.3 Contributing

Developers may be interested in browsing the source code repository that we use in developing EML. Starting with EML 2.1.1, the master branch reflects the current stable release of EML. Development occurs in development branches (e.g., BRANCH_EML_2_2), which allows experimental additions as they are being proposed by the community. This always contains the most recent development version of EML, and therefore may be in flux, or otherwise broken. It is unlikely that it will contain the same files that are in the current release. Development branches are virtually guaranteed to change before they are released, and so they should not be used in production environments. Use development branches at your own risk for testing. Write access to this repository is reserved for current project maintainers. Please submit contributions as pull requests. We welcome contributions to this work in any form. Contributions can take many forms, including the development of the EML schema, writing documentation, and helping with maintenance, among others. Non-project members can contribute by submitting their feedback, revisions, fixes, code, or any other contribution through pull requests at GitHub. Discussion of issues occurs on the Slack channel, or through the EML Issue Tracking system. The preferred way to submit problems with EML or feature requests is the issue tracking system.

1.4 History

EML was originally developed by Matthew Jones at NCEAS based on a report by the ESA Committee on the Future of Long-Term Ecological Data and on a related paper on ecological metadata by Michener et al. (see Michener, William K., et al., 1997. Ecological Applications, “Nongeospatial metadata for the ecological sciences” Vol 7(1). pp. 330-342.). Version 1.0 was released at NCEAS in 1997 and used internally, with further internal releases of versions 1.2, 1.3, and 1.4, all of which followed the FLED recommendations closely in its content implementation. Version 2 became a community-maintained, open specification. Substantial modifications for EML 2.x came from experience using the earlier specification at NCEAS and from feedback from the ecological community, particularly information managers from the Long Term Ecological Research Network. Versions 2.1 and 2.2 introduce significant new features like internationalization, semantic annotations, and support for data papers.

1.5 Older versions (deprecated)

The following versions are still available for reference purposes, although they have been superseded by the current version (2.2.0). Please make every effort to use the current version.

1.7 Funding and Acknowledgements

EML was developed and is maintained with support from the National Center for Ecological Analysis and Synthesis (NCEAS), a Center funded by the University of California Santa Barbara and the state of California.

This material is based upon work supported by the US National Science Foundation under Grant No. DEB-9980154, DBI-9904777, 0225676, DEB-0072909, DBI-9983132, and DEB-9634135. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).

This product includes software developed by the Apache Software Foundation (http://www.apache.org/). See the LICENSE file in lib/apache for details.

The source code, object code, and documentation in the com.oreilly.servlet package is copyright and owned by Jason Hunter. See the cos-license.html file for details of the license. Licensor retains title to and ownership of the Software and all enhancements, modifications, and updates to the Software.

This product includes software developed by the JDOM Project (http://www.jdom.org/). See jdom-LICENSE.txt for details.