Benjamin Webb
Ben Webb is an open source developer working at the Center for Geospatial Solutions at the Lincoln Institute of Land Policy. He is committed to developing open source software, primarily in support of the Internet of Water Coalition.
Sessions
The Hydro Network-Linked Data Index (NLDI) is a system that can index data to a hydrographic network and offers a RESTful web service to discover indexed information upstream and downstream of arbitrary points along the stream network. This allows users to search for and retrieve geospatial representations of stream flowlines, catchments, and relevant water monitoring locations contributed by the water data community - without downloading the national dataset or establishing links themselves.
This is done by data providers publishing open information about the locations of their data within the context of the U.S. stream network. Data linked to the NLDI includes various federal, state and local water infrastructure features and water quantity and quality monitoring locations. The NLDI is being developed as an open source project and welcomes contributions to both its code and indexed data, with the main implementation currently being maintained by the U.S. Geological Survey.
The community of practice surrounding the NLDI extends to R and python developers working on clients that allow scientists to quickly retrieve data relevant for specific hydrologic analyses. As the NLDI community grows, a similar concept could be applied at a global scale, facilitating the development of downstream tools and applications.
While the NLDI is limited to the US, global work would be possible by leveraging global stream network datasets such as MERIT-Hydro. A proof-of-concept global River Runner allowing discovery of the flowpath downstream of arbitrary points anywhere on Earth has already been implemented using MERIT-Hydro and OGC-API Processes in pygeoapi. This session includes demonstrations of the NLDI and the global River Runner.
The Web has an increasing number of web applications being developed to freely provide their information and is a hub for open data publishing. For this to happen as a self-sustained ecosystem, data must be findable, accessible, interoperable, and reusable to both humans and machines across the wider web. This session delves into Web Best Practices for publishing data using open source and standards-based solutions.
The geoconnex.us project is about providing technical infrastructure and guidance to create an open, community-contribution model for a knowledge graph linking hydrologic features in the United States as an implementation of Internet of Water principles. This knowledge graph can be leveraged to create a wide array of information products to answer innumerable water-related questions.
Implementation has two parts: persistently identified real world objects and organizational monitoring locations that collect data about them. Both must be published to the Web using persistent URIs and communicated with common linked data semantics in order for a knowledge graph to be constructed.
The Internet of Water Coalition supports the first part with a Permanent Identifier Service and reference hydrologic reference features (e.g. watersheds, monitoring locations, dams, bridges, etc.) within the US.
In support of the second part, geoconnex.us takes advantage of pygeoapi using the OGC API - Features standard to publish structured metadata resources about individual hydrologic objects and the data about them. pygeoapi supports extending this standard by incorporating domain-specific structured data into the HTML format at the feature level, and allowing for external HTTP URI identification. In addition, pygeoapi’s flexible plugin architecture enables for custom integration and processes. This means that individual features from various sources can have structured, standardized metadata harvested by search engines and assembled into a useful knowledge graph.
This spatial feature-based linked data architecture enables data interoperability between independent organizations who hold information about the same real world thing without centralizing data infrastructure - answering important questions like, “Who is collecting water data about my local stream and its tributaries?” or “What data do we have about water upstream and downstream of East Palestine, Pennsylvania?”