Data Making from Space

Curiosio
7 min readMay 11, 2020

--

by Vas Mylko, Roman Bilusiak

EDIT: This article was written for DOU.ua and was unlisted, only available via the link there https://dou.ua/lenta/articles/ml-from-space/ Now, after three months we are making it public so that more geeks could find and read.

We are Curiosio — superguide for travel geeks. Curiosio consists of the cutting-edge optimizer and the knowledge graph of points and places. Recently we have engineered features for a new ML module for Curiosio, and one of the features was elevation. The elevation of a geographic location is its height above or below a mathematical model of the Earth’s sea level. The task sounded as easy as writing the “Hello World” program in unknown programming language.

Hello World

I prepared a list of ~70 manually sampled locations, and started to pull their elevations from the OSM, Wikidata, and Wikipedia. Almost instantly I noticed diversity how/where the elevation data was present. It resembled me the diversity of the naming the basic things in different spoken languages (e.g. dog, rain, food) because they evolved independently. Elevation was present under different names and forms in them all, but in most cases it was absent at all:

  • OSM has Key:ele
  • Wikidata has elevation above sea level (P2044)
  • Wikipedia has {{Infobox… elevation_m=… }} for meters and {{Infobox… elevation_ft=… }} for feet
  • Wikipedia has elevation in free form text on the page

I decided to link more data sets, so looked in the GeoNames. Elevation was supposed to be there, it was there, but not for all locations from the sampled list:

  • GeoNames has geoname table with elevation column

Here I felt that this mission was not even close to the “Hello World” by complexity and notified Roman about potential surprises with the elevation data point… He found the data. What to do in such situations? Raise abstraction level. How high? Until it starts working for our problem. So we have raised the bar by ~240 kilometers (150 miles) above the Earth to Space.

Space Tech

The Shuttle Radar Topography Mission (SRTM) was an international project spearheaded by the National Imagery and Mapping Agency and NASA, with participation of the German Aerospace Center DLR. Its objective was to obtain the most complete high-resolution digital topographic database of the Earth. SRTM consisted of a specially modified radar system that flew onboard Endeavour during its 11-day mission in February 2000. This radar system gathered around 8 terabytes of data to produce high-quality 3D images of the Earth’s surface.

60-meter mast (by NASA)

To acquire topographic data, the SRTM payload was outfitted with two radar antennas. One antenna was located in the Shuttle’s payload bay, the other on the end of a 60-meter (200-foot) mast that extended from the payload bay once the Shuttle was in space. SRTM mission covered approximately 80% of the Earth’s surface (71% of Earth’s surface is water-covered), with a global resolution of 90 meters, and a resolution of 30 meters over the USA.

STS-99 Space Shuttle Mission Crew

Terra (EOS AM-1) multi-national NASA scientific research satellite was launched in 1999. It is the flagship of the Earth Observing System (EOS). It is equipped with the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) — a Japanese sensor which is one of five remote sensory devices on board. It has been collecting data since February 2000. ASTER Global Digital Elevation Model (GDEM) data of good quality was released in October 2011 as v2, better than SRTM over rugged mountainous terrain.

On September 2014 NASA released improved version of SRTM. Previously, SRTM data for regions outside the United States were sampled for public release at 3 arc-seconds, which is 1/1200th of a degree of latitude and longitude, or about 90 meters (295 feet). The new data has been released with a 1 arc-second, or about 30 meters (98 feet), sampling that reveals the full resolution of the original measurements.

We will use the new SRTM for elevation data as one of the data points for machine learning for one module in Curiosio.

Big thanks to the astronauts, JSC JPL mapping engineers, and NGA/NIMA! OK, back to Earth…

Elevation Data

The Earth elevation data has been released by NASA, which means the data is in public domain. Works in the public domain are not covered by intellectual property rights, such as copyright, at all. You can use the data as is, including commercial use.

SRTM data is organized into individual rasterized cells, or tiles, each covering one degree by one degree in latitude and longitude. Sample spacing for individual data points is either 1 arc-second, 3 arc-seconds, or 30 arc-seconds, referred to as SRTM1, SRTM3 and SRTM30, respectively. Since one arc-second at the equator corresponds to roughly 30 meters in horizontal extent, the SRTM1 and SRTM3 are sometimes referred to as “30 meter” or “90 meter” data.

Index map (no data for northernmost and southernmost latitudes)

Main entry point to SRTM v2 is https://dds.cr.usgs.gov/srtm/version2_1/

Index of /srtm/version2_1
Documentation/
NAVMac800QSFile
SRTM1/
SRTM3/
SRTM30/
SWBD/
Index of /srtm/version2_1/SRTM1
Parent Directory
Region_01/
Region_02/
Region_03/
Region_04/
Region_05/
Region_06/
Region_07/
Region_definition.jpg
Index of /srtm/version2_1/SRTM1/Region_01
N38W112.hgt.zip
N38W113.hgt.zip
N38W114.hgt.zip
N38W115.hgt.zip
N38W116.hgt.zip
N38W117.hgt.zip
N38W118.hgt.zip
...

File names refer to the latitude and longitude of the lower left corner of the tile — e.g. N38W112 has its lower left corner at 38 degrees north latitude and 112 degrees west longitude. Zipped HGT files are height files containing DEMs aka Data Elevation Models. The DEM is provided as 16-bit signed integer data in a simple binary raster. 16-bit integers represent the heights of each cell in meters arranged from west to east and then north to south. There are no header or trailer bytes embedded in the file. The data is stored in row major order (all the data for row 1, followed by all the data for row 2, etc.).

Byte order is Big-endian standard with the most significant byte first. Since they are signed integers elevations can range from -32767 to 32767 meters, encompassing the range of elevation to be found on the Earth. These data also contain occasional voids from a number of causes such as shadowing, phase unwrapping anomalies, or other radar-specific causes. Voids are flagged with the value -32768.

The most convenient way to access the data is by using the library SRTM.py and its golang port go-elevations. The license of both libs is Apache 2.0, allowing commercial use. Both libs were developed by tkrajina aka Tomo Krajina — software geek from Višnjan, Istra, Croatia. Kudos to Tomo!

Python example how to get elevation by geo coordinates:

import srtm
data = srtm.get_data()
print(data.get_elevation(50.8682, 7.1377))
# Cache files with urls of SRTM files are put in HOME dir.
# In case, you need another location, set the cache dir.
import srtm
data = srtm.get_data(local_cache_dir="foo")
print(data.get_elevation(50.8682, 7.1377))
# Voids happen, get interpolated value
# IDW stands for Inverse Distance Weighted
print(data._IDW(50.8682, 7.1377))

Golang example how to get elevation by geo coordinates:

import (
"fmt"
"net/http"
"github.com/tkrajina/go-elevations/geoelevations"
)
func main() {
srtm, err := geoelevations.NewSrtm(http.DefaultClient)
if err != nil {
// retry, or reconfig and retry
}
ele, err := srtm.GetElevation(http.DefaultClient, 45.2775, 13.726111)
if err != nil {
// ...
}
fmt.Println(ele)
}

What’s about the data beyond the reach? First of all, the uncovered area is not so big as Mercator projection shows. Mercator stretches near the poles and squeezes near the equator. Check out other projections to ensure the SRTM coverage is very good. Second, other data sources (including Wikidata, Wikipedia, GeoNames, GDEM) will be needed for elevations in the snow covered latitudes.

Endeavour

STS-99 mission was successfully completed by the Space Shuttle Endeavour. Here is a link to STS-99 launch. Here is a link to STS-99 flight story, which contains the footage and description how the mapping machinery worked in orbit (starts from 4m00s and again from 10m07s with a glitch and tears). It was the biggest rigid structure in space by that time. It took 330 big tape cassettes to record the data. Endeavour accomplished 25 missions between May 1992 and June 2011. It was formally decommissioned and put to California Science Center in Los Angeles. Endeavour’s road to the science center was the mission itself.

Endeavour moving through Los Angeles

Curiosio

Curiosio is a superguide for travel geeks. We envision independent travel when the world unlocks. We build fundamental technology for the new world of travel. Curiosio is a computational knowledge engine + search engine + answer engine. Curiosio is neat + clean + cool on the screen. Curiosio is advanced and cutting-edge behind the browser. We do ML for the knowledge graph mainly. We do Evolutionary AI for the artificial smartness. Be safe and curious, stay tuned!

--

--

No responses yet