The GMAP Mounds identification science case aims to develop a generalised machine learning pipeline for the localisation and characterisation of specific geomorphological features (mounds) that are present on the surface of Mars. Mounds are positive relief features that can be ascribed to a variety of phenomena (e.g., De Toffoli et al., 2019). They can be related to monogenic edifices due to spring or mud volcanism, rootless cones on top of lava flows, pingos and so on. The focus of the investigation is related to the sedimentary/spring case of mud extrusion or sulphate oversaturated fluids. These objects are usually widespread regionally and/or contained in large complex craters (i.e., tens of km in diameter) often in populations of several hundred/thousands. Previously, automatic detections were performed in some of these cases (Pozzobon et al., 2019) using topographic data in limited areas (i.e., Digital Terrain Models (DTMs) as rasters whose cells represent height values) in order to discriminate these objects in terms of pre-trained morphometric parameters and map them. Due to the scarcity of high-resolution DTMs and poor area coverage, the ML WP challenge is to reach the ability to detect such mound features by using simple grayscale panchromatic images at mid-high resolution with no need of topographic information.
The data is obtained from the Mars Reconaissance Orbiter (MRO) mission. The MRO spacecraft is designed to study the geology and climate of Mars, provide reconnaissance of future landing sites, and relay data from surface missions back to Earth. The data was collected by the High Resolution Imaging Science Experiment, also known as HIRISE. HiRISE is the most powerful camera ever sent to another planet, one of six instruments on board the MRO. The data is in the format of Digital Elevation Model (DEM). Detailed description of the data can be found on the University of Arizona HiRISE website.
The training set consists of two DTMs, one used for training and the other for testing. In the first step, the training DTM is tiled into several smaller fixed sized images. The label masks are created based on the available ground-truth shape files. The images are then scaled to be in range [-1,1]. The training set is then split further into train and validation sets with an 80/20 ratio. The train set is augmented in the next step with image manipulations such as flipping, rotation, rescaling and so on to create a large training set for the segmentation task.
For the initial image segmentation task, a standard UNet (Ronneberger et al., 2015) is trained using the training set. A mean IoU (Intersection over Union) of about 60 % on the validation set is obtained. This result is consistent with another GAN based model, indicating a saturation in information present in the training set.
Due to the limited number of samples to train from, we learn a Generative model (Goodfellow et al., 2020) to approximate the true distribution of the landforms. We generate an augmented set using this approach and train the image segmentation again, observing an improvement of about 10% in the IoU. This is an interesting result, as it indicates that the model can be used to simulate the mound terrains. The approximated distribution space should be then factorisable into a set of independent mechanisms, which could control factors of variation.
A simulator of such likes can be used for controlled generation. Another advantage of latent space learning is that it can offer benefits in downstream tasks, which is an added advantage for storage and efficient searching. We have developed this simulator and we plan to disseminate the method as a publication in the coming months.
Results of this science case were presented at the EGU21. The ML pipeline is available on our GitHub repository.
References: