Differences

This shows you the differences between two versions of the page.

--- science_cases:lmsu_science_case [2022/02/17 10:50] – admin
+++ science_cases:lmsu_science_case [2022/09/12 13:44] (current) – admin
@@ Line 22: / Line 22: @@
 ===== Description of the machine learning problem and our approach =====
-Based on data from the mission, several global models of the magnetosphere were proposed (e.g., Winslow et al., 2013; Philpott et al., 2020). However, they could only describe an average shape of the bow shock and magnetopause crossings and can be prone to missing the statistical nuances in the data. Given large data, neural networks can be expected to approximate complex functions, which often surpass deterministic and rule-based methods, in a variety of time series tasks like classification (Fawaz et al., 2019), time series forecasting (Lim and Bohren, 2021), and rare time series event detection (Nguyen et al., 2018). We leverage these to **develop a predictor** that can be used in real-time during orbit to predict magnetic region for each step in a short window of observation.
+Based on data from the mission, several global models of the magnetosphere were proposed (e.g., Winslow et al., 2013; Philpott et al., 2020). However, they could only describe an average shape of the bow shock and magnetopause crossings and can be prone to missing the statistical nuances in the data. Given large data, [[:glossary#neural_network|neural networks]] can be expected to approximate complex functions, which often surpass deterministic and rule-based methods, in a variety of time series tasks like [[:glossary#classification|classification]] (Fawaz et al., 2019), time series forecasting (Lim and Bohren, 2021), and rare time series event detection (Nguyen et al., 2018). We leverage these to **develop a predictor** that can be used in real-time during orbit to predict magnetic region for each step in a short window of observation.
-The use of **statistical neural networks** allows us to explore another aspect: With the help of **active learning**, it is possible to add samples to the training process incrementally. With this, we can examine how the model scales its predictive capacity with increasing data, and thus study how the variations such as changing solar wind and environmental conditions affects the manifestation of boundary signatures. To begin with, different orbits can be expected to have some element of similarity in the magnetic field structure, yet would have large variations in the same segments at different conditions. It is also interesting to study what the minimum amount is for the data needed to be able to generalise these phenomena for future missions such as BepiColombo.
+The use of **statistical neural networks** allows us to explore another aspect: With the help of **[[:glossary#active_learning|active learning]]** , it is possible to add samples to the training process incrementally. With this, we can examine how the model scales its predictive capacity with increasing data, and thus study how the variations such as changing solar wind and environmental conditions affects the manifestation of boundary signatures. To begin with, different orbits can be expected to have some element of similarity in the magnetic field structure, yet would have large variations in the same segments at different conditions. It is also interesting to study what the minimum amount is for the data needed to be able to generalise these phenomena for future missions such as BepiColombo.
 The data set was **manually labelled with the boundary crossings**. To identify bow shocks, we first subtracted planetary dipole magnetic field components from the magnetometer measurements, computed the magnitude of the remainder attributed to external sources, applied the Savitzky-Golay filter to smooth the time profile of the remainder and computed its second derivative. The first and the last second derivative spikes as determined by z-score are assumed to be the enter and exit bow shock crossings respectively. Magnetopause boundaries were eyeballed using the cartesian components of the magnetic fields in the Mercury Solar Orbital coordinate system. During magnetopause crossings at least one of the components in the magnetogram experiences a sharp growth; the exact component depends on the spacecraft position. The beginning and ending points of this growth region are assumed to determine the magnetopause crossing edges. To supplement these, we also used the boundaries marked by Philpott et al. (2020) for a few orbits.
@@ Line 37: / Line 37: @@
 |4|Magnetosphere|14.1|
-The boundaries of critical interest - bow shock and magnetopause - are minorities with only 3.7 and 2.3 % representation. The table highlights the **data imbalance issue** that requires investigating special techniques to ensure the predictor does not bias towards the overrepresented classes.
+The boundaries of critical interest - bow shock and magnetopause - are minorities with only 3.7 and 2.3 % representation. The table highlights the **[[:glossary#data imbalance|data imbalance]] issue**that requires investigating special techniques to ensure the [[:glossary#predictor|predictor]] does not bias towards the overrepresented classes.
-As a first step in pre-processing, **feature selection** was performed to assess the contribution of available features in the estimation of the output. Based on statistical correlations, the magnetic flux features (BX_MSO, BY_MSO, BZ_MSO), spacecraft position coordinates (X_MSO, Y_MSO, Z_MSO) and planetary velocity components (VX, VY, VZ) were found to be most informative. In addition, three meta features, namely EXTREMA, COSALPHA and RHO_DIPOLE, were selected.
+As a first step in pre-processing, [[:glossary#feature_selection|feature selection]] was performed to assess the contribution of available [[:glossary#feature|features]] in the estimation of the output. Based on statistical correlations, the magnetic flux features (BX_MSO, BY_MSO, BZ_MSO), spacecraft position coordinates (X_MSO, Y_MSO, Z_MSO) and planetary velocity components (VX, VY, VZ) were found to be most informative. In addition, three meta features, namely EXTREMA, COSALPHA and RHO_DIPOLE, were selected.
-In the feature preparation stage, a sliding window of variable sizes (3 seconds to 3 minutes) with a hop size of 1 second was computed on the time series signal to obtain feature vectors. Finally, the features were normalised to have mean of 0 and a standard deviation of 1. No other pre-processing or engineering was applied in order to allow the deep learning model to engineer features implicitly.
+In the feature preparation stage, a sliding window of variable sizes (3 seconds to 3 minutes) with a hop size of 1 second was computed on the time series signal to obtain [[:glossary#feature_vector|feature vectors]]. Finally, the [[:glossary#feature|features]] were normalised to have mean of 0 and a standard deviation of 1. No other pre-processing or [[:glossary#feature_engineering|engineering]] was applied in order to allow the [[:glossary#deep_learning|deep learning]] model to [[:glossary#feature_engineering|engineer features]] implicitly.
-The windowed features are fed first into a block of 3 Convolutional layers with 1D filters, each followed by Batch Normalisation and ReLu activations. The activations obtained at the end of the CNN block are then passed to the Recurrent block with two layers of LSTMs. The final activations are then passed to a fully connected layer with softmax activations. The objective function used for training is Categorial cross entropy, with Adam optimizer.
+The windowed features are fed first into a block of 3 [[:glossary#convolutional layer|convolutional layers]] with 1D filters, each followed by [[:glossary#batch_normalisation|batch normalisation]] and [[:glossary#rectified_linear_unit_relu|Rectified Linear Unit (ReLU) activations]]. The [[:glossary#activation_function|activations]] obtained at the end of the [[:glossary#convolution_neural_network_cnn|CNN]] block are then passed to the recurrent block with two layers of [[:glossary#long_short-term_memory_lstm|LSTMs]]. The final activations are then passed to a fully connected layer with softmax activations. The objective function used for training is Categorial cross entropy, with Adam optimizer.
-The window size used in these experiments is 30 seconds. Overall, the **predictor achieves a macro F1 score of about 80% **on the bow shock and the magnetopause crossings on a randomly sampled test of 300 orbits. None of the orbits overlap in the train and test sets.
+The window size used in these experiments is 30 seconds. Overall, the [[:glossary#predictor|predictor]]** achieves a macro [[:glossary#f1|F1 score]] of about 80% **on the bow shock and the magnetopause crossings on a randomly sampled test of 300 orbits. None of the orbits overlap in the train and test sets.
-The results from the active learning experiment are still not complete. We are currently in the process of documenting them and we will put them forth in a publication soon.
+Results of this science case were presented at the {{:wiki:egu2021-lavrukhin_etal.pdf|EGU21}}  as well as at {{:wiki:epsc2021-mercuryboundaries.pdf|EPSC2021}}. This ML pipeline was presented in a [[https://github.com/epn-ml/EPSC2021-MercuryBoundaries-workshop|workshop at the EPSC2021]] and is available on [[https://github.com/epn-ml/LMSU-Mercury_boundaries|our GitHub repository]]. This work was submitted to and accepted by the ECML PKDD 2022 conference and will be published in the proceedings.
-Results of this science case were presented at the EGU21 as well as at EPSC2021 (see presentations on the ML Portal and on GitHub). This ML pipeline was presented in a workshop at the EPSC2021 and is available on our GitHub repository. A publication will be submitted soon.
 **References:**