User Tools

Site Tools


science_cases:iwf_science_case

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
science_cases:iwf_science_case [2022/05/17 14:16] adminscience_cases:iwf_science_case [2022/09/12 13:06] (current) admin
Line 13: Line 13:
 ===== Main aim ===== ===== Main aim =====
  
-The main aim of this science case is to develop/implement an algorithm to automatically detect ICME and CIR signatures in in situ solar wind data.+The main aim of this science case is to develop/implement an algorithm to automatically detect ICME signatures in in situ solar wind data.
  
 Interplanetary coronal mass ejections (ICMEs) are one of the main drivers for space weather disturbances. In the past, different machine learning approaches have been used to automatically detect events in existing time series resulting from solar wind in situ data (e.g., Nguyen et al., 2019). However, classification, early detection and ultimately forecasting still remain challenging when faced with the large amount of data from different instruments. While CNNs are often used to discover objects or patterns in images or data series, there are two main problems when facing our specific task: high duration variability and a rather ambiguous definition of start and end time. Interplanetary coronal mass ejections (ICMEs) are one of the main drivers for space weather disturbances. In the past, different machine learning approaches have been used to automatically detect events in existing time series resulting from solar wind in situ data (e.g., Nguyen et al., 2019). However, classification, early detection and ultimately forecasting still remain challenging when faced with the large amount of data from different instruments. While CNNs are often used to discover objects or patterns in images or data series, there are two main problems when facing our specific task: high duration variability and a rather ambiguous definition of start and end time.
Line 19: Line 19:
 ===== Description of the machine learning problem and our approach ===== ===== Description of the machine learning problem and our approach =====
  
-The first step in this science case was the **reimplementation of a model proposed by Nguyen et al. (2019)**, which had previously been tested on WIND data and achieved a maximum recall and precision of around 84%.+The first step in this science case was the **reimplementation of a model proposed by Nguyen et al. (2019)**, which had previously been tested on WIND data and achieved a maximum [[:glossary#recall|recall]] and [[:glossary#precision|precision]] of around 84%.
  
-After the reimplementation of this model, the model was tested on STEREO-A and STEREO-B data as well as on WIND data. All three contain less variables than the original data set used by Nguyen et al. At a similar recall as for the original set, the precision for all three datasets was only around 30% and the accuracy in delivering start and end times was limited.+After the reimplementation of this model, the model was tested on STEREO-A and STEREO-B data as well as on WIND data. All three contain less variables than the original data set used by Nguyen et al. At a similar [[:glossary#recall|recall]] as for the original set, the [[:glossary#precision|precision]] for all three datasets was only around 30% and the [[:glossary#accuracy|accuracy]] in delivering start and end times was limited.
  
 The next step was to align all three data sets in order to process more training data for a combined model. It was tested on held out datasets for WIND, STEREO-A and STEREO-B. Surprisingly, this did not sufficiently improve performance and lead us to explore other approaches. The next step was to align all three data sets in order to process more training data for a combined model. It was tested on held out datasets for WIND, STEREO-A and STEREO-B. Surprisingly, this did not sufficiently improve performance and lead us to explore other approaches.
  
-Starting from the reimplementation, a **post processing step based on YOLO v5** (ultralytics) was investigated, in order to improve performance. Even though first results seemed promising, the idea was later discarded due to unsatisfactory results and the laborious pipeline. Since the ultimate goal is an explicit and widely applicable pipeline, it was decided to abandon the general approach of using multiple basic neural networks and the similarity measure used by Nguyen et al. (2019) completely and **compose it as a segmentation problem** instead.+Starting from the reimplementation, a **post processing step based on YOLO v5** (ultralytics) was investigated, in order to improve performance. Even though first results seemed promising, the idea was later discarded due to unsatisfactory results and the laborious pipeline. Since the ultimate goal is an explicit and widely applicable pipeline, it was decided to abandon the general approach of using multiple basic [[:glossary#neural_network|neural networks ]]and the similarity measure used by Nguyen et al. (2019) completely and **compose it as a segmentation problem** instead.
  
-We proposed a pipeline using a **UNet ** (Ronneberger et al., 2015) including residual blocks, squeeze and excitation blocks, Atrous Spatial Pyramidal Pooling (ASPP) and attention blocks, similar to the ResUNet\+\+ (Jha et al., 2019), for the automatic detection of ICMEs. Comparing it to our first results, we find that our model outperforms the baseline regarding GPU usage, training time and robustness to missing features, thus making it more usable for other data sets, as well as the three aligned data sets. The relatively fast training allows straightforward tuning of hyperparameters. Our proposed pipeline can be used for any time series segmentation problem. The straightforward implementation allows a simple extension to a multiclass classification problem and paves the way to include corotating interaction regions into the range of detectable phenomena within our pipeline. Furthermore, we hope to apply our model to similar problems in the future.+We proposed a pipeline using a **UNet ** (Ronneberger et al., 2015) including residual blocks, squeeze and excitation blocks, Atrous Spatial Pyramidal Pooling (ASPP) and attention blocks, similar to the **<nowiki>ResUNet++</nowiki>** (Jha et al., 2019), for the automatic detection of ICMEs. Comparing it to our first results, we find that our model outperforms the baseline regarding GPU usage, training time and robustness to missing [[:glossary#feature|features]], thus making it more usable for other data sets, as well as the three aligned data sets. The relatively fast training allows straightforward tuning of [[:glossary#hyperparameters|hyperparameters]]. Our proposed pipeline can be used for any time series segmentation problem. The straightforward implementation allows a simple extension to a [[:glossary#multi-class_classification|multi-class classification]] problem and paves the way to include corotating interaction regions into the range of detectable phenomena within our pipeline. Furthermore, we hope to apply our model to similar problems in the future.
  
-Results of this science case were presented at the EGU21, at EPSC2021, at ESWW 2021, and at AGU21. This ML pipeline was presented in a workshop at EPSC2021 and is, together with a tutorial, available on our GitHub repository. A publication was submitted to the journal "Space Weather".+Results of this science case were presented at the {{:wiki:esws2020-iwf_presentation.pdf|ESWS 2020}}, at {{:wiki:egu2021-ruedisser_etal.pdf|EGU21}}, at {{:wiki:esww2021-ruedisser_presentation.pdf|ESWW 2021}}, at {{:wiki:agu21_icme_ruedissser.pdf|AGU21}}, and at {{:wiki:mlhelio22_ruedisser_etal.pdf|ML-Helio 2022}}. This ML pipeline was presented in a [[https://github.com/epn-ml/EPSC2021-ICME-workshop|workshop at EPSC2021]] and is, together with a [[:tutorials_icme|tutorial]], available on our [[https://github.com/epn-ml/|GitHub repository]]**A publication was submitted to and accepted by the journal "Space Weather".**
  
 **References: ** **References: **
  
 * Nguyen, G., et al. (2019), Automatic Detection of Interplanetary Coronal Mass Ejections from In Situ Data: A Deep Learning Approach, Astrophys. J. 874, 145, doi:10.3847/1538-4357/ab0d24\\ * Nguyen, G., et al. (2019), Automatic Detection of Interplanetary Coronal Mass Ejections from In Situ Data: A Deep Learning Approach, Astrophys. J. 874, 145, doi:10.3847/1538-4357/ab0d24\\
-* Jha, D., et al. (2019), Resunet\+\+: An advanced architecture for medical image segmentation, arXiv e-prints, arXiv:1911.07067\\+* Jha, D., et al. (2019), Resunet++: An advanced architecture for medical image segmentation, arXiv e-prints, arXiv:1911.07067\\
 * Ronneberger, O., et al. (2015), U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N., Hornegger J., Wells W., Frangi A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer, Cham. [[https://doi.org/10.1007/978-3-319-24574-4_28|https://doi.org/10.1007/978-3-319-24574-4_28]] * Ronneberger, O., et al. (2015), U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab N., Hornegger J., Wells W., Frangi A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol 9351. Springer, Cham. [[https://doi.org/10.1007/978-3-319-24574-4_28|https://doi.org/10.1007/978-3-319-24574-4_28]]
  
  
science_cases/iwf_science_case.1652789771.txt.gz · Last modified: 2022/05/17 14:16 by admin