Knowledge Discovery in Big Data from Astronomy and Earth Observation: Astrogeoinformatics


Categories: , Tag: GTIN: 9780128191545
Knowledge Discovery in Big Data from Astronomy and Earth Observation: Astrogeoinformatics
Size: 73 MB (76026654 bytes) Extension: pdf
Author(s): Petr Skoda (editor), Fathalrahman Adam (editor)

Publisher: Elsevier Science Ltd, Year: 2020

ISBN: 0128191546,9780128191545

Knowledge Discovery in Big Data from Astronomy and Earth Observation: Astrogeoinformatics bridges the gap between astronomy and geoscience in the context of applications, techniques and key principles of big data. Machine learning and parallel computing are increasingly becoming cross-disciplinary as the phenomena of Big Data is becoming common place. This book provides insight into the common workflows and data science tools used for big data in astronomy and geoscience. After establishing similarity in data gathering, pre-processing and handling, the data science aspects are illustrated in the context of both fields. Software, hardware and algorithms of big data are addressed.

Finally, the book offers insight into the emerging science which combines data and expertise from both fields in studying the effect of cosmos on the earth and its inhabitants.

  • Addresses both astronomy and geosciences in parallel, from a big data perspective
  • Includes introductory information, key principles, applications and the latest techniques
  • Well-supported by computing and information science-oriented chapters to introduce the necessary knowledge in these fields
Table of contents :
List of Contributors
A Word from the BIG-SKY-EARTH Chair
What’s in This Book?
Motivation and Scope
Part i Data
1 Methodologies for Knowledge Discovery Processes in Context of AstroGeoInformatics
1.1 Introduction
1.2 Knowledge Discovery Processes
1.3 Methodologies for Knowledge Discovery Processes
1.3.1 First Attempt to Generalize Steps – Research-Based Methodology
1.3.2 Industry-Based Standard – the Success of CRISP-DM
1.3.3 Proprietary Methodologies – Usage of Specific Tools
1.3.4 Methodologies in Big Data Context
1.4 Methodologies in Action
1.4.1 Standardization and Automation of Processes – Process Models
1.4.2 Understanding Each Other – Semantic Models Example – EXPO Example – OntoDM
1.4.3 Knowledge Discovery Processes in Astro/Geo Context Process Modeling Aspects Ontology-Related Aspects
2 Historical Background of Big Data in Astro and Geo Context
2.1 History of Big Data and Astronomy
2.1.1 Big Data Before Printing and the Computer Age
2.1.2 The Printing and Technological Renaissance Revolution
2.2 Big Data and Meteorology: a Long History
2.2.1 Early Meteorology
2.2.2 Birth of International Synoptic Meteorology
2.2.3 Next Step: Extension of Data Collection to the Entire Globe
Part ii Information
3 AstroGeoInformatics: From Data Acquisition to Further Application
3.1 Introduction
3.2 Background
3.3 Remote Sensing
3.3.1 Passive Sensing
3.3.2 Active Sensing
3.4 Big Data in Astro- and Geoinformatics
3.5 From Data Acquisition to Applications
3.6 Galileo Applications
3.7 Galileo and Smart Cities
3.8 Conclusion
4 Synergy in Astronomy and Geosciences
4.1 Introduction
4.1.1 Basic Data Operations
4.1.2 Coordinate Transformations
4.1.3 Distance Measurements
4.1.4 One-Dimensional Series
4.2 State of the Art: VESPA Initiative of Bringing Together IVOA, IPDA (PDS), and OGC
4.2.1 Standards and Software
4.2.2 VESPA – Virtual Observatory for Planetary Science
4.3 Case Studies: Interoperability of Virtual Observatory and Geographical Information Systems
4.3.1 Geographical Data and Virtual Observatory
4.3.2 Astronomical Data and Geographical Information Systems
4.4 Perspectives and Possibilities
4.5 Conclusions
5 Surveys, Catalogues, Databases, and Archives of Astronomical Data
5.1 Introduction
5.2 From the First Star Photographic Catalogues to the Modern Digital Sky Surveys. Optical and Near-Infrared Astronomy
5.2.1 First Important Visual Surveys and Catalogues
5.2.2 Photographic Observations. Stellar and Extragalactic Surveys
5.2.3 Spectral Photographic Surveys
5.2.4 CCD Surveys
5.3 New Life of Old Astronomical Data
5.3.1 Digitization of Photographic Sky Surveys
5.3.2 Scientific Objectives for Old Data Involving
5.4 Multiwavelength Ground-Based and Space-Born Surveys, Archives, and Databases
5.4.1 Gamma Ray Astronomy
5.4.2 X-Ray Astronomy
5.4.3 Ultraviolet Astronomy
5.4.4 Mid- and Far-Infrared Astronomy
5.4.5 Submillimeter/Millimeter Astronomy
5.4.6 Centimeter/Meter/Decameter Radio Astronomy
5.5 Multiwavelength Data Archives
6 Surveys, Catalogues, Databases/Archives, and State-of-the-Art Methods for Geoscience Data Processing
6.1 Geospatial Surveying
6.1.1 Collecting Geospatial Data Through in Situ, Aerial, and Satellite Surveying
6.1.2 CEOS LPV Group & NASA Aeronet
6.1.3 OGC
6.1.4 International Standardization Organization
6.2 Geospatial Archives, Catalogs, and Databases
6.2.1 International Archives, Catalogs, and Databases UNOOSA CEOS GEO INSPIRE Directive Copernicus Galileo and EGNOS JRC ESA
6.2.2 National Geospatial Catalogues and Databases CNES CSA CSIRO DLR INPE ISRO JAXA NASA, USGS, NOAA RADI Roscosmos
6.2.3 Proprietary Geospatial Databases/Catalogues/Archives
6.3 Geoinformatics
6.3.1 Definition and Subject of Geoinformatics
6.3.2 Big Data in Geoinformatics – State of the Art and Prospects
6.3.3 Challenges with 4V (Volume, Variety, Velocity, and Value) of the Geospatial and EO Big Data
6.3.4 HPC Computing in Geoinformatics. Simulations, Visualization, and Animations NASA Earth Observations – NEO ( COPERNICUS Data and Information Access Services – DIAS ESA Thematic Exploitation Platforms – TEPs
6.4 State-of-the-Art Methods for Hyperspectral Image Processing
6.4.1 Spectral Imaging Types
6.4.2 Hyperspectral Image Transformation Methods Fourier Transform (FT) Short-Time Fourier Transform (STFT) Principal Component Analysis (PCA) and Karhunen-Loève Transform (KLT) Wavelet Transform (WT)
6.4.3 Hyperspectral Image Classification Methods Image Fusion Statistics-Based Techniques Three-Dimensional Spatial-Spectral Methods Learning Methods
6.4.4 Hyperspectral Image Denoising Methods Classical Approaches Penalty Methods Linear Transformation Techniques
6.4.5 Dimensionality Reduction Methods for Hyperspectral Images
6.5 Conclusive Remarks
6.6 List of Abbreviations
Internet Sources
7 High-Performance Techniques for Big Data Processing
7.1 Introduction
7.2 Compute Architectures
7.2.1 Cache-Based Systems
7.2.2 Multicore Systems
7.2.3 Manycore Systems
7.2.4 Other Architectures
7.3 Distributed Systems
7.3.1 Clusters
7.3.2 Cloud Computing
7.4 Storage and Data Management
7.4.1 High-Performance Storage and I/O
7.4.2 Big Data Storage
7.5 Assessing Performance
7.5.1 Compute Metrics: Moore’s Law, Floating Point Performance, and Bandwidth
7.5.2 Data Metrics: the Five Vs
7.5.3 Benchmarking
7.5.4 Performance Modeling
7.5.5 Performance Analysis
7.6 High-Performance Data Analytics
7.7 Concluding Remarks on HPC, Big Data, and Their Convergence
8 Query Processing and Access Methods for Big Astro and Geo Databases
8.1 Big Data Management
8.2 Query Processing Steps
8.3 Access Methods
8.4 Query Optimization
8.5 From Big Data Management to Big Data Analytics and Vice Versa
8.5.1 Open Issues in the Intersection of Big Data Management and Big Data Analytics of Geo and Astro Data
8.6 Discussion and Outlook
9 Real-Time Stream Processing in Astronomy
9.1 Introduction
9.2 Event Processing Concepts
9.2.1 Tools and Differences
9.2.2 Esper Versus Drools
9.3 Application of Event Processing to Astronomy
9.3.1 Esper EPL
9.4 Conclusion
Part iii Knowledge
10 Time Series
10.1 Introduction
10.2 Basics of Time Series
10.3 Early Agnostic Time-Domain Studies
10.4 Longer Time Series With Uniform Passbands
10.5 Features
10.5.1 Dimensionality Reduction
10.6 Period Finding
10.7 Early Characterization/Classification
10.8 Classification Using CNNs
10.9 RNNs, LSTMs, etc.
10.10 Availability of Libraries
10.11 Real Time Aspects
10.12 Topics not Adequately Covered
11 Advanced Time Series Analysis of Generally Irregularly Spaced Signals: Beyond the Oversimplified Methods
11.1 Introduction
11.2 Statistical Properties of the Functions of (Correlated) Parameters of the LS Fits
11.2.1 Least-Squares Method: Test Functions
11.2.2 Linear Least Squares
11.2.3 Influence of Deviations of Coefficients
11.2.4 Linear Approximation
11.2.5 Linearization
11.2.6 Statistical Properties of Functions of Coefficients
11.2.7 Accuracy of the Derivative and Moments of Crossings
11.3 Statistically Optimal Number of Parameters
11.3.1 “Esthetic” (User-Defined)
11.3.2 Analysis of Variance (ANOVA)
11.3.3 “Best Accuracy” and Related Estimates
11.4 Nonlinear LS Method and Differential Corrections
11.5 Nonunique Minimum of the Test Function
11.5.1 Bootstrap Method
11.5.2 Determination of Times of Minima/Maxima (ToM)
11.6 Periodogram Analysis: Parametric Versus Nonparametric Methods
11.6.1 From Time to Phase
11.6.2 “Parametric” (“Point-Curve”) Methods
11.6.3 “Nonparametric” (“Point-Point”) Methods
11.7 What Is the “Orthodox” Fourier Transform for Discrete Data?
11.7.1 LS-Based DFT
11.8 Periodogram Analysis of Signals With Aperiodic or Periodic Trends: When Detrending and Prewhitening Lead to Generally Wrong Results
11.9 Analysis of Multiperiodic, Multiharmonic, and Multishift Signals
11.10 Running Approximations
11.10.1 General Expressions
11.10.2 Running Approximations and Scalegram Analysis for Irregularly Spaced Data
11.10.3 Running Sines
11.10.4 Wavelet Analysis
11.11 Moments of Characteristic Points (O-C Analysis)
11.11.1 Period Determination
11.11.2 Period Changes
11.12 Autocorrelation and Cross-Correlation Analysis
11.12.1 Continuous and Discrete Regular Signals
11.12.2 Bias of ACF due to Trend
11.12.3 Irregularly Spaced Signals
11.13 Principal Component Analysis and Related Methods
11.13.1 Principal Component Analysis: Multichannel Signals
11.13.2 Case of Noisy Channels of Simultaneous Signals
11.13.3 Singular Spectrum Analysis (SSA)
11.13.4 Effective Amplitudes of Low, Fast, and Noise Variability
11.14 Conclusions
12 Learning in Big Data: Introduction to Machine Learning
12.1 Brief History of Machine Learning
12.1.1 Introduction
12.1.2 The Modern History of Artificial Intelligence and ML
12.1.3 What Is Learning?
12.1.4 Why Use Machine Learning Instead of Traditional Statistics?
12.2 Types of Learning
12.2.1 Supervised Learning
12.2.2 Unsupervised Learning
12.2.3 Semisupervised Learning
12.2.4 Reinforcement Learning
12.2.5 Active Learning
12.3 Machine Learning Algorithms
12.3.1 Naive Bayes Classifier
12.3.2 k-Nearest Neighbors
12.3.3 Support Vector Machine
12.3.4 Random Forest
12.3.5 Artificial Neural Network
12.3.6 Multilayer Perceptron
12.3.7 Dimensionality Reduction
12.4 Machine Learning in Astronomy and Geosciences
12.4.1 Case Studies in Astronomy Object Classification Star Galaxy Classification Galaxy Morphology Photometric Redshift Data Mining Software and Tools
12.4.2 Case Studies in Geoscience
12.4.3 Simple Case Study in Geology: Supervised Classification of Lithology
12.4.4 Common Properties
12.5 Scalable Machine Learning Algorithms
12.5.1 What Is a Scalable Machine Learning Algorithm?
12.5.2 Scalable Clustering Hierarchical Methods Density-Based Methods Grid-Based Methods
12.5.3 Scalable Prediction: Classification and Regression
12.5.4 Scalable Pattern Mining
12.6 Scalable ML Frameworks
12.6.1 Apache Spark Components of Apache Spark
12.6.2 Flink ML
12.7 Inference and Learning in Astronomy and Geosciences
12.8 Summary
13 Deep Learning – an Opportunity and a Challenge for Geo- and Astrophysics
13.1 Introduction
13.2 The Difference Between Shallow Learning and Deep Learning
13.3 Why Is Deep Learning a Good Fit for the Data Science Problems in Astro- and Geophysics
13.4 Deep Learning Models
13.4.1 Convolutional Neural Networks
13.4.2 Recurrent Neural Networks
13.4.3 Generative Models Variational Autoencoders Generative Adversarial Networks
14 Astro- and Geoinformatics – Visually Guided Classification of Time Series Data
14.1 Introduction
14.2 The MESSENGER Data
14.3 System Architecture
14.3.1 Scalability
14.3.2 Indexing Time Series With Apache Lucene
14.3.3 Related Pattern Search in Signal Data
14.4 Visual Interface
14.4.1 Large-Scale Signal Visualization
14.4.2 Finding Related Signal Patterns
14.4.3 Signal Annotation
14.5 Time Series Preprocessing
14.5.1 Normalization of Time Series
14.5.2 Stationarity of Time Series
14.5.3 Trends in Time Series
14.5.4 Periodicity and Seasonality in Time Series
14.6 Time Series Representation
14.6.1 PAA
14.6.2 SAX
14.6.3 Piecewise Linear Representation
14.6.4 Windowed Approach to Time Series
14.7 Time Series Similarity
14.8 Pattern Mining in Time Series
14.8.1 Outlier Detection in Time Series
14.8.2 Frequent Patterns
14.8.3 Surprising Patterns
14.9 Time Series Modeling and Classification
14.9.1 Time Series Forecasting
14.9.2 Classification and Clustering of Unsegmented Time Series
14.9.3 Classification on Segmented Time Series
14.10 Conclusions
15 When Evolutionary Computing Meets Astro- and Geoinformatics
15.1 Introduction
15.2 The Optimization Problem
15.2.1 Standard Formulation
15.2.2 Types of Optimization Problems The Number of Decision Makers The Type of the Decision Variables The Number of Constraints The Number of Objective Functions The Linearity The Uncertainty Tied to the Optimization Model
15.2.3 The Multiobjective Optimization Problem
15.3 Evolutionary Computation
15.3.1 Basic Structure of an Evolutionary Algorithm
15.3.2 Evolution Operators Selection Cross-Over Mutation
15.4 Evolutionary Computing Metaheuristics
15.4.1 Genetic Algorithms
15.4.2 Evolutionary Strategy
15.4.3 Evolutionary Programming
15.4.4 Genetic Programming
15.4.5 Other Evolutionary Algorithms and Bio-Inspired Approaches Differential Evolution Coevolutionary Algorithms Swarm Intelligence Artificial Immune Systems
15.5 Parallel Evolutionary Computing Metaheuristics for Big Data
15.6 Practical Applications of Evolutionary Computing Metaheuristics in the Context of Astro- and Geoinformatics
15.7 Discussion and Conclusions
Part iv Wisdom
16 Multiwavelength Extragalactic Surveys: Examples of Data Mining
16.1 Introduction
16.2 The Automated Morphological Classification for the SDSS Galaxies
16.3 Zone of Avoidance of the Milky Way
16.4 Flux Variability of the Blazar 3C 454.3
17 Applications of Big Data in Astronomy and Geosciences: Algorithms for Photographic Images Processing and Error Elimination
17.1 Flatbed Scanners as Digitizers for Astronomic Photographic Material
17.2 Algorithm of Correction for Scanner Errors
17.3 Big Photographic Data and Their Errors
18 Big Astronomical Datasets and Discovery of New Celestial Bodies in the Solar System in Automated Mode by the CoLiTec Software
18.1 Introduction
18.2 Big Astronomical Data Processing
18.3 Summary
19 Big Data for the Magnetic Field Variations in Solar-Terrestrial Physics and Their Wavelet Analysis
19.1 Introduction to Big Magnetic Data in Solar-Terrestrial Physics
19.2 Mechanism of Generating Strong Geomagnetic Storms (Long-Period Geomagnetic Field Variations)
19.2.1 The Big Picture of Solar-Terrestrial Physics – Quiet and Disturbed Geomagnetic Phenomena
19.2.2 Geomagnetic Storms
19.2.3 Ground Geomagnetic Field, and Geomagnetic Activity Index During a Storm The Dst Index During the 2003 Storm
19.2.4 Ionospheric Parameters From Ionospheric Sounding Stations Data About the Parameters of the Ionospheric Plasma
19.2.5 Emergence of Higher-Frequency Modes in the Ionospheric Parameters and in the IMF Which Are Related to the Ground Geomagnetic Field Variations
19.2.6 The Strong Geomagnetic Storms in 2003 and 2017 to Be Analyzed The Storm in 2003 The Storm on 7 and 8 September 2017
19.2.7 Acquired Data for Short-Period Variations of the Geomagnetic Field, the Ionospheric Parameters, and the IMF
19.2.8 Data About the Strongly Disturbed Geomagnetic Field in October and November 2003 and September 2017 The H Component of the Geomagnetic Data From the Panagyurishte (PAG) Observatory The DS Index From the Surlary (SUA) Geomagnetic Data
19.3 Experiments With Wavelet Analysis and Conclusions
19.3.1 References on Applications of Wavelet Analysis to Geomagnetism
19.3.2 Experiments With Data on a Quiet Day, 28 July 2018 Visualization of the Wavelet Analysis Experiments With Balchik Geomagnetic Data, 28 July 2018, 1 Second Data Experiment With SUA Geomagnetic Data on 28 July 2018
19.3.3 Experiments With Data for the Geomagnetic Storm on 7 and 8 September 2017 The Dst for the Period 7-10 September 2017 Experiments With Geomagnetic Data From PAG, 7-10 September 2017, 1 min Data Experiments With Ionospheric Data From Athens, 7-10 September 2017, 5 min Data Experiments With IMF Data From ACE Satellite, 7-10 September 2017, 4 min Data
19.3.4 Experiments With Data for the 2003 Strong Geomagnetic Storm Experiments With Geomagnetic Data From SUA, on 28 and 29 October 2003 Experiments With Spline Smoothing of Dst Experiments With IMF Data, on 28 and 29 October 2003, 4 min Data
19.4 Conclusions
19.A Wavelet Analysis and Its Applications to Geomagnetic Data
19.A.1 Technical Stuff
19.A.2 CWT of Some Simple Functions
20 International Database of Neutron Monitor Measurements: Development and Applications
20.1 Introduction
20.2 The Neutron Monitor Database (NMDB)
20.2.1 The Need for NMDB
20.2.2 The NMDB Database
20.2.3 Data Contribution and Dissemination
20.3 Applications
20.3.1 Ground Level Enhancements (GLEs) – Detection and Characterization
20.3.2 Evaluation of the Radiation Effects on Electronics and Health
20.3.3 Space Weather Nowcast and Forecast
20.4 Summary and Outlook
21 Monitoring the Earth Ionosphere by Listening to GPS Satellites
21.1 Introduction
21.2 The Determination and Procedure Transformation of the Ionosphere Parameters With GNSS Observations
21.3 Recovery of the Spatial State of the Ionosphere
21.3.1 Restrictions and Assumptions for Use of the GNSS Measurements to Restore the Ionization Field
21.3.2 Description of the Method for Determining Ionization Using STEC
21.3.3 Results of the Experimental Restoration of the Changes in the Atmosphere Ionization
21.3.4 Algorithm of the Ionization Field Change Restoration Using the Approximation of the Change in Time of the Coefficients of the Polynomial From Numerous Arguments
21.4 Conclusion
22 Exploitation of Big Real-Time GNSS Databases for Weather Prediction
22.1 Introduction
22.2 Influence of Neutral Atmosphere on Results of Range Finding Observations of Artificial Satellites
22.3 Taking Atmospheric Delay Into Account Using Modern Satellite Technology for Coordinate Support in Real-Time
22.4 Use of Modern Satellite Technology in Meteorology
22.5 Conclusions
23 Application of Databases Collected in Ionospheric Observations by VLF/LF Radio Signals
23.1 Introduction
23.2 Experimental Setup and Observations
23.2.1 Global Experimental Setups
23.2.2 Example of VLF/LF Receiver and Collected Data
23.3 Application of Databases in Detections of Astrophysical and Geophysical Events
23.3.1 Sources of the Low Ionospheric Perturbations
23.3.2 Detections of the Low Ionospheric Perturbations Time-Domain Analyses Frequency-Domain Analyses
23.4 Application of Databases in Modeling Low Ionospheric Plasma Parameters
23.4.1 Modeling of Low Ionospheric Plasma Parameters
23.4.2 Example of VLF/LF Database Application in Modeling Numerical Determination of Wait’s Parameters Analytical Calculation of Electron Density
23.5 Practical Applications
23.5.1 Natural Disasters Earthquakes Cyclones
23.5.2 Telecommunication
23.6 Summary
24 Influence on Life Applications of a Federated Astro-Geo Database
24.1 Introduction
24.1.1 History of the Influence of Geophysical Parameters on Health
24.2 Meteorology and Climate (Temperature, Humidity, etc.) and Application to Disease Propagation
24.3 Influence of Extraterrestrial Sources: Solar Activity, Galactic Cosmic Rays, and Geomagnetism
24.4 Solar UV and Life
24.5 Applications to Agriculture
24.6 Conclusions: Future of Space Observations, Biodiversity, and Astrobiology

Stay Connected

We don’t spam! Read our privacy policy for more info.

Shopping Cart