Post

Supervised Hyperion Clay Minerals Mapping using Random Forest and Support Vector Machine

Hyperspectral clay study in Sidoarjo Mud Volcano

Supervised Hyperion Clay Minerals Mapping using Random Forest and Support Vector Machine

Universitas Gadjah Mada, Bachelor Thesis, 2024

Background

Sidoarjo mud has been closely studied since the 2006 eruption that led to continuous hot mud flow until today. For over 18 years, the impacts of this disaster have significantly affected the surrounding areas, impacting both the environment and the economy. Sidoarjo is known for its clay minerals rich in rare earth elements (REE). The presence of REEs in these clay minerals has triggered scientific interest due to their potential applications in high-tech industries such as electronics, renewable energy, and advanced materials development.

In mineral mapping, the use of multispectral images such as ASTER and LANDSAT is very common in Indonesia, employing VNIR-SWIR color composites or band ratios. However, multispectral imaging has limitations in the number of bands and the spectral range it can cover, especially in the SWIR range where distinctive absorption features of clay minerals are typically found. To address these challenges, hyperspectral imaging offers a solution with a significantly larger number of bands and narrower spectral ranges.

Data and Method

Data

This project uses Hyperion EO-1 hyperspectral data collected from U SGS Earth Explorer with the acquisition time in 2015. This data was downloaded in L1R format, which means Level 1 Radiometrically Corrected. Additionally, XRD data were analyzed from the field.

Method

Unlike multispectral data, hyperspectral data are typically processed using dimension reduction methods. Kruse (2003) proposed a method named “Hourglass” that includes several processing techniques such as Minimum Noise Fraction (MNF), Pixel Purity Index (PPI), n-D Visualizer, and then mapped spatially with Random Forest algorithm.

The script was run in anaconda environment by using Jupyter notebook, the code can be found at the end of this storymap

This method aims to identify the purest group of pixel characteristics (endmembers) from the spectra by analyzing their shapes and absorption features, and comparing them with spectra from the USGS Spectral Library at the pixel level.

MNF

The analysis begins with MNF (minimum noise fraction) conversion using ENVI software, with input data from atmospheric correction results. During this process, 163 Hyperion image bands are simplified to the best bands, with initial bands having the least noise and later bands having the most.

MNFMNF results (the first 12 bands)

Spatially, the MNF transformation results show that the initial MNF bands (1-8) clearly depict the Sidoarjo mud puddle with minimal noise. In contrast, the later MNF bands (9-12) have high noise levels, making it difficult to recognize the mud puddle. This trend continues for bands beyond 12. Therefore, the MNF transformation effectively filters the 163 bands down to just 8 bands with the best image quality and least noise.

PPI and N-D visualizer

Pure pixels (PPI) are extracted at this stage and visualized in n-dimensional space, revealing a dense distribution of pixels. These dense pixels share the same spectral characteristics and pixel values, allowing them to be delineated and identified as endmembers. There are 8 endmembers in the study area.

PPILeft: N-D visualizer, illustrating the distribution of pixels with similar values (endmember)., Right: these pixels mapped spatially on the Lapindo mud volcano

Spectra Matching (Training data)

Since Random Forest is a supervised classification method, the input data must be labeled. Therefore, the Spectral Analyst tool is used to identify the mineral types to be used as labels by matching the endmember spectra with reference spectra from the USGS . For example, endmember 2 has three water absorption regions and an Al-OH metal hydroxyl absorption at a wavelength of 2203 nm, which matches the spectral characteristics of the montmorillonite.

spectraEndmember 2 has a similar spectrum to montmorillonite

The same process was applied to all 8 endmember spectra. Then, all pixel data with labels were trained and mapped using the Random Forest Algorithm.

Results

The resulting maps appear to make geological sense, showing that the mineral distribution is spread in a circular pattern. The composition is predominantly a mixture of kaolinite and smectite, covering 46.37% of the study area.

before classified after classified

The Random Forest algorithm successfully classified the Hyperion hyperspectral image with an overall accuracy (OA) of 99.4% and a kappa score of 0.993, slightly higher than the SVM algorithm’s 98.9% OA and 0.987 kappa score. The source of error came from the misclassification of the mineral montmorillonite. The H2O absorption regions and the Al-OH metal hydroxyl absorption played the most significant roles in model creation.

random forest SVM
This post is licensed under CC BY 4.0 by the author.