Innovative Methodology for Geospatial Insights

Orbify is a SaaS platform that automates the analysis of satellite imagery with help of AI to streamline the measurement of environmental impact as well as planning & monitoring of climate-positive projects.

The methodology is carried out using a geospatial data platform (GDP) developed by Orbify, Inc. (USA). Orbify platform uses three types of satellite imagery to conduct analysis. The data is interpreted by machine learning regression and classification models, employing advanced technology to gather reliable data, using all sensors described below when they are relevant as input signals for calculating respective indicators.

Optical

Optical imagery provided by multiple providers, primarily the Copernicus Sentinel-2 mission, launched by European Space Agency, and the Landsat program operated by NASA. As needed, supplemented with data coming from commercial providers like Axelspace or Planet to ensure high enough revisit time and spatial resolution appropriate for the specific project. All satellites provide imagery in, at least, four basic bands (visible red, green, and blue, and near- infrared(NIR)), while the Sentinel-2 mission goes up to 10m spatial resolution and 13 bands of spectral resolution (besides visible and near-infrared, e.g., vegetation red edge and short-wave infrared).

Image source: Orbify

Lidar

LiDAR (Light Detection and Ranging) is used for measuring forest canopy height, canopy vertical structure, and surface elevation, obtained from NASA’s GEDI (Global Ecosystem Dynamics Investigation) mission that uses an instrument attached to the International Space Station

Image source: GEDI L1-L2 Data Resources - NASA

SAR

SAR (Synthetic Aperture Radar) – from multiple providers primarily the European Space Agency’s Sentinel-1 SAR satellite operating in the C-band spectrum providing a 10m spatial resolution and 12-day temporal resolution, supplemented by commercial imagery as needed.

Image source: Orbify

Biomass

Orbify is developing a proprietary algorithm for more accurate Above Ground Biomass (AGB) estimation. It utilizes data from the GEDI mission and Sentinel imagery to create a predictive model for forest monitoring.

The estimation of Above Ground Biomass (AGB) is based on data acquired through the Global Ecosystem Dynamics Investigation (GEDI) mission, which was operational from 2019 to 2023. GEDI's primary objective was to scan the Earth's surface using Light Detection and Ranging (LiDAR) technology. As a result of this mission, one of the valuable products generated is the AGB data, which is represented as point-based measurements in units of Mg/ha.

The points derived from the GEDI LiDAR measurements serve as the training dataset for our machine-learning model. To ensure that the training data only includes points from forested areas, we utilized the woodland delimitation dataset obtained from the UK government for the year 2020. Additionally, we acquired Sentinel 2 Optical Imagery and Sentinel 1 Synthetic Aperture Radar (SAR) imagery for the same year. To focus solely on forested regions, we applied a masking process to the Sentinel 2 and Sentinel 1 imagery using the woodland delimitation data. Subsequently, we identified the corresponding points from the GEDI dataset that align with these specific forested areas. As a result, we now possess a set of points that are spatially associated with forests, and their respective pixels from the Sentinel imagery, which will be utilized for training the AGB prediction model.

The training process involves employing a Machine Learning algorithm, utilizing the aforementioned points as the training dataset. Once the model is trained, it becomes capable of predicting the AGB for a given area that possesses Sentinel 2 and Sentinel 1 imagery for different years. This predictive capability enables us to estimate the AGB of forested regions over time using the provided remote sensing data.

In summary, our approach based on scientific evidence leverages GEDI LiDAR point data, Sentinel 2 and Sentinel 1 imagery, and a random forest Machine Learning model to estimate AGB in forested areas. This integrated methodology holds promise for ecological and environmental studies, aiding in forest monitoring, carbon accounting, and other related applications.

Forest/Non Forest

The Forest/Non-Forest model is an advanced tool for land cover classification and forest monitoring, combining data from Sentinel-1, Sentinel-2, and GEDI LiDAR for accurate change detection.

Feature Space Creation and Labelling

Satellite and GEDI data are temporally aligned to create a rich feature space, and labelling uses predefined categories like tree, urban, and water for accurate training data.

The feature space is created by carefully aligning the satellite and GEDI data temporally. This process ensures that the satellite features correspond to the same time periods as the GEDI measurements, resulting in a rich feature space that combines spectral and textural information from satellite imagery with structural information from GEDI data.

The labelling process utilises a predefined set of categories: tree, urban, water, fieldland, and baresoil. These categories are defined using a combination of GEDI metrics and ancillary data, employing a multi-criteria approach for robust and accurate labelling. For example, the "tree" class is identified using both height information (rh98 ≥ 7m) and tree cover percentage (≥ 40%). This multi-faceted approach to labelling helps create more reliable and accurate training data for the model.

Model Training and Evaluation

The model is built using a Random Forest classifier, with oversampling for class balance and dimensionality reduction techniques for performance validation.

The heart of this approach is a Random Forest classifier, implemented through Google Earth Engine's smileRandomForest algorithm. To address potential class imbalance issues, particularly the likely underrepresentation of forested areas, the approach applies an oversampling technique to the tree class. This ensures that the model is equally sensitive to all classes, despite potential imbalances in their natural occurrence.

The model is trained on a carefully selected set of input properties, including spectral bands, derived indices, and texture features. This comprehensive feature set allows the model to capture various aspects of land cover characteristics.

Evaluation of the model involves generating various metrics to assess its performance. Additionally, dimensionality reduction techniques such as Principal Component Analysis (PCA) and t-distributed Stochastic Neighbour Embedding (t-SNE) are implemented to validate the distinction between classes and understand how the feature space behaves. Feature importance is calculated from the trained Random Forest, providing insights into which features contribute most significantly to the classification process.

Prediction and Change Detection

For generating predictions, the trained classifier is applied to new satellite imagery. This process produces binary forest cover maps, effectively distinguishing forested areas from non-forested areas based on the learned patterns from the training data.

Visualisation

To communicate results effectively, the approach includes a sophisticated visualization step. It creates color-coded maps showing areas of maintained forest, forest loss, and forest gain. These visualizations are prepared as named layers with associated legends, making them readily interpretable for end-users.

Indicators

Land Use Analysis

Project & jurisdictional area mapping

Change analysis

Change patterns

Vegetation Condition

AGB for carbon stock assessment

Canopy Height

Forest Canopy Cover

Vegetation Health

Vegetation Stress index

Environmental Conditions

Air quality

Water quality

Freshwater resources mapping

Humidity

Temperature

Wind

Natural & Anthropogenic Hazards

Drought analysis

Air pollution analysis

Flood analysis

Landslide analysis

Thermal extremes

Wildfire analysis

Soil erosion

Biodiversity Classification

Distribution of biodiversity

Reliable, independent and unbelievably cost-effective

Insights for any area size available within minutes

API for seamless integration into other tools or web properties

Data that scales with your needs

Proprietary Algorithms to assure data accuracy