How ThinkingEarth Leverages Large-Scale Datasets

News Item
27 March 2025

At ThinkingEarth, we harness the power of Self-Supervised Learning (SSL) and Graph Neural Networks (GNNs) to construct task-agnostic Copernicus Foundation Models and a graph-based representation of the Earth. By integrating diverse Earth Observation (EO) datasets, we aim to improve environmental monitoring, climate action, and sustainable development.
At ThinkingEarth, we harness the power of Self-Supervised Learning (SSL) and Graph Neural Networks (GNNs) to construct task-agnostic Copernicus Foundation Models and a graph-based representation of the Earth. By integrating diverse Earth Observation (EO) datasets, we aim to improve environmental monitoring, climate action, and sustainable development.

Copernicus Foundation Models and Their Role

Copernicus Foundation Models serve as the backbone of ThinkingEarth, enabling us to extract meaningful insights from EO data, particularly from Copernicus Sentinel missions. These models facilitate large-scale geospatial analysis by learning representations that generalise across multiple tasks, sensors, and temporal scales.

Key Datasets Powering ThinkingEarth

To develop robust EO Foundation Models, we rely on several large-scale datasets that provide diverse spectral, temporal, and geospatial information. Each dataset contributes to different aspects of our research:

SoftCon (Contrastive Learning for EO Data)

  • Purpose: SoftCon enables contrastive self-supervised learning for EO data by leveraging multi-label land cover annotations.
  • Usage in ThinkingEarth: It helps pre-train models on diverse land cover conditions, improving downstream classification and segmentation tasks.

DOFA (Domain-Oriented Feature Adaptation)

  • Purpose: DOFA is designed for weakly supervised domain adaptation in EO, ensuring models can generalise across regions with different distributions.
  • Usage in ThinkingEarth: We incorporate DOFA techniques to enhance model adaptability, reducing the need for extensive labelled data when applying models to new geographic areas.

MUDDAT (Multi-Source Domain Adaptation Dataset for EO)

  • MUDDAT is a benchmark dataset for evaluating domain adaptation techniques in EO.
  • Usage in ThinkingEarth: It allows us to assess the robustness of our models when transferring knowledge between different EO datasets.

Kuro Siwo (Global Flood Mapping Dataset)

  • Purpose: Kuro Siwo provides manually annotated flood maps for global-scale flood detection.
  • Usage in ThinkingEarth: Our flood prediction models are trained using Kuro Siwo to improve the accuracy of flood risk assessments and early warning systems.

GAIA (Geo-Aware Image Annotations)

  • Purpose: GAIA offers high-quality semantic labels for EO images, aiding in land cover classification and change detection.
  • Usage in ThinkingEarth: GAIA enhances scene-level annotations in self-supervised learning pipelines, making our models more context-aware.

Fo-Mo Bench (Forest Monitoring Benchmark)

  • Purpose: Fo-Mo Bench is a large-scale benchmark for forest monitoring, integrating data from multiple sources to track deforestation and forest health.
  • Usage in ThinkingEarth: We utilise Fo-Mo Bench to fine-tune our models for vegetation monitoring and carbon stock estimation.

Expanding EO Foundation Models at Scale

By integrating these datasets, we are expanding the capabilities of EO Foundation Models through:

  • Unified Sensor Fusion: Developing models that process multispectral, SAR, and other sensor data seamlessly.
  • Climate Science Integration: Incorporating climate variables (e.g., atmospheric composition from Sentinel-5P) to improve model understanding.
  • Vision-Language Models for EO: Enhancing the interpretability of EO data using semantic alignment techniques such as GeoLangBind, which connects EO imagery with textual descriptions.

Enhancing Model Adaptability

To ensure our models remain robust across diverse environments, we apply:

  • Transfer Learning: Reusing knowledge from pre-trained models to new tasks.
  • Domain Adaptation: Refining models for application to new geographies.
  • Domain Generalisation: Ensuring robustness across different datasets.

Advanced techniques such as SupCon and SoftCon play a critical role in our self-supervised pretraining strategies, while FoMo-Net and CLIP-based models enhance cross-modal learning.

Conclusion

ThinkingEarth, powered by AI-driven Copernicus Foundation Models, is revolutionising EO analytics for climate resilience and environmental sustainability. Through continuous dataset integration, model refinement, and interdisciplinary collaboration, we are building the next generation of scalable, flexible, and impactful EO models.

Share

Read next


ThinkingEarth and Copernicus Foundation Models: Advancing Earth Observation
News Item
28 February 2025

ThinkingEarth and Copernicus Foundation Models: Advancing Earth Observation

At ThinkingEarth, we are developing advanced Artificial Intelligence (AI) techniques to better understand our planet as a complex system. Using Deep Learning (DL), eXplainable AI (xAI), and physics-aware Machine Learning (ML), we enhance Earth observation (EO).
Enhancing Land Cover Mapping with ThinkingEarth’s Self-Learning Techniques
News Item
30 January 2025

Enhancing Land Cover Mapping with ThinkingEarth’s Self-Learning Techniques

Mapping land cover is important for tracking environmental changes, managing natural resources, and making decisions about cities, farms, and climate. Traditional methods need a lot of manual work and labelled data, which can be slow and costly. ThinkingEarth's self-supervised learning (SSL) technique can offer a better way to map land cover accurately and efficiently, using large amounts of unlabelled satellite data.
Graph Neural Networks and ThinkingEarth
News Item
12 December 2024

Graph Neural Networks and ThinkingEarth

Graphs are a powerful way to represent complex relationships between different entities, making them essential for understanding intricate systems.
Newsletter of the project Thinking Earth

Stay tuned and subscribe to our quarterly newsletter

By submitting your e-mail address, you agree to our privacy policy for the site.