How ThinkingEarth Leverages Large-Scale Datasets

News Item
27 March 2025

At ThinkingEarth, we harness the power of Self-Supervised Learning (SSL) and Graph Neural Networks (GNNs) to construct task-agnostic Copernicus Foundation Models and a graph-based representation of the Earth. By integrating diverse Earth Observation (EO) datasets, we aim to improve environmental monitoring, climate action, and sustainable development.
At ThinkingEarth, we harness the power of Self-Supervised Learning (SSL) and Graph Neural Networks (GNNs) to construct task-agnostic Copernicus Foundation Models and a graph-based representation of the Earth. By integrating diverse Earth Observation (EO) datasets, we aim to improve environmental monitoring, climate action, and sustainable development.

Copernicus Foundation Models and Their Role

Copernicus Foundation Models serve as the backbone of ThinkingEarth, enabling us to extract meaningful insights from EO data, particularly from Copernicus Sentinel missions. These models facilitate large-scale geospatial analysis by learning representations that generalise across multiple tasks, sensors, and temporal scales.

Key Datasets Powering ThinkingEarth

To develop robust EO Foundation Models, we rely on several large-scale datasets that provide diverse spectral, temporal, and geospatial information. Each dataset contributes to different aspects of our research:

SoftCon (Contrastive Learning for EO Data)

  • Purpose: SoftCon enables contrastive self-supervised learning for EO data by leveraging multi-label land cover annotations.
  • Usage in ThinkingEarth: It helps pre-train models on diverse land cover conditions, improving downstream classification and segmentation tasks.

DOFA (Domain-Oriented Feature Adaptation)

  • Purpose: DOFA is designed for weakly supervised domain adaptation in EO, ensuring models can generalise across regions with different distributions.
  • Usage in ThinkingEarth: We incorporate DOFA techniques to enhance model adaptability, reducing the need for extensive labelled data when applying models to new geographic areas.

MUDDAT (Multi-Source Domain Adaptation Dataset for EO)

  • MUDDAT is a benchmark dataset for evaluating domain adaptation techniques in EO.
  • Usage in ThinkingEarth: It allows us to assess the robustness of our models when transferring knowledge between different EO datasets.

Kuro Siwo (Global Flood Mapping Dataset)

  • Purpose: Kuro Siwo provides manually annotated flood maps for global-scale flood detection.
  • Usage in ThinkingEarth: Our flood prediction models are trained using Kuro Siwo to improve the accuracy of flood risk assessments and early warning systems.

GAIA (Geo-Aware Image Annotations)

  • Purpose: GAIA offers high-quality semantic labels for EO images, aiding in land cover classification and change detection.
  • Usage in ThinkingEarth: GAIA enhances scene-level annotations in self-supervised learning pipelines, making our models more context-aware.

Fo-Mo Bench (Forest Monitoring Benchmark)

  • Purpose: Fo-Mo Bench is a large-scale benchmark for forest monitoring, integrating data from multiple sources to track deforestation and forest health.
  • Usage in ThinkingEarth: We utilise Fo-Mo Bench to fine-tune our models for vegetation monitoring and carbon stock estimation.

Expanding EO Foundation Models at Scale

By integrating these datasets, we are expanding the capabilities of EO Foundation Models through:

  • Unified Sensor Fusion: Developing models that process multispectral, SAR, and other sensor data seamlessly.
  • Climate Science Integration: Incorporating climate variables (e.g., atmospheric composition from Sentinel-5P) to improve model understanding.
  • Vision-Language Models for EO: Enhancing the interpretability of EO data using semantic alignment techniques such as GeoLangBind, which connects EO imagery with textual descriptions.

Enhancing Model Adaptability

To ensure our models remain robust across diverse environments, we apply:

  • Transfer Learning: Reusing knowledge from pre-trained models to new tasks.
  • Domain Adaptation: Refining models for application to new geographies.
  • Domain Generalisation: Ensuring robustness across different datasets.

Advanced techniques such as SupCon and SoftCon play a critical role in our self-supervised pretraining strategies, while FoMo-Net and CLIP-based models enhance cross-modal learning.

Conclusion

ThinkingEarth, powered by AI-driven Copernicus Foundation Models, is revolutionising EO analytics for climate resilience and environmental sustainability. Through continuous dataset integration, model refinement, and interdisciplinary collaboration, we are building the next generation of scalable, flexible, and impactful EO models.

Share

Read next


ThinkingEarth Hackathon: Shaping the Future of AI for Earth
Event
11 September 2025

ThinkingEarth Hackathon: Shaping the Future of AI for Earth

Through Horizon Europe, EUSPA drives the uptake of space-based solutions across Europe. The ThinkingEarth project is a prime example — and it’s now inviting innovators, developers, and Earth science enthusiasts to its first Hackathon, taking place 29–30 September in Riga, Latvia, as part of the BiDS25 satellite events.
DOFA and ThinkingEarth: Towards Adaptive and Scalable Earth Observation Models
News Item
29 August 2025

DOFA and ThinkingEarth: Towards Adaptive and Scalable Earth Observation Models

Earth Observation (EO) is entering a new era where models are no longer tied to a single data type. From optical to radar, multispectral to hyperspectral, each modality provides a distinct environmental signal but leveraging them together in a unified framework has remained a challenge. Recent advances such as DOFA (Domain-Oriented Feature Adaptation) and DOFA-CLIP are addressing these gaps, offering promising directions for large-scale, adaptive EO understanding — directions highly relevant to the goals of ThinkingEarth.
ThinkingEarth Talks: World Food Programme
News Item
29 July 2025

ThinkingEarth Talks: World Food Programme

The ThinkingEarth project sat down with Duccio Piovani from the World Food Programme to get an inside look on how Earth Observation and sentinel data can impact food insecurity across the globe.
Newsletter of the project Thinking Earth

Stay tuned and subscribe to our quarterly newsletter

By submitting your e-mail address, you agree to our privacy policy for the site.