ThinkingEarth and Copernicus Foundation Models: Advancing Earth Observation

News Item
28 February 2025

At ThinkingEarth, we are developing advanced Artificial Intelligence (AI) techniques to better understand our planet as a complex system. Using Deep Learning (DL), eXplainable AI (xAI), and physics-aware Machine Learning (ML), we enhance Earth observation (EO).
At ThinkingEarth, we are developing advanced Artificial Intelligence (AI) techniques to better understand our planet as a complex system. Using Deep Learning (DL), eXplainable AI (xAI), and physics-aware Machine Learning (ML), we enhance Earth observation (EO).

Our work leverages Self-Supervised Learning (SSL) and Graph Neural Networks (GNNs) to build task-agnostic Copernicus Foundation Models and a Graph representation of the Earth.

Copernicus Foundation Models and Their Role

Copernicus Foundation Models are central to the ThinkingEarth project, helping to extract useful information from EO data, particularly from Copernicus Sentinel missions.

Innovation in EO Foundation Model Research

To tackle key environmental challenges like biodiversity monitoring and climate action, we have curated large-scale datasets, including:

  • SSL4EO-S12-ML – A global multi-label dataset (~5TB) combining multispectral and SAR imagery with open land-cover products.
  • SSL4EO-S – A ~15TB dataset integrating Sentinel mission data and Copernicus DEM GLO-30, linking EO with climate science.
  • Kuro Siwo – A manually annotated dataset (~1.33TB) for global flood mapping.
  • FoMo-Bench – A benchmark for forest monitoring, spanning diverse datasets (~10TB).

Big data plays a crucial role in EO foundation models. While raw satellite imagery is valuable, additional resources like WorldCover (global land cover maps at 10m resolution) and Dynamic World (real-time land-use mapping) provide rich semantic information. Though noisy at the pixel level, they can be refined into scene-level annotations to enhance self-supervised learning.

Expanding EO Foundation Models at Scale

Looking ahead, we aim to:

  • Develop unified models across different sensors and timeframes, handling various spectral and non-spectral data.
  • Bridge EO and climate science by incorporating gridded surface and atmospheric data (e.g., S5P products).
  • Enhance vision-language models to make EO models more interactive through semantic alignment.

A key initiative, GeoLangBind, unifies EO data through language-driven alignment, improving model reasoning and cross-modal learning.

Strengthening Model Adaptability

To improve model performance across different datasets, we focus on:

  • Transfer Learning for knowledge reuse.
  • Domain Adaptation to refine models for new data.
  • Domain Generalisation for robustness across diverse EO datasets.

Techniques like SupCon and SoftCon use multi-label land cover annotations for self-supervised pretraining. FoMo-Net pre-trains a sensor-agnostic model using diverse EO datasets, while CLIP-based models align EO imagery with text for better cross-modal learning.

To ensure effectiveness, we will assess these methods using advanced performance metrics and statistical distance evaluations.

Conclusion

By integrating AI with EO data, ThinkingEarth and Copernicus Foundation Models drive innovation in climate action, ecosystem preservation, and sustainability. Through continued research and collaboration, we aim to create robust, scalable, and flexible EO models that can provide actionable insights for global environmental challenges.

Share

Read next


Enhancing Land Cover Mapping with ThinkingEarth’s Self-Learning Techniques
News Item
30 January 2025

Enhancing Land Cover Mapping with ThinkingEarth’s Self-Learning Techniques

Mapping land cover is important for tracking environmental changes, managing natural resources, and making decisions about cities, farms, and climate. Traditional methods need a lot of manual work and labelled data, which can be slow and costly. ThinkingEarth's self-supervised learning (SSL) technique can offer a better way to map land cover accurately and efficiently, using large amounts of unlabelled satellite data.
Graph Neural Networks and ThinkingEarth
News Item
12 December 2024

Graph Neural Networks and ThinkingEarth

Graphs are a powerful way to represent complex relationships between different entities, making them essential for understanding intricate systems.
ThinkingEarth and Flood Mapping: Enhancing the Future of Disaster Response
News Item
29 November 2024

ThinkingEarth and Flood Mapping: Enhancing the Future of Disaster Response

Flood mapping has long relied on satellite-based datasets, which provide critical insights into disaster impact. However, these datasets often face challenges, such as limited temporal coverage, inconsistent annotations, and the absence of pre-event imagery.
Newsletter of the project Thinking Earth

Stay tuned and subscribe to our quarterly newsletter

By submitting your e-mail address, you agree to our privacy policy for the site.