Wednesday, June 10, 2026

Next-Generation EO Archive Storage: Evaluating Zarr and Icechunk for Temporal Access Efficiency and Data Maintainability in Sen2Cube.at

 Suggested by: Luke McQuade, Martin Sudmanns


Short description

Assess Zarr + Icechunk as a future storage option for Sen2Cube

Cloud-optimized GeoTiff (COG) is now an established format for archived EO data, offering reduced data transfer costs when afforded by the user requirements, e.g., area of interest (spatial subsetting) and spatial resolution (overviews/pyramids). However, there are downsides to this format. Typically, COGs are stored as a collection of files - one for each acquisition - and for time series analyses, this means having to open several files, and make several network requests, for even tiny AoIs.

Zarr is a recent alternative, offering chunking across the time dimension as well (and others). Could this be an improvement over COG for Sen2Cube?

Even though data are treated as historical as soon as they enter an archive, occasionally defects are encountered and have to be rectified, or enhancements (to metadata especially) must be made. Icechunk offers a solution for making such changes without having to reprocess lots of data. Could this be used, in conjunction with Zarr, to improve the updateability of Sen2Cube?

Start

As soon as possible

Prerequisites/qualification

  • Remote Sensing & GIS
  • Programming (e.g., Python)
  • Interest in the topic 

No comments: