#install.packages("rstac")
#install.packages("sf")
#install.packages("gdalcubes")
#install.packages("viridis")Creating a raster cube from STAC items of the EOPF Sentinel Zarr Samples Service using rstac and gdalcubes
Introduction
This notebook demonstrates how to create a raster data cube from STAC items of the EOPF Sentinel Zarr Samples Service using the rstac and gdalcubes packages in R. We will search for Sentinel-2 images roughly covering the “Münsterland” region in Germany for June 2025, and retrieve corresponding STAC items with rstac. We will then use gdalcubes to create a raster data cube and visualize an RGB plot aggregating the data to one image (using median), and compute the NDVI for that image.
The gdalcubes package facilitates the creation of regular raster cubes from irregular image collections with differing spatial and temporal resolutions, partial spatial overlap or differing projections etc. It first collects images in so-called image collections and creates data cubes from them, where the spatio-temporal geometry is defined via a data cube view. It also has a function to directly create image collections from STAC items, which we will use here. For details on the functionality and concepts, see the gdalcubes documentation and specifically Marius´ tutorial on the example of the Sentinel-2 COG catalog on Amazon Web Services (AWS).
What we will learn
- 🛰️ Retrieve STAC items for Sentinel-2 imagery in a specified AOI (roughly “Münsterland” region, Germany) and time range (June 2025) using
rstac - 🔎 Create an image collection from STAC items using
stac_image_collection()ingdalcubes - 🚀 Create a raster data cube with a user-defined spatio-temporal geometry using
cube_view()andraster_cube()ingdalcubesand compute the NDVI
As you’ll notice, we’ll need to make a small adjustment to the workflow at one stage, since gdalcubes isn’t fully optimized yet for the EOPF Sentinel Zarr Samples Service. However, the required functionality is available, and we can expect it to be even more seamless in future releases.
This now also works on Windows when using R >= 4.5.2, where an issue with reading blosc compressed files using gdal was fixed, see here
Prerequisites
Make sure to have R >=4.5.2. Install rstac, sf and gdalcubes. We will also use viridis for nice color maps, so we will install it as well.
Create image collection from STAC items
We can convert STAC items received through rstac to gdalcubes image collections using the function stac_image_collection(). From there on, we can create raster cubes using the usual gdalcubes workflow.
This notebook is only meant to showcase a simple example of the usage of gdalcubes with STAC items. It does not contain details on the concepts and workflow of creating and further processing raster cubes. For a more in-depth introduction, please refer to the gdalcubes documentation. This also contains a number of tutorials that explain the capabilities and potential applications of the package.
We start with a search for image data in the EOPF Sentinel Zarr Samples Service. To interact with the EOPF STAC API, we will use rstac:
library(rstac)
library(gdalcubes)
library(sf)Linking to GEOS 3.13.1, GDAL 3.11.4, PROJ 9.7.0; sf_use_s2() is TRUE
# Using a bbox roughly for "Münsterland" region in Germany
# and the month June 2025.
# For the stac request, we need the bbox in WGS84
# We limit the no of results to 5 for this example
bbox_wgs84 = st_bbox(
c(xmin = 6.55, xmax = 7.8, ymin = 51.8, ymax = 52.4),
crs = st_crs(4326)
)
stac_source = stac("https://stac.core.eopf.eodc.eu/")
items = stac_source |>
stac_search(
collections = "sentinel-2-l2a",
datetime = "2025-06-01T00:00:00Z/2025-06-30T23:59:59Z",
bbox = c(
bbox_wgs84["xmin"],
bbox_wgs84["ymin"],
bbox_wgs84["xmax"],
bbox_wgs84["ymax"]
),
limit = 5
) |>
post_request()
length(items$features)[1] 5
items###Items
- features (5 item(s)):
- S2C_MSIL2A_20250630T104041_N0511_R008_T32UMD_20250630T161206
- S2C_MSIL2A_20250630T104041_N0511_R008_T32UMC_20250630T161206
- S2C_MSIL2A_20250630T104041_N0511_R008_T32ULD_20250630T161206
- S2C_MSIL2A_20250630T104041_N0511_R008_T32ULC_20250630T161206
- S2C_MSIL2A_20250630T104041_N0511_R008_T31UGU_20250630T161206
- assets:
AOT_10m, B01_20m, B02_10m, B03_10m, B04_10m, B05_20m, B06_20m, B07_20m, B08_10m, B09_60m, B11_20m, B12_20m, B8A_20m, product, product_metadata, SCL_20m, SR_10m, SR_20m, SR_60m, TCI_10m, WVP_10m
- item's fields:
assets, bbox, collection, geometry, id, links, properties, stac_extensions, stac_version, type
###--> Helper function to adapt URLs (adding a prefix) to make them accessible for GDAL
f = function(href) {
return(paste0("ZARR:\"/vsicurl/", gsub(".zarr/", ".zarr\"/", href)))
}As you can see, we have received 5 STAC items matching our search criteria (we would have received more items, but we have set the limit to 5). We have then defined a helper function that adapts the asset URLs.
❓ Why adapt the URLs?
The items returned from our STAC request have asset hrefs like this:
items$features[[1]]$assets$B02_10m$href [1] "https://objects.eodc.eu:443/e05ab01a9d56408d82ac32d69a5aae2a:202506-s02msil2a/30/products/cpm_v256/S2C_MSIL2A_20250630T104041_N0511_R008_T32UMD_20250630T161206.zarr/measurements/reflectance/r10m/b02"
In order for GDAL to use the correct driver, we need the URL with a prefix for the correct Zarr driver (see this tutorial). Additionally, we want it to point directly to the .zarr file. The function above replaces the href by one that links directly there and prepends the required prefix. Since you can give this function as an argument to stac_image_collection(), it will be applied to all assets automatically.
Now we can create an image collection from the received STAC items:
# the helper function f (argument 'url_fun') is applied to all asset hrefs
s2_collection = stac_image_collection(
items$features,
asset_names = c("B02_10m", "B03_10m", "B04_10m", "B08_10m"),
url_fun = f
)
s2_collectionImage collection object, referencing 5 images with 4 bands
Images:
name left
1 S2C_MSIL2A_20250630T104041_N0511_R008_T32UMD_20250630T161206 7.500940
2 S2C_MSIL2A_20250630T104041_N0511_R008_T32UMC_20250630T161206 7.531529
3 S2C_MSIL2A_20250630T104041_N0511_R008_T32ULD_20250630T161206 6.004761
4 S2C_MSIL2A_20250630T104041_N0511_R008_T32ULC_20250630T161206 6.065798
5 S2C_MSIL2A_20250630T104041_N0511_R008_T31UGU_20250630T161206 5.927820
top bottom right datetime srs
1 53.24954 52.25345 9.146277 2025-06-30T10:40:41 EPSG:32632
2 52.35039 51.35443 9.143291 2025-06-30T10:40:41 EPSG:32632
3 53.24196 52.22621 7.678544 2025-06-30T10:40:41 EPSG:32632
4 52.34305 51.32805 7.704578 2025-06-30T10:40:41 EPSG:32632
5 53.21199 52.17549 7.634183 2025-06-30T10:40:41 EPSG:32631
Bands:
name offset scale unit nodata image_count
1 B02_10m 0 1 5
2 B03_10m 0 1 5
3 B04_10m 0 1 5
4 B08_10m 0 1 5
Note that the images originate from two different UTM zones (EPSG:32632 and EPSG:32631). They will be automatically detected and images -if necessary- reprojected during the creation of the raster data cube. This simplifies the workflow significantly and allows to easily combine images from different UTM zones in one cube.
Create a cube view
After collecting all available images in an image collection, we will now create a raster data cube. For this, we first need to define a cube view, which specifies the desired spatio-temporal geometry of the raster cube (spatial resolution, extent, temporal resolution, time extent, etc.). For our example, we will select a small subset (10 x 10 km) north of the city of Muenster, which includes the “Rieselfelder” bird sanctuary. We collect the spatio-temporal extent in a list, which we then pass to cube_view(). We also specify a spatial resolution of 10m and a temporal resolution of 1 day. While the Sentinel-2 satellites have a revisit time of ~5 days (and in addition we have limited the number of items received in our STAC request to 5 in this example), swath overlaps may in principal lead to a higher temporal resolution for certain pixels, so we define 1 day here to get as much data from the image collection as we can, leading to a rather sparse cube. However, days and locations without images are interpreted as no-data values and will simply be ignored in the following steps (see https://gdalcubes.github.io/source/concepts/execution.html#chunking for details).
Even for the same day, certain locations may in principal be recorded more than once. Thus, we always have to define an aggregation method. Here we use “first”: in the unlikely case that we have data from two images per day at a certain location, we simply choose the first.
#Collect spatio-temporal extent in a list
# We select a coordinate north of Muenster close to the "Rieselfelder" bird sanctuary,
# and define a 10km x 10km extent around it
rieselX = 404500.13
rieselY = 5765279
ext = list()
ext$left = rieselX - 5000
ext$right = rieselX + 5000
ext$bottom = rieselY - 5000
ext$top = rieselY + 5000
ext$t0 = "2025-06-01"
ext$t1 = "2025-06-30"
# Create cube view with 10m spatial and 1 day temporal resolution,
# and the previously defined spatio-temporal extent
v = cube_view(srs="EPSG:32632", dx=10, dy=10, dt="P1D",
aggregation="first",
extent=ext)
vA data cube view object
Dimensions:
low high count pixel_size
t 2025-06-01 2025-06-30 30 P1D
y 5760279 5770279 1000 10
x 399500.13 409500.13 1000 10
SRS: "EPSG:32632"
Temporal aggregation method: "first"
Spatial resampling method: "near"
As you can see, the cube view summarizes the spatio-temporal geometry of the raster cube we want to create. We can see (amongst other information) that the resulting cube will have 1000 x 1000 pixels in space (10 km x 10 km at 10m resolution) and 30 time steps (1 per day for June 2025), though many of these pixels and time steps will be no-data values as we do not have complete daily coverage for the whole area. We decide to use UTM-Zone 32N (define the srs as EPSG:32632) for our cube view, images in other spatial references systems will be reprojected on the fly.
Create cube and plot an RGB and an NDVI image
Now we´ll create a raster cube using the previously defined cube view. As an example application for such a raster cube, we´ll first plot an RGB image of the area. We´ll then select the red and near-infrared bands, and compute an NDVI. In both cases, we use our raster cube to aggregate the results for one month computing the median. This means, that for all locations where we received data on more than one day in that month, the median of all values of that pixel throughout the month is taken –> a very basic, but easy and straightforward way to select a value that is robust against outliers (e.g. clouds).
gdalcubes_options(parallel=4)
#create an RGB plot of the area, aggregated to one image for the whole month using median
raster_cube(s2_collection, v) |>
select_bands(c("B02_10m","B03_10m","B04_10m")) |>
reduce_time(c("median(B02_10m)", "median(B03_10m)", "median(B04_10m)")) |>
plot(rgb = 3:1, zlim = c(0,3000))#compute NDVI and aggregate to one image using median
ndvi <- raster_cube(s2_collection, v) |>
select_bands(c("B04_10m","B08_10m")) |>
apply_pixel("(B08_10m - B04_10m)/(B08_10m + B04_10m)", "NDVI") |>
reduce_time(c("median(NDVI)"))
plot(ndvi, zlim = c(-0.2,1),key.pos=1, col=viridis::viridis)💪 Now it is your turn
🔍 Task: Perform a simple NDVI change analysis between April and June 2025.
ℹ️
gdalcubescan also be used to compute more complex analyses, such as change analysis between two time periods. For a task like this however, it is probably more straightforward to create two separate cubes for both months, derive cloud-free composite NDVI images from them, and then use another package liketerrato compute the difference image. You can apply the same approach as above to create a composite NDVI image for the same area in April. Export both NDVI images (April and June) usingwrite_tif()and use the functionrast()of the packageterrato load the output. You can then simply subtract both objects to compute a simple difference image highlighting changes between the two months.
Conclusion
In this notebook, we have demonstrated how to create a raster data cube from STAC items of the EOPF Sentinel Zarr Samples Service using the rstac and gdalcubes packages in R. We retrieved Sentinel-2 images for a specified area and time range, created an image collection from these items, and then constructed a raster cube with a defined spatio-temporal geometry. We visualized an RGB composite and computed the NDVI for the area of interest. We have seen that currently some manual adjustments are necessary to work with the EOPF Sentinel Zarr Samples Service using gdalcubes, but this will be streamlined in future releases of the package.