FOSS4G 2022 academic track

Francesco Pirotti

Francesco Pirotti is currently associate professor at University of Padova, teaching courses on spatial data analysis, surveying and remote sensing. His research is oriented towards the application of remote sensing for the environmental sciences. He is author of more than one hundred scientific papers (in national and international journals) and associate editor in several international scientific journals and has guest-edited several special issues for different journals. He is active in the International Society of Photogrammetry and Remote Sensing, in the Association of European Remote Sensing Laboratories (EARSeL) and in national academic societies.

The speaker's profile picture

Sessions

08-24
14:15
30min
InforSAT: an online Sentinel-2 multi-temporal analysis toolset using R CRAN
Francesco Pirotti

Remote sensing via orbiting satellite sensors is today a common tool to monitor numerous aspects related to the Earth surface and the atmosphere. The amount of data from imagery have increased tremendously since the past years, due to the increase in space missions and public and private agencies involved in this activity. A lot of these data are open-data, and academics and stakeholders in general can freely download and use it for any type of application. The bottle-neck is often not data availability anymore, but the processing resources and tools to analyse it. In particular multi-temporal analysis requires stacks of images thus digital space for storage and processing workflows that are tested and validated. Processing image by image is often not a viable approach anymore. Several solutions have been created to support centralized and automated processing of multiple images. Software as a service (SaaS) is becoming more common among users. The most popular to this day is probably Google Earth Engine (GEE), which gives users Petabytes of data at their fingertips, access to processing resources and an interface that provides a large number of tools for data processing via Javascript or Python programming environments (Gorelick et al., 2017). What took before days if not months can now be run in a few minutes or hours. GEE is available and free for academics as of today, but it must be noted that it is not to be taken for granted in the future. Other initiatives such as Copernicus RUS project that has closed at the end of 2021 also provided access to data (Copernicus data) and computing resources, to promote uptake of Copernicus data via educational and research activities.

Moving towards SaaS solutions usually requires a provider that puts software on the cloud and a channel, usually a web portal, for accessing data and tools. The R CRAN programming environment has all the “ingredients” that are needed to create such SaaS in a local machine or on a server. We propose and discuss here a solution, called InforSAT, that was created ad hoc for centralizing satellite imagery processing, taking advantage of a remote server with multiple processors and thus also parallel processing solutions. The R Shiny package was used for connecting online widgets for user interaction with R tools for specific processing of imagery that is done via other specific packages. To this date only Sentinel-2 Level 2C data are considered, but the system is scalable to other sensors and processing levels. The tools that are available to this day are focused on multi-temporal analysis, to support the academic community involved in particular in vegetation analysis, whose phenology has notable changes inter- and intra-annually. The tools are available via a web portal to reach research teams that are not so familiar to satellite image analysis, to allow simplified extraction of multi-temporal data from Sentinel-2 images. Figure 1 shows the interface and figure 2 the result of extracting a boxplot of vegetation index values over a specific time window.

All image data are stored in a user-defined folder on the server, and a script checks weekly (or at other user-defined intervals) for new Sentinel-2 images and automatically downloads them and stores metadata in an R list structure. The metadata stores image paths, bands and also histograms of values for each band, to use for defining color-stretching parameters during image rendering on the browser. Regarding visualization, users can render real-color and false-color composites defining their own band combinations, and can also create and raster layer with the values of common vegetation indices or define their own index by providing an equation on the interface (see Figure 1). The images to be rendered on the user browser are processed on-the-fly from the original JPEG2000 format, also for calculating the index raster and the color-composites. Each index raster is calculated every time the user re-draws actively the raster, by sampling the original image with points that correspond to the screen pixels, reprojected from screen coordinates to image coordinates. Depending on the screen size and on the area, these are around one million points, that are then converted to an image and rendered on screen with a fixed scale that depends on the expected minimum and maximum values of the index (e.g. for the normalized vegetation index that would be between -1 and 1) or a scale that automatically stretches between the 10th and 90th percentile of the frequency distribution of the real values. The color-composites are automatically drawn at any scale using the intrinsic overviews for each Sentinel-2 band that are present from the JPEG2000 format. Regarding multi-temporal analysis, users can define one ore more polygons over the area and for each polygon extract single pixel values (digital numbers – DN) and aggregated zonal statistics for each and all available images in a few seconds, with or without using parallel processing mode. Users can download the multi-temporal data, i.e. the DN values, in table format for further analysis. The table is in long format and has a column with a timestamp, one with polygon ID and one column for each band with values. In both visualization and multi-temporal analysis, users can decide a threshold for masking according to cloud and snow probability, which are available products from the sen2cor processing of Sentinel-2 to level 2C. In the near future this solution will be integrated in an R package, allowing users to easily download, install and replicate their own portal locally or in their own server. Code is available on Github at https://github.com/fpirotti/inforsat

Room Modulo 3