Paola Salmona

Researcher in geomatics, with experience above all in cartography and GIS


Sessions

07-16
16:45
5min
Python plugin for statistical analysis of landslides susceptibility over wide areas
Paola Salmona

Within the Extended Partnership “Multi-Risk sciEnce for resilienT commUnities undeR a changiNg climate” (RETURN), the research group of the Department of Civile, Chemical and Environmental Engineering of the University of Genova is developing a system for processing landslide susceptibility maps in GIS environment.
The expected result is a tool meant to be used by administrations and local authorities for ground instability assessment and management. Therefore, high usability is required, that implies free and easily available base data, easily interpretable results, and clearly explained limitations of use and reliability level.
To accomplish this objective, some stakes have been established:
• Only earth landslide are expected to be considered for the processing of susceptibility maps. They include slow flows, fast flows, slides, areas subject to diffuse shallow landslides and those landslides classified as “undetermined” and “complex” that generally contain at least some earth movement .
• The proposed tool should be scalable and transferable. To such purpose, eight independent predisposing factors were chosen complying with these requirements and described by data open and available for the whole Italy.
• To ensure the quality of input and output data, the procedure is optimized for certain “certified” datasets
• The procedure has to be transparent and traceable . The logistic regression method was chosen, which is widely used and described in the literature. The resulting maps report probability values that are quite difficult to understand, so, to facilitate usability, they are subsequently aggregated into qualitative susceptibility classes.
• Minimum reliability threshold: the procedure has been tuned, and additional tests are presently in course in some study areas, to ensure an overall reliability defined through AUC of at least 75%. In case of lower AUC values, the causes are investigated, to foresee possible corrective interventions.
• In order to allow also people not particularly experienced in GIS to use the model and to ensure the correct implementation of the planned operations, the writing of the entire procedure in Python code is underway, at the moment within GRASS, then, possibly as a QGIS plugin.
The procedure was tested in GRASS over areas of about 1,000 km2, considering the pixel as the minimum spatial unit, with a nominal scale of 1:100,000 and a raster resolution of 20 m. The first phase consisted of the preprocessing and discretization of the basic data, so as to allow a general control of data quality and reduce the possible combinations of factors to a manageable number. The eight considered factors, elevation, slope, aspect, water accumulation, land use/land cover, lithology and rainfall influence, were then brought into raster format at the set resolution and divided into qualitative (e.g., land use type) or ordinal (e.g., Elevation Intervals, from 0 to maximum elevation) classes.
The resulting maps were compared in a bivariate analysis with the Inventory of Landslide Phenomena In Italy (IFFI), and the classes of each factor were reordered on the basis of conditional probability. The factors were then related to each other and to actual landslides in a multivariate analysis by logistic regression, defining for each pixel the probability of landslide occurrence. The obtained values were finally grouped into three qualitative classes to indicate high, medium and low landslide susceptibility.
The procedure described above resulted in an AUC in calibration generally above the preset threshold of 75 percent and was used as a basis for the realization of the actual tool, through a series of refinements currently in progress.
The tests so far have shown that each type of landslide is affected differently by the factors considered, and the model is more reliable if the different kinematisms are treated separately. Therefore, for each study area, the procedure has to be repeated for all landslide types, which is time-consuming and disk room-consuming. In response to this problem, the rewriting of the model as a python script is in course, not only for a more efficient application, but also to define a standardized procedure by which to make it accessible to people outside the research team, and leading to comparable results.
As part of the automation of the procedure, efforts are also being made to define ancillary functions dedicated to solving problems that arose during the experimentation.
With regard to the definition of the statistical sample, i.e., the areas actually in landslide, the need has emerged to distinguish, for the phenomena reported in the IFFI repository, the detachment area, i.e., the area from which the landslide actually developed, and the accumulation area, i.e., the part of the territory affected by the effects of the landslide. Therefore, to avoid introducing noise into the model due to incorrect perimetry, a part of the code is being prepared to separate the two parts on a statistical basis.
Another issue that is being addressed is the identification, for each kinematism and for each territory considered, of the factors actually determining the development of the landslides, in order to reduce noise and lighten the computational processes. Several methods are being tested, including “Frequency Ratio,” “Leave One Out,” and “Stepwise”. Once the most effective method is determined, this part will also be introduced as a script into the model.

Academic track
PA01