Module 3: Identifying indicators and data acquisition

In case you completed the qualitative stage of IVAVIA (again, see Figure 6) you can now start with the first module of the quantitative stage. The purpose of this module is to identify indicators for the previously defined elements of the IVAVIA impact chain diagrams, gather the required data, check the quality of the data, prepare it for the risk component aggregation, document it, and store it in a suitable database. In case you did not prepare impact chain diagrams for each individual hazard from scratch, you could revisit results from earlier vulnerability assessments or customize diagrams developed by other cities.

Caveat

While the quantitative stage of IVAVIA is neither overly complicated nor overly labour intensive, compared to risk assessments in other domains (like industrial safety), it still requires some knowledge of statistics and experience in handling data. After all, assessing risk for an urban system is a complex task that cannot be arbitrarily scaled down without compromising the validity of the results. And although we have simplified several tasks of this IVAVIA stage by supporting software tools, there is still manual work to do. However, putting in this effort will make your assessment results more credible and thus they will be a better basis for policy level decisions on funding adaptation measures and increase your chances to get the required support.

It is not uncommon that the climate resilience officers of a city do not have all the expertise for performing this stage and also not uncommon that a city—especially a small one—does not have the required capacities. In case you still want or need to perform the quantitative stage, we recommend involving consultants or experts from local universities or research institutions. This is a common practice.

What are indicators and what can I use them for?

Indicators are employed to quantify the intensifying or mitigating elements of an exposed system with regard to selected hazard(s), as well as the potential impacts hazards may have on the exposed system. Indicators provide information about states or conditions that are not directly measurable; they may relate directly or indirectly to the element they are intended to measure. Indicators are usually compared against critical thresholds or previous measurements.

An ‘indicator’ is a general concept in statistics, with specialisations in other disciplines. An indicator is the value of an observed variable. When used in vulnerability or risk analysis, the indicator is a variable that contributes to describing the properties of an exposed system.

Examples of indicators (from the indicator list of the Covenant of Mayors framework, see appendix B):

  • Population density, measured in people per square kilometre, is a socio-economic indicator
  • Change in average annual/monthly temperature, measured in percent, is a physical/ environmental indicatorPublic service interruptions (e.g. energy/water supply, health/civil protection/emergency services, waste), measured in number of days, is an infrastructure indicator
  • The values of the selected indicators will later be aggregated to risk components and provide the basis for the calculation of the composite risk score. Therefore, you will need to select at least one indicator for each risk component of your preliminary impact chain diagrams.How to start identifying indicators?There is no fixed sequence for indicator identification. However, we recommend starting with identifying indicators for the selected hazard and drivers, before identifying indicators for stressors, impacts, sensitivity, and coping capacity. Established indicator directories can be a great help with identifying suitable indicators. Such directories can be found for example in the annex of the Vulnerability Sourcebook (see BMZ 2014b, p. 14-17), the annex of the Covenant of Mayors for Climate and Energy Reporting Guidelines (see Neves et al. 2016,However, because indicators are only useful if the relevant data is available for your local context, you should already start considering existing and necessary data when compiling a set of potential indicators. Even the best indicator is inoperable if there is no feasible way of acquiring the necessary data. The problem with this is that you need to find out about data availability to select a possible indicator, but you most likely only have enough resources to determine the availability for a limited number of potential indicators. The whole indicator selection and data acquisition activity is an iterative and potentially time-consuming process: identifying an indicator, checking its suitability, gathering data, reformulating the indicator if no suitable data can be found to substantiate it, checking data quality and finding alternative sources where necessary. Subsequently, the steps of this module should not be considered as isolated parts of a sequential process, but rather as different views of an iterative process. Consequently, we advise you to read all step descriptions before starting the indicator identification and data acquisition process.

It is important and advisable to include local expertise, e.g. from technicians and other experts working within your city, who know which data is available, as well as researchers. Additionally, in addition to using the already mentioned indicator directories, you may want to examine previous studies for you region or city in order to reduce the number of potential indicators for which to check data availability. Finally, it is important to find a consensus between the involved experts and stakeholders.

Practical guidance on identifying or creating indicators

There are several options to get at suitable indicators:

  1. If your city is participating in an adaptation framework, like the Covenant of Mayors or national frameworks, these frameworks may already have a standard list of It is recommended to start with that list. We have included the list from Covenant of Mayors and how they might be used in IVAVIA in appendix B of this guideline.
  2. You may find examples of indicators in the RESIN impact chain diagram repository (see appendix C of this guideline for examples), and some may be suitable for your
  3. The RESIN Climate Risk Typology includes various indicators (see the RESIN e-Guide for a link to the Typology).

However, while certainly many indicators are suitable for a wide range of cases, there are also cases that are so special—due to special risk, special exposure etc.—that there simply are no suitable indicators available. If none of the available examples were suitable to your special situation, you would need to create your own indicator. Luckily, there are a number of special guidance notes around that may help you. These include:

Document your indicators!

During and after indicator identification and selection you should thoroughly gather and document information for each potential indicator that allows you to retrace and justify your decision process later on. Amongst others, this information includes:

  • a brief description of the indicator with a justification for selecting (or rejecting) it, and how it relates to the corresponding risk component and its measurement;
  • the coverage area and resolution required for the indicator data;
  • the unit of measurement and resolution required;
  • the temporal coverage required;
  • potential data sources; and
  • documentation of the corresponding data: age of the data, privacy level, ownership and intellectual property rights level (copyright, license model, …), row and column meanings in tabular data sources, (for a detailed description of data requirements see below).

Any content-specific information should be defined and documented at this point in the vulnerability assessment.

Where does the data you intend to use come from?

The potential data for the preliminary indicators will most likely come from different sources and subsequently be compiled using different methods, which may come with certain (dis)advantages. Some of the ways data may have been compiled and their (dis)advantages are:

  • Physical These are usually conducted using instruments like thermometers, hygrometers, or gauges to measure indicators like surface temperature, air humidity, or water run-off, but also encompass other methods like the analysis of satellite data to determine land cover. Data from these sources is usually highly accurate, but may require context-specific expertise for sample selection and statistical analysis to obtain robust results. Statistical techniques might be needed to create a map from these data.
  • Censuses and This methodology can provide data like age distribution of households, education levels, or availability of air-conditioning in buildings. However, conducting censuses and surveys is very time-consuming and, similarly to physical measurements, may require context-specific expertise, e.g. for drafting questionnaires, conducting surveys, selecting representative samples, or analysing statistical data.
  • Modelling/Simulation. This methodology employs calculation frameworks integrating a variety of information in order to represent complex functional relationships of several input parameters in a simplified Examples of this approach are hydrological models to convert rainfall volume into run-off volume or groundwater models to describe groundwater flow systems. Due to the complexity of most models, developing and gathering data this way is usually time- and resource intensive, and requires the expertise of researchers from research centres, universities, and private companies. Additionally, the quality of the output data is highly dependent on the quality of the input data, which usually comes from physical measurements.
  • Expert If data is not available in the required quantity or quality, if there is not enough time to generate data specifically for the assessment using the above mentioned methods, or if the study area is very much localised, you should draw on the knowledge of local experts to quantify certain indicators. This can be done using methods like participatory workshops or interviews. For example, you could ask local sewage system experts to draw a map of the areas that experience regular flooding due to heavy precipitation. However, expert judgement is based on the experience and perception of respondents and is subsequently highly subjective.

Document your data!

As already mentioned, you should thoroughly gather and document information (for all data) that allows you to retrace and justify your decision process later on and also allows colleagues (currently) not involved in the process to understand it. This is especially true, if the vulnerability assessment is conducted by an external, maybe non-local, project partner. Abbreviations and non-descriptive names should be avoided, especially if data is provided in a language that differs from the native language of the person conducting the assessment. The documentation for every piece of data should at least comprise information about:

  • which object it relates to (e.g. road network, population, buildings),
  • where it came from, who owns it, and what the intellectual property rights (IPR) level is,61-67) or the indicator database of the EU FP7 project MOVE (Methods for the Improvement of Vulnerability Assessment in Europe, see [MOVE]).
  • how it was compiled (e.g. physical measurement, survey, modelling, or expert judgement),
  • how old it is, which time span it covers, and how often it gets updated,
  • what the privacy level is, which security protection has to be guaranteed, and whether or not it has to be anonymised,
  • how it is structured in general (e.g. number of population in different age groups per gender),
  • how it is structured specifically, e. which row/column/field represents which information and how this information is formatted (e.g. percentage, absolute number, categorization). This also includes detailed information about applied categorizations, e.g. traffic intensities between 0 and 3 need further explanation about which number represents exactly which intensity, and
  • what the level of uncertainty

Additionally, the documentation should include information about how the data shall be used to measure the corresponding indicator. For example, the location of electronic road signs can be used to measure (potential) information flow in case of an emergency. However, the position itself does not provide enough information, i.e. shall the absolute number of signs per district be calculated or shall the number of signs be related to the length of the road network of the corresponding district?

As already mentioned, this module is a bit different from the other ones. Its steps should not be considered as isolated parts of a strictly sequential process, but rather as part of an iterative process. You might have to go through at least a few of them more than once until you are satisfied with the quality of the outcome.
1Step 3.1: Select indicators 2

During this step you identify and select potential indicators for all elements of your IVAVIA impact chain diagrams. Similar to the processes of module 2, we recommend conducting this step as part of a participatory workshop in collaboration with the stakeholders from the city. Again, there are a number of directories of well tested indicators that might help as a starting point for this step (see the introduction of this module).

Indicators for hazards and drivers largely consist of directly measurable climate-related parameters, such as average temperature, amount and distribution of precipitation, or evapotranspiration, that allow you to measure their intensity and probability. When selecting indicators for hazards and drivers you should already bear in mind what potential methods you plan to employ for the assessment of potential consequences during later steps (see step 5.3). For example, if you plan to employ flood depth-damage functions, flood depth needs to be selected as a potential hazard indicator. Depending on the chosen hazard exposure combination, you may also need to specify the frequency of data values you require. For example, average monthly rainfall data may suffice as a measurement for water availability for crops. However, for assessing the reduction in traffic volume due to precipitation you will need daily—or even hourly—rainfall data as well as traffic count data.

Indicators for stressors consist mainly of measurable non-climatic trends that influence the vulnerability of the exposed object(s) to the selected hazard, e.g. projected changes in the built-up area or in demography. Be aware that the range of stressors can be very large. It might be commendable to focus on a subset of most relevant, most common, or most influential stressors.

Indicators for impacts may consist of both directly and indirectly measurable parameters, e.g. number of fatalities or loss in GDP due to reduced working hours. Impact indicators may also cover second-order or even higher order impacts (i.e. impacts that result from cascading effects, including from failing critical infrastructure). As with the selection of hazard/driver indicators, you should bear in mind what potential methods you plan to employ for the assessment of potential consequences during later steps (see step 5.3). For example, if you plan to estimate heat wave related injuries, you need to have access to detailed historical medical records and choose an appropriate impact indicator.

Indicators for sensitivity are often directly measurable (bio)physical and/or socio-economic parameters, e.g. the share of elderly in the population or the percentage of sealed surfaces. Preferably, you should choose indicators for sensitivity that you can act upon, i.e. indicators that you can influence, be it in the short or long-term. The percentage of sealed surfaces in a city is a good example of a sensitivity indicator that can be changed via city planning procedures that raise the amount of permeable soil, e.g. through green and/or blue infrastructure.

Indicators for coping capacity are often less direct than indicators for hazard, drivers, or sensitivity and subsequently not self-evident, e.g. awareness of the population due to the frequency they receive information about a hazard or the response time of emergency facilities. Similar to sensitivity indicators, you should preferably choose indicators for coping capacity that you are able to influence. The capacity of grey infrastructure, e.g. storm tanks or sewer pipes, is a good example for a coping capacity indicator, which can be influenced by city planners. The extent of personal social networks, e.g. neighbourhoods, friends, and family, on the other hand is not as influenceable and subsequently a weaker indicator.

 

Potential pitfalls

Due to the iterative and necessarily collaborative process of selecting indicators and gathering data, it can be prone to some non-obvious pitfalls, such as:

  • Identical indicator influence on all regions of the study This often is a result of choosing an indicator with an inappropriate resolution. For example, measuring tem- perature increases of the whole city does not allow to identify spots with high ambient temperature within districts. Subsequently, if the resolution of the assessment were the different city districts, the citywide indicator could be left out of the assessment without loss of result accuracy.
  • Indicators/data not measuring the desired This is usually the case if a discrepancy exists between the chosen indicator and the employed data source, i.e. the data does not represent the information described by the indicator. For example, using surface temperature maps captured with satellites will show hotspots in harbour areas with a lot of asphalt or in open sand areas, but will miss high ambient temperatures in street canyons.
  • Correlation between indicators and double counting. The most common occurrence of this problem is multiple indicators measuring the same effect, potentially in different For example, green infrastructure often corresponds to areas with high soil permeability. If the assessment included both the area of green infrastructure as well as the area of highly permeable soil, the same effect would be counted twice.
  • Correlated indicators for sensitivity and coping Sometimes, an indicator could both designate an aspect of coping capacity and an aspect of sensitivity. For example, in case of pluvial flooding, the capacity of a sewer system could indicate a sensitivity (if it is too small) and a coping capacity (if it is large enough to cope with, say, a very high percentage of extreme rainfalls).

You should keep these challenges in mind while identifying indicators.

3

Step 3.2: Check if the indicators are suitable

4

After you have identified potential indicators, you should assess whether they are suitable:

  • Are they valid and relevant, e. do they represent well the elements you want to assess?
  • Are they reliable and credible and allow for data acquisition and measurement in the future?
  • Do they have a precise meaning, e. do stakeholders agree on what the indicators are measuring in the context of the vulnerability assessment?
  • Are they clear in their direction, e. is an increase in value unambiguously positive or negative with relation to the corresponding risk component?
  • Are they practical and affordable, e. do they come from accessible data sources, or can they be produced against reasonable costs?
  • Are they appropriate, e. is their temporal and geographical coverage right for the vulnerability assessment?

Especially relevant in this regard are the two dimensions of appropriateness:

  • Geographical The identified indicators should cover the full extent of the study area (e.g. the whole city) and have an appropriate resolution (e.g. population data on a district level).
  • Temporal coverage and time Depending on the different indicators you identified and whether or not you will be looking into the past/future you may need historical records and/or future climate projections. Additionally, you should try to choose indicators covering the same temporal interval with the same resolution, e.g. daily precipitation and traffic volume for a whole year. The required temporal coverage also depends on the frequency and intervals at which you plan to repeat the vulnerability assessment, e.g. for monitoring purposes.

Indicators that are not suitable should be disregarded and their entry in the list of potential indicators amended with a corresponding comment to allow for the reconstruction of the process by non-participating colleagues later on.

Should the pruning of the potential indicator list result in elements of the preliminary impact chain diagrams without indicators, you may need to go back to the corresponding step in the process, try to find new indicators and re-iterate the specificity check. If not at least one suitable indicator per risk component can be identified, you may have to go even further back in the IVAVIA process and refine the impact chain diagrams.

56

The process of gathering data for your indicators can range from extremely simple to highly challenging. It may suffice to download available census data or GIS maps from open access websites. In some cases, time and resources permitting, you may wish to conduct your own surveys or process large, complex data sets like satellite images from NASA, which might require specialist skills.

What kind of data do you need to quantify your indicators?

There is no standard solution for all risk components. While indicators for hazards and drivers will most likely require physically measured or modelled (historical) data, indicators for coping capacity, sensitivity, and stressors will often require survey and census data or data from expert judgement. However, the most important decision criteria for what data you need are the study area of your assessment (e.g. your city as a whole or individual city districts), the resolution of your assessment (e.g. districts, neighbourhoods, smaller areas within neighbourhoods), and the output type (e.g. maps, diagrams, tables) and level of detail you want to produce (e.g. result validation via visual analysis of the employed data sets).

For example, if the assessment is to be conducted at district level, data should at least have a district level resolution (e.g. population data for complete districts or individual households, which then can be assigned to corresponding districts). Data at a lower resolution (e.g. population data for the whole city) usually cannot be disaggregated without loss of validity. As another example, if the results of the assessment should be represented visually as a map, you need corresponding geographic data (e.g. a shape file, see step 3.4) to which you can relate the calculated risk scores. Similarly, if detailed (visual) validation of results seems necessary or helpful (e.g. to validate that a certain road segment does indeed not cross a zone prone to flooding), the corresponding data needs to be provided in an appropriate format.

Does the data already exist?

Based on the information you gathered in the previous module you should be able to identify a first set of relevant organizations, facilities, and experts on a local, regional, national, or international level that may be able to provide you with the necessary data. The huge number of institutions and experts you may need to contact can make this one of the most time- consuming steps of the assessment, especially as follow-up negotiation is often required.

Depending on your indicators, you may want to contact statistical offices, meteorological authorities, and different national and local government departments. ‘National Spatial Data Infrastructures’ (NSDI) are another key entry point for data acquisition. NSDIs have been established in many countries and will ideally offer standardised data, even where it is sourced from multiple institutions (cf. BMZ 2014a, p 96).

When gathering data and contacting institutions and experts, be aware of intellectual property rights (IPR) levels and sharing policies that may be in place; formal agreements with the respective rights holders could be required.

How many resources can you commit to generate data?

If there is no data available or it is of insufficient quality, you may need to collect data yourself (or choose a different indicator). In this case you need to carefully assess the re- quired costs and expertise needed for the data collection. The Vulnerability Sourcebook gives some useful tips (cf. BMZ 2014a, p. 96-100):

  • “For meaningful results, observation of biophysical indicators such as precipitation, temperature and run-off must be made over long periods – often over The time and money required for this means it is almost certainly unfeasible for [you to gather this data yourself]. Luckily, however, most countries can provide such data [, e.g. via climate service providers such as hydro-meteorological offices]. If you require highly localised data, expert judgement may be a worthwhile alternative.”
  • “Data for socio-economic indicators such as average household income, average size of household and livelihood strategies can be captured in The time and money required depend largely on the sample size. A representative survey may cover a whole country, or just a few communities. At the sub-national level, surveys can be an effective means of gathering information not captured by national institutions, such as perceptions around climate and environmental change. [In large cities, socio-economic survey data often might have already been collected by a statistics or data department.] Be sure to involve a local expert who can help in drafting the survey, selecting a representative sample and analysing the resulting data.”
  • “Modelled data are both time- and resource-intensive and usually require measured data as […] For meaningful results, you will need to ensure that you can call on the required modelling skills.” Often, models for fluvial/pluvial flooding, temperature, and urban heat islands may already be available on a regional or national level.”
  • “Where time and financial resources are limited, expert judgement can be a good, fast way of quantifying indicators that cannot otherwise be This is most often the case at a very local level – [e.g. when gathering information on the municipal sewer system from city technicians] – which is rarely covered by detailed statistical data, and where the climatic and hydrological characteristics are too specific to be captured by modelling. This local knowledge – captured using participative methods as well as scoring and ranking –can be used to either complement or replace surveys.”

Given the extent of the data gathering process and its inherent iterative nature, it is perfectly acceptable to conduct the quantitative vulnerability assessment process in an iterative fashion starting from module 3. That is, instead of gathering all potentially necessary data before calculating vulnerability and risk scores, it may be more practicable to start with a minimum amount of data necessary to calculate first score estimates and iteratively expand the calculation with more data. While partial results may lack validity, they may help you when communicating with your stakeholders and to secure stronger stakeholder commitment.

78

Data is vital for the vulnerability assessment, because the quality of the assessment result depends strongly on the quality of the input data. Subsequently, you need to assure that the data you gathered is of sufficient quality. The quality check may reveal major issues with the data quality, which may necessitate going back to step 3.1 and gather new data. To avoid this, you ideally should consider the quality criteria below already during the data collection.

Is the data in the correct format?

Data can be provided in different formats. To enable easy data handling, data should be provided in a well-structured, preferably digital file format that is easy to interpret, does not need manual reformatting, and can potentially be handled by colleagues without the need for specific (software) tools. Appendix G provides a (non-exhaustive) list of digital file formats that comply with these requirements.

Does the data have the correct temporal and geographical coverage?

The geographical and temporal coverage of different data sources may vary. Thus, you need to determine whether different data sources can be combined and compared or not.

In general, the data you employ for the vulnerability assessment should have the same geographical and temporal coverage. For example, if you want to relate traffic intensity to amount of precipitation, both data sets need to cover the same study area, have the same resolution, and need to cover the same time interval. One way to achieve a consistent spatial resolution is to employ regular grids (e.g. a grid of 500x500m cells) when collecting data or to relate highly detailed data (e.g. household level age distribution) to such a grid. Usually, national standards for such grids already exist and can easily be obtained from relevant national institutions. Additionally, data should ideally be as recent as possible and have, or at least be convertible to, the same temporal resolution (e.g. hourly precipitation and traffic volume from June 2017).

However, some differences in data timeliness and temporal coverage may be tolerable. This is especially true if you employ data for the assessment that changes comparatively slowly, e.g. age distribution of citizens or other census data. Additionally, if indicators are independent from one another using data sources with different temporal coverage and resolution may be acceptable. For example, the amount of green infrastructure in a city and the annual household income will most likely not be related and subsequently a difference in temporal coverage may not be significant.

Another problem may arise if geographical data use different coordinate systems and projections. This is especially frequent when working on cross-border regions. In order to combine and compare different data sources, you need to make sure they employ a common geographic reference system, such as the Universal Transverse Mercator coordinate system.

In any case, having some data is preferable to having no data at all. If you have no other alternative at all, use the data you have available, but be aware that the results of the vulnerability assessment may be less reliable or may even be misleading.

Are there any missing values or ‘outliers’ in the data?

Missing values (e.g. regions omitted from geographical data) are problematic for quantita- tive assessments. Smaller gaps can be closed with interpolation, i.e. finding existing data nearest to the gaps (in space or time) most likely matching the missing data, or by using the average value as a replacement, if no other data is available. The Handbook on Constructing Composite Indicators by the OECD (cf. OECD 2008, p. 24-25) goes into more detail about missing values and how to deal with them.

‘Outliers’ on the other hand are values that are so far outside the expected range of your data that they may indicate an error in the capturing or calculation method.

Once enough indicators have be identified and passed the suitability check and their corresponding data has passed the quality check, the associated impact chain diagrams can be finalized by including the indicators in them.

9

Step 3.5: Data management

10

To avoid data loss and lower the risk of data redundancy, you should store the gathered data sets in a common data storage system. This may range from a simple collection of data files in a set of folders to a more complex database system (e.g. Excel spread sheets or a distributed web-based database). Depending on the system you employ, you may need to transform the different data sets into a common data format, utilising export and transformation routines from multiple software products.

If you work with multiple (external) partners and stakeholders you may need to ensure that they can all access the different data sets and work with the same format. Additionally, you may need to assign responsibilities for database management and maintenance or commission an external service provider. Finally, you need to ensure that sensitive data is stored in a secure way, only accessible by the appropriate users. What exactly is considered ‘sensitive data’ is dependent on the country or even region you live in. A good start would be to check the EU General Data Protection Regulation (see https://www.eugdpr.org/) and its national implementation, as well as additional national and regional legislation.

At this point all the data you gathered should be documented precisely and comprehensively, allowing all internal and external colleagues, partners, and stakeholders to understand the format and meaning of the data and work with it. Although documentation is a time-consuming exercise, it is extremely important, particularly when qualitative or quantitative questions regarding the data arise. Insufficient knowledge about the data can lead to unnecessary duplicate effort from colleagues, data loss, missing results, or a lack in transparency and credibility.

11

21828-200Proceed to Module 4