WP6: Sensor data integration & digital twinning

Objectives:

Identify & integrate data inputs from municipal, governmental, and online data sources. Develop proxies for missing input data based on statistical models via supervised machine learning. Enable near-real-time dynamic updating of the CDT model of the urban area. Develop V1 of the CDT.

Description of Work:

Task 6.1: Online data source integration (NTUA, RED) Hazard information will be ingested from national weather and seismographic networks, to incorporate available online data on the pilot area, e.g., from the National Observatory of Athens as they become available. API hooks will also be coded to read online data sources on arrivals (e.g. FlightRadar24), local hotelier information as well as economic indices as published from the Chamber of Commerce & Industry, National Bank of Greece, Eurostat, local hoteliers’ associations etc., on a daily, monthly or yearly basis. All pertinent information on hazards, population, business and economy will thus be entered in a database for direct or indirect employment in the CDT.

Task 6.2: Surrogate models for missing data (NTUA, RG) Surrogate models will be used to recover missing input data from related proxies based on statistical models via supervised machine learning. Kriging surrogates, k-nearest-neighbor nonparametric models, and deep neural networks will be tested for building relationships between historical proxy data and known parameter values to be employed, and the results will be leveraged for projecting into the near future. To enable self-correction and improvement, an online supervised learning approach will be employed, allowing the staged improvement of the CDT after an initial calibration period.

Task 6.3: Digital twinning of urban area (NTUA) To close the circle, this task will enable the near-real-time dynamic updating of the CDT model of the urban area. A machine learning data-based algorithm will be employed to link external inputs to pre-computed scenarios of community impact/functionality per each asset. Reflecting inherent uncertainties, a range of best matching scenario outcomes will be presented, each with an associated likelihood and combination weight to derive overall outcome distribution statistics. The system will allow assimilation of success/failure information (based on user input flagging correct and incorrect scenario picks) to allow re-training of the selection algorithm under the paradigm of online supervised learning. This will become the first version (V1) of the integrated CDT, to be subsequently calibrated to V2 in WP7.