Expert interview

Prof Dr Lutz Heuser
on the subject of data

We also talked to Prof Dr Lutz Heuser on the subject of data, its handling and security.

Data must be collected and processed for every application and evaluation.
How do you get a grip on this incredible amount of data?

Prof Heuser: Over the past few years, we have successfully completed an important development in IT, that of “big data” analysis. We are also making use of it. An enormous volume of data has always existed and been collected in companies, and this continues to be the case. Think about large corporations that send out their invoices on a monthly basis, for example.
Big data algorithms make it possible to process data in real time. In the past, data processing was triggered in “batches”. The calculation period lasted four weeks, and then the invoice was sent. With the new big data technology, which has been around for the last 5 to 6 years, the entire process can be reduced in size so that, depending on the problem, everything can be calculated in less than a second or, at most, in a few minutes. The underlying technology for evaluating the data is machine learning and artificial intelligence.
Nevertheless, it’s safe to say that in 10 or 15 years there will be a flood of data that today’s computer architecture will no longer be able to deal with in terms of calculations. This means that there will be point when there will be a new computer architecture that is certain to look completely different from the one we are using today.
In the short term, this will not be an issue for us in the Smart City segment. At the moment, it is a topic of basic research that is currently transitioning to applied research. The first computers of this generation will be put into operation sometime around the beginning of the next decade. Then it will be possible to see how quickly these computers are developing.

How is this abundance of data reduced to its essentials?

Prof Heuser: The core of Smart City can be broken down into two fields. This won’t always be the case, but at the moment it is true: People want predictions. These are a combination of the current situation and historical data, meaning experiences from the past. That’s the classic “machine learning” principle. Forecasts for a certain time are derived from historical data and actual data. This method is used for parking, traffic, public transport transitions and energy distribution. A huge data set is reduced to a single statement, for example: “The traffic light will turn green in 15 seconds.”
The other category is referred to as event-driven. Something has happened, and because that has happened, another action is triggered. In most cases, this event is not a single event, but a combination of events, that is, an event chain, and the technical term for this is “complex event processing”. Many individual events together result in a complex event.
Such complex events need to be able to be described. It's necessary to program an if-then combination that triggers an action in the applicable situation. For example, when danger is detected, a luminaire raises its light intensity to 100 percent and simultaneously places an emergency call.

How does this data become “compatible” with existing data formats and systems
of a city or municipality, or does a “parallel world” develop here?

Prof Heuser: That’s a very contemporary question. Efforts in this regard are being made through the DIN standard of the open urban data platform. That was the first step, so to speak. In a second step, we are currently in discussions with DIN about a supplement to DIN 91357 – the open urban data platform. This standard deals precisely with the compatibility of data formats in order to provide even more planning security, not only in the way the platform is structured but also in terms of what the data formats should look like.
We are currently comparing this with the “EDIFACT” standard from industry. This standard is a cross-industry international standard for the format of electronic data in business transactions. For example, EDIFACT tells you how an invoice, an address or a quotation has to look, making it possible to process invoices and quotations between computers online.
We want to discuss such an EDIFACT-like idea for municipal data. There are certain commonalities that are the same in every city. We want to package these commonalities into a common data model. Proposals have already been made, but there is still no standard.

Does this mean additional effort and costs for the municipality?

Prof Heuser: No. Here we encounter the catchwords re-use and re-purpose, that is, re-using and re-purposing existing data. The fact that we merge the data together and use it doesn’t incur additional costs.
It is, of course, necessary to incur some costs: Digital transformation is an investment. It’s not free. A city naturally needs to fund additional expenditures to make the digital transformation happen, but something is received in return that justifies this extra expense.

There are certainly issues to be considered very critically when it comes to the matter
of data, such as data security. How do you deal with this and what do you recommend to your customers?

Prof Heuser: In Germany, where we have very strict requirements, there is currently a trend towards having the respective “sensitive components”, such as cameras, processing the data in the device itself. IT experts call this “edge computing”, or “computing at the edge of the net”. Consequently, these devices do not transfer the image itself but only the evaluation of the image. This ensures that the privacy of the people who can be seen on the images is protected.
Many traffic lights are also camera-controlled in cities. The cameras do not pass on individual number plates or vehicle models, just the number and type of vehicles (i.e., car, truck or bus). If an image has to be passed on, it is done in pixelated form. The component does this, thus guaranteeing compliance with the General Data Protection Regulation. Consequently, we work with non-personal data.
The situation is different when official tasks are involved. That’s why data infrastructures between police and third parties cannot be used at present. The police have the legal authority and the official task of identifying persons in pictures – at airports or train stations, for example. That has nothing to do with the General Data Protection Regulation, and it is not allowed to mix this data with other municipal data in one data set.
If the official tasks are excluded, the data can be used in compliance with the General Data Protection Regulation. This is done on these urban data platforms. We do not process any personal data on our platforms, but only data that does not allow any conclusions to be drawn about individuals.
This even applies to WiFi data we get from routers. The routers filter it out because this information is not permitted to leave the router. For this re-use according to the above-mentioned principle of “re-use and re-purpose”, we only learn that there was a person there and that the person was there during this time period. We don’t find out exactly who it was. It is always taken care of in the devices themselves. In this respect, and here I am happy to repeat myself, the data with which we work is covered by the issue of data security.
In the sense of intervention in critical infrastructures, this must be managed by the computer architecture. We have also described this in the DIN standard. We, as a company, place what is known as a “gateway server” behind the firewall and use it to collect data. In this way, the actual system is not directly connected to the Internet, and our customers (e.g., municipal utilities and infrastructure operators) are fine with that.
This means that the data security of the management systems is controlled through gateways, and personal data security is controlled by processing through edge computing in the devices on site.
If the data platform is hacked, all services are affected. The usual data security criteria, that always need to be guaranteed when operating a service like ours on the Internet, must be engaged. To this end, we use Microsoft or SAP cloud services that make these mechanisms available within the cloud infrastructure and whose standards are very high.
This gives us the three most important components: secure access to data, anonymisation where necessary and use of the IT security infrastructure of large cloud providers.

Are there standards in this area, or are there plans to introduce them?

Prof. Heuser: In addition to standard DIN 91357 already mentioned, there is also standard DIN 91347. The latter refers to imHLa, that is, street lighting in the sense of the digital hub. The former standard mentioned, DIN 91357, refers to open urban data platforms.
Standard DIN 91367 is currently being developed. It relates to mobile data, in other words, the mobility data a city would generally have to make available in order to enable these new real-time applications. DIN 91377 has been reserved for the common data standard that we have already discussed.
There are a whole series of standards that build on each other to address this topic. However, there is no single Smart City standard. That would be far too complex. The entire complex of topics is divided into smaller units in order to deal with them better.