Data protection & contract design - access to machine and personal data

Episode 71 at a glance (and click):

[06:38] Challenges, potentials and status quo – This is what the use case looks like in practice
[20:33] Solutions, offerings and services – A look at the technologies used
[33:56] Results, Business Models and Best Practices – How Success is Measured

Podcast episode summary

Companies are reluctant to share data within a supply chain or with commercial customers because of the complexities of clarifying rights and obligations. How can data exchange between companies work best? What are the requirements and what to look for?

Boris Scharinger, Senior Innovation Manager at Siemens Digital Industries, is addressing precisely these questions and how cross-vendor collaboration can be made possible and standardized. In doing so, he sheds light on both legal and organizational perspectives and presents possible solutions, as many companies fear the loss of trade secrets – especially in the case of overlapping data pools in AI projects. He also shows how the initiation process of IoT projects should be accelerated and simplified.

CEO of MindSphere World e.V., Ulf Könekamp, aims to help shape the future of the IIoT. He realizes this with experts from a wide range of industries. With the help of a wide variety of working groups, complementary further improvements in performance can be achieved. Where the limits of a single company are reached, further progress can be made through collaboration with other companies.

In the 70th episode of the IoT Use Case Podcast, we learn how a multilateral relationship between companies can be legally regulated and how the Ecosystem Manager can provide a remedy for possible consequences.

Podcast interview

Boris, you are a Senior Innovation Manager at Siemens Digital Industries. Siemens is the technology and innovation leader for industrial automation and digitization. You are currently dealing with what is probably the most important topic when it comes to data exchange from a legal and organizational perspective, in order to make cross-vendor collaboration possible. To start, do you have any recent examples that illustrate the relevance of the topic of data sharing from a legal and organizational perspective?

Boris

Yes, of course. There are many legal and organizational issues to clarify. The topic “What is a trade secret?” comes to mind. That’s not so trivial to answer. I would like to give a small example. I’m sure we all remember how Tesla ramped up Model 3 production. And the question of how high the production output will be from the new Model 3 was something that affected the stock price on a daily basis. If I just imagine I have a central machine in Tesla’s production, then the “timestamps” date alone is very sensitive, whereas in a completely different constellation at another company, that is insignificant from a trade secrets perspective. These and other issues are exactly what our work is about.

You are not alone here today and you have brought Ulf Könekamp with you, from MindSphere World. Ulf, you are CEO of the MindSphere World e.V. association. You have been on the road since the beginning of 2018 with around 80 member companies from industry and research, but also different users. You aim to shape the future of the Industrial Internet of Things (IIoT) together and share first-hand practical knowledge. You work with technological standards that belong to it, and look to give all members the opportunity to make this open operating system usable. Is that correct?

Ulf

As you said, MindSphere World is an independent association whose members are, above all, very multi-disciplinary in composition. These include machine builders, component manufacturers, system integrators, financial institutions, business developers, and also start-ups that specialize in AI, for example.

This multidisciplinarity enables everyone to benefit from the special expertise of others and to look beyond their own horizons. The members are all experts in their industry or domain. We are engaged in mutual consulting across all borders. In this still very new topic of digital transformation, that is very valuable.

To go deeper; you say there are different contributors. Can you give a few examples or even use cases that you guys are working on at the association right now?

Ulf

With pleasure. For example, we have an Interest Group “Finance & Insurance” in which the who’s who of the German-speaking finance and leasing industry meets regularly. These include Deutsche Bank, SüdLeasing and Swiss Three, to name a few. In the neutral environment of the association, these companies provide advice on how the financial sector can support entirely new business models.

How can a pay-per-use model be mapped in the regulatory framework? How can you reasonably absorb the risk of not using an asset, and what are the different financial mechanics for new business models? What are the different customer archetypes? Because no one has yet come up with the perfect solution, such an exchange at eye level is very important for everyone involved.

In the Edge working group, a wide range of companies are involved with this relatively new technology. What are the characteristics of an edge system? What is the architecture, what are the elements? It’s not just about technology, but also about the players in an edge system and the specific application scenarios. Who does what – and the business transformations – when a hierarchical system suddenly becomes a heterarchical one?

A third example is the Shared Data Pools group. Participants have an advanced ecosystem mindset and have created a valuable tool for themselves and others.

Very exciting what you are doing.

Challenges, potentials and status quo - This is what the use case looks like in practice [06:38]

Boris, you are one of the leaders of the Shared Data Pool group. Cross-vendor collaboration, where does that happen today and when do I need such data pools with multiple people?

Boris

We would like to see it take place even more than it does today. Many of today’s IoT projects are bilateral – between someone who provides data and someone who then builds a model with the data, for example. Wherever we want to see solutions developed that scale, that scale beyond a project, that have the potential to become a product, these overarching data pools are very valuable.

Maybe you train a neural network for a quality inspection, for an automated quality inspection at a customer in a specific environment. I will have great difficulty getting this neural network to work at the second customer; this will then require a very large further project effort. However, if I have now trained the neural network with data from five or six different customers, perhaps even with data from several plants per customer, then I can be very confident that the solution that was created there will also work for customer seven and eight.

Put another way: That it scales commercially. This is the big challenge today in the development of predictive models. That we have to manage to get out of the condemned project mode, to create solutions and products that scale. That’s difficult because today many parties – mechanical engineers, for example – sit on their data and say from a gut feeling, no, I don’t actually want to put my data together with other companies into a larger data pool now.

You just mentioned the aspect of bringing together data from plants and customers. This is not about any weather data coming from external sources. But rather a manufacturer – perhaps a sensor manufacturer – a manufacturing company and a component supplier, for example, would have a joint project. Would that be such a scenario, what you describe, or how do you have to imagine such projects?

Boris

Exactly. I think structurally there are two scenarios. One scenario is, there are different manufacturers of, for example, machines that are all used in a process. These manufacturers can generate additional customer benefits by coordinating their machines throughout the entire process. To be able to generate this reconciliation, one starts to integrate data on behalf of the customer along the value chain and the production steps.

The other scenario is simply that there is a very specific function in a production that several companies have. Perhaps inspecting circuit boards to see if they are properly populated. A lot of companies have that and say that’s a use case, I’d like to see that work a little bit better. In terms of use cases, I’m now joining forces with five or six other companies and we’re saying that we’re going to combine our circuit board data and measurement data and see if we can optimize our quality inspection function.

An example would be you have a manufacturer of glass talking to their suppliers and you find there are similar processes and also data that can be shared to exchange quality or survey data among others. Previously, the break here was that people said everyone was doing their own thing and building their own data pools. Now you collaborate and share that data.

Going further, why would I share this data? Do you have some examples from your practice here?

Boris

AI models scale or generalize better when I can work with data from different companies and different plants. It is also the case that our German productions in particular are so highly optimized that the availability of defect data is a very difficult issue. That’s another place where companies that get together can create data pools that get a little bit more to the critical mass of error data so that these training operations can even take place.

I would like to come back to one of the use cases that Ulf has already talked about – the topic of pay-to-use models. I have insurance as a stakeholder in my ecosystem and a lessor. I then have an end customer and perhaps a producer and manufacturer of a machine. That, too, is already a multilateral data exchange relationship that has to be organized there. Of course, the insurance company has a legitimate interest in knowing the maintenance status of the machine; and so does the lessor. The latter may also need to know the throughput.

There are very different technical issues that require multilateral data sharing, but also new approaches in business models where it is necessary.

Can you provide some insights into why companies are also hesitant to share their data in these scenarios?

Boris

Of course, the fact is that these multilateral treaties that are involved are quite an effort. In practice, we see that the contractual initiation of a joint IoT project can take six, nine or even twelve months. This also has something to do with the fact that this expertise I need to understand IoT, AI and the legal aspects is an expertise that is rare. There is no standard and no established mechanisms for industrial trustworthy, multi-vendor data exchange. Different, by the way, from what other industries have already created. I am now talking about contract standards or methodological procedures that are multilaterally recognized as an example.

What comes to that is the question of what a trade secret actually is. A lot of companies say in their gut, no, I don’t know exactly what the other party might be getting out of my data. A structured approach to looking at a required data set – that always has to be something I look at use case specific. This also raises the question: How do I determine in a structured way? What are the trade secrets in there and can I maybe take them out and still develop the algorithm? Can I camouflage them? Can I tailor my records and say a partner company only ever gets four thousand records a day so you can’t read out the throughput? These are all topics that are not basic know-how in our industry. As a consequence, this initiation then costs a lot of time and money; that doesn’t make it much better.

Exactly, so here we have the issue of trade secrets and the effort that goes along with a new skill set, which you also need as a lawyer within a company to be able to look through and understand this issue. There are bound to be some learning curves there as well.

There is also the issue of GDPR, or rather, who owns the data in the end? Especially with so many data pool donors.

Boris

Interestingly, the concept of ownership of data does not actually exist. But the data, that arise do not belong to anyone in the narrower sense. But the information that is in it belongs to someone. If I put those into a neural network … maybe I’m talking about IP, Intellectual Property, which I can then protect. In the process, by the way, I also have to reach an agreement – how do I share this with the partner parties?

The other thing is: trade secrets are protected. Trade secrets are mine. In fact, these are legal situations … we talk a lot about the data owner – who legally doesn’t even exist! These are such hurdles and learning curves that you have to go through. After that, you can also try to structure the process and possibly create a standard for other companies to follow.

Why is it a problem today that companies are not moving in the direction of bringing data together and also creating a critical mass of data somewhere, even in the future?

Boris

We also say in the working group that we will only be able to leverage the potential of artificial intelligence on the store floor if we get out of the project environment in which we are currently stuck. This is going to be a big issue over the next few years. It is already the case that many in companies are seeing signs of fatigue. They say, well, we shimmy from proof of concept to proof of concept, from feasibility study to feasibility study. But how do we make money with AI solutions now? Or the other way around, how do we genuinely raise productivity on the ground, through the use of AI? Today, this fails due to cross-company data pools.

I would like to make one more comment on the topic of International Dataspaces and GAIA-X, because that is a big European-driven topic. This is great, important, and provides an important technical infrastructure for how data can be shared securely in the future. But that I have to create a project organization beforehand and that I have to create a legal agreement between the project parties … How is it shared, what is shared, for what purpose is it shared? How is emerging IP shared? How do I deal with liability issues? What if one of the participants doesn’t have the data source lined up that the project needs to implement the solution? These are all issues that technical infrastructures, such as GAIA-X and IDS, do not address. That’s where we’re trying to start with our working group.

If we want to tackle the issue and implement this together: What are the requirements that you have identified for such a system, what it has to bring to make cross-data collaboration possible from a legal perspective?

Boris

Important question. That was the core of our working group. We have deliberately said we are not building yet another IoT platform now. We are not concerned with a secure platform for data exchange, but we want to simplify and accelerate the initiation process of these IoT projects, these shared data pool projects. This acceleration has something to do with matchmaking.

I have a use case that I want to work on. I also have an algorithm and now need to create a cross-company data pool for this. Or would like to contribute with data that I have. I’m tasked with doing a risk analysis and realizing that this specific data set is required for this use case and I look at that and do an analysis: what trade secrets could be in there for me as a company that I need to watch out for? Then you can deal with it. I just have to handle it contractually and say, there are trade secrets of mine in there; I want to know every other contractor now and the data can’t be shared. Or is it possible to deal with it technically? I convert and transform a particular data point into another data point that has less trade secret exposure to me.

I also have issues related to the IP rights of the solution. A project initiation must also allow me to talk. If I give away tens of thousands or hundreds of thousands of data records, i.e. if I put in the effort to get the data quality-cleaned and secured into this data pool – what do I get? Do I get rights to use the model? Do I get marketing rights to the model? And so on. Then it’s a matter of a normal Statement of Work; a normal project contract that says, this and this are the roles, participants, and obligations of the appropriate parties involved in the project. Who manages the project? Who gives data, who makes a model out of it, what are the handover points?

Last, but not least, I need some control mechanisms on top. I need to think about how do I deal with competition law? Is this already a cartel that I might be forming, or am I running into the danger? What happens if one of the participants wants to leave the project and says, I can’t manage to deliver the data as required? Suddenly one of the hyperscalers can knock on the door and say, I would have liked to participate in this project, is that possible? For that, I need steering structures – that there are then voting mechanisms to deal with such project situations.

Solutions, offerings and services - A look at the technologies used [20:33]

Within the MindSphere World working group, you guys have made it happen and are working together on an IoT template generator to enable this collaboration across data. How does that work exactly, what you have developed there?

Boris

We have created the so-called MindSphere World Ecosystem Manager. This is a platform where I can announce use cases and also look at existing use cases in the marketplace, so that I can then decide: Do I want to apply to be involved in one of these use cases, for example by contributing a capability? In fact, besides this marketplace, the moment a project is … I’ll call it “configured ready” … the parties involved are fixed; there’s been a structural commercial agreement; there’s been a discussion on how to deal with IP, and there’s been an agreement. Then I can configure that. I have configuration options for all of them in my project setup and press the button. Then a framework contract is generated from the project setup for the entire project and all participants; in some cases, specific contracts for individual service packages are also generated.

For example, there is the trusted data processor. With which a Data Processing Agreement is concluded; between the parties involved and this party. It says exactly how the data pipelines are set up organizationally and what legal duties and obligations the trusted data processor must fulfill. If we say, as an example, in the project all stakeholders agreed that there is an option to audit. This audit option ensures that an independent external auditor checks whether the technical implementation of the shared data pool actually complies with the contractual agreements. Then the data processor must have audit clauses in their service contract. These say, with so and so many days notice, we can announce an audit, and then you have to have an outside party look at the whole issue.

That’s an issue where we check off; okay, we need the auditor and the audit capability, and then there are additional passages put in there in the contracts and contract templates.

Very exciting. So one is matchmaking; parties that want to tackle data projects like this together, but then also all these features and functions that I need to create a contract template like this, which is probably a PDF file in the end, right?

Boris

We decided to make it a Word file in the end, because we say that lawyers have to be able to look over it and possibly add one or two half-sentences. The flexibility should be there. However, it is also important to me that these contract templates have been and will be reviewed by multiple parties in advance so that it can be assumed that this is a fair draft contract that does not disadvantage anyone in the project environment.

Do you have an example from the private environment where such contract templates also come into play? I’m sure there’s something there to make it easier to understand.

Boris

I can think of some right away. We all know the ADAC. Not only because it gives us roadside assistance when in doubt, but also because I can go to the ADAC site to sell my car. I can download there a contract template that regulates all crucial aspects of a used car sale, and without being somehow disadvantageous for the seller. This is a credible platform. When I come up with a contract template, usually the other side has no problem at all going along with it.

A similar example is renting out an apartment. Some of us may be doing that. The moment I have a new tenant, I don’t sit down at the computer and open a new blank Word file and start writing and thinking about what needs to go into the lease. But there is the landlord association, “Haus & Grund” – possibly I also have to deposit 5 euros at PayPal, but then I also have a wizard that guides me through a few questions and then generates a rental agreement absolutely accepted by all.

I end up saving myself time and money with it.

Boris

Right, that’s one thing. But I also have credibility because I didn’t sit down and text anything myself.

You said there are different roles. Whether it’s usage and marketing rights at the end or different roles I can take as a data provider or data taker. Can you give examples of which parties are taking a seat there? What roles are there also within the ecosystem manager?

Boris

Basically, there are different roles, and each party can have one or more of these roles. For example, we could have a situation where a Fraunhofer Institute is an idea generator. An idea generator for a use case, who may also bring the algorithm, is one of these roles. That’s where IP needs to be regulated.

There is also the role of the orchestrator. This is the party that manages the project – also for a fee, by the way. And who says across multiple parties: I’ll make sure we have our project plan here, that the data quality review happens, and that a service provider is subcontracted.

In addition, there is the role of data donors, who participate in the project by contributing data. The role of the model developer – potentially, by the way; because it could also be that all parties involved agree that the project stops the moment the shared data pool is established. Then each party with access to the shared data pool would subsequently build its own model.

Stakeholders may also agree to make a joint assignment for a model developer. Then the result of the joint project is the model that is distributed to all the parties who may have put data into it. Or you say, no, there is an application as an output of the project; that is, there is someone who builds another application around the model, a real application. This is the deliverable that is then distributed back into the project community.

These different constellations exist. I might need an IT service provider to build the applications for me. This may be a different service provider than the company that develops the model for me. Our platform brings the flexibility to decide that freely. The auditor as an example, who is commissioned and integrated into the project in rare cases. And the trusted data processor, who builds the pipelines; who then has to get the data transformation in this common shared data pool right.

That means if I’m a glass manufacturer or – your example – from electronics manufacturing, I can look at having a machine learning or data science team in-house. Then I can also bring in this data. I can also solicit data from a supplier or a customer externally; then I can put that project out to bid and solicit input from the other stakeholders or share data. This also works quickly with a tool to be able to work together in a standardized way – is that right?

Boris

Exactly. It’s just always important to me that no one mixes the platform with, what the actual IoT platform through which data exchange is then technologically enabled is. It isn’t. We are all about supporting the initiation process. Once the project contracts are generated, then the job of our platform is done for now.

The IoT platform, that would then again be a role as a hyperscaler or if I already have a platform myself; independent of that – practically a building block that you can then create there?

Boris

That’s a building block that you want to put on there; and if someone chooses to use Google Drive for that, then that may be a legitimate implementation.

We have various IoT partners, manufacturing companies, machine and plant builders listening right now. If I find this exciting, how do I join you guys? Do you have the openness within the working group to say, hey, join us, or how do I join you?

Boris

Exactly. First of all, we are also looking for companies that want to take the step from today’s very bilaterally structured IoT projects and model development projects into the multilateral area. You are also welcome to contact us through iotusecase.com. That is one possibility. The other option is to join the MindSphere World Community! We have quite diverse activities. It will be permanent that these platforms, which we have developed for MindSphere World, are free of charge for members and therefore companies, which are interested in the working group work or also in the use of the platform, can contact Ulf Könekamp and or also me at MindSphere World.

The corresponding links can be found in the show notes.

You are also very upfront with the topic. You probably have some capacity going on in-house, and also the expertise that you bring from MindSphere World, that you’re working together on a topic like this. That’s probably profitable for many out there to join in there.

Results, Business Models and Best Practices - How Success is Measured [30:56]

Ulf, you said you have very different companies with you that are all about bringing these competencies together. Can you summarize why it’s an important issue for many of your members?

Ulf

Many companies have now realized that they can no longer achieve the right innovations or efficiency improvements on their own. But that they have to do it together with others. That they use the complementary competencies of others and thus create ecosystems in order to be able to offer things that no one else could offer alone. It is always this interaction of several. That’s a real mindset change, where companies don’t think I can do this alone or bilaterally, but work together in a larger group. These include, for example, Cross Supplier Solutions. These offer enormous potential because the companies do not simply meet a specification, but can think ahead together and thus achieve more.

All these new technologies that we’re seeing – whether it’s edge or whether it’s cloud – become particularly valuable when data is shared and put into a new context. So data to another company or the weather data to it and so on. Then completely new possibilities and also insights arise, from which one can again draw a benefit.

Many companies fear losing company secrets and legal problems as a result of this data exchange and partial disclosure of their data. Of course, this also includes getting compliance. At this point, at the latest, the Ecosystem Manager comes into play. Contract templates tailored specifically to data use then protect companies and also enable them to move into data-driven business models, which would otherwise not be so easy.

Boris, to put this in perspective, where do you currently stand with Ecosystem Manager and what’s coming in the future? There are quite a few different parties, including banks and insurance companies, that come into play.

Boris

As a result, we intend to enable shared data pools in such a way that we overcome the hurdle of legal fears, the hurdle of data sharing. We want to help companies get over that and move them more toward data economy issues, toward new business models, but also more toward – very practically speaking – commercial scaling of industrial AI.

One more comment: we have a geopolitical issue in Europe with our fears about data sharing, and with our rights that we have and that we value. We have to make sure that we don’t fall behind geopolitically, because there are other cultures that handle data sharing differently and have developed different foundations, for example for AI applications. We can do that, too. We can catch up and keep up by dealing with the hurdles we’ve imposed on ourselves – for good reason – in a structured way, and then accelerating within the structure we’ve created for ourselves. That is what we will continue to work on and what we believe we will be able to make a difference sooner or later.

This was a nice appeal to all the listeners. Let’s tackle this issue together and move forward. It’s a really important topic in the direction of innovation, and it enables collaboration between everyone that ends up lifting the value of new technologies and making the business cases fly. Ulf, I’ll give you the last word for today. Innovation is an important topic, also for you in the association, and will remain so in the future, right?

Ulf

Exactly. True innovation comes from bringing together many different competencies and entirely new forms of collaboration, across all levels of the value chain. The basis for this is data exchange. When everyone knows more, everyone can benefit and the overall outcome is better.

Modern industrial IoT technologies are the basis for this. Be it edge systems, which enable much simpler and more modular architectures with their heterarchical structures, or cloud systems, between which data, and thus knowledge, can be exchanged very easily, but always in a controlled and dedicated manner. The whole thing always based on reasonable contracts and the knowledge of who deals with whom and how with each other.

Data protection & contract design – access to machine and personal data

Podcast episode summary

Podcast interview

Challenges, potentials and status quo - This is what the use case looks like in practice [06:38]

Solutions, offerings and services - A look at the technologies used [20:33]

Results, Business Models and Best Practices - How Success is Measured [30:56]

Quicklinks

Explanation of terms

Contact