The 48th episode of the Industrial IoT Use Case Podcast is about industrial cameras that not only take pictures, but also think for themselves and are able to make decisions themselves. Patrick Schick (Product Marketing Manager, IDS Imaging Development Systems GmbH) and Alexey Pavlov (Founder and CEO, urobots GmbH) talk about AI-based cameras with eyes and brain and their far-reaching added values.
Podcast episode summary
IDS Imaging Development Systems presents IDS NXT, a platform for industrial applications ranging from classical to AI-based image processing. With its industrial cameras, IDS delivers not only images, but directly viewable results. Thanks to integrated image processing and a wide range of communication interfaces such as OPCUA, the cameras are able to visualize the results of a process and control machines and systems – without being connected to a PC.
What does a process in which the IDS NXT cameras are used look like? One example described in the podcast is the final inspection in the production of sealing rings. A product expert assigns the sealing rings a label, e.g. good or bad part, based on their condition in the AI training process. Images are taken of these labeled parts, with 10 to 20 images already being enough to train the AI and lead it to success. This means that the expertise of the person at the final inspection is poured into the AI so that the robot can work independently in the final inspection and distinguish good parts from bad. The big advantage is that no in-house knowledge of artificial intelligence is required to work with and on the camera. The only knowledge required is expertise in the relevant products.
But there is another way: In this podcast episode, IDS brought along its customer urobots, which has already developed its own image processing AI and uses the AI accelerator running on the IDS NXT camera. Knowledge is shared on an open platform. IDS thus serves two different types of customers – those with prior knowledge of AI and those without.
What are the benefits of the cameras? An accuracy of 99.9% in quality control and thus the reduction of complaint costs. By having robots take over monotonous tasks, employees can be deployed more efficiently in other areas. In addition, users incur fewer investment costs in their own hardware, partly because the IDS NXT does not require a PC. Users also don’t need their own image processing department, as the AI does the whole job. In addition, the cameras work very quickly: urobots speaks of approximately 200 milliseconds in which the software captures the positions of all objects on the camera image. That is, five acquisitions per second. The accuracy of the detection is +- 2 degrees and +- 2 pixels. Faster computing time means faster decisions can be made, which in turn increases the clock speed.
Podcast interview
Hello Alexey, Hello Patrick, welcome to the IIoT Use Case Podcast. I’m so glad you joined us today and took the time to share some insights on your topics with our listeners. I’d say we’ll just start with a quick round of introductions. I’m going to look your way, Patrick. Would you like to introduce yourself briefly and say one or two facts already about your company, about a core business, where exactly you are on the road?
Patrick
Thank you, I’ll be happy to do that. My name is Patrick Schick. I am a Product Marketing Manager at the company IDS. IDS Imaging Development Systems GmbH. We are the world’s leading manufacturer of digital industrial cameras, located in southern Germany, the most beautiful part of Germany. We have been developing and producing products in this field of industrial image processing since 1997. What’s really exciting about us: we have everything here in southern Germany at our site. This means that we research, develop and produce everything here directly at the site. This also means that we have extremely short distances. We can transfer things directly from research to production. Our portfolio includes classic 2D industrial cameras. We have 3D Active Stereovision cameras and we offer the IDS NXT, an external embedded vision platform. This is a platform for industrial applications ranging from classical image processing to AI-based image processing. With this platform, we not only deliver images, which is the typical result of an industrial camera, but through the integrated image processing and different communication interfaces such as OPCUA or the REF protocol known in the IoT field, we can deliver results directly, just like any other sensor and thus also interact directly with a plant, control a plant.
You brought a customer today – urobots GmbH. Hello Alexey, once again in your direction. Thank you so much for taking the time to be there today as well. I’d say we’ll just round out the introductions briefly and then I’d get into the content of image processing. Alexey, would you like to introduce yourself briefly, including what you do, what your personal responsibilities are in the company, and who your customers are.
Alexey
Yes, Madeleine, I’d love to. I am Alexey Pavlov the founder and CEO of the company urobots GmbH. urobots is a software development company based in Pforzheim, Germany and we are specialized in image processing for quality and assembly control and also in robot vision. Our customers come from the manufacturing sector – either they are manufacturers who want to retrofit their own lines or automation companies who build modern production facilities.
You have just mentioned your customers. How do you have to imagine this? Who are you talking to? Is it more the IT people or the technical people? Or who are the customers you talk to the most or who are your contacts at the customer?
Alexey
Very different. So mainly process owners. For example, the people who are responsible for making the line work well or the maintenance team or the IT managers. The people who really care that production runs well, runs error-free and the quality of production is good.
You guys at urobots are primarily concerned with software, as you just pointed out, but also with machine learning and computer vision algorithms. Now a question for you, Patrick. We’re talking about image processing today, but also in the context of your industrial cameras. Maybe just to get started: You come from the classic industrial camera business. What market changes do you see now in your customer segment and what is the relevance of digitalization in the broadest sense here?
Patrick
I would start with digitization, because that is answered quickly. We are manufacturer of digital industrial cameras. That is, as far as digitizing images is concerned, that is complete. That has been there for a long time. We are now working on integrating this more deeply into the customer application, so that we are not just there as a pure supplier, but as a provider of results. What do we see for market changes? Our customers and users are looking more and more for ready-made solutions and much less for individual components. We are actually a classic component manufacturer. How do we come to be able to offer the customer a solution? They want to contribute their expertise as quickly as possible and, ideally, not worry about image processing. Let someone else do that for them. That is, one market change is that customers are looking for solutions The other big market change we’re seeing is that with artificial intelligence as a mass product now in image processing, there are also completely different opportunities emerging. First, existing applications can be significantly improved by moving away from a classic, rule-based algorithm and replacing it with AI. And entirely new applications are possible. If you now take a look at the food or agriculture sector, you can suddenly solve applications that were not possible before. This is because we are working in the environment of organic systems, which are difficult to describe in a rule-based way. For example: Describe the difference between a crop and a weed growing in the field. If you were to go into the algorithm for a moment and program that: It’s both green, but one leaf shape can be one way, the other can be another. But a person can just go and say, “That’s a weed. This is a useful plant. That is, he can rate it, label it. And it is precisely this knowledge that we can train in artificial intelligence and can build entirely new solutions with it.
This topic of labeling and AI – I would also ask about that again right away. Alexey, you just mentioned quality processing and the processes that take place at your customers’ sites, which you ultimately help with your software stack. I would now be interested to know: What does it actually look like on site at your customer’s or how do such processes look like? Can you share a virtual picture of a customer? What does the process look like on site with your customers?
Alexey
As an example, let’s imagine a production line with objects on a conveyor belt to be processed by an automated station. The positions of the objects on the tape are random and they are scattered all over the surface. There must be someone who recognizes these objects accordingly, selects them and does the desired work, e.g. puts them into packaging or prepares a further processing or analysis station. Such types of tasks are also called pick and place. Often this is done by humans and this work is simple but monotonous, so humans could well be replaced by robots at this point. Our goal is to enable this replacement or supplementation of the workforce. The system solution for this consists of the IDS NXT camera, our image processing software and any cooperative robots. The NXT camera can be thought of as robotic eyes with brains. We also want to offer our customers a way to modify and customize our solution for different use cases. This means that if something suddenly changes in production, e.g. the lighting, the appearance of the objects or if the customer wants to process new object types with the same system, then this should be very easy for the customer to implement.
Now I would bring a little bit of a swing in the direction of the industrial camera and on the part of IDS. You said it’s about pick and place and now somewhere an IDS NXT platform or solution is also being used here. How do you have to imagine this in practice? What was the challenge here and how did you solve that with this camera, which you just described now?
Alexey
In our company, we have actually already developed a PC-based solution for object detection and robot control. For this purpose, we have developed our own artificial intelligence model that is able to recognize the position and orientation of objects in the images. This time, our goal was to port this PC-based solution to the IDS NXT platform by developing a dedicated NXT app. And the challenge here was two major tasks: First, we need to run our own AI models on the camera hardware, which is IDS NXT. Because a PC solution uses the graphics card, so-called GPU (Graphics Processing Unit, graphics processor) to run the models, and on the camera the FPGA (Field Programmable Gate Array, an integrated circuit of digital technology) is another technology. Therefore, we adapted our artificial intelligence models and converted them into FPGA models using a special tool from IDS. Secondly, our software that communicates with the robot and converts 2D coordinates of detected objects into the 3D coordinates of the robot is written in Python programming language. But for NXT app development, C++ is used. That’s why we need to adjust our code. Since we already had experience with the development of software for the NXT camera, we were able to successfully complete this relatively quickly within a few weeks and now you virtually don’t need a PC. Now you can solve the task only with a robot and only with an IDS NXT camera.
Before I go into the data right now, a quick question for you, Patrick: Did I understand that correctly? Does that mean I have a conveyor belt somewhere with different objects. The robot performs its work there. And now your industrial camera comes into play and this AI algorithm that ensures that this image processing can take place at all, that runs on your camera, so to speak. Did I understand it correctly?
Patrick
That’s right, the IDS NXT has an integrated AI accelerator. This allows us, as Alexey just said, to do what you had to do on the PC, on a GPU, directly on the camera, on the edge device. Perhaps in addition: There are two ways. We have just been talking about solution providers and asking for solutions. First, a customer can train networks directly with IDS NXT ocean itself without having any AI knowledge at all. He can just train that directly with us on the cloud and then upload nets. And then IDS NXT takes a second approach, and that’s what urobots is implementing here: it’s an open platform. Alexey can come here with his own networks, with his own AI, and use our accelerator running on the IDS NXT camera.
Alexey, I would be interested to know: Your customers may already be using different systems. Now I need certain key figures to evaluate this data somewhere at all. Which data and key figures are interesting for your customer in such a process?
Alexey
In the beginning, a bit about performance metrics and opportunities, so to speak. Our software captures the positions of all objects on the camera image in about 200 milliseconds. That’s five acquisitions per second. The accuracy of detection is plus minus 2 degrees and plus minus 2 pixels. Objects of different types can be located simultaneously. The software provides the position and type of all detected objects. Flexible objects or objects that are slightly different from each other are supported. This means that the objects do not have to look exactly the same – a deviation in shape is allowed and can be trained by the AI. As for the hardware, you need an IDS NXT camera and a robot for production. We support Universal Robots robots with an extension we developed that can be installed on the robot. This extension supports the communication with the camera software and enables the programming of the robot movements by so-called teaching. This means that it is sufficient to move the robot once to the desired position with respect to an object recognized by the camera software, so that it later assumes exactly the same position with respect to other objects located elsewhere. The camera software can also be a support for other robot platforms or devices, such as PLCs for example. We can implement virtually any Ethernet-based protocol because IDS NXT is basically a camera with a computer or a computer with a camera. For us, the development opportunities here are virtually limitless.
Patrick, maybe one more question for you. We are now talking about different data, as Alexey has just elaborated. So how does this data get from your camera into the NXT platform? How does this data get “from the bottom up” to the cloud? How does this data processing work? I would like to understand that a little bit.
Patrick
In principle, the process starts in the same classic way as with any industrial camera: We first take pictures, pictures of what we want to evaluate later. These can be workpieces or similar, as with the Alexey. Then comes a subject matter expert for these products and labels these images. He says this is a good part and this is a bad part and then he uploads these images to the cloud. This is our IDS NXT lighthouse tool. This is a cloud-based artificial intelligence training tool. There it is uploaded and basically that’s it. Then you just have to put the labeled images on top, start the training and out comes a fully trained AI. That means our user doesn’t need any expertise in artificial intelligence, it’s all mapped in the cloud. He only needs to bring expertise in his field. He uploads it, he trains the mesh, he gets the mesh back, he uploads it to the IDS NXT, and then he can use the same camera he used to take the pictures before to do the evaluation and send results directly to his facility. This is the one case. This is the IDS NXT ocean package. The other case is that the tasks are so complex that you need someone else as support to implement image processing, perhaps to implement an AI that we do not support by default now. And that’s when solution providers like urobots come into play. That’s where IDS NXT is an open platform. Then urobots can program its own app on it. But in the end, the data comes along the same way. There are images, they are labeled according to their properties, an AI is trained, the AI is uploaded and then it performs the image processing.
Now if I imagine I’m a production manager and I’m in charge of this line where these objects are removed manually. Now you do it with a robot and your industrial camera. Can you give an example of this labeling? I am trying to imagine the process right now: I see this object is now either not the right one or it may be wrong. Then the robot would go and maybe straighten up that workpiece or would sort that out directly. But for that I need the competence to know that this is so and for that I label this data, right?
Patrick
Exactly, I would take a final inspection of sealing rings as an example. Sealing rings are produced, we are in the final inspection. And someone stands there today, looks at these rings and says: Ah, that’s good, it fits and oh, there’s a burr on it, maybe it’s not one hundred percent round or just has a flaw. The person who inspects it so far, that can decide within milliseconds, the is good or bad. It simply sees how we as humans can make such decisions very quickly. Whereas a rule-based algorithm has a theme or a problem. Then I take pictures of the good and bad parts that the person has pre-labeled by putting it in the good tray or the bad tray, for example. That is, the expertise of the person at the final inspection is poured into the AI. That’s the input for artificial intelligence, for artificial intelligence training.
Alexey
Madeleine, if we’re talking about our use case, you’re absolutely right. We have developed a special tool for the labeling process. And there are just different types of labels. That means, for example, if you have two different objects on the image, then you have two different labels – object 1, object 2. And if object 1 can be located another way, on another side, then we make a third label and say this is object 1 on the side and with that artificial intelligence trains itself and after that it is able to recognize these two types of objects plus one object on the side. And then robot programmers can create the program accordingly. And training goes the same way with us. You only need camera images with objects. The customer does it with his camera, as Patrick said, then labels with our tool. 10 to 20 images are actually enough to train the AI models, and the training of the AI model and preparation for use on the IDS NXT camera will be done by us. This is a different model type everything at lighthouse. But if the customer has his own PC with GPU, he can do it himself. As with lighthouse, no expert knowledge is required for this. The customer only has to prepare the images and the rest is done automatically by our software.
Thank you for the elaboration. Patrick, maybe to you again the question about this competence structure: You both have image processing. Who contributes which competencies in detail here?
Patrick
So, first of all, the customer always brings his expertise to bear on the product he manufactures. That’s when we as IDS come into play. We bring as a competence the performant execution of neural networks (AI) on an embedded device. This is the competence that Alexey can reuse for his solutions to make customized solutions. But it doesn’t stop there. And on the bottom we cover the complete AI chain. That is, in addition to the AI accelerator, we have also implemented cloud-based neural network training. We also take this competence away from the customer. There are the two ways: either I need a special AI, then urobots or other partners, or I want to train it for myself. Then, when it comes to classifications, object detection, I can start right away with IDS NXT lighthouse and train myself as a customer. At IDS, we try to map the complete AI competence -from execution to training. These are our main competencies in this environment. I would perhaps add one more point: As IDS, we also bring you all the rest of our industrial camera know-how at this point, because it doesn’t stop with the piece of hardware. I need lenses to go with it, I need cables, I need to know how to mount it, etc. Of course, all that still needs to be taken care of.
With projects like this, it’s also always interesting to know what the business case is. I mean, at the end of the day, you want to cut costs with it or maybe even generate new revenue to just be better. Perhaps also at the go to market to simply optimize the processes at the end. That’s why I’m always asked: What’s the business case behind it? How do I earn money with it as a customer? And Alexey, the question for you is, you’ve been involved in this image processing topic for quite some time now, what does the business case for a customer look like?
Alexey
So in our field, in vision, AI is superior to humans. AI is fast, works almost flawlessly, does not pause. Therefore, it is simply more efficient. We’ve already proven this in many other areas, such as quality control, that the accuracy of AI is 99.9 percent. This enables our customers to reduce complaint costs, for example. That’s the whole idea. But in tandem with robotics, artificial intelligence can cut costs by taking over monotonous work. For example, instead of three workers per day, the customer needs only one robot, and the humans can then be used in many other areas in which they are superior to the robots. And the use of the IDS NXT camera is also a clear advantage, because in some cases, such as our pick and place case, NXT completely replaces the PC, thus reducing acquisition and maintenance costs.
Patrick, maybe the question to you again: The business case for you is that with the new camera that you’re releasing now, the PC is being replaced, isn’t it?
Patrick
So one thing is clear, I can do without a PC in most cases because the camera delivers results and not just images that then have to be evaluated on the PC. On the other hand, our users can do without their own image processing department, because if an application can be solved based on AI, we offer everything. With the training we have, they don’t have to invest in their own hardware. They don’t have to train experts on site for the topic, but they can use our know-how at that point, which is added to it. One is to save a PC. Quite often, however, it is also important to make preliminary decisions on site so that later decisions in the plant are made faster. So a classic example would be this i.O. / n.i.O. decision. The only feedback you need is: Okay, the part is okay or it’s not okay. Do I feed it to another expensive step or do I just throw it out. Until now, a classic industrial camera has had to transmit a lot of image data over long cables. That’s all computing time. I can increase the number of cycles. This is a typical business case as a user. When I make a decision on the Edge, I can usually assume that the clock speed of my machine can increase because I’m already getting to decisions much faster, much sooner.
I think what you just said about resources is also an important point, because the expertise for these topics is usually not in house – especially among SMEs. And here, of course, it is extremely valuable to have the competencies via the partners who bring them along and then also deliver them holistically. Also in the constellation, in which you are now also active. We now have different audiences from very different backgrounds. Of course, there is not only this production use case, I would say, but also others where a wide variety of image processing is used. Patrick, how do you think this use case is transferable now? What kind of customers do you have then? Maybe you can give us some insight on how this use case can be applied to others now.
Patrick
If we now return to the pick and place, it is widely transferable into the most diverse applications. Because in addition to the workpiece that Alexey has now trained, I can also train other workpieces and the subject of AI is generally always transferable. I can also classify other objects. For this, I need image data and I have to train. But there are many, many other ways to use AI and perhaps make existing solutions better and more efficient. Alexey said it: 99.9 percent in quality control. I can use my employees much more effectively, perhaps in much higher-value jobs, by replacing monotonous work with robots. Well, I’m someone who always likes to be inspired. And we have a new area on our website, an AI marketplace, the visionbay, and there we offer a wide variety of solutions. You can just go there and browse. There you will find solutions like pick and place applications, but also other logistics applications, recycling applications. We have a solution provider, for example, that detects foreign matter in compost, in garbage. He can see when there’s something in there that doesn’t belong. Of course, I can transfer this to many use cases. Where do I need to identify foreign materials in order to sort them out later? And I think in visionbay you can find a solution if you are looking there right now. I’m sure you can find one there.
But that also means, conversely, that if I’m a software developer now and I’m dealing with image processing, that I have the opportunity to put my AI algorithm on your marketplace to, in turn, collaboratively help other customers in the processes, right? Or for whom is this AI marketplace primarily interesting?
Patrick
As you said: for both sides. On the one hand, for the user to find a solution for themselves, but also for the developer to say, “Hey, I made a cool solution here with AI. It runs on an IDS NXT camera. I’ll put this in the marketplace at IDS. Then simply get in touch with us and everything else will be done.
I think there are a lot of exciting applications, perhaps even from manufacturing companies that are also looking for such topics or perhaps mechanical engineers for whom this is very interesting. Maybe last question for you, Alexey. If we now talk about these different applications at the customer or also use cases, do you have any additions where you say that it’s also an exciting case, we’re working on it with a customer or just more ideas on how these use cases can now be transferred or what else you’re doing with customers?
Alexey
As Patrick said, the possibilities have no limits. What we basically do, we detect positions of objects and orientation of objects. That can be tools, that can also be boats in the harbor, for example, or any pepperoni on a pizza. So all kinds of things you can detect with AI and do all kinds of work after that.
Very nice, then you can just click into the AI marketplace and see what exciting solutions you have. Then I would thank you at this point for your time. Thanks for the exciting insights into these individual processes.
Patrick
I would like to add a brief remark. What I consider important in the AI environment is to really think out of the box – even in classical image processing. If you look at a problem from the perspective of artificial intelligence, you come up with completely new solutions. So we first had to learn to really look at things in a new way and also to put known knowledge aside for a while and to take AI as a basis for a solution. This opens up completely new avenues, and things that you didn’t think were possible before can be solved all at once.
Perfect. That was actually a nice closing word and also a nice appeal to one or the other out there who is working on these projects right now. Thank you for the session!