Main

Smart Picking with AI and 3D Vision

Today, industrial robots are used in a wide variety of ways to automate different tasks. In combination with 3D camera technology, they are already able to react and grasp adaptively instead of just executing pre-programmed sequences. But for certain activities or situations, such as picking tasks, a high level of intelligence is still required. For example, if the task is to put objects of different size, shape, material or even quality from unsorted boxes into shelves, objects must not only be gripped, but identified and analysed beforehand. These challenges are difficult or impossible to solve with rule-based image processing systems. But with AI-based inference, productive solutions are already available on the market even for this task. In our talk "Smart Picking with AI and 3D Vision" we will explain how the combination of two IDS products – intelligent IDS NXT cameras and powerful 3D Ensenso cameras - multiplies the capabilities of industrial robots. Product website: https://en.ids-imaging.com/ids-nxt-ocean.html Explanatory video about IDS NXT ocean: https://www.youtube.com/watch?v=Z9BYIbzc6rM Pick and place use case with IDS NXT: https://youtu.be/amnmnJ2fPMc

IDS Imaging Development Systems GmbH

2 years ago

Smart picking with artificial intelligence and 3D vision. Today we want to have a look on a little bit on different way to a bin picking application. A little bit away from what's usually done in industrial bin picking let's say. First of all don't be afraid, I don't want to introduce ideas here in this case but two major products come into point here. And this is on the one hand artificial intelligence on an edge device which is our IDS NXT camera in this case and 3D stair vision system. Which
is as Miriam told us before the Ensenso camera. Today we want to combine those two technologies, those two products together and let's see what can happen. So to give you a short idea what we do today, first I would like to motivate and show you an application, which we then solve and realize with this product combination. And last but not least we come to conclusion and to some practical examples, where the things I will show you now are still used. So let's start. Let's imagine you're a shop o
wner and you have to refill shelves in your warehouse. If it's a small one no topic you go there you fill them and you're fine. But imagine you have a really big warehouses and you have to refill those shelves every evening or even during the time when the people are inside. So wouldn't it be good to automate this process to go ahead and say okay i have a a robot which walks through my floors and then puts the items into the shelf bag. A robot alone would not be enough at all because you get tho
se items normally in such unsorted boxes and so it is hard to automate such kind of process because you have different items, different size, different material, different shapes, different behavior if you pick them. So that's not an easy task and it's a little bit different to typical bin picking applications. So for such kind of applications you first have to know what you pick and then you pick it. So in this talk we go deeper into this topic and see how we can realize this. So the first step
as I told before is to know what is in the box. If you think about a typical bin picking application, normally you know the items in your box. So you have single origin items, maximum two of them, so there is no need to specify before you pick what kind of item it is because you know the object you pick. When it comes to warehouses or food logistic applications we have to handle with applications, with natural applications, with goods, with goods with different size, different material and even
different behavior. I told before think about a shirt or what do you have all with us today a pack of masks and you you pick them and then they get a different size because you gripper or due to their weight and they get down. There is no stable surroundings like we would have when we pick metal or plastic parts. So you have to know what your pick your, your robot has to know what he's picking because he has to go other ways, he has to be more flexible, he has to take care not to hit other thin
gs for example, this may be an issue. So first thing, we have to know what we pick. There are masks, if as we see noodles, this is a bottle, which has a total different behavior than a bag of crisps. Even your robot may need to change the gripper. Because picking a bottle may work totally different to picking a bag of noodles. So due to this high variety of the objects and even we don't talk about this, the environment around, the lighting condition. Rule-based machine vision is not the first ch
oice to solve such kind of applications. For such kind of applications artificial intelligence is a much more appropriate solution. The same object may have different sizes. A banana doesn't look always the same but you have to refill the banana. So we're here going with artificial intelligence. With artificial intelligence we do not have to, clearly let's say rule based define the surroundings. We can lane, learn and train the system to recognize what is inside of the basket and i'll say not on
ly to classify also to locate and detect where this item is so that the next step your robot has a much more easier task to do. And to make this happen and easy for you, IDS developed the IDS NXT system and with the IDS NXT ocean we deliver all in one inference solution. Including cameras which are able to use neural networks directly on the edge so you really can connect your camera and can let run an application there. But that running neural network is one part of the story. The other part is
how to train it, how to label the data. Also on this case we offer a cloud-based training service, which you can train the networks, which then later can run on the IDS NXT cameras but step by step we start with the hardware. I would like to give you a short introduction in the camera family., we have at the moment. We have two major camera families, we have the IDS NXT rio and the IDS NXT rome and for those people who know our IDS uEye families, maybe they look a little bit familiar because th
e IDS NXT rear goes more the direction of the ids uEye SE camera and the IDS NXT rome follows the approach of the uEye FA factory automation family. So from this point of view, from hardware point you can expect similarity to those camera families up to, that the IDS NXT rome supports IP65, 67. But for sure that's not all. We included web server on this camera which allows us to communicate with the camera. We enabled the development of vision apps, so you as a user can develop your own vision a
pp for the camera. We implemented a rest interface which comes into the place when your camera has to communicate for example with your with robots or with your automation system. Once more OPC UA similar here we have an industrial ethernet protocol, which allows that a camera can directly communicate with the machines behind. So there is often now no need anymore for an integrated pc. But the real highlight on these cameras is our so-called deep ocean core. Maybe you heard the session last week
from professor Dr. Rastislav Struharik, he's one of the major inventors of the deep ocean core. If you missed it doesn't matter, have a look in the media library of the IDS NXT vision channel, there this talk is placed. And he explains in detail but even understandably for me one of the marketing team, how this deep ocean core works. So join the session, you can look them always on demand. So here we have an hardware accelerator for an FPGA. We go a little bit other approach than other camera m
anufacturers, we develop our CNN on a FPGA. Here we can paralyze CNN operations. We have efficient memory management and it's embedded. We have embedded on camera optimization for the system and this makes the IDS NXT camera family really specific and quite nice to handle such kind of applications we show now. As told have a closer look on the session if you want to hear more about the deep ocean core. So this was the hardware part IDS NXT rio, roam with um hardware accelerator included. Let's h
ave a short look on the software because that's not all we deliver with IDS NXT ocean 2.0. also software possibilities. On the one hand we have the classification and multi-classification. We have prepared apps for these tasks so you don't have to develop them by your own, you just train your networks, you upload them to the IDS NXT camera and that's it. Then classification and multi-classification will do. What is a differentiation between those? In classification we have the complete image or
only one ROI where we do classification. If we have a multi-classification we have different area of interest as much as you need for your application and then we classify inside of this fixed positions area of interest. If you want to dig deeper into this topic the next week we will also have sessions here. Where we go deeper into the topic how to use the classification and the object detection of IDS NXT. Because this is the second part or the second neural network application we have directly
on the camera this is object detection. Here you can detect the object on unknown positions. That's the differentiation to the multi error I, there you have to define the positions. Here you detect the object in different positions even they can move slightly. They can be on a conveyor belt and you will always have the surrounding box and the class of the object. So with this information you can then for example sort these objects, count this object and if you're not in a bin or a basket where
it goes into the deep, so you're on a conveyor belt for example, with this information you could just start picking. Because you know the coordinates of the object and due to the conveyor belt you know your height. But here we have a little bit of different application as told before. We have bins with basket where it's not only enough to know the location by a surrounding box and to know what is inside the box, we really have to know also the 3D coordinates to be able to pick. So object detecti
on, this is what we need in this case that we know. Because we do not know up up front where the items are inside of this basket or the specs. They're inside and we have to identify the location and what kind of object this is. All these features surely also supported with IDS NXT lighthouse, that's what I called just before the training service, our web-based AI training platform. So there's all the specialist knowledge included, so you as a user of the systems you do not know how CNNs really w
ill work. You label your images, you upload them to the IDS NXT lighthouse training platform and you get your neural network back upload it to your camera and let it run. For sure there's a bit more inside, you get values how good your net is, you can check this online but in general this is the process label your images drain the network and then load it to the camera and let the AI run. So shortly again back to here with all this methods we have now an edge device, with a trained network and w
e can detect the objects in our basket. But that's not enough at all to grip them, because in this case there are in a kind of 3D depth localized so we have to localize them by 3D in the basket. It's not enough to only know a surrounding box. And therefore as told before we use our Ensenso 3D active stair vision camera. For those who are not familiar with this technology I would give a pretty, pretty short idea of it, also here we have a session in the vision channel where Dr Martin Hennemann, o
ur product manager describes the technology of 3D systems, also 3D active stereo vision systems into the deep. You can also find this in the media library. What happens with the incentive system, we have two cameras so that's the stereo vision part and we have a projector which projects a pattern on the area you want to see. And with this pattern projected we get a really good 3D image so really good 3D coordinates even when there is low texture. In the image this is what wouldn't be feasible if
you use passive stereo vision for example. But as told, have a look at the media library there is a session from Martin Hennemann, which goes much more deeper to this topic. The Ensenso can acquire images in one shot or they can also placed inline on conveyor belts. There are two major series let's say of the Ensesno camera. We have on the one hand the Ensenso and Sirius is our small compact and lightweight cameras, which can be placed on a robot. Think about our application, we would like to b
uild a robot which goes to our store and picks and fills the shelves, so there it is necessary to integrate a lightweight camera, lightweight system. In this case the Ensenso N. This camera can even be placed directly on the head of the robot. So this is the N-series and on the other hand we have the Ensenso x-series. With the the x-series is a much more modular system with a higher resolution. We can go up to distances up to 5 meters with the XR system we have even onboard processing directly o
n the 3D camera. This is quite necessary if you for example do a multi-camera setup because you reduce the workload of your pc. So taking the Ensenso 3D images and the object detection from the IDS NXT cameras we are now able to clearly locate the items in the basket. We know the 3D coordinates and we know what kind of item it is, so we know how we have to handle this item when we pick it. So we have all necessary parts together to solve this application or? One part is missing. We need a robot
to pick. And here for example our cameras, the Ensenso camera is techman robot plug-and-play certified, so you can directly connect the camera to the robot, to the robot's system and software. So you can directly use the 3D data and our IDS NXT cameras are supporting OPC way. For example which allows also to directly connect the camera to the robot controller. Or depending on the controller we also see more and more applications where rest or XML RAPC is used directly out of the camera and talk
to the to the robot, that there is no additional pc is needed. So now we have everything together, we have our AI cameras, we have a 3D camera and we can connect them to a robot. We can solve the application and till now you saw only let's say images and examples that we have here something which comes, which is real life. Still we often don't see real application warehouses, because a lot of them is under development still now. But a setup may look like this so you have a robot, there is a box
inside of it and if you have a closer look to the system, there is a 2D camera, an IDS NXT system and a Ensenso, which then picks the items either out of the shelf and put it into the box. It could go also the other way or pick it out of the box and bring it into your shelf. So I hope the last minutes I could show you that when we combine two different technologies together, in our case AI with the IDS NXT and 3D vision with the Ensenso, we can come to completely new solutions. And with this com
bination you're perfectly prepared for today's and tomorrow's picking task.

Comments