Today, industrial robots are used in a wide variety of ways to automate different tasks. In combination with 3D camera technology, they are already able to react and grasp adaptively instead of just executing pre-programmed sequences. But for certain activities or situations, such as picking tasks, a high level of intelligence is still required. For example, if the task is to put objects of different size, shape, material or even quality from unsorted boxes into shelves, objects must not only be gripped, but identified and analysed beforehand. These challenges are difficult or impossible to solve with rule-based image processing systems. But with AI-based inference, productive solutions are already available on the market even for this task.
In our talk "Smart Picking with AI and 3D Vision" we will explain how the combination of two IDS products – intelligent IDS NXT cameras and powerful 3D Ensenso cameras - multiplies the capabilities of industrial robots.
Product website: https://en.ids-imaging.com/ids-nxt-ocean.html
Explanatory video about IDS NXT ocean: https://www.youtube.com/watch?v=Z9BYIbzc6rM
Pick and place use case with IDS NXT: https://youtu.be/amnmnJ2fPMc
Smart picking with artificial
intelligence and 3D vision. Today we want to have a look on a little
bit on different way to a bin picking application. A little bit away from what's usually
done in industrial bin picking let's say.
First of all don't be afraid, I don't want to introduce ideas here in this
case but two major products come into point
here. And this is on the one hand
artificial intelligence on an edge device
which is our IDS NXT camera in this case and 3D stair vision system.
Which
is as Miriam told us before the Ensenso camera. Today we want to combine
those two technologies, those two products together
and let's see what can happen. So to give you a short
idea what we do today, first I would like to motivate and show you an application,
which we then solve and realize with this product combination.
And last but not least we come to conclusion and
to some practical examples, where the things
I will show you now are still used. So let's start. Let's imagine
you're a shop o
wner and you have to refill shelves in your warehouse.
If it's a small one no topic you go there you fill them
and you're fine. But imagine you have a really
big warehouses and you have to refill those shelves every evening or even
during the time when the people are inside.
So wouldn't it be good to automate this process to go ahead and say okay i
have a a robot which walks through my floors
and then puts the items into the shelf bag.
A robot alone would not be enough at all because
you get tho
se items normally in such unsorted boxes and so
it is hard to automate such kind of process because you have
different items, different size, different material,
different shapes, different behavior if you pick them. So that's not an easy
task and it's a little bit different to typical bin picking applications. So for such kind of
applications you first have to know what you pick
and then you pick it. So in this talk we go deeper into this
topic and see how we can realize this.
So the first step
as I told before is to know what
is in the box. If you think about a typical bin picking
application, normally you know the items in your box.
So you have single origin items,
maximum two of them, so there is no need to specify
before you pick what kind of item it is
because you know the object you pick. When it comes to warehouses or
food logistic applications we have to handle with applications, with
natural applications, with goods, with goods with different size,
different material and even
different behavior.
I told before think about a shirt or what do you have all with us today a
pack of masks and you you pick them and then they get
a different size because you gripper or due to their weight and they get down.
There is no stable surroundings like we would
have when we pick metal or plastic parts.
So you have to know what your pick your, your robot has to know
what he's picking because he has to go other ways, he has to
be more flexible, he has to take care not to hit
other thin
gs for example, this may be an issue. So
first thing, we have to know what we pick. There are masks,
if as we see noodles, this is a bottle, which has a total
different behavior than a bag of crisps. Even your robot may need to change the
gripper. Because picking a bottle may work
totally different to picking a bag of noodles.
So due to this high variety of the objects
and even we don't talk about this, the environment around, the lighting
condition. Rule-based machine vision is not the
first ch
oice to solve such kind of applications. For such kind of applications artificial
intelligence is a much more appropriate solution. The same object may have different
sizes. A banana doesn't look always the same but you
have to refill the banana. So we're here going with artificial
intelligence. With artificial intelligence we do not
have to, clearly let's say rule based define the
surroundings. We can lane, learn and train
the system to recognize what is inside of the basket
and i'll say not on
ly to classify also to
locate and detect where this item is so that
the next step your robot has a much more easier
task to do. And to make this
happen and easy for you, IDS developed the IDS NXT system
and with the IDS NXT ocean we deliver all in one inference solution. Including cameras which are able to use
neural networks directly on the edge
so you really can connect your camera and can let run
an application there. But that running neural network
is one part of the story. The other part is
how to train it,
how to label the data. Also on this case we offer a cloud-based training service,
which you can train the networks, which then later can run
on the IDS NXT cameras but step by step we start
with the hardware. I would like to give you a short introduction
in the camera family., we have at the moment. We have two major camera families, we
have the IDS NXT rio and the IDS NXT rome and for those
people who know our IDS uEye families,
maybe they look a little bit familiar because th
e IDS NXT rear goes more the
direction of the ids uEye SE camera
and the IDS NXT rome follows the approach of the uEye FA
factory automation family. So from this point of view, from hardware
point you can expect similarity to those camera families up to, that the IDS NXT rome supports
IP65, 67. But for sure that's not all.
We included web server on this camera which allows us to communicate with the
camera. We enabled the development of vision apps, so you as
a user can develop your own vision a
pp for the camera. We implemented
a rest interface which comes into the
place when your camera has to communicate
for example with your with robots or with your automation system. Once more OPC UA similar here we have an
industrial ethernet protocol, which allows that a camera can directly
communicate with the machines behind. So there is
often now no need anymore for an integrated pc. But the real highlight on these cameras
is our so-called deep ocean core. Maybe you heard the session
last week
from professor Dr. Rastislav Struharik, he's one of the
major inventors of the deep ocean core. If you missed it doesn't matter, have a
look in the media library of the IDS NXT vision channel, there this
talk is placed. And he explains in detail
but even understandably for me one of the marketing team,
how this deep ocean core works. So join the session,
you can look them always on demand. So here we have an hardware accelerator for
an FPGA. We go a little bit other approach than other camera m
anufacturers, we develop
our CNN on a FPGA. Here we can paralyze CNN
operations. We have efficient memory management and
it's embedded. We have embedded on camera optimization for the system and this makes the IDS
NXT camera family really specific and quite nice to handle
such kind of applications we show now. As told have a closer look on the
session if you want to hear more about the deep ocean core.
So this was the hardware part IDS NXT rio, roam with um
hardware accelerator included. Let's h
ave a short look
on the software because that's not all we deliver with IDS
NXT ocean 2.0. also software possibilities. On the one hand
we have the classification and multi-classification. We have prepared
apps for these tasks so you don't have to develop them by your own,
you just train your networks, you upload them to the IDS NXT camera
and that's it. Then classification and multi-classification
will do. What is a differentiation between those? In
classification we have the complete image or
only one
ROI where we do classification. If we have a multi-classification we
have different area of interest as much as you
need for your application and then we classify inside of this fixed
positions area of interest. If you want to dig deeper into this
topic the next week we will also have sessions here.
Where we go deeper into the topic how to use the
classification and the object detection of IDS NXT.
Because this is the second part or the second neural network
application we have directly
on the camera this is
object detection. Here you can detect the object on
unknown positions. That's the differentiation to
the multi error I, there you have to define
the positions. Here you detect the object in different
positions even they can move slightly.
They can be on a conveyor belt and you will always have
the surrounding box and the class of the object.
So with this information you can then for example sort these objects, count
this object and if you're not
in a bin or a basket where
it goes into the deep, so
you're on a conveyor belt for example, with this information
you could just start picking. Because you know the coordinates of the
object and due to the conveyor belt you know your height. But
here we have a little bit of different application as told before. We have bins with basket where
it's not only enough to know the location by a surrounding box
and to know what is inside the box, we really have to know also the 3D
coordinates to be able to pick. So object detecti
on, this is what we need
in this case that we know. Because we do not know up up front where
the items are inside of this basket or the specs.
They're inside and we have to identify the location
and what kind of object this is. All these features surely also supported
with IDS NXT lighthouse, that's what I called
just before the training service, our web-based AI
training platform. So there's all the specialist knowledge
included, so you as a user of the systems you do
not know how CNNs really w
ill work. You label your
images, you upload them to the IDS NXT
lighthouse training platform and you get your neural network back upload it to
your camera and let it run. For sure there's a bit
more inside, you get values how good your net is, you can
check this online but in general this is the process
label your images drain the network and then load it to the camera
and let the AI run. So shortly again back to here with all this
methods we have now an edge device,
with a trained network and w
e can detect the objects in our basket. But that's not enough at all to grip
them, because in this case
there are in a kind of 3D depth localized so we have to localize them by 3D
in the basket. It's not enough to only know
a surrounding box. And therefore as told before we use our
Ensenso 3D active stair vision camera. For those who are not familiar with this
technology I would give a pretty, pretty short idea of it, also here we have a
session in the vision channel where Dr Martin Hennemann, o
ur product manager
describes the technology of 3D systems, also 3D active stereo vision systems
into the deep. You can also find this in the media library.
What happens with the incentive system, we have two cameras
so that's the stereo vision part and we
have a projector which projects a pattern on the
area you want to see. And with this pattern projected we get
a really good 3D image so really good 3D coordinates even when
there is low texture. In the image this is
what wouldn't be feasible if
you use passive stereo vision for example.
But as told, have a look at the media library there is
a session from Martin Hennemann, which goes
much more deeper to this topic. The Ensenso can acquire images in one
shot or they can also placed inline
on conveyor belts. There are two major series let's say of
the Ensesno camera. We have on the one hand
the Ensenso and Sirius is our small compact and lightweight
cameras, which can be placed on a robot.
Think about our application, we would like to b
uild
a robot which goes to our store and picks
and fills the shelves, so there it is necessary to
integrate a lightweight camera, lightweight system. In this case the
Ensenso N. This camera can even be
placed directly on the head of the robot. So this is the N-series and on the
other hand we have the Ensenso x-series. With the
the x-series is a much more modular system with a
higher resolution. We can go up to distances up to 5 meters
with the XR system we have even onboard processing directly o
n the 3D camera.
This is quite necessary if you for example do
a multi-camera setup because you reduce the workload
of your pc. So taking the Ensenso
3D images and the object detection from the IDS
NXT cameras we are now able to clearly locate the items in the
basket. We know the 3D coordinates
and we know what kind of item it is, so we know how we have to
handle this item when we pick it. So we have all necessary parts
together to solve this application or? One part is missing. We need a robot
to pick. And here for example our cameras,
the Ensenso camera is techman robot plug-and-play certified, so you can
directly connect the camera to the robot,
to the robot's system and software. So you can directly
use the 3D data and our IDS NXT cameras are supporting
OPC way. For example which allows also to directly connect
the camera to the robot controller. Or depending on the controller we also see
more and more applications where rest or XML RAPC
is used directly out of the camera and talk
to the to the robot, that there is
no additional pc is needed. So now we have everything together, we
have our AI cameras, we have a 3D camera
and we can connect them to a robot. We can solve the application
and till now you saw only let's say images and
examples that we have here something which comes, which is
real life. Still we often don't see real application
warehouses, because a lot of them is under development still now. But a setup may look like this so you
have a robot, there is a box
inside of it and if you have a closer look to the system,
there is a 2D camera, an IDS NXT system and
a Ensenso, which then picks the items
either out of the shelf and put it into the box.
It could go also the other way or pick it out of the box
and bring it into your shelf. So I hope the last minutes I could show you that
when we combine two different technologies
together, in our case AI with the IDS NXT and 3D vision with
the Ensenso, we can come to
completely new solutions. And with this com
bination you're perfectly
prepared for today's and tomorrow's picking task.
Comments