Full article: Modeling object arrangement patterns and picking arranged objects

Formulae display: $MathJax Logo$ ?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax in order to improve their display. Uncheck the box to turn MathJax off. This feature requires Javascript. Click on a formula to zoom.

ABSTRACT

This study investigates object picking focusing on object arrangement patterns. Objects stored in distribution warehouses or stores are arranged in regular patterns, and the grasping strategy for object picking is selected according to the object arrangement pattern. However, object arrangement patterns have not been modeled for object picking. In this study, we represent objects as polyhedral primitives, such as cuboids or hexagonal cylinders, and model object arrangements by considering occlusion patterns for object model surfaces and considering whether the adjacent object occluding the surface is moveable. We define grasp patterns based on combinations of the grasp surfaces and discuss the grasping strategy when the grasp surfaces are occluded by adjacent objects. We then introduce newly developed gripper for picking arranged objects. The gripper comprises a suction gripper and a two-fingered gripper. The suction gripper has a telescopic arm and a swing suction cup. The two-fingered gripper mechanism combines a Scott Russell linkage and a parallel link. This mechanism is advantageous for the gripper in reaching narrow spaces and inserting fingers between objects. We demonstrate the picking up of arranged objects using the grippers.

GRAPHICAL ABSTRACT

KEYWORDS:

1. Introduction

Object picking is a fundamental task in the field of robotics. The most fundamental robotic manipulation task is bin picking, which refers to the robotic task of picking up an object placed randomly among other objects. Bin picking in a factory can use the computer aided design (CAD) models of the target object. Many previous studies have focused on vision guided bin picking, in which the 3D scan data of the object are matched to the CAD model, and the object pose is estimated from it [Citation1–6]. Kirkegaard adopted this approach and used the harmonic shape contexts feature for pose estimation [Citation2]. Buchholz used random sample matching (RANSAM) and iterative closest point (ICP) algorithm to identify object localization [Citation5].

Until recently, object picking was realized only in factory settings [Citation1–9], but more recent studies have investigated object picking in homes, warehouses and stores [Citation10–18]. However, most object picking tasks cannot use the CAD models, and simplified approximation models such as shape primitives [Citation10], superquadrics [Citation11], and bounding boxes [Citation12] are used.

The Amazon Picking Challenge (APC) 2015 and 2016 and the Amazon Robotic Challenge (ARC) 2017 were organized to encourage the advancement of state-of-the-art object picking tasks in warehouses. In APC and ARC, different items were placed randomly in a storage receptacle, and participating teams attempted to pick up and store the items with robots multiple times [Citation19–22]. The APC and ARC triggered the advancement of randomized picking tasks, and a significant amount research on randomized picking has since been carried out [Citation23–29]. Several studies on randomized picking have used deep learning techniques [Citation19–22, Citation24–29].

In randomized picking, objects adjacent to the target are regarded as obstacles, and grasping strategies avoid or move the obstacles. Berenson studied grasp planning in a complex environment involving obstacles [Citation13]. Dogar proposed moving obstacles by pushing [Citation16]. Harada used learning algorithms to estimate whether the pick-up task would succeed when a finger touched an adjacent object [Citation9].

To achieve efficient storage and create attractive product displays, objects stored in distribution warehouses or stores are arranged in regular patterns, where similar items are placed, stacked, or lined up together. When objects are arranged in regular patterns, parts of the surfaces of the objects remain occluded. This limits the graspable areas on objects, and it becomes necessary to manipulate objects to reveal occluded surfaces. For instance, Figure (a) shows books lined up in a bookstand. In this case, the surfaces to be grasped by the fingers are the left and right sides of the book, which remain occluded by adjacent books. As seen in Figure (a), the book is tilted to reveal the hidden sides for grasping. Figure (b) shows a stack of books, where the top and bottom surfaces are used for grasping, and the bottom surface remain occluded. As shown in Figure (b), the book is slid forward to reveal the hidden surface for grasping. This illustrates how picking up objects arranged in regular patterns requires picking strategies corresponding to the patterns the objects are arranged in. Object arrangement patterns constrain the position and posture of the objects, which can be regarded as virtual segments. Objects in arrangements are regarded as object groups placed in the virtual segments. Thus, object picking tasks for arranged objects must consider the arrangement patterns in addition to the shapes of individual objects contained in the virtual segments. So far, object shapes have been modeled for object picking. However, the object arrangements have not been modeled. In this study, we accomplish object picking by focusing on the arrangement patterns of objects. Firstly, we represent objects as polyhedral primitives, and model object arrangements using the surface occlusion patterns. Then, we define the grasp patterns based on combinations of the grasp surfaces of the objects and discuss the grasping strategy when the grasp surfaces are occluded by adjacent objects. In addition, we demonstrate picking up arranged objects using a physical robot with a newly developed gripper. To the best of our knowledge, this is the first study on object picking focusing on object arrangement patterns.

Figure 1. Non-prehensile manipulation may be performed for object picking according to the object arrangement pattern. (a) Tilting an aligned book for picking. (b) Sliding the top book for picking.

The rest of the paper is organized as follows. In Section 2, we describe object arrangement models, and in Section 3, we outline the definitions of grasp patterns and explain the grasping strategies. In Section 4, we describe a picking experiment for arranged objects and conclude the paper in Section 5.

2. Modeling of an object arrangement pattern

The grasping strategy for object picking is selected according to whether the gripper can access the grasping surfaces on the target objects. We model an object arrangement pattern with the occlusion pattern of the object surface. To describe the object arrangement pattern, we define the following surfaces.

Accessible surface: A surface on the target object that a gripper can access without interference from adjacent objects or the surrounding.
Occluded surface: A surface occluded by adjacent objects or the surrounding, which a gripper cannot access for grasping. The occluded surface is the excluding of the accessible surface on the object.
Grasp surface: A surface on the target object that a gripper can potentially access for grasping. The grasp surface is defined depending on the features of the gripper. When the gripper type is a two-fingered gripper, the grasp surface is two opposing surfaces of the object within the stroke, and when the gripper type is a vacuum gripper, the grasp surface is a flat surface of the target object.

2.1. Object model for describing object arrangement patterns

There are a wide variety of objects in warehouses and stores, and CAD models of these objects usually cannot be obtained. We model an object with a simplified shape primitive for describing the object arrangement pattern. Figure depicts two typical patterns of objects lined up on a flat surface. Figure (a) shows objects lined up in the front, back, left, and right, and many objects are arranged in this pattern. In such cases, objects come into contact with surrounding objects at up to four places, and objects can be modeled as cuboids. Figure (b) shows circular objects lined up closely, with up to six points of contact with surrounding objects. In this case, we model objects as hexagonal prisms.

Figure 2. Two arrangement patterns of circular objects on a plane. (a) Arrangement pattern 1, (b) Arrangement pattern 2.

Objects stored in warehouses or stores are placed on shelves, with similar objects grouped together. The surface of an object seen on the shelf head-on is regarded as the front of the object. The surfaces of the object model are labeled as in Figure . The notations used are in the following order. (1) $\begin{aligned} F_{c u b} & = (B o t t o m, T o p, F r o n t, R i g h t, B a c k, L e f t) \end{aligned}$ (1) (2) $\begin{aligned} F_{h e x} & = (B o t t o m, T o p, F r o n t, R i g h t_F r o n t, R i g h t_B a c k, \\ B a c k, L e f t_B a c k, L e f t_F r o n t) \end{aligned}$ (2)

Figure 3. Labeling the faces of the object model. (a) The faces of the cuboid model and (b) the faces of the hexagonal prism model.

2.2. Describing an object arrangement pattern

The grasping strategy for object picking is selected depending on which grasp surfaces of the object are occluded by adjacent objects. Additionally, different grasping strategies can be selected depending on whether the adjacent object occluding the grasp surface can be moved. For instance, if the adjacent object can be moved, the occluded grasp surface can be revealed by simply moving the adjacent object. Moreover, if the adjacent object is a fixed structure such as a wall, the occluded grasp surface can be revealed by moving the target object by pushing it up against the adjacent object, as shown in Figure . Therefore, we model object arrangement pattern by considering the occlusion pattern, and whether the adjacent objects occluding the surface are moveable. Thus, the object arrangement pattern is described as follows: (3) $F = (f_{1}, f_{2}, \dots, f_{i}, \dots, f_{n}), n = 6 o r 8$ (3) where $f_{i} (i = 1 \sim n)$ expresses whether the surface is occluded by the adjacent object, and if the adjacent object is moveable. $f_{i} = 0$ if surface i is an accessible surface, and $f_{i} = 1$ if surface i is an occluded surface, and the adjacent object occluding that surface can be moved. Moreover, $f_{i} = 2$ if surface i is an occluded surface, and the adjacent object occluding that surface cannot be moved. Therefore, there are $3^{n}$ object arrangement patterns. The following ternary number is introduced to identify the object arrangements. (4) $I d = \sum_{i = 1}^{n} f_{i} * 3^{i - 1}$ (4)

Figure 4. Sliding up an object while pressing it on the wall by gripper.

If Id = 0, the object is floating in the air, and $I d_{c u b} = 222222_{(3)}$ or $I d_{h e x} = 22222222_{(3)}$ represents the object being placed inside a box. Figure shows representative object arrangement patterns of the cuboid model.

Figure 5. Representative object arrangement patterns of the cuboid model. The yellow cuboid is a target object, orange objects are moveable objects, and gray faces represent immovable objects.

In some cases, only a part of the surface is occluded in object arrangements, or there is a gap between the target and the adjacent object. This is regarded as an intermediate state between when the surface is fully occluded and when it is not occluded at all. For instance, Figure (a) depicts the case where a part of the object is protruding from the desk. In this case, the object arrangement is regarded as an intermediate state between Id = 0 and Id = 2, as shown in Figure . Figure (b) shows that the target object is at the middle and taller than the objects on the left and right. The left object is taller than the object on the right. In this case, the object arrangement is regarded as an intermediate among the three state of $I d = 2, 245$ and 272 in Figure . Whether a gripper can grasp the partial occluded surfaces depends on the size of the exposed area or size of the gap between the surface and the adjacent objects, and mechanical features of the gripper such as gripper type, shape, grip force, control accuracy.

Figure 6. Object arrangement pattern in which part of the surfaces are occluded. (a) An object protruding from a desk and (b) middle object is the target object and is taller than the left and right objects, and the left object is taller than the right object.

3. Picking arranged objects

In this section, we present a strategy for picking up arranged objects.

3.1. Grasp pattern

The grasp surface of an object is determined by the mechanical features of the gripper such as gripper type, shape, stroke. Moreover, there are a number of grasping methods even for the same gripper, depending on which surface of the object is grasped. The combination of grasp surfaces on the model when an object is grasped is called the grasp pattern and is described as follows: (5) $G = (g_{1}, g_{2}, \dots, g_{i}, \dots, g_{n}), n = 6 o r 8$ (5) where n is the number of surfaces in the object model. $g_{i} (i = 1 \sim n)$ is a binary number that represents whether the face is a grasp surface. If the i-th face is a grasp surface, $g_{i} = 1$ , else, $g_{i} = 0$ . The order of the face is the same as that in Equations (Equation1(1) $\begin{aligned} F_{c u b} & = (B o t t o m, T o p, F r o n t, R i g h t, B a c k, L e f t) \end{aligned}$ (1) ) or (Equation2(2) $\begin{aligned} F_{h e x} & = (B o t t o m, T o p, F r o n t, R i g h t_F r o n t, R i g h t_B a c k, \\ B a c k, L e f t_B a c k, L e f t_F r o n t) \end{aligned}$ (2) ). Figure shows grasp patterns of a cuboid model when a two-fingered gripper can grasp all faces of the object within the finger stroke. Then, the following three grasp patterns can be defined: (6) $\begin{aligned} G_{1} & = (1, 1, 0, 0, 0, 0) \\ G_{2} & = (0, 0, 0, 1, 0, 1) \\ G_{3} & = (0, 0, 1, 0, 1, 0) \end{aligned}$ (6)

Figure 7. Three grasp patterns of a cuboid model when the object is grasped by a two-fingered gripper.

3.2. Grasping strategy for picking up arranged objects

A grasping strategy is an action sequence for picking up an object from an arrangement. To pick-up an object, the following formula must be satisfied: (7) $\begin{aligned} f_{2} \neq 1 \end{aligned}$ (7) (8) $\begin{aligned} \forall k \in i s . t . n_{k} \cdot d > 0 \Rightarrow f_{k} = 0 \end{aligned}$ (8) (9) $\begin{aligned} \sum_{i = 1}^{n} f_{i} * g_{i} = 0 \end{aligned}$ (9) where $n_{k}$ is the outward normal vector of surface k, and $d$ is the direction to which the object is picked up. Equation (Equation7(7) $\begin{aligned} f_{2} \neq 1 \end{aligned}$ (7) ) expresses the situation where no moveable object is on top of the target object, and Equation (Equation8(8) $\begin{aligned} \forall k \in i s . t . n_{k} \cdot d > 0 \Rightarrow f_{k} = 0 \end{aligned}$ (8) ) describes the situation where no adjacent object is present in the direction to which the object is picked up. Equation (Equation9(9) $\begin{aligned} \sum_{i = 1}^{n} f_{i} * g_{i} = 0 \end{aligned}$ (9) ) expresses the situation when all grasp surfaces are accessible. The gripper must be able to approach and pick up the object, and there must be no adjacent objects in the approach trajectory. If the direction of the approach is opposite to the action of picking up, then Equation (Equation8(8) $\begin{aligned} \forall k \in i s . t . n_{k} \cdot d > 0 \Rightarrow f_{k} = 0 \end{aligned}$ (8) ) is satisfied. Where either Equations (Equation7(7) $\begin{aligned} f_{2} \neq 1 \end{aligned}$ (7) )–(Equation9(9) $\begin{aligned} \sum_{i = 1}^{n} f_{i} * g_{i} = 0 \end{aligned}$ (9) ) are satisfied, it is a graspable condition. If a grasp pattern that satisfies a graspable condition exists, the object can be directly grasped and picked up using the grasp pattern. Else, different grasping strategy is required to pick up an arranged object. The following is a description of the typical grasping strategies:

S1:	Remove the adjacent objects. In this case, the adjacent objects should be movable and graspable with the gripper. If Equations (Equation7(7) $\begin{aligned} f_{2} \neq 1 \end{aligned}$ (7) ) or (Equation8(8) $\begin{aligned} \forall k \in i s . t . n_{k} \cdot d > 0 \Rightarrow f_{k} = 0 \end{aligned}$ (8) ) is not satisfied, this strategy should be selected. If the adjacent object has no grasp pattern that satisfies the graspable condition, then a grasping strategy for removing the adjacent object is selected according to the arrangement state of the adjacent object. And moreover, if the grasping strategy requires removing the object further next to the adjacent object, then a grasping strategy for removing the object further next to the adjacent object is selected. This is repeated until all adjacent objects that need to be removed for picking up the original target are removed.
S2:	Grasp between a gap. Insert a finger between the target object and the adjacent object to access the occluded grasp surface and grasp it. In this case, the object arrangement state can be regarded as the intermediate state. Whether the finger can insert into the gap depends on the size of the gap, size of the finger and the control accuracy of the robot. The gap must be detected by tactile or visual sensors. The control accuracy required is determined by the gap size. The narrower the gap the more difficult it is to insert a finger into the gap.
S3:	Non-prehensile manipulation. Exert non-prehensile manipulations to the accessible surfaces of the target object and expose the occluded grasp surfaces, then grasp the target. Non-prehensile manipulation is a manipulation that translates or rotates an object without grasping the object. Figure shows representative non-prehensile manipulations in which a finger touches one surface of the object. Pushing and dragging are sliding manipulations executed by contact between the finger and the side or top face of the object, respectively. Pulling is a sliding manipulation executed by sucking the side face of the object with a vacuum gripper. Turning is a rotating manipulation around the vertical axis, executed by contact between the finger and the side face of the object. Tilting is a manipulation that tilts an object with the finger in contact with the top face of the object. Raising is a manipulation that raises an object with the finger pushing the object against a wall. The non-prehensile manipulation technique is selected according to the object arrangement pattern and gripper type.

The object picking task for arranged object can be expressed by the state transition diagram with the object arrangement pattern described in Section 2.1 as a node. Actions have been assigned to state transitions between each node. The grasping strategy can be generated by obtaining the action sequence when the object state transitions from the arranged state to a graspable condition. In the generation of the grasping strategy, the path of the state transitions and actions between nodes depends on the object type, and mechanical features of the gripper such as gripper type, shape, finger stroke, grip force or suction force. Thus, the grasping strategy should be generated by considering the objects and the gripper. When transitioning between two object arrangement patterns, the graspable condition may be satisfied in the intermediate state before the occluded grasp surfaces are completely exposed. Whether the intermediate state satisfies the graspable condition depends on the conditions described in Section 2.2. Therefore, it is necessary to evaluate in advance the size of the grasp surfaces required to grasp the object with a real gripper.

Figure 8. Non-prehensile manipulations to expose occluded grasp surfaces.

4. Experiment on picking up arranged objects

Here, we demonstrate examples of picking up arranged objects with a physical robot. The object picking system comprises a manipulator and a 3D camera, as shown in Figure . The manipulator was equipped with a new developed gripper on the tip of the robotic arm (LBRiiwa14R820, KUKA Robotics), and the 3D camera used was the Astra S (Orbbec 3D Tech. Intl. Inc.).

Figure 9. Overview of the object picking system.

In the experiment, the object models are given in the system, and the real objects that are lined up horizontally and vertically on a shelf, as shown in Figure , were picked up. The horizontally and vertically arranged objects are of typical arrangement patterns.

Figure 10. Objects arranged on a shelf.

Picking up arranged objects, as when refilling shelves with products, may involve repeatedly picking the same type of object. Therefore, the grasping strategy that can be applied to many objects is given preference in strategy selection. Grasping strategy selection must consider what direction the target object can be moved in for transition of the object arrangement state. To describe the motion of the target object, we introduce a right-handed coordinate system where Front is in the $+ x$ axis direction, and the Right is in the $+ y$ axis direction. The translation and rotation along each axis are expressed as $T r a n s_{{+, -} {x, y, z}}$ and $R o t_{{+, -} {x, y, z}}$ , respectively.

In Figure , the object arrangement pattern of the object on the left end, middle, and right end in the left stack are denoted by Id = 29, Id = 272, and Id = 245, respectively. All objects in the left stack can move along $T r a n s_{{+ x}}$ , $T r a n s_{{+ z}}$ and $R o t_{{+, -} {y}}$ . The left end object can additionally move along $T r a n s_{{- y}}$ and $R o t_{{+ x}}$ , and the right end object can additionally move along $T r a n s_{{+ y}}$ and $R o t_{{- x}}$ . If the robot picks up all objects from the left or right of the stack, the same grasping strategy can be applied to all objects in the stack.

In the right stack, only the top object satisfies the condition of Equation (Equation7(7) $\begin{aligned} f_{2} \neq 1 \end{aligned}$ (7) ). The object arrangement pattern of the top object is Id = 1, and it can move along $T r a n s_{{+ x}}$ , $T r a n s_{{+, -} {y}}$ , $T r a n s_{{+ z}}$ and $R o t_{{+, -} {z}}$ . In this case, the robot should pick up objects from the top to the bottom, and the same grasping strategy can be applied to all objects.

A concrete grasping strategy for object picking is dependent of the gripper type. We will now describe the gripper developed for object picking in this study.

4.1. Developed gripper

As shown in Figure , the developed gripper comprise a two-fingered gripper and a suction gripper. To reach into narrow spaces of shelves and pick up a wide variety of objects, the width of the gripper base should be compact and the finger stroke width should be wide. Moreover, to allow the insertion of a finger of the two-fingered gripper between objects, parallel rectilinear motion of both fingers when opening and closing is desirable. To date, many two-fingered grippers have been developed using a rack and pinion, feed screw, or parallel link mechanism. Two-fingered grippers using a rack and pinion mechanism or feed screw mechanism are capable of opening and closing their fingers through parallel rectilinear motion, but the base of the grippers are wider than the finger stroke width. This is disadvantageous for reaching into narrow spaces. Moreover, a two-fingered gripper using a parallel link mechanism allows for a compact gripper base, but it does not allow for linear motion of the fingertips when opening and closing the fingers. Thus, with current grippers, it is difficult to make a grasp plan due to the change in the length from the base to the fingertip in accordance with the finger stroke width. In this study, the two-fingered gripper mechanism was developed by combining a Scott Russell linkage and a parallel link. Although this favors a compact gripper base, it also enables parallel, rectilinear motion of the fingers when opening and closing. The fingers of the two-fingered gripper are 115 mm from the base to the fingertip, the finger stroke is 0–150 mm, and the width of the gripper base is 64 mm. A suction gripper was also developed for the object picking system. The suction gripper has a telescopic arm, and as shown in Figure (b), the suction cup has range of 0–90 $^{\circ}$ . The telescopic stroke of the suction gripper is 0–148 mm, and when the gripper is extended, the suction cup protrudes 33 mm from the tip of the two-fingered gripper. The diameter of the suction cup is 30 mm. Moreover, the suction gripper has a pressure sensor to detect contact between the suction cup and an object. Overall, the gripper length is 411 mm from the base bottom to the fully extended suction cup. So far, several grippers combining a two-fingered gripper and a suction gripper have been developed [Citation30, Citation31]. Since the fingertip trajectory of these grippers are non-linear, strategies for inserting a finger in between arranged objects are difficult to adopt.

Figure 11. Overview of the developed gripper.

4.2. Design of the grasping strategy with the developed gripper

The initial condition of the gripper is that the finger of the two-fingered gripper is open, the suction arm is contracted, and suction cup is straight. The basic action of the gripper is as follows:

(A):	pinch by the two-fingered gripper
(B):	object suction by the suction gripper
(C):	extension of the suction arm
(D):	contraction of the suction arm
(E):	rotating the suction cup to 90 $^{\circ}$
(F):	returning the suction cup to 0 $^{\circ}$

Figure shows basic actions of the suction gripper. By combining these basic actions, the following strategies can be adopted (as shown in Figure ).

Figure 12. Basic actions of the suction gripper.

Figure 13. Grasping strategies of the gripper.

M1: A

M2: C+B

M3: C+E+B

M4: C+B+D+A

M5: C+E+B+F+D+A

M6: M2+M1

M7: M3+M1

M1 involves moving the gripper to the grasping point through the motion of the arm and pinching the object with the two-fingered gripper (Figure (a)). M2 involves extending the suction arm, bringing the suction cup into contact with the object through the motion of the arm while sucking air, and grasping the object through suction (Figure (b)). The contact between the suction cup and the object is detected by the pressure sensor. M3 involves extending the suction arm, rotating the suction cup by 90 $^{\circ}$ , bringing the suction cup into contact with the object through the motion of the arm while sucking air, and grasping the object through suction (Figure (c)). The contact between suction cup and the object is detected by the pressure sensor. This grasping strategy is selected the top of the object is exposed. M4 involves strategy M2, followed by pulling the object between the two fingers by contracting the suction arm then pinching the object (Figure (d)). M5 involves strategy M3, followed by returning the suction cup to 0 $^{\circ}$ , and pulling the object between the two fingers by contracting the suction arm and pinching the object (Figure (e)). M4 and M5 involve grasping the object with both the two-fingered gripper and the suction gripper to grasp the object in a stable manner. M6 and M7 involve M2 and M3, respectively. The motion of the arm then reveals the occluded surface, and after the object is temporarily released, and then grasped with the two-fingered gripper (Figure (f,g)). Therefore, this grasping strategy involves a regrasping action.

The grasping strategy should be developed according to the objects and the gripper. Using the developed gripper, the various grasping strategies for picking up the arranged objects, as shown in Figure , was developed. The grasping strategy with the lowest number of motion steps, which can be applied to a large number of objects, was prioritized.

First, we designed the grasping strategies for picking up the horizontally stacked objects, as shown on the left in Figure . The objects cannot be directly grasped by the two-fingered gripper, as the left, right, or both faces are occluded by adjacent objects. Whereas the front and top faces, and the left and right faces of the far left and far right objects, respectively, are exposed and are thus graspable with the suction gripper. However, when grasping the left or right face with the suction gripper, there is a possibility that the object will topple sideways when the suction cup is pushed onto the surface. Further, the top face of the object is difficult to visually sense with the camera. Thus, the front face is the ideal grasp surface for the suction gripper, and the strategy was selected that moves the object along $T r a n s_{{+ x}}$ for grasping. The applicable strategies in this case are M2, M4, and M6. When grasping is only performed through the suction gripper, as in strategy M2, the object tilts because the suction cup is soft, as shown in Figure . Moreover, as grasping strategy M6 involves regrasping, the number of motion steps increases. Thus, the object can be picked up by employing the M4 strategy as this strategy uses both the two-fingered gripper and the suction gripper, and therefore, can achieve a stable grasp.

Figure 14. The object tilts when sucking the side of the object with the suction gripper.

Next, we designed the grasping strategies for picking up the vertically stacked objects, as shown on the right in Figure . The object cannot be grasped directly by the two-fingered gripper as the bottom face is occluded by the object beneath it. Thus, the suction gripper is needed to pick up the objects. For the suction gripper, the grasp surfaces are the front, right, and left faces, as well as the top face of the top object. In this case, the front, right, and left faces of the objects are small vertically, and therefore require highly precise positioning when placing the suction gripper. Thus, the ideal grasp surface for the suction gripper is the top face, and a strategy was selected that moved the object along $T r a n s_{{+ x}}$ or $T r a n s_{{+ z}}$ for grasping. In this scenario, strategies M3 or M7 are applicable. As the target object was light, M3 is the ideal strategy owing to fewer motion steps involved. In the case of heavy objects that cannot be grasped using the suction gripper alone, grasping strategy M7 must be applied.

4.3. Experiments

The grasping points for each grasp pattern of the two-fingered gripper and the suction gripper are set in the object model in advance. In the experiment, first, object arrangement states are recognized. The robot selects the grasping strategy based on the recognition result, and the gripper and the grasp pattern used for picking are determined. Next, the position and posture for each of the individual objects in the object arrangement are estimated, and the robot get the grasping points in the world frame using the object model. Thereby, the robot picks up the objects by applying the selected grasping strategy.

The recognition of the object arrangement was performed using the technique described in [Citation32], and segmented the regions where the same object arrangement state and identified the object arrangement. Specifically, segmentation was performed by inputting two-dimensional RGB images of the scene, detecting individual objects with a bounding box (BB) through general object detection, defining a four-dimensional feature vector of which elements are central position $(x, y)$ , width and height of the BB, and clustering the feature vectors through density-based spatial clustering of applications with noise (DBSCAN) [Citation33]. Then, object BBs in the same cluster were converted into 1-of-K vectors using self-organizing maps (SOM) [Citation34], and their sum was used to create a bag-of-words [Citation35]. Support vector machine (SVM) [Citation36] was used for the identification of horizontal and vertical stacks in the segmentation. Figure (a) shows the general object detection results for boxed objects, and Figure (b) shows the segmentation results for the identification of arranged object regions and arrangement patterns.

Figure 15. Recognition of the arranged objects and stacks. (a) Detection of the arranged objects and (b) recognition of the object arrangement.

Next, the position and posture for each of the individual objects was estimated. The arranged objects vary in terms of position according to the arrangement patterns, but objects are lined up in virtually the same posture. Using this fact, the 3D point cloud and object model were compared, and the position and posture of the object was estimated. The 3D shape of the object was obtained from the object model. In the case of horizontally stacked objects, the point cloud was matched with the model from the BB of the end object, and in the case of vertically stacked objects, the point cloud in the BB of the top object was collated. Using the 3D features termed the Shape Index [Citation37, Citation38], the position and posture were obtained. To estimate the position and posture of the second and subsequent objects, the previously obtained object positions were translated in the horizontal direction along the front of the object arrangement region for horizontally stacked objects and translated along the vertical direction for vertically stacked objects. The position and posture estimation were performed in the same way, by matching the point cloud in the BB and the model. This reduces computing costs of estimating the position and posture of the second and subsequent objects in the same arrangement pattern.

By identifying the object arrangement pattern and estimating the position and posture of the object included in the object arrangement, the grasping strategy and its grasping point were obtained. Then, the robot started the object picking task by applying the grasping strategy. The procedure for object picking is as follows: First, the gripper moves to the starting point according to the selected grasping strategy. In our experiment, the starting point was at a line that passes through the grasping point perpendicular to the grasp surface. The distance from the grasping point was set to constant in advance. Then, the robot grasps the object by applying the selected grasping strategy. After grasping the object, the robot moves to an escape point, which is the same as the starting point. Then, the robot brings the object to the predetermined destination point.

Figures and depict object picking when objects are placed in horizontal and vertical stacks, respectively, and grasping strategies are selected based on the 3D camera recognition results (strategies M3 and M4).

Figure 16. Picking an object from a horizontal stack using grasping strategy M4.

Figure 17. Picking an object from a vertical stack using grasping strategy M3.

In the experiment on picking up a horizontally stacked object, the gripper moved to the starting point for picking the object on the far left (Figure (a)), and the suction cup made contact with the front of the object at the grasping point (Figure (b)). Then, the suction arm contracted for pulling out the object, and the object was grasped with two-fingered grippers (Figure (c)) and moved to the escape point (Figure (d)), and then to the destination point.

In the experiment on picking up a vertically stacked object, the gripper moved to the starting point for picking the top object (Figure (a)), the suction gripper approached close to the grasping point on top surface of the object with the suction cup facing downwards, the object was sucked with the suction gripper (Figure (b)), moved to the escape point (Figure (c)), and then moved to the destination (Figure (d)).

5. Conclusions

In this paper, we described object picking focusing on object arrangement patterns. Firstly, we defined accessible surface, occluded surface, and grasp surface from the viewpoint of the finger access to the object surfaces and grasping the object. We represented a variety of objects in polyhedral primitives such as cuboids or hexagonal cylinders and modeled object arrangements considering the combination of occluded surfaces of the object model, and whether the adjacent object occluding the object surface is moveable. Moreover, grasp patterns were defined by combining the grasp surfaces and the graspable condition for arranged objects was derived and the grasping strategy for picking up an arranged object was discussed.

In addition, we introduced newly developed gripper for picking arranged objects. The gripper comprises a suction gripper and a two-fingered gripper. The suction gripper has a telescopic arm and a swing suction cup. The two-fingered gripper mechanism combines a Scott Russell linkage and a parallel link. This mechanism is advantageous for reaching into narrow spaces and inserting the finger between objects. We described the design of the grasping strategy for picking up horizontally and vertically arranged cuboidal objects using the newly developed gripper and conducted an experiment of picking up the objects. The basic actions of the gripper can expose the occluded surfaces of an object placed in other arrangement states. The grasping strategy may be generated for picking up objects in other arrangement states by combining these basic actions. In the experiment, the grasping strategy was developed manually. The automatic generation of the grasping strategy is a challenging theme which involves task planning and grasp planning. In the future, we plan to conduct research on automatic generation of the grasping strategy.

The object arrangement patterns in the experiment are relatively simple but typical. Complex arrangement patterns may be present in objects stored in warehouses or stores, where only parts of the object surfaces are occluded, for instance, when there are gaps between objects or when objects are stacked in a tilting pile. In the future, we also plan to investigate picking up various types of objects arranged in complicated patterns with the aim of automating robotic tasks in distribution warehouses or stores.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This work was supported by JSPS KAKENHI [grant number JP17H01805].

Notes on contributors

Kazuyuki Nagata

Kazuyuki Nagata received his B.S. and Ph.D. in Engineering from Tohoku University, Japan in 1986 and 1999, respectively. He joined Tohoku National Industrial Research Institute (TNIRI), Ministry of International Trade and Industry in 1986, he was assigned to Electrotechnical Laboratory (ETL) in 1991, and assigned to Planning Headquarters of the National Institute of Advanced Industrial Science and Technology (AIST) in 2001. He is currently a senior research scientist at AIST. His current research interests include mechanics of robot hands, robotic manipulation and grasping.

Takao Nishi

Takao Nishi received his Ph.D. in Agriculture from Okayama University in 1999. After working at the National Institute of Advanced Industrial Science and Technology (AIST), he has been a specially appointed associate professor at the Graduate School of Engineering Science, Osaka University since 2020. His research interests include computer vision, intelligent robotic systems, and agricultural machinery.

References