1,328
Views
25
CrossRef citations to date
0
Altmetric
Biomedical Paper

Tracking endoscopic instruments without a localizer: A shape-analysis-based approach

, , &
Pages 35-42 | Received 17 Apr 2006, Accepted 30 Jul 2006, Published online: 06 Jan 2010

Abstract

We present an approach to localizing endoscopic instruments with respect to the camera position, based purely on processing of the endoscope image. No localizers are needed; the only requirement is a colored strip at the distal part of the instrument shaft to facilitate image segmentation. The method exploits perspective image analysis applied to the cylindrical shape of the instrument shaft, allowing measurement of the instrument position and orientation with five degrees of freedom. We describe the method theoretically, and experimentally derive calibration curves for tuning the parameters of the algorithm. Results show that the method can be used for applications where accuracy is not critical, such as workspace measurement, gesture analysis, augmented-reality guidance, telementoring, etc. If this method is used in combination with an endoscope tracker or a robotic camera holder, full localization with respect to the patient reference frame can be achieved.

Introduction

Localization of objects in space is one of the key prerequisites for providing computer assistance in the operating room (OR) Citation[1]. By tracking the position and orientation of the surgical instruments and the patient in the surgical workspace, it is possible to measure distances, compute trajectories, analyze gestures, etc. Localizers are based on optical, electromagnetic, mechanical, or sonic technology Citation[2], each type having its own advantages and drawbacks. Hybrid systems that combine different approaches have also been developed Citation[3]. Optical localizers are the most commonly used in minimally invasive surgery (MIS) applications because of their good accuracy and reliability. Six different optical localizers have been extensively evaluated in references Citation[4–7]. However, the requirement for a line of sight between the localizer cameras and the tracked objects limits the set-up of the OR and hampers the movements of the surgeon, since, with the instrument tips being inserted into the patient's body, the tracking targets must be accommodated on the handles of the instruments. The positioning of localizers in the cluttered OR is itself a critical issue. Also, during laparoscopy the surgeon uses many different instruments; providing all of them with sensors for localization is troublesome and may slow down switching between instruments.

Mechanical localizers consist of a pointer that is rigidly linked to a 6-axis coding robot arm, the joints of which are sensorized by encoders Citation[8–10]. Mechanical localizers are highly accurate, but the need for rigid links between the reference system and the tracked objects can result in a very bulky set-up in the OR and cumbersome handling for the medical staff, especially if it is necessary to track more than one object. Electromagnetic localizers have not been the most appropriate systems for image-guided surgery (IGS) since they require careful control of the environment in which they are used in order to avoid distortions in the electromagnetic field. Moreover, materials that are supposed to be non-ferromagnetic such as stainless steel can still affect them Citation[11]. Even so, by careful selection of materials and methods it is possible to incorporate these systems into real products. Their accuracy, though acceptable for some IGS applications Citation[12–14], is typically inferior to that achievable with optical and mechanical systems. Other localization technologies, such as sonic-based systems, are poorly suited for use in the OR because of their sensitivity to perturbations and their “line-of-hearing” limitation, but their use has been widely investigated Citation[15–17].

In this paper we describe a novel method for localization of laparoscopic instruments with respect to the camera position, based purely on laparoscope image processing; no localizers are required. The method exploits perspective image analysis applied to the cylindrical shape of the instrument shaft, allowing measurement of the instrument position and orientation with five degrees of freedom (DOFs). The only requirement to enable tracking of conventional surgical instruments is a colored strip at the distal part of the instrument shaft to facilitate image segmentation. Another image-processing-based approach has recently been presented by Voros et al. Citation[18]: their method does not require a color marker and is quite rapid and robust to partial occlusion; however, it only allows detection of the instrument position (3 DOFs) and requires separate measurement of the trocar positions.

Methods

The proposed method operates in two steps: first, image analysis is performed to extract relevant geometrical features from the color-coded region on the endoscope image; then object coordinates are computed using these features.

Image analysis

Color segmentation

To perform shape analysis on the instrument shaft, it is necessary to distinguish the endoscopic instrument with respect to the operative scene. For this purpose, we have implemented the HSV (hue saturation value) segmentation method described in reference Citation[19]. The distal part of the instrument shaft is color-coded with a marker -- a strip of azure adhesive tape that occupies a free region in the HSV space of the endoscope images. shows the color-coded instrument, and shows a typical color distribution in the H-S plane of laparoscopic images of a surgical procedure; the highlighted oval contains the distributions of the pixels corresponding to the azure strip. By selecting pixels with the hue and saturation values in the highlighted area, the colored marker can be segmented automatically, since the colors in this area belong only to the surgical instrument and not to the anatomy.

Figure 1. (a) The color-coded instrument. (b) The averaged color distribution in the H-S plane for 3000 laparoscopic images, together with the H-S occupation of the colored band in the highlighted oval. (c) The segmented image with superimposed geometric features needed for localization. [Color version available online.]

Figure 1. (a) The color-coded instrument. (b) The averaged color distribution in the H-S plane for 3000 laparoscopic images, together with the H-S occupation of the colored band in the highlighted oval. (c) The segmented image with superimposed geometric features needed for localization. [Color version available online.]

However, since lighting conditions vary between different operations, laparoscopes and light sources, choosing a pre-determined region of the H-S plane to perform color segmentation, as in reference Citation[19], yields poor results. We therefore developed and implemented an algorithm to automatically determine the region of the H-S plane occupied by the colored strip, filtering out the background. This segmentation algorithm is schematically illustrated in . The upper part of shows the pre-processing phase, which is necessary to determine the color region in the H-S plane occupied by the colored strip. Two images are required, one containing only the operative background, i.e., the anatomy that is being operated on, and one also containing the surgical instrument, placed in front of the same -- or a very similar -- background. The algorithm first computes the H-S plane histogram of the two images; this is done by first converting the endoscope images from RGB (red-green-blue) to HSV format, and then by counting, for every value of hue h and saturation s, the number of pixels in the HSV image with those h and s values. Simplistically stated, each pair of (h, s) numbers defines a certain color and the histogram represents how many pixels of the endoscope image have that color. The histograms are then thresholded, in order to keep only the principal colors in the endoscope images. Computing the difference between the two segmented histograms yields the color difference of the two images. Part of the background colors will be different, showing up as spot noise, i.e., isolated pixels, in the color difference image; the main differences, however, will appear in the region corresponding to the colors of the instrument. To eliminate the spot noise, image erosion is performed until only one cluster is present in the image. In order to recover the full 9-connected shape of the eroded region, erosion is followed by seeded region growing, the seed being the centroid of the cluster. This region corresponds to the colors of the colored strip, and the coordinates of its bounding rectangle will be used to detect the instrument in all subsequent endoscope images.

Figure 2. Schematic representation of the image processing steps needed for color segmentation: the pre-processing phase is shown at the top, while the lower part shows the intra-operative segmentation process applied to two endoscope images. See text for a detailed description. [Color version available online.]

Figure 2. Schematic representation of the image processing steps needed for color segmentation: the pre-processing phase is shown at the top, while the lower part shows the intra-operative segmentation process applied to two endoscope images. See text for a detailed description. [Color version available online.]

This pre-processing phase need be performed only once at the beginning of the operation. The lower part of shows the intra-operative color segmentation phase, which must be repeated on every endoscope image to detect the instrument. The H-S plane histogram of the current endoscope image is computed and then clipped to the previously defined bounding rectangle. In the HSV representation of the endoscope image, only pixels with values of h and s within this bounding rectangle are retained (hmin < h < hmax, smin < s < smax) and the rest are discarded. The RGB-conversion of this pruned image contains only the colored strip.

Shape analysis

With reference to , we will now show how to compute the relevant geometric features needed for localization, starting from the segmented image. Point c is defined as the geometric centroid (center of gravity) of the segmented region R, highlighted in gray in . Let [p ∈ R] be the characteristic function of the segmented region R, denoted by the so-called Iverson bracket:where J is the whole image and p ∈ J is an image pixel which can be represented as 2-vector with its coordinates (row, column). The centroid c can be computed as:

The line direction a, corresponding to the axis of the instrument shaft, is computed as the axis of minimum inertia of the segmented region R. The moment of inertia tensor of R is defined as Iij:where i, j∈{x, y}, J is the whole image, δij is the Kronecker deltaand p = ‖p‖. a is computed as the eigenvector corresponding to the smallest eigenvalue of the moment of Iij. b is the straight line through c and perpendicular to a: b⊥ a, c ∈ b.

To find lines l1 and l2, a Hough transform is used. By applying edge detection to the segmented image, we isolate the contours of the segmented region R, obtaining R′, a 1-pixel-wide outline of the color-coded region. Collinear pixels transform to peaks in the Hough plane Citation[20]. In R′, the most collinear points are found in the two line segments that belong to lines l1 and l2. Therefore, the first two global maxima of the Hough transform of R′ correspond to the convergent lines l1 and l2 tangent to the instrument shaft (see ).

Localization

Neglecting the opening/closing movement of the end-effector, the kinematics of the endoscopic instrument is characterized by 4 DOFs: two rotation angles around the access point, the insertion depth, and the rotation of the instrument around its axis. However, in order to be able to univocally measure the position and orientation of the instrument with only four coordinates, the position of the trocar, which acts as a movement constraint, must be known. If the trocar position, which is subject to change over time as a consequence of movements of the patient's abdomen, is not being tracked during the operation, 6 coordinates are needed to measure the full instrument position and orientation. Referring to , the measurement of X, Y, Z, θ, and φ univocally determines the position and orientation of the surgical instrument, except for the roll angle. However, because instruments have cylindrical symmetry, the roll angle can be neglected for several surgical assistance tasks. In the following, we will show how to measure X, Y, Z, θ, and φ from the previously extracted geometrical features.

Figure 3. (a) Angles defining local orientation of an object. (b) Pinhole camera model.

Figure 3. (a) Angles defining local orientation of an object. (b) Pinhole camera model.

Measuring position: X, Y, Z

A simple geometric model for describing an endoscope camera is the pinhole camera model Citation[21], depicted in : the image p(x, y) of a point P(X, Y, Z) in the 3D space is defined as the intersection of the image plane I with the ray from P through a center of projection C(0, 0, λ). This projection can be best described by means of perspective geometry involving homogeneous coordinates (for convenience, the axes of the reference frames are defined to be coincident):where λ is the focal length of the camera, w is the fourth homogeneous coordinate, and k is an arbitrary non-zero constant. Real cameras are not geometrically exact, since lenses introduce minor irregularities into images, typically radial distortions. Several camera calibration techniques are available for compensation of these distortions Citation[22],Citation[23].

For reconstructing the position of a point P(X, Y, Z) in 3D space, the coordinates p(x, y) of its projection on the image plane are not sufficient, since this only allows reconstruction of the optical ray through p and P. If the Z-coordinate is also known, p can be determined univocally. From Equation (1): Perspective projection scales down objects with distance along the Z-axis. From Equation (2), the distance of the image points p1(x1,y1) and p2(x2,y2), corresponding to two points P1(X1,Y1,Z) and P2(X2,Y2,Z) lying on a plane parallel to the image plane, is reduced by the ratioTo determine the Z-coordinate of a point, we exploit the size reduction of the circular sections of the instrument shaft. Due to the perspective model, the shape of the color-coded region R changes with instrument orientation, making it difficult to estimate a circular section at the edges of the colored strip. We therefore preferred to estimate the diameter of the circular section through the centroid c. The diameter d (in pixels) of the circular section is estimated as the length of the line segment given by the intersection of b with the lines l1 and l2:The relation Z(d) is determined experimentally in the Results section.

Measuring orientation: θ and φ

θ is defined and can be measured as the angle formed by line a with the X-axis of the endoscope image. Perspective projection makes parallel lines converge. The shaft of an instrument, when placed at an angle with respect to the image plane, progressively appears more conical (see and ). By measuring this convergence, it is possible to compute φ, the rotation of the instrument around the Y-axis. Having previously computed l1 and l2, we define the convergence angle α as the angle between two unit vectors lying on these two lines: The angle φ, defining the rotation of the instrument around the Y-axis, can be computed by measuring α, since the relationship between α and φ can be derived from Equation (2). In the Results section, we will derive this relation experimentally.

Figure 4. Line convergence due to perspective transform. (a) and (b) are images of the instrument placed at different angles with respect to the image plane. The highlighted lines in (a) and (b) correspond to the first two global maxima of the Hough transform of the images (c).

Figure 4. Line convergence due to perspective transform. (a) and (b) are images of the instrument placed at different angles with respect to the image plane. The highlighted lines in (a) and (b) correspond to the first two global maxima of the Hough transform of the images (c).

Experimental setup

For measuring experimentally the relation φ(α, d), Storz 0° and 30° laparoscopes were used, connected to a Storz TRICAM 3-chip camera. A Y/C connection to a Matrox RT 3000 frame-grabber was used to capture the images at 640×480 pixels. This equipment is used routinely in the OR. Images were calibrated using the MATLAB Camera Calibration Toolbox, based on the work described in reference Citation[24], to compensate for distortions in the imaging system. After analyzing the HSV domain of images from recorded laparoscopic surgeries, we selected our color code, which was well away from the usual HSV domain of these images. Azure adhesive tape (45 mm in width) in this chosen color was attached cylindrically to the shaft of a laparoscopic instrument (10 mm in diameter), near the operative end but separated from the active part. Using sterile adhesive will make the procedure simple, without requiring any specific pre-operative preparation or causing much disturbance to the actual surgical setup.

Eight series of 24 images were acquired at various orientations and distances from the camera using a specially designed set-up, shown in , to accurately measure the distances (accuracy ±1 mm) and angles (accuracy ±0.25°). φ was varied from 0° to 72° at 3° increments in the following set-ups:

  • 0° camera, at Z = {50, 70, 90, 100} mm and θ = 0°,

  • 30° camera, at θ = {22°, 45°, 67°, 90°} and Z = 75 mm,

where Z is the distance between the camera lens and the center of the color-coded region.

Figure 5. Experimental setup for measuring distances and angles.

Figure 5. Experimental setup for measuring distances and angles.

The images obtained, after distortion correction, were binarized by passing them through a window filter for the coded color in the HSV space. A Sobel filter was used to detect edges in the segmented images. The Radon transform Citation[25], closely related to the Hough transform, was used to detect straight lines in the images.

Results

The measurements taken are represented graphically in . From relation α = α(φ, θ) () we deduce the independence of measurements from θ, which is in line with the cylindrical symmetry of the optical system and the pinhole camera model. Also, the curves d = d(φ, θ) at fixed Z and varying θ display the same behavior, which is quite obvious, since varying θ is equivalent to rotating the image around the optical axis.

Figure 6. Experimental relation φ = φ(α, θ).

Figure 6. Experimental relation φ = φ(α, θ).

Figure 7. Experimental relation φ = φ(α, Z).

Figure 7. Experimental relation φ = φ(α, Z).

Figure 8. Experimental relation d = d(φ, Z).

Figure 8. Experimental relation d = d(φ, Z).

Relation α = α(φ, Z) () shows monotonous behavior in the full range and is nicely approximated by a second-order polynomial. At Z ≥ 70 mm, the curves overlap, allowing an estimation of φ from the measured α by inverting the plotted function, without the need to know Z. Laparoscopy is performed using two instruments for the intervention: to be able to view the color-coded strips of both instruments, with the viewing angle of typical laparoscopic cameras, Z ≳ 100 mm is required.

Measurement of instrument size d = d(φ, Z) () is quite independent of φ, except for very close distances. It is therefore possible to estimate the Z-coordinate of the instrument independently from the instrument orientation given by θ and φ.

It is worth noting that the localization algorithm is robust with respect to over-saturated images: while in the bottom row of the white areas of the colored strip are not detected as part of the marker, the localization algorithm is still able to correctly compute the instrument position.

Conclusions

We have presented a method for localizing five degrees of freedom of an endoscopic instrument, based on shape analysis and requiring only a colored strip at the distal part of the instrument shaft. Optical localizers achieve higher accuracy, but the proposed method is simple, requires no specific pre-operative preparation, and does not cause disturbance to the actual surgical set-up. Position and orientation are measured in the reference frame of the endoscope camera. If the position of the camera is tracked, e.g., by means of a localizer or a robotic camera holder, full localization with respect to the world reference frame can be achieved. The method can be optimal if used in combination with a robotic camera holder, since the robot position is easily measured and no unwanted camera motion will occur. The method can be suitable for applications where accuracy is not critical, such as workspace analysis, surgical navigation assistance tasks like proximity warnings Citation[26], and off-line analysis of video recordings for performance assessment Citation[27].

Acknowledgments

The authors are grateful to Prof. Andrea Pietrabissa (Division of General and Transplantation Surgery, University of Pisa, Italy) for supporting the research and for stimulating discussions. They also would like to thank Carlo Moretto and Andrea Peri for assistance during the experiments. This work was supported in part by the FIRB-2001 Project ApprEndo (No. RBNE013TYM) and by EndoCAS, the Center for Computer-Assisted Surgery (COFINLAB-2001 No. CLAB01PALK), both funded by MIUR, the Italian Ministry of Education, University and Research.

References

  • Troccaz J, Peshkin M, Davies BL. The use of localizers, robots and synergistic devices in CAS. Proceedings of the 1st Joint Conference on Computer Vision, Virtual Reality and Robotics in Medicine and Medical Robotics and Computer-Assisted Surgery (CVRMed-MRCAS '97), J Troccaz, E Grimson, R Mösges. Grenoble, France March 1997; 727–736, Lecture Notes in Computer Science 1205. Berlin: Springer; 1997
  • Roux C, Coatrieux JL. Contemporary Perspectives in Three-Dimensional Biomedical Imaging. Amsterdam, IOS Press 1997
  • Birkfellner W, Watzinger F, Wanschitz F, Enislidis G, Truppe M, Ewers R, Bergmann H (1998) Concepts and results in the development of a hybrid tracking system for computer aided surgery. Proceedings of the 1st International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI ’98), Cambridge, MA, October, 1998, WM Wells, A Colchester, S Delp. Springer, Berlin, 343–351, Lecture Notes in Computer Science 1496
  • Chassat F, Lavallée S (1998) Experimental protocol of accuracy evaluation of 6-D localizers for computer-integrated surgery: Application to four optical localizers. Proceedings of the 1st International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI ’98), Cambridge, MA, October, 1998, WM Wells, A Colchester, S Delp. Springer, Berlin, 277–284, Lecture Notes in Computer Science 1496
  • Li Q, Zamorano L, Jiang Z, Gong JX, Pandya A, Perez R, Diaz F. Effect of optical digitizer selection on the application accuracy of a surgical localization system -- a quantitative comparison between the OPTOTRAK and FlashPoint tracking systems. Comput Aided Surg 1999; 4(6)314–321
  • Khadem R, Yeh CC, Sadeghi-Tehrani M, Bax MR, Johnson JA, Welch JN, Wilkinson EP, Shahidi R. Comparative tracking error analysis of five different optical tracking systems. Comput Aided Surg 2000; 5(2)98–107
  • Schmerber S, Chassat F. Accuracy evaluation of a CAS system: Laboratory protocol and results with 6D localizers, and clinical experiences in otorhinolaryngology. Comput Aided Surg 2001; 6(1)1–13
  • Watanabe E, Watanabe T, Manaka S, Mayanagi Y, Takakura K. Three-dimensional digitizer (neuronavigator): New equipment for computed tomography-guided stereotaxic surgery. Surg Neurol 1987; 27: 543–547
  • Rohling R, Munger P, Hollerbach J, Peters T. Comparison of relative accuracy between a mechanical and an optical tracker for image-guided neurosurgery. J Image Guided Surg 1995; 1(1)30–34
  • Marmulla R, Hilbert M, Niederdellmann H. Immanent precision of mechanical infrared and laser guided navigation systems for CAS. Computer Assisted Radiology and Surgery. Proceedings of the 11th International Symposium and Exhibition (CAR '97), HU Lemke, MW Vannier, K Inamura. Berlin, June 1997; 863–865, Amsterdam: Elsevier 1997
  • Milne A, Chess D, Johnson J, King G. Accuracy of an electromagnetic tracking device: A study of the optimal range and metal interference. J Biomechanics 1996; 29(6)791–793
  • An K, Jacobsen M, Berglund L, Chao E. Application of a magnetic tracking device to kinesiologic studies. J Biomechanics 1988; 21: 613–620
  • Day J, Dumas G, Murdoch D. Evaluation of a long-range transmitter for use with a magnetic tracking device in motion analysis. J Biomechanics 1998; 31: 957–961
  • Meskers C, Fraterman H, van der Helm F, Rozing HVP. Calibration of the “Flock of Birds” electromagnetic tracking device and its application in shoulder motion studies. J Biomechanics 1999; 32: 629–633
  • Friets E, Strohbehn J, Hatch J, Roberts D. A frameless stereotaxic operating microscope for neurosurgery. IEEE Trans Biomed Eng 1989; 36: 608–617
  • Trobaugh J, Richard W, Smith K, Bucholz R. Frameless stereotactic ultrasonography: Method and applications. Computerized Medical Imaging and Graphics 1994; 18(5)235–246
  • Ryan M, Erickson R, Levin D, Pelizzari CA, Macdonald RL, Dohrmann GJ. Frameless stereotaxy with real-time tracking of patient head movement and retrospective patient-image registration. J Neurosurg 1996; 85: 287–292
  • Voros S, Orvain E, Long JA, Cinquin P. Automatic detection of instruments in laparoscopic images: A first step towards high level command of robotized endoscopic holders. In: Proceedings of the 1st IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics (BioRob). Pisa, Italy February 2006
  • Wei G, Arbter K, Hirzinger G. Real-time visual servoing for laparoscopic surgery. IEEE Engineering in Medicine and Biology 1997; 16(1)40–45
  • Pratt W. Digital Image Processing. Second edition. New York, John Wiley & Sons 1991
  • Fu K, Gonzales R, Lee C. Robotics. New York, McGraw-Hill 1987
  • Tsai R. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J Robotics Automation 1987; 3(4)323–344
  • Zhang Z. A flexible new technique for camera calibration. IEEE Trans Pattern Anal Machine Intell 2000; 22(11)1330–1334
  • Heikkila J, Silvén O. A four-step camera calibration procedure with implicit image correction. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '97). San Juan, Puerto Rico June 1997; 1106–1112
  • Toft P. The Radon transform -- theory and implementation. PhD thesis. Department of Mathematical Modelling, TU Delft, DelftThe Netherlands 1996
  • D'Attanasio S, Tonet O, Megali G, Carrozza M, Dario P. A semi-automatic hand-held mechatronic endoscope with collision-avoidance capabilities. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). San Francisco, CA April, 2000; 2: 1586–1591
  • Megali G, Sinigaglia S, Tonet O, Dario P. Modelling and evaluation of surgical performance using hidden Markov models. IEEE Trans Biomed Eng 2006; 53(10)1911–1919

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.