876
Views
16
CrossRef citations to date
0
Altmetric
Biomedical Papers

Fusion and visualization of intraoperative cortical images with preoperative models for epilepsy surgical planning and guidance

, , &
Pages 149-160 | Received 21 Jan 2011, Accepted 18 Apr 2011, Published online: 13 Jun 2011

Abstract

Objective: During epilepsy surgery it is important for the surgeon to correlate the preoperative cortical morphology (from preoperative images) with the intraoperative environment. Augmented Reality (AR) provides a solution for combining the real environment with virtual models. However, AR usually requires the use of specialized displays, and its effectiveness in the surgery still needs to be evaluated. The objective of this research was to develop an alternative approach to provide enhanced visualization by fusing a direct (photographic) view of the surgical field with the 3D patient model during image guided epilepsy surgery.

Materials and Methods: We correlated the preoperative plan with the intraoperative surgical scene, first by a manual landmark-based registration and then by an intensity-based perspective 3D-2D registration for camera pose estimation. The 2D photographic image was then texture-mapped onto the 3D preoperative model using the solved camera pose. In the proposed method, we employ direct volume rendering to obtain a perspective view of the brain image using GPU-accelerated ray-casting. The algorithm was validated by a phantom study and also in the clinical environment with a neuronavigation system.

Results: In the phantom experiment, the 3D Mean Registration Error (MRE) was 2.43 ± 0.32 mm with a success rate of 100%. In the clinical experiment, the 3D MRE was 5.15 ± 0.49 mm with 2D in-plane error of 3.30 ± 1.41 mm. A clinical application of our fusion method for enhanced and augmented visualization for integrated image and functional guidance during neurosurgery is also presented.

Conclusions: This paper presents an alternative approach to a sophisticated AR environment for assisting in epilepsy surgery, whereby a real intraoperative scene is mapped onto the surface model of the brain. In contrast to the AR approach, this method needs no specialized display equipment. Moreover, it requires minimal changes to existing systems and workflow, and is therefore well suited to the OR environment. In the phantom and in vivo clinical experiments, we demonstrate that the fusion method can achieve a level of accuracy sufficient for the requirements of epilepsy surgery.

Introduction

In typical image guided neurosurgery implementations, a preoperative surgical plan is registered to the patient space in the operating room so that the surgeon can be guided to the surgical targets. Nevertheless, the lack of correspondence between the intraoperative context and the preoperative plan on the display, e.g., a computer screen or head-mounted display (HMD), poses a challenge to the surgeon's ability to mentally correlate the two spaces. For example, for epilepsy surgery or tumor resection in the left temporal lobe, electro-cortical stimulation mapping (ESM) is often performed to locate the important language and/or memory areas so that they can be spared during the surgery. However, as the surgeon stimulates the cortical surface and marks the location of these critical functional areas, the intraoperative context cannot be easily updated to enable it to be displayed with the preoperative plan. This limits the surgeon's ability to quantitatively correlate the intraoperative information with the preoperative plan, such as that provided by functional magnetic resonance imaging (fMRI) analysis.

Fusion of the intraoperative scene with the preoperative plan is one approach to solve this problem, providing complementary information from both sources and creating an Augmented Reality (AR) environment that may assist the surgeon in performing procedures more effectively. Fusion may be accomplished with a direct view of the real scene by combining the virtual information with the direct optical view, e.g., using half-silvered mirrors Citation[1], or by electronically fusing the virtual model with a video image of the operating field Citation[2], Citation[3]. The direct optical view offers several advantages compared to the video fusion approach because it can provide better depth perception. Both approaches can be implemented with HMDs or externally mounted monitors. Generally, these methods require tracking of the user's viewpoint and viewing direction so that the virtual image can be rendered in the correct position and orientation within the AR scene.

An alternative to providing a fused image containing the real scene along with the virtual models is to project the 2D image of the real scene onto the virtual models. This technique can be considered an example of Augmented Virtuality (AV), and offers several advantages over AR: Not only is it conveniently implemented in the operating room, but it does not require special equipment such as HMDs or externally mounted displays, and also offers better depth perception than video AR systems. Previous examples from our laboratory of related techniques using an endoscope include those reported by Dey et al. Citation[4], who described a method for mapping the endoscopic image onto the 3D shapes extracted from preoperative images via an optically tracked neuro-endoscope, and Szpala et al. Citation[5], who presented a method for overlaying a real-time video image on a cardiac model, also via the use of an optically tracked endoscope. Fusion can also be achieved by directly registering an intraoperatively acquired photographic image with the preoperative model, as demonstrated by Clarkson et al. Citation[6], who performed an intensity-based 3D-2D registration of optical images from multiple views to a preoperative 3D surface model, and the work of Miga et al. Citation[7], in which surfaces captured with a laser range scanner with texture acquired with a digital camera were registered to a preoperative cortical model.

The present work built upon that of Dey et al. [4] by using a direct view instead of an endoscopic view for neurosurgical guidance and eliminating tracking of the camera. To achieve fusion, we employed an image registration-based approach. First, a landmark-initialized intensity-based perspective 3D-2D registration is employed to estimate the position and orientation of the patient's 3D cortical model with respect to the camera coordinate system using a single optical image. Next, to increase the robustness and efficiency of the registration algorithm, an improved multi-stage optimization method is employed. Finally, the photographic image is back-projected onto the 3D cortical model using texture-mapping with the estimated orientation and position parameters. To evaluate the proposed fusion algorithm, we first conducted phantom experiments in which fiducial landmarks were used to assess accuracy. Next, we evaluated the fusion method using a neuronavigation system and intraoperative data. Finally, we present the clinical application of the proposed fusion method for anatomical and functional image guided neurosurgery.

The remainder of the paper is organized as follows: The next section describes the methods proposed to achieve fusion. This is followed by a description of the evaluation experiments using phantom data and in vivo data with neuronavigation. The subsequent section explains the clinical application of the fusion method for image and function guided neurosurgery, and this is followed by the discussion and conclusions.

Materials and methods

System overview

Our system comprises a neuronavigation component which employs an optical tracking system (NDI Polaris, Northern Digital, Inc., Waterloo, Ontario, Canada) and provides the common utility for image guided neurosurgery. In addition, a consumer-grade digital camera (Nikon D60) is used to acquire the intraoperative photographic images. These intraoperative images are transferred to the navigation system via a USB cable, post-processed, and overlaid on the preoperative models and image volumes using the proposed fusion method. In this study, the navigation system runs on an Intel Pentium 4 PC with an NVidia 7800GTS graphics card.

Pinhole camera model

Although a consumer-grade digital camera is employed in this work, its optical characteristics can be modeled by a simple pinhole camera model Citation[8], which describes the mathematical relationship between the coordinates of a 3D point and its projection onto the camera image plane. illustrates the relationship between points in 3D space and the corresponding points on the 2D image plane, using a pinhole camera model. O′ is the aperture of the camera or focal point, and using O′ as the origin a camera-centered coordinate system is established. The focal length f is defined as the distance from the image plane to the aperture in the pinhole camera model.

Figure 1. 2D projection of a 3D model.

Figure 1. 2D projection of a 3D model.

In the pinhole camera model, a point N represented by a homogeneous vector n = (x, y, z, 1)T in the 3D scene is projected to intersect with the image plane at a point M represented by a homogeneous vector m = (u, v, 1)T in the 2D image. n and m are related bywhereis a 3 × 4 projective transformation matrix,is a rigid 4 × 4 transformation matrix, and a is a scaling factor. The projective transformation matrix P is characterized by the intrinsic camera parameters (in this model just the focal length f), and can be set to a predefined value since we used a digital camera that allows direct focal length adjustment in this study. The rigid transformation matrix C comprises a 3 × 3 rotation matrix R and a vector t = (tx, ty, tz)T that describes the orientation and position of the camera with respect to the 3D scene. R is characterized by θ = (θx, θy, θz)T: the angles of rotation around three Cartesian axes. θ and t are also called the extrinsic camera parameters.

Using this pinhole camera model, the recorded 2D image is a perspective projection of a 3D scene. However, fusing a 2D image to the corresponding 3D scene is not a trivial task. First, the relationship represented by this 3 × 4 projection transformation matrix T needs to be recovered. Furthermore, projecting the 2D points back to their corresponding 3D locations is an inverse mapping problem which does not have a unique solution, since a point in the 2D image can be related to an infinite number of points in the 3D space by the projection transformation T. To solve for the first problem, we employ a landmark-initialized intensity-based perspective 3D-2D registration algorithm to recover the camera pose. As for the second problem, segmentation is employed to generate a surface representation of the model so that a unique one-to-one mapping can be obtained.

Landmark-based initial alignment

Unless the registration has global convergence characteristics, an initial alignment is required to limit the search space within the capture range of the registration algorithm in order to successfully find the global optimum. To calculate the initial transformation, a landmark-based rigid body registration is employed, where homologous points of anatomical features such as the sulci on the 3D MR brain image and on the 2D photographic image are selected manually. In this initial alignment step, the landmarks in the photographic image are used as the targets, while their homologs in the 3D image are used as the source. The resulting rigid-body transformation Tlandmark can be viewed as a rough approximation of the true rigid transformation matrix C, where the perspective projection P degenerates to an orthographic projection (i.e., P becomes the identity matrix).

Intensity-based registration method

The general principle for 3D-2D registration in many applications is that a 2D synthetic projection image is first generated from the 3D volume using an estimate for the initial pose, and the 2D projection image obtained at this pose is compared with the physically acquired 2D projection image for similarity. This process continues iteratively until the algorithm finds the best match using some numerical optimization procedures.

2D projection image of brain via volume rendering. To generate a 2D projection image corresponding to the 3D scene, we first employ a segmentation process Citation[9], commonly known as “skull stripping”, to create a rough representation of the brain cortex for visualization purposes. Additionally, this model is used in the final stage as the underlying surface model onto which the photographic image is texture-mapped.

Next, we employ a volume rendering technique based on a GPU-accelerated ray-casting algorithm Citation[10], which employs the OpenGL graphic library Citation[11] for conventional graphics processing such as setting the viewing environment, etc., and the OpenGL shading language Citation[12] for GPU programming. Compared to conventional volume rendering approaches that employ texture-mapping, this technique is fast and generates more photo-realistic and artifact-free images. In addition, this approach avoids the need for accurate reconstruction of the cortical surface, which is required for conventional surface rendering techniques. In the GPU-accelerated volume rendering approach, a fixed illumination source position is employed and we neglect specular shading effects.

To achieve an accurate registration between the 2D projection image generated using volume rendering and the 2D intraoperative photographic image, the anatomical features, which are represented by the sulci and gyri of the cortical foldings of the brain, must be easily recognizable in the 2D projection image. This is achieved through the design of a transfer function which is used to assign different optical properties (generally represented by color and opacity) to each voxel, depending on its intensity value in volume rendering. In this work, we employ a simple manual manipulation of the transfer function, in which a user interface widget is used to allow the user to adjust the transfer function interactively based on the volume-rendered appearance of the brain. Finally, to generate a corresponding 2D projection image using volume rendering, a correct viewing perspective is established that mimics the acquisition of the photographic image.

3D-2D image registration. The intensity-based 3D-2D registration employs the Normalized Mutual Information (NMI) similarity metric Citation[13]. NMI is a normalized form of the standard mutual information (MI) metric and uses information theory to quantify how well one image is explained by another. Theoretically, NMI is maximized when the images are aligned or when they are maximally dependent. NMI was selected as the similarity measure since it is more suitable for multimodality image registration than other metrics such as Normalized Cross Correlation (NCC).

An optimization procedure,is employed to find the transformation T* that maximizes the NMI, where I is the target 2D image (the physically acquired photographic image) and J is the moving 2D image (the synthetically generated 2D projection image). The Downhill Simplex optimization technique Citation[14] is employed to search for the optimal pose parameters based on the values from the similarity cost function. These pose parameters consist of three translations, t = (tx, ty, tz)T, and three rotations, θ = (θx, θy, θz)T, and are used to construct the rigid-body transformation matrix C. The Downhill Simplex algorithm is a derivate-free method that is fast and accurate when the initial solution is close to the optimal solution in the search space.

Since the local Downhill Simplex algorithm has a limited capture range, we employ a multi-stage optimization approach similar to the method presented by Jenkinson et al. Citation[15], Citation[16]. First, the transformation parameter space is partitioned into subspaces, each of which is searched independently. Next, a multi-scale strategy is used to further increase the robustness of the registration algorithm within each subspace. The 2D projection image and the photographic image are first blurred and then registered to each other. These steps are executed repetitively at different scales, progressing to a fine resolution result in a multi-scale manner. The scales are defined as the width of the full width at half-maximum (FWHM) Gaussian kernel, which changes from 8 mm to 2 mm. For Downhill Simplex optimization, the characteristic scale length (also called scaling factor) for the simplex to proceed is set to 10 mm for translation parameters and 10° for rotation parameters. Next, the results from each subspace are compared, with the most successful solutions being retained for further searching in the following stage. Finally, a re-optimization is employed to decrease the possibility of the registration being trapped within a local minimum. For the final search and re-optimization, the translation and rotation scaling factors are set to 3 mm and 3°, respectively. During registration, a mask is also applied so that only the exposed brain cortex is used for registration. This mask or region of interest (ROI) is manually defined in the photographic image using a closed free-form cardinal spline.

Photographic overlay on volumetric image

To overlay a 2D photographic image on the volume-rendered 3D brain MR volume, a surface model of the brain is generated as described previously. Next, a perspective projection is employed using Equation 1 to generate the texture coordinates for each vertex on the brain surface mesh. Finally, the photographic image is texture-mapped onto this mesh. The ROI is implemented as a mask image, which is used to generate the appropriate transparency (alpha) value of the surface mesh so that only the ROI portion of the photograph is displayed.

In our implementation, we ignore the barrel distortion common in consumer-grade cameras since it is relatively small (0.3% or approximately 5 pixels [0.3 mm] in a 3000 × 2000 image). For reference, the surgical annotation tag is approximate 30 pixels wide.

Implementation

Our epilepsy surgical planning and guidance system is based on the AtamaiViewer visualization and navigation platform Citation[17] developed in-house for image guided procedures. AtamaiViewer is a comprehensive, platform-independent, modular, extensible software platform, which has all the essential features needed for common visualization and navigation tasks. It was developed using the Python programming language with the underlying visualization functionality provided by the Visualization Toolkit (VTK) software package Citation[18], and its functionality can be easily extended by user-specific modules using a plug-in mechanism.

In our system, several modules were developed to facilitate the proposed method. Specific to the display environment, the GPU-based volume visualization was implemented using the OpenGL shading language as a generic “volume mapper” class, derived from the open-source VTK software package. The photographic overlay was developed as a module and the perspective texture mapping facility was developed as a class similar to the “vtkTextureMapToPlane” class of the VTK library. Segmentation of the MR brain images and registration of preoperative multimodal images were also implemented as modules within this system.

Experiments and results

Phantom experiment

To evaluate the accuracy of the fusion method, we first conducted a phantom experiment. A 3D CT volume of a standard brain phantom (Kilgore International, Inc., Coldwater, MI) was acquired with a GE HiSpeed CT scanner. A number of fiducial markers were attached on the left side of the cortical surface of the brain phantom. The brain phantom CT image matrix was 512 × 512 × 320, with a voxel size of 0.44 × 0.44 × 0.64 mm. The image was re-sampled to a 256 × 256 × 256 matrix to facilitate rapid volume rendering (). Next, a surface model of the phantom was created using the Marching Cubes algorithm Citation[19] implemented in VTK (). Four photographs of this phantom with different poses were then acquired using the digital camera (image matrix size 3000 × 2000) and transferred to the workstation via USB. The focal length was set to 50 mm during image acquisition. The captured images were then blurred and re-sampled to 375 × 250.

Figure 2. (a) Volume-rendered phantom brain CT image and (b) surface-rendered brain model.

Figure 2. (a) Volume-rendered phantom brain CT image and (b) surface-rendered brain model.

The phantom brain CT image was then imported into the AtamaiViewer system and four landmarks on the volume-rendered 3D CT image, along with the corresponding landmarks in the photographic images, were manually identified as shown in . Landmark registration was then performed and the result was used as the initial registration estimate. Next, we generated ten transformation matrices with random values chosen within the range of each parameter (Δtranslation = 10 mm and Δrotation = 5°) to simulate the initial misalignment. We then performed the intensity-based 3D-2D registration, and the volume rendering was displayed within a 375 × 250 pixel window to match with the re-sampled photographic image. Finally, the fusion was achieved by texture-mapping the photographic image back to the surface model.

Figure 3. Phantom experiment. (a) Landmarks on the CT image. (b) Landmarks on the photograph.

Figure 3. Phantom experiment. (a) Landmarks on the CT image. (b) Landmarks on the photograph.

To evaluate the performance of the fusion algorithm, a 3D registration error measure (MRE) between the homologous fiducial points from the 3D phantom CT image and fused surface model were calculated as follows:where N is the total number of pairs of homologous fiducial landmarks, is the 3D coordinate of the gold standard point i, and is the 3D coordinate of the registered point i. MRE is the mean of the target registration error (TRE) values calculated using several homologous landmarks. In this study, the 3D coordinates of the homologous fiducial landmarks in the CT volume were identified manually by their centroids. After overlay of the photographic image on the 3D surface model, the registered 3D coordinates were determined manually as the centroids of fiducials on the fused surface model. The fusion results are listed in for each image, where the success rate is defined as the percentage of cases in which the MRE is less than 3 mm. shows the fusion of part of the photographic image onto the CT volume with 50% transparency while shows a magnified area of at the edge of the overlaid photograph. It can be seen that the gyral curves and the fiducial landmarks match well in the photographic image and the CT volume.

Figure 4. Result of phantom experiment. (a) Photographic overlay on the CT image. (b) Magnification of the region of interest in (a).

Figure 4. Result of phantom experiment. (a) Photographic overlay on the CT image. (b) Magnification of the region of interest in (a).

Table I.  Image fusion results in the phantom experiment.

Clinical experiment using neuronavigation system

To evaluate the fusion method in a more realistic clinical environment, we conducted an in vivo clinical experiment. One of the most important limitations with the clinical evaluation is the lack of a well-defined gold standard. Therefore, in this study, a standard neuronavigation system (StealthStation Treon, Medtronic Navigation Technologies, Inc., Louisville, CO, USA) was employed to provide such a reference. Nevertheless, the fidelity of the gold standard is compromised during surgery since the accuracy of the navigation can influence the final validation results, and brain deformation after craniotomy and opening of the dura can be as much as 1 cm Citation[20], which could further degrade this gold standard measurement. For this reason, we also investigated the effect of brain shift on the final results.

In the study, a patient with mesial right temporal lobe epilepsy (TLE) underwent an anterior temporal lobectomy (ATL). Prior to surgery, a preoperative anatomical MR scan was acquired with the fiducial markers affixed to the skin of the patient's head. Next, the preoperative MR volume was imported into AtamaiViewer, which was employed as the neuronavigation system in parallel with the regular StealthStation system used for the clinical procedure. Fiducial landmarks on the patient's head were then registered to the landmarks identified on the volume-rendered head image to establish the “image-to-patient” registration.

After craniotomy and opening of the dura mater, four anatomical landmark points were identified by the operating surgeon and marked with paper tags on the cortical surface of the patient's brain (). These landmarks were chosen mainly at the bifurcation of blood vessels.

Figure 5. Clinical evaluation: paper tags on the cortical surface.

Figure 5. Clinical evaluation: paper tags on the cortical surface.

The landmarks were then localized using the pointer tool tracked by the navigation system and their corresponding 3D coordinates as reported by the navigation system were recorded. After 20 minutes – sufficient time for brain shift to occur – these four landmarks were again measured using the navigation system. Next, an intraoperative photographic image was acquired to capture the surgical field of view, as shown in , and this image was then imported into the AtamaiViewer system. Initial alignment was performed through homologous landmark registration between the image and the 3D volume-rendered brain model. shows the selection of the four homologous landmarks in the 3D MR volume and the 2D image, as well as the definition of the ROI. Next, the 3D-2D intensity-based registration was performed to register and fuse the photographic image with the brain model.

Figure 6. Clinical evaluation: landmarks used to align the two images initially. (a) Landmarks defined on the volume-rendered image. (b) Landmarks defined in the 2D photographic image.

Figure 6. Clinical evaluation: landmarks used to align the two images initially. (a) Landmarks defined on the volume-rendered image. (b) Landmarks defined in the 2D photographic image.

shows the fusion of the intraoperative photographic image with the brain model. The landmarks in the fused photographic image were then manually identified and the coordinates were compared to those recorded by the navigation system following brain shift (). Registration errors were reported in the three orthogonal directions as a 3D MRE error as well as a 2D in-plane error, defined as the mean 2D distance of all four landmarks in the AP and SI plane. We also list the brain shift measured using the four landmarks in . In , the yellow spheres represent the landmarks reconstructed immediately after the opening of the dura mater, the red spheres show the landmarks reconstructed 20 minutes after the dura was opened, and the green spheres are the landmarks reconstructed from the photographic fusion.

Figure 7. Clinical evaluation. (a) Fusion of the photographic image onto the MR brain model and reconstructed landmarks in the clinical study. (b) Magnification of the region of interest in (a).

Figure 7. Clinical evaluation. (a) Fusion of the photographic image onto the MR brain model and reconstructed landmarks in the clinical study. (b) Magnification of the region of interest in (a).

Table II.  Image fusion results in the clinical experiment.

Table III.  Brain shift measurements in the clinical experiment.

Clinical applications

The clinical motivation of the fusion method is to facilitate image guidance via both anatomical and functional image data during neurosurgery. For epilepsy or tumor resection surgery in the left temporal lobe, one of the clinical goals is to minimize the surgical risk of resecting eloquent areas, especially those related to language functions, which are generally located in the posterior superior temporal gyrus. ESM is a clinical standard for eliciting these regions. More recently, preoperative language fMRI has also been shown in some studies to yield a good prediction of the language area Citation[21], and it is felt that combining these two techniques could provide better localization of language areas. In addition, it is believed that this approach may potentially provide enhanced or augmented visualization of the surgical field to the surgeon from the two complementary imaging sources. Here we demonstrate the use of the integrated image and functional guidance for epilepsy surgery.

Preoperative imaging

A patient with left TLE was studied in this experiment. Informed consent was obtained from the patient and the study was approved by the Office of Research Ethics of the University of Western Ontario. Prior to the procedure, the patient underwent standard anatomical and functional MR imaging. fMRI data were acquired using two language stimulation paradigms (verb generation and sentence completion) to elicit functional active regions in the brain. In addition, intracranial electroencephalography (EEG) monitoring was employed for seizure localization, wherein subdural electrodes were placed on the surface of the brain to monitor cortical electrical activity. To facilitate the correlation of the position of the electrodes with the neuroanatomical context, a CT scan was performed after the metal EEG electrodes had been placed on the cortex.

Multimodality image fusion and visualization

Skull stripping was employed for the MRI image to extract the brain tissue from the skull, and the extracted brain mesh was saved as cortical surface model data for photographic overlay display. To correlate the positions of the electrodes with their neuroanatomical context, the electrodes were segmented from CT images acquired after the electrode implantation step. fMRI images were processed with SPM2 Citation[22] using a general linear model to generate the functional maps that are represented by areas activated by the language tasks. Next, the MRI and fMRI image data were first registered to each other, and were then in turn registered to the CT image using AtamaiWarp Citation[23], Citation[24]. The registered CT volume, MR volume and fMRI activation maps were then displayed using volume rendering, in which different color transfer functions were applied to each modality image and the final fused image rendered using a composite blending technique.

Intraoperative navigation and fusion

When the studies were performed in the operating room, both AtamaiViewer and clinical StealthStation platforms were employed in parallel. Each of these systems had its own optical tracking camera, but they shared a reference tool and tracked pointer.

Prior to surgery, a preoperative MR scan was acquired with the fiducial markers fixed to the patient's scalp and the preoperative image data were loaded into both navigation systems. Fiducial landmarks on the patient's head were then registered to the landmarks identified on the preoperative MR volume to establish the “image-to-patient” registration. After craniotomy and opening of the dura mater, intraoperative ESM was performed on the patient's left temporal and frontal lobes to elicit the critical language areas. Next, a photographic image capturing the stimulated cortical surface was acquired using the digital camera. This image was transferred to the AtamaiViewer system, and was then overlaid on the preoperative brain model using the proposed fusion method.

illustrates the fusion of the photographic image with the volume-rendered anatomical MR volume, with showing the volume rendering of the anatomical MR volume fused with the functional activation map, and the intraoperative photographic image of the cortical surface with language site tags. shows the fusion of the photographic image with the anatomical MR volume and functional activation map, with the green areas representing the regions onto which language activity was mapped. This presentation allows enhanced visualization of the language areas elicited by both the intraoperative ESM and fMRI data during neurosurgery.

Figure 8. Clinical implementation. (a) Volume rendering of the anatomical MR volume and functional activation map. (b) Intraoperative photograph of left temporal lobe. (c) Overlay of the photograph on the volume rendering of the anatomical MR volume and functional activation map.

Figure 8. Clinical implementation. (a) Volume rendering of the anatomical MR volume and functional activation map. (b) Intraoperative photograph of left temporal lobe. (c) Overlay of the photograph on the volume rendering of the anatomical MR volume and functional activation map.

Discussion

In this paper, we have demonstrated the fusion of an intraoperative scene with a preoperative plan. The proposed fusion method provides a means of rapidly capturing the intraoperative environment and overlaying it on the preoperative model, facilitating the correlation of the two spaces. In contrast to methods proposed in several previous studies Citation[4–7], our method only requires the acquisition of a single untracked optical image. In addition, it is based on the image intensity generated using a direct volume rendering technique, which in turn does not require accurate landmark localization or feature extraction and surface reconstruction. This approach is thus both cost-effective and user-friendly.

The proposed landmark-initialized intensity-based 3D-2D perspective registration algorithm achieved good registration accuracy in the phantom experiment and acceptable accuracy in the clinical experiment. Our method achieved similar accuracy (2.43 ± 0.32 mm) compared to that (2.5 ± 0.7 mm) obtained using the method proposed by Miga et al. Citation[7] in a phantom experiment. However, in the clinical experiment the accuracy achieved using our method (5.15 ± 0.49 mm) was inferior to that of Miga et al. (3.5 ± 1.7 mm). This may be due to differences in the accuracy of the navigation systems and the amount of brain shift occurring during surgery in the two cases. With respect to the registration procedure, landmark initialization brings the initial pose close to the true solution and helps to reduce the search space substantially. Furthermore, the multi-stage optimization strategy adds to the robustness of the registration. An empirically selected color transfer function was developed to optimize the feature contrast.

One of the limitations of the proposed fusion method is the relatively larger registration error in the actual clinical situation, with 2D in-plane error of around 3.3 mm and 3D MRE error of around 5.2 mm. This increase in error is probably the result of several factors. First, the quality of clinically acquired MR images does not match the high quality of the phantom brain CT images, which degrades the ability to render the cortical features distinctly, and in turn renders the registration less accurate. Secondly, the cortical surface model of the brain phantom was generated with high accuracy, since the brain CT image was acquired without the skull. However, this becomes much more difficult for the MR images as accurate reconstruction of the cortical surface from MR image data is still an open problem that merits further investigation in its own right.

Although the registration error is relatively large in the clinical situations, it is still acceptable for the actual clinical procedures. First, the 3D MRE error is not necessarily an appropriate measure for the clinical procedure. When the surgeon examines the photographic overlay, he/she will most likely be visualizing the brain in a direction perpendicular to the cortical surface, making the 2D in-plane registration error a more representative measure for the actual procedure. Also, for epilepsy and tumor resection surgery, the clinical standard is to leave a margin of ∼10 mm with respect to the eloquent areas, so a 3 mm error remains acceptable for these procedures.

Another concern in the clinical environment is the problem of brain shift and its effect on the proposed fusion system. The brain shift in the clinical study presented was not particularly significant (less than 5 mm), so the coordinates reported by the neuronavigation system were assumed to be correct in this case. As shown in the brain-shift experiment, the shift took place mainly along the lateral direction with respect to the patient, in line with the direction of gravity. These results agree with previous studies Citation[20] on brain shift and demonstrate that there is less in-plane variability than in the perpendicular direction. In an actual clinical procedure, the preoperative model does not accurately reflect the intraoperative state. In this case, the fusion algorithm can potentially map the deformed intraoperative cortical image back to the un-deformed preoperative model in a sense to rigidly correct for the brain shift. However, this correction is not ideal, as the preoperative plan is not updated through this process.

Lastly, the accuracy of the fusion also depends upon the extent of salient features present in the exposed cortical surface. In both the phantom and clinical experiments, the exposed areas were relatively large, which made the registration more accurate and robust, whereas for many minimally invasive procedures that involve a smaller exposure of the cortical surface the registration accuracy will decrease. In this situation, there are several ways to improve the accuracy of the procedure. One approach is to incorporate cortical veins as salient features in the registration process, while another is to use some fiducial feature points that are visible to both the navigation system and the digital camera as registration landmarks. For example, the fiducial points on the reference tool for the navigation system could be used for this purpose.

Conclusions

In this paper, we presented a landmark-initialized intensity-based 3D-2D registration method to estimate the camera pose with respect to the 3D patient model and a fusion method to overlay the optical image on the preoperative volume. The result of the photographic overlay using the phantom image demonstrates good correspondence between the preoperative volumes and the intraoperative optical image. A preliminary clinical study showed that fusion could be achieved with 3D MRE of approximately 5 mm and a 2D in-plane registration error of approximately 3 mm. Finally, the proposed fusion method was employed in a clinical case to demonstrate its utility in providing enhanced and augmented visualization for integrated anatomical and functional guidance.

Acknowledgments

The authors thank Frank Bihari and Dr. Aaron So for providing the phantom and patient images, and Dr. David G. Gobbi for helpful discussions.

Declaration of interest: This work is supported by the Canada Foundation for Innovation (CFI) and the Canadian Institutes of Health Research (CIHR). Dr. Wang acknowledges the scholarship support from the University of Western Ontario, the Canadian Institutes for Health Research (Grant MOP 184807), Precarn Inc., and the Ontario Graduate Scholarship in Science and Technology.

References

  • Blackwell M, Nikou C, DiGioia AM, Kanade T. An image overlay system for medical data visualization. Med Image Anal 2000; 4: 67–72
  • Birkfellner W, Figl M, Matula C, Hummel J, Hanel R, Imhof H, Wanschitz F, Wagner A, Watzinger F, Bergmann H. Computer-enhanced stereoscopic vision in a head-mounted operating binocular. Phys Med Biol 2003; 48: N49–N57
  • Vogt S, Khamene A, Niemann H, Sauer F. An AR system with intuitive user interface for manipulation and visualization of 3D medical data. Stud Health Technol Inform 2004; 98: 397–403
  • Dey D, Gobbi DG, Slomka PJ, Surry KJM, Peters TM. Automatic fusion of freehand endoscopic brain images to three-dimensional surfaces: Creating stereoscopic panoramas. IEEE Trans Med Imag 2002; 21(1)23–30
  • Szpala S, Wierzbicki M, Guiraudon G, Peters TM. Real-time fusion of endoscopic views with dynamic 3-D cardiac images: A phantom study. IEEE Trans Med Imag 2005; 24: 1207–1215
  • Clarkson MJ, Rueckert D, Hill DLG, Hawkes DJ. Using photo-consistency to register 2D optical images of the human face to a 3D surface model. IEEE Trans Pattern Anal Machine Intell 2001; 23: 1266–1280
  • Miga MI, Sinha TK, Cash DM, Galloway RL, Weil RJ. Cortical surface registration for image-guided neurosurgery using laser-range scanning. IEEE Trans Med Imag 2003; 22(8)973–985
  • Shapiro LG, Stockman GC. Computer Vision. Prentice Hall, Upper Saddle River, NJ 2001
  • Smith SM. Fast robust automated brain extraction. Human Brain Mapping 2002; 17: 143–155
  • Wang A, Mirsattari S, Parrent A, Peters TM. Fusion of intraoperative cortical images with preoperative models for neurosurgical planning and guidance. In: Miga MI, Wong KH, editors. Proceedings of SPIE Medical Imaging 2009: Visualization, Image-Guided Procedures, and Modeling, Orlando, FL, February 2009. Proc SPIE 2009;7261: 72612O.
  • Woo M, Neider J, Davis T. OpenGL Programming Guide, 2nd edn. Addison-Wesley, Reading, MA 1996
  • Kessenich J, Baldwin D, Rost R. The OpenGL Shading Language. The Khronos Group, Inc.; 2008.
  • Studholme C, Hill DLG, Hawkes DJ. An overlap invariant entropy measure of 3D medical image alignment. Pattern Recognition 1999; 32: 71–86
  • Nelder J, Mead R. A simplex for function minimization. Comput J 1965; 7: 308–313
  • Jenkinson M, Smith SM. A global optimisation method for robust affine registration of brain images. Med Image Anal 2001; 5: 143–156
  • Jenkinson M, Bannister PR, Brady JM, Smith SM. Improved optimisation for the robust and accurate linear registration and motion correction of brain images. NeuroImage 2002; 17: 825–841
  • Moore J, Guiraudon G, Jones D, Hill N, Wiles A, Bainbridge D, Wedlake C, Peters T. 2D ultrasound augmented by virtual tools for guidance of interventional procedures. Stud Health Technol Inform 2007; 125: 322–327
  • Schroeder W, Martin KW, Lorensen W. The Visualization Toolkit: An Object-Oriented Approach to 3D Graphics. Prentice Hall, New York 2000
  • Lorensen WE, Cline HE. Marching Cubes: A high resolution 3D surface construction algorithm. Computer Graphics 1987; 21: 163–169
  • Hill DL, Maurer CR, Jr, Maciunas RJ. Measurement of intraoperative brain surface deformation under a craniotomy. Neurosurgery 1998; 43: 514–526
  • Carpentier AC, Pugh KR, Westerveld M, Studholme C, Skinjar O, Thompson JL, Spencer DD, Constable RT. Functional MRI of language processing: Dependence on input modality and temporal lobe epilepsy. Epilepsia 2001; 42: 1241–1254
  • Friston KJ, Ashburner JT, Kiebel S, Nichols TE, Penny WD, editors. Statistical Parametric Mapping: The Analysis of Functional Brain Images. Academic Press; 2007.
  • Finnis KW, Starreveld YP, Parrent AG, Sadikot AF, Peters TM. Three-dimensional databased of subcortical electrophysiology for image-guided stereotactic functional neurosurgery. IEEE Trans Med Imag 2003; 22: 93–104
  • Guo T, Finnis KW, Parrent AG, Peters TM. Visualization and navigation system development and application for stereotactic deep-brain neurosurgeries. Comput Aided Surg 2006; 11: 231–239

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.