2D-3D image registration - COMPLETE 6D ESTIMATION

88 visualizaciones (últimos 30 días)
payam samadi
payam samadi el 18 de Dic. de 2025 a las 9:08
Comentada: payam samadi el 22 de Dic. de 2025 a las 9:05
Hello everyone,
I am simulating a 2D–3D image registration pipeline for 6D motion estimation (3D translation + rotation) from two 2D X-ray projections, based on the following paper:
[1] Fu & Kuduvalli, “A fast, accurate, and automatic 2D–3D image registration for image-guided cranial radiosurgery,” Med. Phys., 2008.
The pipeline consists of:
  1. Generating a 3D CT volume with known fiducials
  2. Applying a known 6D rigid transformation to the CT
  3. Generating two 2D projections (A and B)
  4. Performing accurate 2D–2D registration between each X-ray and its corresponding DRR
  5. Estimating the final 6D transformation using geometric back-projection / analytical decomposition (as described in the paper)
Problem Description
  • The 2D image registrations are accurate and stable (x, y, in-plane rotation, and roll are recovered correctly for both projections).
  • However, the final 6D estimation shows significant error, especially in:
  • Out-of-plane translation
  • Out-of-plane rotations
Because the 2D registrations behave as expected, I suspect the issue is in the geometric back-projection / 2D-to-3D decomposition step, rather than in the similarity metric or optimization.
Specifically, I am unsure whether:
  • My implementation of the analytical back-projection equations is correct
  • The projection geometry (scaling, sign conventions, coordinate frames) is consistent between the two views
  • Additional geometric constraints or assumptions are required for observability
What I Am Looking For
I would greatly appreciate any insights or suggestions on:
  • Correct implementation of 2D-to-3D geometric back-projection for dual-view registration
  • Common pitfalls in coordinate systems, scaling factors, or sign conventions
  • Whether a cone-beam vs. parallel-beam assumption materially affects the 6D solution
  • Any recommended validation or sanity checks for this type of pipeline
I have attached the MATLAB code below for reference.
Thank you very much for your time and support.
Reference
[1] Fu, D., & Kuduvalli, G. (2008). A fast, accurate, and automatic 2D–3D image registration for image-guided cranial radiosurgery. Medical Physics, 35(5), 2180–2194.

Respuestas (1)

Matt J
Matt J el 18 de Dic. de 2025 a las 15:46
Editada: Matt J el 19 de Dic. de 2025 a las 2:27
I haven't read the paper, but I don't see how the proposed method would ever work as you describe it. 6D motion simply does not propagate into 2D projections in a way that 2D-2D registration (step 4) can model, whether cone beam or parallel beam. Additionally, registering the DRR's to each projection independetly ignores the coupled effects of 3D motion on each projection.
Since you say the CT subject contains fiducials, it seems to me like the best approach would be to find the fiducials in the projections and then triangulate their 3D positions stereoscopically. (In fact, I vaguely wonder if that's the true purpose of the 2D-2D registration -- to help you search for the fiducial projections in 2D.)
To do the triangulation, the Computer Vision Toolbox has a command triangulate which may be helpful. Once you've triangulated, the problem reduces to a 3D-3D point registration, which you can do with,
  10 comentarios
payam samadi
payam samadi hace alrededor de 20 horas
Dear Matt
According to the article, the in the first phase, the in-plane transformation parameters (x,y, theta) are initially estimated using the nominal DRR image, which these three parameters are quickly searched via multiresolution matching, using the sum of squared difference (SSD) as the similarity measure.
Note, the desired pixel accuracy for the translations and the half-degree accuracy for the in-plane rotation are achieved.
In the second phase, the roll is separately searched in one dimension based on the approximate results of (x,y, theta) obtained in the first phase. The optimized pattern intensity is used as a similarity measure to determine the reference DRR image corresponding to a roll.
Note, the search space during this phase is the full search range of roll angles, sampled at 1° intervals for the initial estimation.
The third phase is an iterative process. The in-plane transformation and the roll are refined iteratively, using pattern intensity as the similarity measure for increased accuracy.
Note, the iteration process has two steps, the first consisting of the refinement of the previously estimated in-plane transformation via steep descent minimization, followed by the refinement of the roll using one-dimensional interpolation. These two steps are repeated for a specified number of iterations. The final registration results give the in-plane transformation and the roll.
The "calculatePatternIntensity" function is used for Similarity measure.
Search Method
Multiresolution matching is a fast search method designed to estimate the initial in-plane transformation parameters that will be refined in subsequent search phases.
First, three image resolutions are defined. The original x-ray images are used as the highest resolution images. The lower resolution images are generated by subsampling the original x-ray images by half in each dimension. Then the lowest resolution images are created by further subsampling by half in each dimension.
The basic idea of multiresolution matching is to match the images at each resolution level successively, starting with the lowest resolution. The results at the lower resolution serve as rough estimates of the in-plane transformation parameters x,y, theta. The output at the lower level is then passed on to the subsequent higher resolution level. The parameters x,y, theta are refined using the higher resolution images. In the final matching results, the accuracy of translations depends on the spatial resolution of the highest resolution images. The accuracy of in-plane rotation depends on the sampling intervals of the in-plane rotated DRR reference images.
Note, there is a risk that accompanies the use of multiresolution matching. It is possible that the estimate at lower levels may converge to local optima far from the global optima. In this case, the result of the further matching at subsequent higher resolution levels will most likely not converge to the global optima. To overcome this risk, multiple candidates of estimation are used. A sorted list of candidates with a best match from the lower level is passed to the higher resolution level. The more candidate displacements are used, the more reliable the estimates become. The best candidates are ranked by their SSD values.
Based on these infromation and two codes (Ver23 or Ver24), where do you think I made mistake?
payam samadi
payam samadi hace alrededor de 18 horas
Dear Matt
I created another version (Ver26.m), and this time, I tested the model's performance under the conditions of (x, y, theta), assuming no roll for simplicity (lines 559-581).
The test cases are as follows:
```
test_cases = [
-5, 0, -2;
0, 3, 1;
3, -2, 0;
-2, 4, 1;
4, 1, -1;
];
```
I believe the code is capable of estimating the true 2D-2D results (x, y, theta), but it struggles to accurately estimate the true roll. Do you agree with this assessment?
I apologize for posting so many pieces of code, but I need to complete this simulation, which is very important to me. I would greatly appreciate any help you can provide.

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by