Part A is for taking pictures, selecting corresponding points from the two images, then compute homography, warping im1 to im2, blending the two images into a mosaic.
For part 1, I took 4 pair of pictures, with fixing the center of projection (COP) and rotating my camera while capturing photos. Then I selected corresponding points on both images.
The theory for recovering homographies is explained below:
The transformation is a homography: \( q = Hp \), with \( q \) and \( h \) in homogeneous coordinates:
This implies that \( w = g p_1 + h p_2 + 1 \). We can substitute to get the relations:
Or equivalently:
Finally, stacking this into matrix form, we get:
Generally, I used H = computeH(im1_pts,im2_pts) to compute the homography matrix from im1 warping to im2.
To warp the images, I used the homography matrix H to map the canvas coordinates back to the original image space. First, I created a canvas with dimensions based on the bounding box, then I computed the corresponding points in the original image by applying the inverse of H. Using RegularGridInterpolator, I interpolated the pixel values for the red, green, and blue channels separately. I then combined these channels to produce the final warped image and created an alpha mask to mark valid pixels in the result.
For image rectification, I selected four corners of a rectangular object in the image (im1_pts) and defined the corresponding target points (im2_pts) as a perfect rectangle. Using the homography computed between these points (the same function I used above), I warped the image to correct the perspective, making the object appear rectangular. I tested this with three pairs of rectification images to ensure the homography and warping were working correctly.
To blend the images into a mosaic, I warped the first image (im1) into the coordinate system of the second image (im2) using the homography matrix H. I computed the bounding box for the warped image to determine the size of the overall canvas, ensuring that both images fit. I then created an alpha mask for each image to handle blending. The alpha mask for the warped image was generated by mapping its pixels, and for the second image, I applied a mask with reduced weights near the edges to avoid sharp seams. Using the alpha masks, I computed feather weights that gradually decrease toward the edges of each image. Specifically, I calculated these weights by applying a distance transform to the alpha masks, which assigns higher weights to pixels farther from the edges. This approach allows for smooth blending by weighting pixel contributions based on their distance from the image boundaries. These weights were applied during the blending process to perform a weighted average of the two images at every pixel, resulting in a smooth transition between them. I used an accumulated weight matrix (weight_mosaic) to combine and normalize the feathered weights from both images, ensuring smooth blending in overlapping areas. Then, I displayed the feathered alpha masks and the resulting mosaic with smooth blending. (The source images are shown in Part 1)
Part B is for automatically detecting features, using algorithm to select corners(features), then blending the two images into a mosaic.
For part 1, I used function get_harris_corners provided to find the corners on my original image, I set edge_discard=20.
For part 2, I used Adaptive Non-Maximal Suppression (ANMS) to reduce the number of corners, ANMS selets corners based on the corner strength, to avoid having
many corners within a small scope, ANMS used a circle with radius to find the strengthest corner within the circle-neighborhood, if a neighboor corner has more
strength than corner xi, then we choose corner xj. The general algorithm is
\[
r_i = \min_{j} \| \vec{x}_i - \vec{x}_j \|, \quad \text{s.t.} \quad f(\vec{x}_i) < c_{\text{robust}} f(\vec{x}_j), \quad \vec{x}_j \in \mathcal{I}
\]
for all corners, where \(\mathcal{I}\) is the set of detected corners.
After that, I blurred the whole image[figure6_2], and then sampled these patches from the larger 40x40 window to have a nice big blurred descriptor. Then I extracted
axis-aligned 8x8 patches for each corners as feature descriptor. I also used bias/gain-normalize to normalize the descriptors.
The figure6_3 shows the patches of three channles on the first corner after normalization.
To match the features on image01 and image02, I flattened the patch array as shape (# corners to get match, 64*3), and then I used dist2 to calculate the distance between image01 descriptor and image02 descriptor. And I selected the 1NN and 2NN using the smallest distance values. Then I used Lowe’s trick, setting threshold = 0.4, to select the most convincing matches on pairs of images.
Since the pairs of matches can not guarantee totally accurate, in this part, I used RANSAC to find the most accurate homography. Following the steps "# loop starts: # 1. Select four feature pairs (at random) # 2. Compute homography H (exact) # 3. Compute # loop ends # 4. Keep largest set of inliers # 5. Re-compute least-squares H estimate on all of the inliers", I found the homography on each pair of images, and then visulaized them.
For part 1, I took 4 pair of pictures, with fixing the center of projection (COP) and rotating my camera while capturing photos. Then I selected corresponding points on both images.
In this project, I learned the algorithm of how to do automatical feature detecting, also I learned how to read a paper and how to reproduce the algorithm from the paper.
For part 1 of Bells and Whistles: my own ideas, I took pictures from daytime and night from the same point, then I mosaic these two images together.
For part 2 of Bells and Whistles: video mosaics, I made a mosaic video of a corrider in Wurster Hall. The first two videos are the original videos, and the 3rd video is the mosaic video.