Texture Mapping using multiple quads with getPerspectiveTransform / warpPerspective

Hi everyone,

I’m looking for an efficient way to do ‘texture mapping’ in opencv. I’ve got about 60 quads that I’m applying getPerspectiveTransform / warpPerspective to. It works, but there must be a more efficient way right? Any thoughts?

Best wishes,
Rick

import cv2,csv
import numpy as np

warped = cv2.imread('data/vis_200.png')
unwarped = np.zeros_like(warped)

screen_quads = np.loadtxt("data/screen_quads.txt", dtype=np.float32).reshape(-1,4,2)
cam_quads = np.loadtxt("data/cam_quads.txt", dtype=np.float32).reshape(-1,4,2)

for screen_quad, cam_quad in zip(screen_quads, cam_quads):
    x0, y0 = np.min(screen_quad, axis=0).astype(int)
    x1, y1 = np.max(screen_quad, axis=0).astype(int)

    matrix = cv2.getPerspectiveTransform(cam_quad, screen_quad)
    dst_quad = cv2.warpPerspective(warped, matrix, (warped.shape[1], warped.shape[0]))
    
    unwarped[y0:y1, x0:x1] = dst_quad[y0:y1, x0:x1]

cv2.imshow(f"unwarped",unwarped)
cv2.imwrite(f"unwarped.png",unwarped)
cv2.waitKey(0)
cv2.destroyAllWindows()
1 Like

assuming the relationship is constant, you can pre-calculate maps for cv::remap(). that’ll be fast then.

but no, there’s no way to render textured triangles in 3D or 2D within OpenCV.

1 Like

just curious, how do you gather that data ?
your image reminds me of thin plate splines

1 Like

Thanks! The Thin Plate Splines Shape Transformer seems to work well for my purpose. It’s (probably) much faster than my previous approach.

import cv2
import numpy as np

img = cv2.imread('vis_200.png')

screen_points = np.loadtxt("screen_points.txt").reshape(1,-1,2)
cam_points = np.loadtxt("cam_points.txt").reshape(1,-1,2)

matches = [cv2.DMatch(i, i, 0) for i,_ in enumerate(screen_points[0])]
tps = cv2.createThinPlateSplineShapeTransformer()
tps.estimateTransformation(screen_points, cam_points, matches)
tps.applyTransformation(cam_points)
dst = tps.warpImage(img)

cv2.imshow("img", img)
cv2.imshow("dst", dst)
cv2.moveWindow("dst",640,0)
cv2.waitKey(0)

I’m afraid the warpImage of the TPS is too heavy for my 640x480 camera feed. I can do about 3 frames per second on quite a decent computer.

import cv2
import numpy as np

cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_EXPOSURE,-6)

cam_points = np.loadtxt("data/cam_points.txt", dtype=int).reshape(1,-1,2)
screen_points = np.loadtxt("data/screen_points.txt", dtype=int).reshape(1,-1,2)
screen_points = (screen_points * (480/2400, 640/3200) + (-64,0)).astype(int)   # 3200 = 640*480/480 to maintain aspect ratio, -64 to restore center

matches = [cv2.DMatch(i, i, 0) for i,_ in enumerate(screen_points[0])]
tps = cv2.createThinPlateSplineShapeTransformer()
tps.estimateTransformation(screen_points, cam_points, matches)
tps.applyTransformation(cam_points)

while True:
    ret, src = cam.read()
    dst = tps.warpImage(src)

    for cam_point, screen_point in zip(cam_points[0],screen_points[0]):
        cv2.circle(src, cam_point, 4, (0,0,255), thickness=-1)
        cv2.circle(dst, screen_point, 4, (0,255,0), thickness=-1)

    cv2.imshow("src",src)
    cv2.moveWindow("src",0,0)
    cv2.imshow("dst",dst)
    cv2.moveWindow("dst",640,0)

    key = cv2.waitKey(1)
    if key==27:
        break

cv2.destroyAllWindows()

So now I’m looking into cv::remap. I checked the tutorials but I don’t get it yet how to pre-calculate maps forcv::remap()for my case… I will keep trying and reading. If you have any suggestions please let me know.

i’m afraid, you’re right.

calculating the mapping coords is expensive,
the actual remap() is cheap (nicely optimized), so if you can split the mapping into a one-off calculation, it should run fast.

Hi berak,

When I look into the c++ sourcecode I see that the mapping is recalculated everytime the warpImage function is called. It is calling _applyTransformation for every pixel. Do you know how I can only once calculate the mapping and then apply the remap on every frame? Is there a way to get the mapX and mapY arrays from the TPS in python?

Rick

IDK if the thin spline thingy can do it on its own. it might. or not.

if not:

  • fill a source array with “meshgrid” like data.
  • then map that.
  • then use the result with remap()

let’s recap, that the TPS class here was made in the context of 2d shape comparison (rotating contours, so any dissimilarity is from the point distribution, not from pose), – it wasn’t made to transform video frames in realtime …

i’m still curious about your cam_points / screen_points, how do you get those ?
the 1st example looked, like those already formed a ‘deformation / mesh grid’, and maybe you only need to (bilinear) interpolate that to get ‘dense’ xy mappings

last, where does your distortion come from ?
(maybe there’s an elephant in this room, what it needs is undistortion maps from a chessboard camera calibration ?)

Hi Berak and crackwitz,

Thanks so much for thinking along. With your help I finally found the solution for my problem. I will show it below.

The Thin Plate Spline Shape Transformer turned out to be useful. I found out how to use applyTransformation to create a remap map. Now I just have to call cv.remap in my frame-loop which is really fast compared to warpImage that does both transformation and map.

To answer your questions about how I get my points: For my project Globe4D I am projecting with a fish-eye projector inside a 1 meter globe/dome. A fish-eye camera with Infrared pass-filter is also on the inside. It does not see visible light, therefore it can not be calibrated automatically using a chessboard pattern. With infrared LEDs the globe is illuminated from the inside. This makes hands on the globe reflect the IR-light which can be captured by the camera.
The fish-eye camera is out of center which leads to quite some distortion. To calibrate the touch I project a grid on the globe. By touching the dots on the grid (by hand or with an IR-flashlight) and knowing which dot is the active one I can map screen points to camera points. That is the input for the mapping function.

I think you’re right about that I just might need bilinear interpolation. I’m used to work with the Processing environment and OpenGL, that’s why at first I was talking about vertices and texture-coordinates / uv-mapping, all very linear and 2D. Here’s an example of my quads in Processing: Quads (github.com)
I wouldn’t know how to calculate the mapping array for the remap function myself using bilinear interpolation. I think the fact that the quads don’t have right corners (so not rectangles) makes it even more complex. At least for my level of expertise. I would love to see some code from someone :slight_smile:

So here’s my final code that runs at full speed. Thanks again for your time!

import cv2
import numpy as np

w,h = 640,480
cam = cv2.VideoCapture(0)
cam.set(cv2.CAP_PROP_EXPOSURE,-6)
cv2.namedWindow("src")
cv2.namedWindow("dst")
cv2.moveWindow("dst",w,0)

# estimate TPS transformation
cam_points = np.loadtxt("data/cam_points.txt", dtype=int).reshape(1,-1,2)
screen_points = np.loadtxt("data/screen_points.txt", dtype=int).reshape(1,-1,2)
screen_points = (screen_points * (h/2400, w/3200) + (-64,0)).astype(int)   # 3200 = 640*480/480 to maintain aspect ratio, -64 to restore center   
matches = [cv2.DMatch(i, i, 0) for i in range(len(screen_points[0]))]
tps = cv2.createThinPlateSplineShapeTransformer()
tps.estimateTransformation(screen_points, cam_points, matches)

# apply transformation to remap map (this part can still be improved I think but for now it's fine since it only runs once)
map_x = np.zeros((h,w), dtype=np.float32)
map_y = np.zeros((h,w), dtype=np.float32)
for y in range(h):
    for x in range(w):
        p = np.array([x,y]).astype(np.float32).reshape(1,1,2)
        u,v = tps.applyTransformation(p)[1][0][0]
        map_x[y,x] = u
        map_y[y,x] = v

# draw loop that runs very fast since it only uses remap for transformation
while cv2.waitKey(1)!=27:   
    ret, src = cam.read()
    dst = cv2.remap(src, map_x, map_y, cv2.INTER_LINEAR)
        
    for c,s in zip(cam_points[0], screen_points[0]):
        cv2.circle(src, c, 5, (0,0,255), -1)
        cv2.circle(dst, s, 5, (0,255,0), -1)

    cv2.imshow("src", src)
    cv2.imshow("dst", dst)   

cv2.destroyAllWindows()
1 Like