Homography having only 2 aruco markers as a reference

jcv · February 21, 2021, 9:04pm

Hello everyone,

I’m just a newbie in computer vision but I try to improve myself
I’m currently facing a problem and I don’t know how to deal with it.
I have a target picture that I want to project (homography) on a paper board containing aruco markers. (just something like what can be seen on this site https://learnopencv.com/augmented-reality-using-aruco-markers-in-opencv-c-python)
However I have a strong constraint that only allows me to have 2 markers instead of 4.
One of the marker is at the top left corner and the other one is somewhere between the 2 right corners of the ABCD frame I want to project to (see picture below)

What kind of operation should I use to get my 4 frame points in the camera preview from the markers ref points knowing that I of course know the distance between the markers and the points A, B, C, D on my sheet of paper?
Hope my question is clear …

Thanks in advance

kbarni · February 22, 2021, 8:11am

The homography function needs the correspondence of 4 points.
The linked tutorial uses the first corner for every marker to get the 4 points. However if you take all the 4 corners, you can obtain a total of 8 corners from the 2 markers. That should be enough to compute the homography. You just have to know the exact position of the marker edges w.r.t. the ABCD points.
The downside of this approach is that the corners (of a marker) are quite close, so the corner detection errors will have more effect on the result.

jcv · February 22, 2021, 3:50pm

Thanks kbarni for your answer!
You’re right, I did not consider using all the whole 8 points of my 2 markers to get a better result.
However, my problem is more how to get A, B ,C, D (the destination points for my homography) knowing my 8 points when I move the sheet of paper in front of the camera.
I guess there should be a function that allows to compute A, B, C, D by taking into account the axis of the detected markers.
Of course I know the distance between the markers and the A, B, C, D points on my sheet of paper but I need the A, B, C, D values w.r.t. the markers in the camera picture.
Hope you understand what I mean …

FilipBaas · February 22, 2021, 6:41pm

The function you are looking for is findHomography().

Detecting ArUco markers and extracting positions of 8 corners in image you get imagePoints. Knowing the actual position of corners on plane (world coordinates) you get objectPointsPlanar.

Mat H = findHomography(objectPointsPlanar, imagePoints);

Using output matrix H, witch contains homography transform, in combination with known positions of ABCD corners (world coordinates) objectPointsABCD, you get positions of ABCD points in image.

perspectiveTransform(objectPointsABCD, imagePointsABCD, H);

Also remember that this solution only applies on planar objects (Z == 0). If you want to place ArUco markers or ABCD corners in different Z, you would need to use solvePnP() function in combination with projectPoints(). In that case you will also need to calibrate the camera.

jcv · February 23, 2021, 6:29pm

Thanks FilipBaas!!
This indeed seems to be what I’m looking for.
I suppose I only have to provide 4 points (out of the 8 I have from markers) to the findHomography function for objectPointsPlanar right?
something like:
np.array([[x1,y1], [x2,y2], [x3,y3], [x4,y4]])
with 1-2 linked to marker1 and 3-4 linked to marker2 …
By the way, imagePoints are defined in pixel but what should it be for objectPointsPlanar?
Provided the markers are perfectly aligned with [AB] segment, would it be too risky and inaccurate to only use a single marker?

FilipBaas · February 23, 2021, 9:51pm

findHomography() is able to calculate transformation with any number of points. The more points you use, the better will the result be. So do not worry about using both ArUco markers, they will make detection more robust and accurate.

objectPointsPlanar can be defined in any units you prefer (mm, km, inch, miles) and the result of projectiveTransform() will be in same units.

jcv · February 26, 2021, 9:03am

OK thanks!
I’m bit ashamed to say that but I was not able to correctly give input points to the findHomography function.
I tried different things …

to define my planar points corresponding to the markers on the sheet of paper I tried this:

realWorldMarkerCorners = np.array([[0, 0], [0, 30], [30, 30], [30, 0]])

to get my image points I tried this:

markerCorners, markerIds, rejectedCandidates = cv2.aruco.detectMarkers(frame, dictionary, parameters=parameters)
index = np.squeeze(np.where(markerIds==85))
markerCorners = np.squeeze(markerCorners[index[0]])

but I got errors returned by the function:
" Bad argument (The input arrays should be 2D or 3D point sets) in findHomography"

I should definitely go deeper in numpy arrays learning

By the way, as my ABCD can be on the left side of the marker, can I define negative points for the planar points of my rectangle?

crackwitz · February 26, 2021, 6:43pm

try

realWorldMarkerCorners.shape = (-1, 1, 2)

C++ API cv::Mat commonly has to be a column vector of multi-channel elements. this will give it a column vector shape and a 2-channel element appearance, when translated from numpy into the cv::Mat that’s expected by the OpenCV function.

-1 rows for “fill this in”, 1 for one column, 2 “channels”.

OpenCV is a bit peculiar there.

jcv · March 2, 2021, 8:38pm

Hi crackwitz,

you were right, this solved my problem!!
I did not understand the whole thing but it works.
I finally decided to use 2 markers instead of 1 and it is much stable now.

So to sum up, in my case I use:

findHomography to get matrix1 for “real world markers” to “image markers”
perspectiveTransform to get the projection of “my real world ABCD” to “image world ABCD”
findHomography again to get matrix2 for “my picture to project” to “image world ABCD”
warpPerspective to apply my picture to the current frame based on matrix2

I did not think I would need 2 x findHomography but I don’t see how to do it differently …

However, I encounter 2 new problems now:

the frame rate is just horrible (probably due to a huge processing time for each frame)
the picture that is being projected in warpPerspective (which is a PNG) is completely chopped, just as if resolution was very low)

Just a precision for point 1
I use a raspi 3B in console mode (to lower CPU usage) and python with “picamera” package for preview and overlays and opencv for image processing of course

jcv · March 8, 2021, 8:22am

Hi there,

Any updates on that please?
I’m still stuck on my problems described in my previous post

Thanks

crackwitz · March 8, 2021, 4:53pm

what updates do you expect? you have to give (e.g. runnable code) before you can receive. mere descriptions of code are insufficient.

jcv · March 8, 2021, 9:32pm

Yes of course I understand, it will be easier with the code
Here it is:

import cv2
import argparse
import numpy as np
import picamera
import picamera.array
import time


#------------------ GLOBALS---------------------------
    
# define image to be wrapped in real time
im_src = cv2.imread("image.png")

# load the aruco dictionary
dictionary = cv2.aruco.Dictionary_get(cv2.aruco.DICT_4X4_250)
            
# initialize detector params
parameters =  cv2.aruco.DetectorParameters_create()

# marker corners coordinates on paper (real world)
realWorldMarkerCorners = np.float32([   [0, 0], [30, 0], [30, 30], [0, 30],  \
                                        [125, 40], [155, 40], [155, 70], [125, 70]  ])

# frame corners coordinates on paper (real world) 
realWorldFrameBoardCorners = np.float32([[-2, 34], [158, 42], [158, 84], [-2, 92]])

    
#------------------ FUNCTIONS---------------------------

def process_frame(frame, img_to_process):
    """Detect markers and wrap"""
    
    global realWorldMarkerCorners
    global realWorldFrameBoardCorners

    
    try:

        # detect the markers in the image
        markerCorners, markerIds, rejectedCandidates = cv2.aruco.detectMarkers(frame, dictionary, parameters=parameters)

        imageMarkerCorners = np.asarray(markerCorners)

        # tips from crackwitz (ocv forum)
        realWorldMarkerCorners.shape = (-1, 1, 2)
        realWorldFrameBoardCorners.shape = (-1, 1, 2)

        # marker with Id 85
        index1 = np.squeeze(np.where(markerIds==85))

        # marker with Id 10
        index2 = np.squeeze(np.where(markerIds==10))

        Marker85_UpperLeft_Point    =   np.squeeze(markerCorners[index1[0]])[0]
        Marker85_UpperRight_Point   =   np.squeeze(markerCorners[index1[0]])[1]
        Marker85_LowerRight_Point   =   np.squeeze(markerCorners[index1[0]])[2]
        Marker85_LowerLeft_Point    =   np.squeeze(markerCorners[index1[0]])[3]

        Marker10_UpperLeft_Point    =   np.squeeze(markerCorners[index2[0]])[0]
        Marker10_UpperRight_Point   =   np.squeeze(markerCorners[index2[0]])[1]
        Marker10_LowerRight_Point   =   np.squeeze(markerCorners[index2[0]])[2]
        Marker10_LowerLeft_Point    =   np.squeeze(markerCorners[index2[0]])[3]

        # dest points
        pts_dst =   [Marker85_UpperLeft_Point] + [Marker85_UpperRight_Point] + [Marker85_LowerRight_Point] + [Marker85_LowerLeft_Point] \
                +   [Marker10_UpperLeft_Point] + [Marker10_UpperRight_Point] + [Marker10_LowerRight_Point] + [Marker10_LowerLeft_Point]
        pts_dst_m = np.asarray(pts_dst)
        
        # 1st homography : get matrix1 for “real world markers” to “image markers”
        matrix1, status = cv2.findHomography(realWorldMarkerCorners, pts_dst_m)

        # get the projection of “my real world frame" to “image world frame"
        imagePointsABCD = cv2.perspectiveTransform(realWorldFrameBoardCorners, matrix1)

        # source points
        pts_src = [[0,0], [img_to_process.shape[1], 0], [img_to_process.shape[1], img_to_process.shape[0]], [0, img_to_process.shape[0]]]         
        pts_src_m = np.asarray(pts_src)

        # 2nd homography : get matrix2 for “my picture to project” to “image world frame"
        matrix2, status = cv2.findHomography(pts_src_m, np.asarray(imagePointsABCD))

        # warp source image to destination based on homography
        warped_image = cv2.warpPerspective(im_src, matrix2, (frame.shape[1],frame.shape[0]))
       
        # return the processed image
        return warped_image
    
    except Exception as inst:
        print(inst)
        return None


		
def take_picture(cam):
    """ 
    Returns an openCV compatible colour image 
    """
    with picamera.array.PiRGBArray(cam) as stream:
        cam.capture(stream, format="rgb", use_video_port=True)
        return stream.array
       



def execute(flip=1):

    camera = picamera.PiCamera()
    camera.framerate = 30    

    if (flip == 1):
        camera.hflip = True

    # gives camera warm-up time
    time.sleep(2)
    camera.start_preview()
    
    frame = take_picture(camera)
    ovr = camera.add_overlay(frame, format="rgb", layer=3, alpha=64)
    
    if (flip == 1):
        ovr.hflip = True
    
    
    try:

        while True:
            frame = take_picture(camera)
            frame_out = process_frame(frame, im_src)

            if (frame_out is not None):
                ovr.update(frame_out)

    except Exception as e:
        print(e)
        pass
        


#------------------ MAIN---------------------------

def main():
    
    # ---- ARGS parsing -----------
    ap = argparse.ArgumentParser(description='AR frame projection test')
    ap.add_argument("--flip", "-f", default=1,
        help="Flip both preview and overlay")
    args = vars(ap.parse_args())

    if (
        isinstance(args["flip"], str)
        and args["flip"].isdigit()
    ):
        args["flip"] = int(args["flip"])

    #-------------------------
        
    execute(args["flip"])

      
        
if __name__ == "__main__":
    main()

FilipBaas · March 14, 2021, 11:37am

warpPerspective() function, from my experience, is very performance demanding so RPi 3B may have some hard time with processing. The downscale of warped image might be solved by changing the interpolation used in warp function from default INTER_LINEAR to INTER_NEAREST.

jcv · March 15, 2021, 9:18pm

Hi FilipBass,
I tried what you suggested but unfortunately it does not change anything in my case.
I’ll post a picture to show exactly what happens.
Concerning the processing issue, I can maybe try to get my “process_frame” function execute in a separate thread but I’m not sure it will change much. The best thing should probably be to port my code into a more “low level” language …

jcv · March 15, 2021, 10:18pm

OK I found the problem … it was simply linked to the camera resolution which was set to 640 x 480 by default
However, now that my resolution is higher, processing time is increased and the frame rate is just not acceptable

crackwitz · March 16, 2021, 12:26am

a complete perspective warp costs some calculations. the GPU part of a computer would be the best processor to handle that.

if you do need this in realtime and are willing to invest some effort into the source code, that’s something to consider.

by the way, your code may have some but not much potential for optimization. you use a single perspective warp (potential for GPU… but effort) and two findHomography calls which involve RANSAC (so they can take a bit of time, depending on amount of input data).

if one of your findHomography calls (the second one?) consists of exactly four point pairs, you can use getPerspectiveTransform, which is a direct solution.

you could also do that to the other call, if you pick a good set of four pairs from the eight pairs of corners you have.

check how much this call really costs (also the other calls). use time.perf_counter() to take timestamps, calculate their difference. you need to see what’s actually taking how much time in your code.

I don’t know the state of OpenCL on a raspberry pi… OpenCV can use OpenCL for many things, if you use the UMat type. it’s a wrapper around Mat (or numpy array) that represents the data on a GPU. many OpenCV calls, specifically the perspective warp call, can take a UMat and return a UMat, which you can then unpack as a Mat/nparray (those live CPU-side). IF you have OpenCL available and OpenCV sees it and can use it (there are functions to check that), you could try this, but moving data to and from the GPU can cost you extra time. on a raspi it’s all the same memory so potentially it could cost you nothing at all.

jcv · March 23, 2021, 10:23pm

Hi crackwitz,

I added timestamps as you suggested and here is the result:

------------------
Processing times for operations in process_frame
------------------
. detectMarkers => 0.11570285399966451
. findHomography#1 => 0.00202957300007256
. perspectiveTransform => 3.906299980371841e-05
. findHomography#2 => 0.00039276799998333445
. warpPerspective => 0.06075245000010909

------------------
Processing times for the infinite loop
------------------
. take_picture => 0.1219838689999051
. process_frame => 0.18347789700010253

It seems that the more time-consuming tasks are not homography but “detectmarkers” and “takepicture”.
Maybe it could be interesting to run “takepicture” in another thread …
What do you think? does it make sense?

crackwitz · March 24, 2021, 12:06am

that depends on whether the pi camera produces frames at a fixed frame rate or not.

if it does, the time you see is merely idle time until the next frame is actually produced.

to assess this, you’d determine the time for a complete loop. take the time in one place in the code, say between cam.capture and returning from take_picture, and check the difference. if that time is somewhat stable and consistent with a fixed frame rate, say 1/30 seconds… you have your answer.

cam.capture itself may take time, which you can only assess if there’s buffering, and if you delay before the call, so there’d be a full frame buffered already, so you don’t have to wait for the camera itself to finish exposing the frame first.

jcv · March 24, 2021, 9:32pm

By taking a closer look at the Picamera API, I finally found a way to reduce the take_picture part!!
I fact, there’s a method called “capture_continuous” that produces continuous frames.
(as you said, I guess it continuously feeds a frame buffer)
So instead of calling “cam.capture”, I just have something like:

for frame in camera.capture_continuous(rawCapture, format=“bgr”, use_video_port=True):
…

The overall time of my infinite loop has now dropped from 0.30 to 0.17
If I could now reduce the “detectMarkers” part it would be perfect (I could may be check the DetectorParameters attributes of the detectMarkers method for that, but it does not seem to be that easy)
Anyway perfs are much better now.

Topic		Replies	Views
Get the proportional distance between Aruco markers?	1	213	February 27, 2023
Trying to get transformation between 2 arUcos Python calib3d , aruco , cg	4	658	October 14, 2021
Aruco coordinate system changed in pose estimation for OpenCV 4.6/4.7; bug or change in spec? Python calib3d , aruco , objdetect	2	1093	March 27, 2023
Recover pose between two cameras using Aruco markers calib3d , aruco , cg	8	1142	November 3, 2022
How to draw circle in realtion to aruco marker y axis? Python aruco , imgproc	1	459	March 4, 2021

Homography having only 2 aruco markers as a reference

Related Topics