findHomography inaccurate as it moves to left side of image

OpenCV 4.53, C++, VS 2017, Win10

I have a findHomography situation that I cannot explain. The findHomography is becoming increasingly inaccurate as the points move to the left of the camera view of the screen.

Previously I had a chessboard of 8, 4 internal corners and used findChessboardCorners. I am now using a chessboard of 13, 6 and using findChessboardCornersSB and yet I am still seeing the same anomaly.

I am projecting the chessboard on to a 2nd monitor and the camera is facing that 2nd monitor. In this case the 2nd monitor is 1280 X 720, the chessboard jpg is resized to 1280 X 720 from it’s original 1920 X 1080. A namedWindow is created and properties set with code below

cv::namedWindow("GameBoard", CV_WINDOW_NORMAL | CV_WINDOW_KEEPRATIO);
cv::setWindowProperty("GameBoard", CV_WND_PROP_FULLSCREEN, CV_WINDOW_FULLSCREEN);
resizeWindow("GameBoard", rSecondary.Width(), rSecondary.Height());
moveWindow("GameBoard", rSecondary.left, rSecondary.top);

To find the chessboard corners in the camera view and then the Mat of the chessboard jpg I have:

bool patternfound = findChessboardCornersSB(mCameraImage, boardSize, vImage, CALIB_CB_EXHAUSTIVE | CALIB_CB_ACCURACY);
patternfound = findChessboardCornersSB(mToScreenChessBoard, boardSize, vObject, CALIB_CB_EXHAUSTIVE | CALIB_CB_ACCURACY);

I then create the Homograph with:

mHomoToScreen = findHomography(vImage, vObject, CV_RANSAC);

A point in the camera view is then translated with:

newImage_point.push_back(Point2f(cvpKeypoint.x, cvpKeypoint.y));  // keypoint seen by camera and discovered in ShowBlob
perspectiveTransform(newImage_point, newObject_point, mHomoToScreen);
cvpHomoKeypoint = cvPoint(newObject_point[0].x, newObject_point[0].y);

What I am seeing is that the points in the camera view from the right 2/3rds of the camera view translate to points on the 2nd screen very accurately. However, as the points move to the left of the camera view, the more they move into the 1st third of the frame the more inaccurate they become. The inaccuracy is that the actual point in the camera view will progressively translate to something more and more to the right of where it should be on the 2nd screen as the point in the camera view moves to the left edge. My guess is that the worst translated point is about 20 pixels to the right of where the point should be out of a total of 1280 pixels of screen width.

Does anyone have an opinion of what would cause an inaccuracy but only to the left of the screen?

Ed

pictures/video please?

Ok, let me give this a try. I don’t see an “Insert picture” so I assume that I need to use the link insert.

Here is a screen capture of the 2nd monitor with the chessboard displayed. Monitor2 is 1280 X 270 and the Mat mToScreenChessboard has been resized to this same resolution and show FullScreen.
Mat of chessboard shown on 2nd monitor

Here is what that 2nd monitor with the chessboard looks like in the cameraWindow Camera view of chessboard

Once the Homography is done, for testing purposes, I put a spreadsheet on the 2nd monitor with the cells sized at 1/4" X 1/4" and maximized. The dark cell is, give or take, the center of the spreadsheet. What the program will be doing is sending each frame from the camera stream to SimpleBlobDetector to look for a laser hit on the screen. When detected, that will be the keypoints. The program will then take the keypoint on the cameraWindow and, using the Homograph created, determine where this same point is with respect to the Mat mToScreenChessboard and then simulate a mouse click at that point.

At this stage of testing the Mat of the chessboard is on a monitor so there is no complication of a projector screen being skewed. Since the Mat is 1280 X 720 and shown full screen on the 2nd monitor which is also 1280 X 720, then I would assume that what is on the 2nd monitor and the actual Mat should match up more or less 1 to 1. Here is the spreadsheet on the 2nd monitor Spreadsheet shown maximized on 2nd monitor

Now I shoot a laser onto the 2nd monitor more or less in the center. Here is the camera view of that hit. The actual laser hit is surrounded by a black circle drawn around the keypoint detected. So you can see that the keypoint is accurate. What you can also see in this camera view is the result of the simulated mouse click on the 2nd screen. The Homographed keypoint will be the tip of the Arrow Cursor and the spreadsheet cell under that point will activate with a black box around it. In this hit in the center of the screen you will see that the mouse arrow cursor point is right behind the laser hit showing that the Homograph translated accurately to the 2nd monitor. Camera view of laser hit center of 2nd monitor (Camera has been dimmed to help detect laser hits.)

Here is how that Homographed point on the 2nd monitor is ued to simulate a mouse click. For testing purposes, the 2nd monitor has the same resolution as the primary monitor and, in this case, is an extension of the primary and to the right. Sx and Sy are the Homographed keypoint for the screen (2nd monitor).

INPUT Inputs[3] = { 0 };
Inputs[0].type = INPUT_MOUSE;
Inputs[0].mi.time = 0;
Inputs[0].mi.mouseData = 0;  // Mouse Wheel movement is 0
// The Sx and Sy have been moved from the Mat 0,0,1280,720 to the corresponding point on the the 2nd monitor which as a starting x of 1280
// Get the sysmetrics of this point
Inputs[0].mi.dx = (float)Sx * (65536.0f / GetSystemMetrics(SM_CXVIRTUALSCREEN)); // desired X coordinate on 2nd monitor
Inputs[0].mi.dy = (float)Sy * (65536.0f / GetSystemMetrics(SM_CYVIRTUALSCREEN)); // desired Y coordinate on 2nd monitor
// Move the mouse to the desired point on the 2nd monitor
Inputs[0].mi.dwFlags = MOUSEEVENTF_ABSOLUTE | 
MOUSEEVENTF_VIRTUALDESK | MOUSEEVENTF_MOVE;
// Now left mouse down and then up to make a click
Inputs[1].type = INPUT_MOUSE;
Inputs[1].mi.dwFlags = MOUSEEVENTF_LEFTDOWN;

Inputs[2].type = INPUT_MOUSE;
Inputs[2].mi.dwFlags = MOUSEEVENTF_LEFTUP;
// Now send all 3 inputs
SendInput(3, Inputs, sizeof(INPUT));

Here now are camera views of the next shot more to the left and then another even more to the left. You will notice that as the shots go left the tip of the arrow cursor is more and more to the right of the actual laser hit
Laser hit left of center
Laser hit at left of screen

Now here is a hit on the right of screen to show how the inaccuracy is not a matter of being off center but being to the left.
Laser hit on screen right.

You will see that this particular camera has a slight fish eye effect. I have assumed that the Homography will compensate for this but as soon as I finish with this post I will run the same test with a different camera that does not have the slight fish eye effect. I will let you know how that turn out.

Any opinions or guesses of what I can modify to see if it fixes this would be appreciated.

Ed

Damn! It was the fish eye lens. The fish eye effect is relatively minor compared to a true fish eye lens. I didn’t think that it would matter. Is there anything in OpenCV to compensate for the fish eye effect when it comes to homography?

Ed

If you calibrate the camera you can use the resulting camera matrix + distortion coefficients to either undistort the image (and then use only undistorted images for computing the homography and using the homography), or you can undistort the points (faster) to compute the homography etc.

The key is that the homography is a linear transform and the lens distortion is non-linear, so the homography isn’t capable of modeling the lens distortion (just the projective distortion).

Your distortion isn’t too significant, so you can probably get away with the 5 parameter model. Getting measurements as far into the corner of your image during calibration will be helpful. Otherwise you might want to try the rational (8 parameter) model. I have had really good luck with that on lenses with significantly more distortion.

1 Like

Steve,

I’ll need time to translate this to English :grinning: I have found the OCV fisheye namespace but have not been able yet to figure out if that is what I am looking for. You mention 3 things. One is to “calibrate the camera” and then to “undistort the image” and the last is to “undistort the points”

I am unfamiliar with these processes. What OCV functions would be used for doing any of these three things?

PS. I have found a “learnopencv” article about camera calibration that should tell me what I need to know…thanks.

Ed

A few more comments.

  1. For camera calibration I suggest using the ChAruco pattern (chessboard + Aruco markers) because you don’t need to see the entire pattern in each image in order to use the visible points. This is helpful in getting sample points (chessboard corners) deep into the corners of your image (which helps with calibration accuracy).
  2. You can use the monitor to generate your calibration pattern if the scale / focal distance works for your use case (which appears to be the case). Monitors are really flat and have accurate pixel spacing. My main suggestion would be to make sure the image is being displayed without scaling (1:1 correspondence between image pixels and screen pixels). The monitor will probably generate better results than a printed target.
  3. It looks like this is for a shooting simulator. If the intent is to display the images on monitors you are fine, but if you plan on using projectors for your visual display, note that some have uncorrected distortion, particularly the ultra-short-throw types. It will be worst at the edges, and usually not very bad, but you could see 5-10 pixels of error near the corners / edges. If you find this to be the case you can calibrate the intrinsics of the projector (the same math / techniques used for cameras apply to projectors - they are equivalent except for the direction of the light). And if you project onto a curved screen things get harder.
1 Like

I haven’t used the fisheye functions in several years, but when I evaluated them I did not find them to be ready for my purposes. First, I had problems with the calibration algorithm converging to a good solution. When it did the results were good, but often it would be total garbage. Also the normal set of functions (cv::undistort, cv::undistortPoints, cv::projectPoints, etc) had fisheye versions that weren’t as fully functional or not a 1:1 match. I don’t remember specifically, and maybe this has all changed, but it wasn’t workable for me. Fortunately if you use the rational model and the standard calibration you will be totally fine with that lens.

Your first order of business is to calibrate your camera. If you want accuracy there is no way around this unless you can get a lens with very low distortion. In that case you could just use a homography, but it will limit your options down the road (only “perfect” lenses, only flat screens, etc). cv::calibrateCamera(), or preferrably cv::calibrateCameraCharuco() are the core functions. I think there are some sample apps that automate the collection of images and call the calibration process for you, but I haven’t used them (and I’m not sure if there is one for the ChAruco calibration pattern or not - I suspect there is?).

The calibration process will provide a score. Lower is better. The number represents the amount of error between where a sample point was actually observed in the camera and where the model predicts it. This is in camera pixels and is an average (RMS) value for all of the samples used. For your purposes you might be ok with any score < 1.0 (and maybe you can tolerate even higher error?). You might be able to get scores as low as 0.25 with your setup.

The result of the calibration process is a camera matrix and distortion coefficients. These parameters are fixed (assuming your lens is a fixed focal length lens and you aren’t adjusting it in any way). The camera matrix and distortion coefficients encode how 3D points (in the camera reference frame) project to 2D image points. These parameters (camera matrix and distortion coefficients) are often referred to as intrinsic parameters. You will use these parameters to undistort images or points.

cv::undistort() is what you want for undistorting an image. See the documentation for how to use it.

cv::undistortPoints() will undistort a list of points based on the camera matrix and distortion coefficients. Again, see the documentation for this function and samples/tutorials. Note that this is an iterative process and not a direct computation, so there will be some residual error. The amount of error is usually pretty small, especially for low-distortion lenses. In current versions of OpenCV you can pass in parameters that specify how many iterations to try as part of the cv::TermCriteria parameter, I believe) - I have found that I need a very high number of iterations in some cases to converge. Probably you will be fine with 20 iterations (or maybe even 5-10) but I would run some tests with different values to see how much of a change it makes. I think in some cases I have used 100 iterations to get what I want (but to be fair my lens is highly distorted and may be pushing the limits of the distortion model I’m using)

cv::distortPoints() - this takes undistorted point locations and computes distorted point locations based on the calibrated intrinsics. This is a direct computation and does not iterate to converge to a solution.

I’m sure this is a lot of information to digest, but I wanted to include it here for you (and others) to refer back to - some of the nuances aren’t obvious from reading the documentation. Work through some examples and read the documentation…it will take some effort but it will be worth it.

Fastest way to get up and running with your current setup:

  1. Calibrate the camera with the sample app. Use the ChAruco pattern / method if there is an app / option to do so, but don’t get hung up on that (use the standard Chessboard pattern if that’s all that is available). It’s fine to start with the 5 parameter model, but if you find that accuracy is lacking you might want to use the rational model at some point.

  2. Use your exact setup as it is, but instead of using the raw camera images call cv::undistort() and then use the undistorted images everywhere throughout your whole process (to compute the homography, for finding the blob etc.)

That will get you results the fastest, but you might (?) find that the cv::undistort is too slow for your purposes. If so, you’ll have to get comfortable with calling cv::undistortPoints and computing the homography with the undistorted values, etc. This will be much faster.

Good luck.

1 Like

Steve,

Wow, this is great. I see, from what I have read in the last 45 minutes, that getting the camera matrix (am I getting the terminology right?) the process requires multiple camera shots from different angles. the LearnOpenCV article suggests.

When we have very little control over the imaging setup (e.g. we have a single image of the scene), it may still be possible to obtain calibration information of the camera using a Deep Learning based method.

but then never mentions how this is done. In my case, I will not be the one using the app so the app needs to look at the video capture (camera stationary) and first decide if the lens is distorted or not. If it is distorted, like mine was, it then apparently need to be able to use that single image to un-distort the points when there is a detected keypoint to un-distort.

So, apparently, my first step is to find out how to calibrate the camera using only one image. Am I looking at the same library of functions but maybe different parameters or is the “Deep Learning” method using a different library of functions?

Ed

Why can’t you have the end-user move the camera around to take pictures of the monitor from different angles? If you do that you can have them calibrate the camera (a one-time thing) and use the standard OpenCV functions.

I don’t know anything about the deep learning method you mentioned. If all you really need to do is undistort the image (or points) I think you could probably do that with something less than a fully calibrated camera, but I’m not sure how you would get there. I think all you need is center of distortion and some distortion function, and maybe a single image can get you there. My gut tells me that an image of a number of lines that are known to be straight in the world might be enough to get the distortion function, but I’m not sure how I would go about it.

It reminds me of a paper on calibrating camera distortion with a “calibration harp” - a number of strings under high tension (to enforce linearity). If you are curious you can read about it here:

(https://arxiv.org/pdf/1212.5656.pdf)

I don’t think that’s going to really give you what you want, or at least not without a good amount of work and deeper dive on your part.

1 Like

Steve,

For some of my users they may have the camera mounted on the ceiling pointing to a projector screen. So, keeping them in mind, I would like to figure out how I can do this with a single image.

This is all a moot point if the camera is flat like a Logitech web cam but I need to check regardless since an IR camera that my customers use is the one that I was using that had the slight fish eye distortion. I guess that I could figure out the matrix for that particular camera and look for it but then there is a second problem. The camera is a manual Zoom so, from what I see, I need the focal length for some of these calculations and with a zoom, I’m not sure how that would be calculated.

I have written to the author of the LearnOpenCV article about the Deep Learning method so hopefully something will come of that.

This brings me to the point where I would like to give a shout out to Serendipity. Had I been using a Logitech camera instead of the one with the fish eye distortion for testing, my testing would have shown all points accurate and I would have proceeded none the wiser until some customers complained that there was some inaccuracy and I would be scratching my head as to why them and not me. Just call me lucky!

Anyway, I really appreciate all of the pointers and hopefully some will lead me to a solution. Since this thread was started under the findHomography heading and that turned out not to be the problem, as I proceed I may end up starting another thread on calibrateCamera. I see I have some additional learning to do.

Thanks again.

Another thing you could consider doing is just a mesh-based mapping. Get the camera->monitor correspondences for a grid of points and then when you detect the laser in a camera you can map it to screen space like this:

  1. Calibrate the system by getting a mapping from screen coordinates to image coordinates.
  2. create a triangle mesh from the image points
  3. Detect the laser in the camera image
  4. determine which triangle the laser point is contained in
  5. compute the distance in image pixels between the laser to each of the triangle vertices
  6. use these distances to compute weights and then multiply the corresponding screen-space locations by the weights to get an estimated screen location for the laser.

Read up on Barycentric interpolation to get the details on this triangle weighting scheme. I have used this technique very successfully in similar applications.

A few comments:

  1. A point-based measurement like this will be prone to outlier-induced error. What I mean by this is that if your measurement of some of the points is incorrect (a smudge on the lens, a reflection/glare on the monitor, etc) the resulting camera->screen mapping will have ripples / high frequency artifacts. This is one of the tradeoffs between a map/lookup table approach vs a mathematical model (which inherently filters out high frequency errors by picking a best fit to all of the data)
  2. You can do some sort of neighborhood-based filtering to detect / correct outliers. The optical distortion you are seeing does vary from one side of the image to the other, but locally it is fairly constant. I have used a local neighborhood (3x3) homography to filter/correct points in the past, but that was with a low distortion lens. With the amount of distortion you have a higher order function might be needed - I have used a 5x5 neighborhood to fit a 2nd order function for a projector on a curved screen scenario. This might be more than you want to get into, and there might be some libraries that package this approach up nicely.
  3. If you aren’t able to get a dense enough mesh using a chessboard target there are some other methods - I can explain in more detail if that is an issue.

I guess what I’m trying to say is that you can get what you want, but it might be somewhat involved to get there. I have a lot of experience doing this kind of thing - I’ll help where I can.

-Steve

Steve,

Thanks for this and thanks for the offer of help. I’m hoping I can work my way through this problem as I usually learn best by banging my head against the wall :wink:

I saw your earlier references to the samples in OpenCV and am reading through that now. I haven’t run the camera calibration sample yet but I see that they refer to 14 images in the sample directory that show the chessboard at 14 different angles. If this is what I think it is, it might be that, instead of someone moving the camera and taking 14 photos of a static chessboard on the wall, I can project a chessboard at 14 different angles and use these to calibrate the camera. IF this is correct, it might solve my problem of my users not being able to move their cameras or projection screens. Instead of the program projecting just one static chessboard, it will project 14 different boards. At least that is what I am hoping will happen when I run the sample.

If not, then it is on to play B.

Ed

Unfortunately I don’t think that will work, but I’m not sure I understand the fundamentals well enough to give a reason why. I think it’s because you would be composing two projections (the actual one your camera does and the simulated one used to render it at different angles). And maybe a third projection (that of the projector onto the screen, which won’t be perfect) This would, I think, result in a calibration that would correspond to some theoretical camera, but not the one you are trying to calibrate. Maybe if you used the (known) camera matrix (projection matrix in OpenGL speak) you could factor that out of the calibration result somehow? I really don’t know, but maybe there is something there. The projection screen not being flat would also introduce error…and it’s probably not flat.

Also I don’t think you really need a full camera matrix if all you want to do is unwarp the image. You need the image center and distortion coefficients, but the focal length is just a scale factor, as I understand it. We are way out on a limb into speculation territory here, though…If you do try it post the results.

The other consideration is that you will probably have a hard time getting points to cover the full image so your distortion model won’t be too accurate outside of the views you get. Not a huge deal since you only care about the parts of the image that contain the monitor/projector in the first place.

Interesting problem. Good luck.

Steve,

Well, for one, I could not get the sample camera_calibration.cpp to compile. I spent some time messing with it and decided it was taking too much time.

I see your point about the photos in the sample data directory being photos themselves… I’m wondering if taking my chessboard and then using PhotoShop to modify its perspective several times and then projecting those in turn, if that would work. That way also, I have control of the image size so, if flat on, it covers the whole screen, if the perspective is top down, the top will cover the width of the full screen, etc. The bottom up, left over, and then right over so 5 different shots.

For now I am using a monitor as the 2nd screen to avoid the further complexity of a projector and screen. Baby steps :wink:

Having given up on the sample cpp I am reading through an example of calibrate and undistorte from https://aishack.in/tutorials/calibrating-undistorting-opencv-oh-yeah/

I’ll see what that does. It is always good to have examples to make sure no steps are getting missed and then play around with it once you get something working.

I’ll let you know.

Ed

Steve,

It looks like my luck is holding out.

I used the sample code at:
http://www.aishack.in/tutorials/calibrating-undistorting-opencv-oh-yeah/
(my browser warned me that the web certificate was out of date on this web page so I sandboxed and continued. Looked OK. Well, the guy works for Microsoft so who knows :wink: )

In that code it has a loop to go through each “photo” and do a findChessboardCorners for each one. I had taken my 13 X 6 chessboard and made 4 other versions at different perspectives such as this one Chessboard perspective top to bottom. along with bottom to top, left to right, and right to left. With my flat face on chessboard, that made 5 “photos”. With each iteration through the loop I changed out the Mat that was projecting in the 2nd monitor and let the camera capture it (at least in code).

In the sample code, at the end of the loops, it imshows a window of the original Original distorted chessboard camera capture and a window of the Undistorted chessboardcamera capture.

Well that looked like it worked but while I was running this test I never saw the Mat on the 2nd screen change from the original flat face on chessboard.

This camera calibration is running in it’s own thread and all of the namedWindows were created and being shown in the main thread, For some reason, while doing the camera calibration, the loop in the main thread had paused. I haven’t checked that out yet but the fact is that, in effect, I ran 5 loops with the same “photo”. I had read somewhere that this would cause problems and yet it worked.

The sample code didn’t have a section to get an error reading so I added code to use projectPoints to get an average error reading and I got an error of 0.08577. You mentioned above that probably anything < 1.0 might be OK and maybe I could get a 0.25 with my not-so-fisheye camera. So 0.085 is pretty good and the undistorted chessboard looks to the human eye pretty good as well.

I’m going to modify the code to disregard all of the perspective chessboards to double check that it was ignoring them to begin with and then my final test will be the original test to see if I can get the un-distorted point homographed over to the spreadsheet on the 2nd monitor.

All told it sure looks like I stumbled on a fix. I’ll let you know if there are any gotchas as I go forward. Thanks for your help and encouragement.

Ed

Steve,

Final report…TaDaaaa. It works. In effect I ran the findChessboardCorners on just the one chessboard and had set the “focal distance”? “focal length”? to 1 even though I have no idea what it is especially with this manually zoomed lens. I got the cameraMatrix and distCoeffs and later, when the program detected a laser hit, I used those with undistortPoints before getting the homographed points for the 2nd monitor. Almost every laser hit was spot on accurate except for the extreme left side and even that was just slightly off by no more than a few pixels. The laser hit and the inaccurate translated homographed points were both within the same small spreadsheet cell. For this program that is accurate enough for me not to go any further trying to get those left side hits closer. I’m a happy camper.

Thanks again.

Ed

Hey, that’s great. I’m not sure I understand how that worked, but it sounds like by providing a fake focal length you were able to get the calibration to work well enough for undistorting purposes.

Question: It sounds like you provided a focal length in the camera matrix parameter (for FX and FY). Did you use the CALIB_USE_INTRINSIC_GUESS, CALIB_FIX_FOCAL_LENGTH, CALIB_FIX_ASPECT_RATIO flags? And did you use or not use CALIB_FIX_PRINCIPAL_POINT?

I’m asking because for the distortion to be the most accurate you would want to know the actual principal point since the distortion is relative to that point. Nominally (for most cameras) it is expected to be close to the center of the image, but even small shifts can affect the accuracy, particularly on high distortion lenses for points far from the center.

I’m point this out because if you aren’t getting a calibrated value for the principal point (you are just using a guess / the numerical image center) you might have gotten lucky with this camera (if the principal point happens to be pretty close to the center of the image), but that might not hold in practice. I routinely have cameras where the principal point is 50 pixels away from the center of the image.

Ideally you would be able to get a calibrated value for the principal point and distortion coefficients from just one image - if that’s what you’ve done then great!

One word about using 1 as the focal length - this might be totally fine, but it’s not at all close to what the actual value is. It theory it’s just a scale factor and it shouldn’t matter, but a value of 1 might be too small / have some sort of numerical stability issues. (Or it might not matter at all) If it were me I’d probably use a dummy value that’s a lot larger. The focal length measured in units of pixels, and something like 1/2 the width of the image would be a reasonable number to use.

BODY {font-family=“Arial”} TT {font-family=“Courier New”} BLOCKQUOTE.CITE {padding-left:0.5em; margin-left:0; margin-right:0; margin-top:0; margin-bottom:0; border-left:“solid 2”;}

Steve,

Thanks for getting back to me about these flags. I had not gotten to the point of going through the various functions in more detail to see if there was something to tweek. The sample code I used left out the flags parameter in calibrateCamera and I hadn’t noticed that. Looking at all of the various possible flags it is a little confusing as to what each one really does. Do you have a suggestion to which flag(s) I should start with in testing?

That whole “focal point” terminology that is used here is confusion. I’m used to a camera lens having a focal length of 40mm…that type of thing. The distance internally between the film and the focal point. This x,y focal length must be something else entirely.

Thanks again.

Ed

Steve,

Currently I am using

calibrateCamera(object_points, image_points, image.size(), cameraMatric, distCoeffs, rvecs, tvecs);

If I add any of the flage, even one such as CALIB_USE_INTRINSIC_GUESS or CALIB_FIX_ASPECT_RATIO, the program will hang.

Ed