Determine the length of the table (green line in the image)

Find midpoint of length computed in step 1 (red circle)

Determine distance of box’s center relative to midpoint calculated in step 2 (blue line)

How can I go about writing the OpenCV code for this problem? What topics should I look at to get started? Do I use contouring? How will it help to calculate the box’s distance relative to the a reference point (midpoint of table length)?

When you say “determine the length of the table” do you mean by using the camera, or by using some standard measuring device (tape measure etc)?

Note that the midpoint of the line in image space does not coincide with the midpoint of the table. To learn more about why, you should look into perspective projection. Chapter 2 in Multiple View Geometry in computer vision (Hartley / Zisserman), and particularly 2.4 (a hierarchy of transformations) and 2.5 (the projective geometry of 1D) would be instructive for your situation.

Similar to the comments in 2, be aware that the real-world distance of the box to the center will depend not just on how many pixels it is from the reference point, but also where it is (above, below) compared to the reference point. Again, learning about projective geometry will help you understand why.

I think if I were doing this I would start by adding some reference points to the table with known world / plane locations, and then use an image editing program (Gimp, etc) to manually select the image points of the world references. Then use the image locations and corresponding world / plane locations to compute a homography. From there you can learn how to apply the homography to other image points (like the center of the table, and the center of your object) to compute world coordinates for them, and therefore distance between them.

I wouldn’t try to automate the image processing until you have a basic framework for using a homography. But that might not be how you work best, so it might be better to start with whatever you are most motivated to do.

I should have stated this in my main post:
** IT IS FINE if I do not get the real-world length of the table.**

I am okay with just getting the length of the table from the image alone. I am okay with getting the table length in pixels or any other unit that is easy to use with OpenCV for measurement.

I am aware of that. The red dot is just my estimation of where the midpoint is. Again, I do not care about real world measurements for now. I just want the OpenCV program to be able to draw the green line and calculate the midpoint of the green line.

I want the OpenCV program to be able to draw the green line and calculate the midpoint of the green line and say where (above, below) and how far it is from the point of reference aka the midpoint of the line. It is perfectly fine if I can get this information in pixels alone.

So if I reduce this problem to just drawing the line along the table’s length, getting the midpoint of that line and then saying how far along the line the box’s center is without worrying about real-world coordinates, can it be solved easily using just contouring and getting the Euclidean distance between the line’s midpoint and the box’s center (as calculated from the image alone)?

Sure, OpenCV can do that. Look for some tutorials on the drawing functions to draw your line etc. For finding the box, I’d probably start by subtracting your current image from a background image (a picture take without the box on the table) and some sort of thresholding and (most likely) morphology to isolate the box, and then findContours.

That should get you started, but of course there are other / more sophisticated ways to go about it, and depending on what you want to be able to handle, you might have to investigate other options.

I would definitely consider some kind of background modeling, which you don’t appear to have done. I suggested starting with a simple background subtraction (using an image taken without the box in view) + thresholding and morphology. Have you given that a try?

If you can control the objects in the scene (your black box, the table color etc) you can get more contrast and have an easier time of it. If you can control it even further, say putting an Aruco marker on the box, you can make it even easier.

There are so many directions you could go in, and which one you choose will to some extent depend on your constraints and requirements, which I don’t know.

I used morphology in my code but I didn’t use background subtraction and thresholding. What if I only had this image only and I needed to make it work with just this image? How would I get the output I want with just this image at my disposal and not having the ability to take pictures without the box and so on?

Moreover, there are stray wires and glare in the image in my SO question which the findContours function is detecting. I am not sure atm how I can get my program to ignore those.