I would like to know how OpenCV’s implementation of the lucas kanade algorithm returns the location of the tracked points in the next frame. If I’m not mistaken, the algorithm computes the flow vectors in the x and y directions at the points of interest. Are those flow vectors simply added to the input coordinates and returned?
that’s how optical flow works. there is no deeper state or history or “model” of these points. it’s just points on one frame, the local neighborhood for each point in that one frame, and their best matches on the next frame.
you shouldn’t ask about implementation first. you should just assume that any implementation sticks to the original paper, and any deviations would be documented.
Have a look at these computer vision courses:
- Image Alignment, 16-385 Computer Vision (Kris Kitani), Carnegie Mellon University
- Alignment and tracking, 16-385 Computer Vision, Spring 2019, Lecture 22, CMU
- Lecture 30: Video Tracking: Lucas-Kanade, Robert Collins, CSE486, Penn State
The reference papers:
- Lucas-Kanade 20 Years On: A Unifying Framework: Part 1, Simon Baker and Iain Matthews
- Lucas-Kanade 20 Years On: A Unifying Framework: Part 2, Simon Baker, Ralph Gross, Takahiro Ishikawa, and Iain Matthews
- Lucas-Kanade 20 Years On: A Unifying Framework: Part 3, Simon Baker, Ralph Gross, and Iain Matthews
And of course:
For the LK algo in OpenCV, I think this paper can help:
Thanks. I just have a follow up question. The function allows points to be provided in floating point coordinates. How does algorithm handle cases when the coordinates do not match the image grid? Is any interpolation performed or does it look at the closest point on the grid?