Defining the strategy to perform forgery detection

Hello community, this is my first post here. I am learning to use OpenCV to help me develop a way to identify video forgeries, such as inter-frame forgery detection. For that, I am using the Gunnar-Farneback method to compute Optical Flow and then following procedures according to this article: Identifying Video Forgery Process Using Optical Flow

The issue is, even though I already learned how to create a process pool in order to distribute processing and enable 100% of my CPU to run my code, it still takes over 30 seconds to process one second of footage. I could exchange my current AMD GPU for a Nvidia one to leverage CUDA, but I would rather avoid the hassle for now if possible.

So my idea now is to use the Lucas-Kanade method instead (as it also recommends in the article above), but from what I’ve seen it does not output the optical flow itself but the position of “Good Points to Track” between frames. Is it possible to calculate the Optical Flow, similar to the output of the Farneback implementation?

I’m sorry if this question is somehow redundant or repeated, but I was unable to find an answer to this by myself.

there is a difference between dense and sparse flow. sparse flow is calculated for a given set of points only, not all pixels. theoretically, I think LK can be formulated for dense flow. AFAIK OpenCV has LK for sparse flow only.

besides, why stick to 20 and 40 year old algorithms?

use “DIS” optical flow. it’s fast and has lots of other positive attributes.

you can swap out flow algorithms, if that’s not too much hassle. however, don’t speculate so much on what might be wrong with its performance until you’ve profiled the program. then you know where time is spent and where optimization is worth it.

CUDA is far from the only way to use a GPU. OpenCV uses OpenCL in many of its basic functions, if you give it UMat types instead of Mat types.

2 Likes

Thank you for answering me. After a full day of trying to understand and implement your suggestions, I have a couple of questions:

First, I tried to use UMat before posting here, in my Farneback implementation, but I did not notice much (if any) speed up, and I also didn’t notice much GPU when it was executing. That’s why I went the multiprocessing route in the first place. So maybe I am skipping a step? All I have to do is transform a np.ndarray into cv.UMat and nothing else?

The other question is, when I try to implement the DISOpticalFlow method, I am getting an error. Here is my code:

cap = cv.VideoCapture('video_file.dav')
    ret, frame1 = cap.read()

    prev_frame = cv.UMat(cv.cvtColor(frame1, cv.COLOR_BGR2GRAY))

    optical_flow_magnitudes_vect = []

    dis = cv.DISOpticalFlow.create(preset=1)

    while 1:
        ret, frame2 = cap.read()
        if not ret:
            print('End of video frames.')
            break

        next_frame = cv.UMat(cv.cvtColor(frame2, cv.COLOR_BGR2GRAY))
        optical_flow = cv.UMat.get(dis.calc(frame1, frame2, None))

Returns:

Traceback (most recent call last):
  File "C:\Users\user\PycharmProjects\project\test.py", line 33, in <module>
    optical_flow = cv.UMat.get(dis.calc(frame1, frame2, None))
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cv2.error: OpenCV(4.11.0) D:\a\opencv-python\opencv-python\opencv\modules\video\src\dis_flow.cpp:1434: error: (-215:Assertion failed) !I0.empty() && I0.depth() == CV_8U && I0.channels() == 1 in function 'cv::DISOpticalFlowImpl::calc'

And the funny thing is, when I tried to implement this method but using the previous strategy of multiprocessing via CPU, it gave some errors but ran. Different erros, mind you, whilst the optical flow code was mostly the same.

I would greatly appreciate it if you could help me once again. I spent a good portion of today trying to understand OpenCL and how to implement it because I was sure I was missing something, but was unable to find it by myself.

right. many/most/all the basic APIs support it, but more complex ones might not. there’s no guarantee (nor any documentation…) that any given API actually has an OpenCL path it can use, nor that use of a GPU will actually be faster than available CPU paths. general GPU programming advice applies (data transfer time, kernel compilation time, other warmup, pipelining, deficiencies in the kernel itself or how it uses the hardware, etc etc).

the assertion says it wants (non-emptiness, 8-bit integers, and) single-channel inputs. you gave it color. convert to gray. if you believe there is information in the chroma channels (color) that didn’t also show up in the luma channel, addressing that would need separate thought.

don’t worry about OpenCL and UMats for now. first get it working.

I don’t know if OpenCV has anything like this, but optical flow is an ideal application to run on GPUs. nvidia’s SDKs probably have APIs to do that. AMD and Intel probably do too.

most video codecs are based on optical flow, except they don’t call it that, they call it motion estimation. I’ve seen people take a video bitstream from a file (or hardware encoder), take it apart to get at the motion vectors. that’ll be somewhat coarse because entire blocks move as one. in any case, could be something to consider.

Oh damn, sorry. I created so many functions to test things that I mixed up variable names and parsed non-greyscale frames to the Optical Flow bit.

So now that the script is working, it is giving me the same error as the other when it reaches dis.calc(...):

-DDIS_BORDER_SIZE=16 -DDIS_PATCH_SIZE=8 -DDIS_PATCH_STRIDE=4 -DCV_USE_SUBGROUPS=1 -D AMD_DEVICE
C:\Users\user\AppData\Local\Temp\comgr-7326e0\input\CompileSource:186:12: error: use of undeclared identifier 'sub_group_reduce_add'
  186 | sum_diff = sub_group_reduce_add(sum_diff);
      |            ^
C:\Users\user\AppData\Local\Temp\comgr-7326e0\input\CompileSource:187:15: error: use of undeclared identifier 'sub_group_reduce_add'
  187 | sum_diff_sq = sub_group_reduce_add(sum_diff_sq);
      |               ^
C:\Users\user\AppData\Local\Temp\comgr-7326e0\input\CompileSource:226:11: error: use of undeclared identifier 'get_sub_group_local_id'
  226 | int sid = get_sub_group_local_id();
      |           ^
C:\Users\user\AppData\Local\Temp\comgr-7326e0\input\CompileSource:360:11: error: use of undeclared identifier 'get_sub_group_local_id'
  360 | int sid = get_sub_group_local_id();
      |           ^
4 errors generated.
Error: Failed to compile source (from CL or HIP source to LLVM IR).

It still runs to the end, but it seems to not use much if any GPU resources (my rig consists of a Ryzen 5 7600 and a RX 7800 XT, fairly new and moderately powerful pieces of hardware).

Just to shine more light on my situation, I am learning this because I am trying to help a relative who got a freelance job in which one of the steps is checking large amounts of video evidence for forgeries, so compute time is unfortunately already important to me.