My video freezes after one second (not responding)

Hi all, I am quite new here.
I need help. I have been trying to figure it out for more than a week, but got stuck.
I appreciate if someone can help.
Thanks so much.

I am having a problem with openCV (version 4.5.2)
I am trying to run a video with an “object detection”. It is running, but then after about one second, the video is not responding. (picture is attached below).

FYI, the “object detection” is working until the video freezes. So, I believe there is nothing wrong with the 'object detection" code.

I also found out, even without the application of “object detection”, the video freezes at the same frame. However, it will run again after a while. But this is not the case for the video with “object detection”.

Please find my code below:
Video for running the video (without object detection)

cap = cv2.VideoCapture('Tensorflow/workspace/images/video.MP4')
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))


while cap.isOpened(): 
    ret, frame = cap.read()
    image_np = np.array(frame)
        
    if ret:
        cv2.imshow('object detection', cv2.resize(image_np, (960, 540)))
            
        if cv2.waitKey(240) & 0xFF == ord('q'):
            break
        
cap.release()
cv2.destroyAllWindows()

Video for running the video (with object detection)

cap = cv2.VideoCapture('Tensorflow/workspace/images/video.MP4')
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

while cap.isOpened(): 
    ret, frame = cap.read()
    image_np = np.array(frame)
    
    input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
    detections = detect_fn(input_tensor)
    
    num_detections = int(detections.pop('num_detections'))
    detections = {key: value[0, :num_detections].numpy()
                  for key, value in detections.items()}
    detections['num_detections'] = num_detections

    # detection_classes should be ints.
    detections['detection_classes'] = detections['detection_classes'].astype(np.int64)

    label_id_offset = 1
    image_np_with_detections = image_np.copy()

    viz_utils.visualize_boxes_and_labels_on_image_array(
                image_np_with_detections,
                detections['detection_boxes'],
                detections['detection_classes']+label_id_offset,
                detections['detection_scores'],
                category_index,
                use_normalized_coordinates=True,
                max_boxes_to_draw=5,
                min_score_thresh=.4,
                agnostic_mode=False)

    if ret:    
        cv2.imshow('object detection', cv2.resize(image_np_with_detections, (960, 540)))
    
        if cv2.waitKey(240) & 0xFF == ord('q'):
            break
        
cap.release()
cv2.destroyAllWindows()

the error shown: It shows error:

ValueError: 'images' must have either 3 or 4 dimensions.

video

you have multiple issues here.

  • you claim it’s starting and stopping repeatedly. I can’t imagine the code doing that, but you aren’t showing everything, and this doesn’t look quite like a minimal reproducible example.
  • you have a ValueError but you have kept the Traceback a secret.

pick one problem at a time.

Dear sir, I believe they are correlated to each other. As the “error” is happened exactly at the same frame when the video is stop for a while.
However, once the error shown (the video is not responding) it will not running again. the program will crash.

How to show the Traceback?

try to run your object detection on a single image, measure how long that takes.

(i’m pretty sure, your detection is the bottleneck, not the video)

python shows that automatically… right above the error. this is basic python debugging.

Thank you for the video.

this is my error and traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Users\ALI~1.ARY\AppData\Local\Temp/ipykernel_6000/1925657203.py in <module>
      8 
      9     input_tensor = tf.convert_to_tensor(np.expand_dims(image_np, 0), dtype=tf.float32)
---> 10     detections = detect_fn(input_tensor)
     11 
     12     num_detections = int(detections.pop('num_detections'))

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
    887 
    888       with OptionalXlaContext(self._jit_compile):
--> 889         result = self._call(*args, **kwds)
    890 
    891       new_tracing_count = self.experimental_get_tracing_count()

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
    915       # In this case we have created variables on the first call, so we run the
    916       # defunned version which is guaranteed to never create variables.
--> 917       return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
    918     elif self._stateful_fn is not None:
    919       # Release the lock early so that multiple threads can perform the call

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\function.py in __call__(self, *args, **kwargs)
   3020     with self._lock:
   3021       (graph_function,
-> 3022        filtered_flat_args) = self._maybe_define_function(args, kwargs)
   3023     return graph_function._call_flat(
   3024         filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\function.py in _maybe_define_function(self, args, kwargs)
   3442 
   3443           self._function_cache.missed.add(call_context_key)
-> 3444           graph_function = self._create_graph_function(args, kwargs)
   3445           self._function_cache.primary[cache_key] = graph_function
   3446 

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
   3277     arg_names = base_arg_names + missing_arg_names
   3278     graph_function = ConcreteFunction(
-> 3279         func_graph_module.func_graph_from_py_func(
   3280             self._name,
   3281             self._python_function,

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\framework\func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
    997         _, original_func = tf_decorator.unwrap(python_func)
    998 
--> 999       func_outputs = python_func(*func_args, **func_kwargs)
   1000 
   1001       # invariant: `func_outputs` contains only Tensors, CompositeTensors,

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\eager\def_function.py in wrapped_fn(*args, **kwds)
    670         # the function a weak reference to itself to avoid a reference cycle.
    671         with OptionalXlaContext(compile_with_xla):
--> 672           out = weak_wrapped_fn().__wrapped__(*args, **kwds)
    673         return out
    674 

~\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\framework\func_graph.py in wrapper(*args, **kwargs)
    984           except Exception as e:  # pylint:disable=broad-except
    985             if hasattr(e, "ag_error_metadata"):
--> 986               raise e.ag_error_metadata.to_exception(e)
    987             else:
    988               raise

ValueError: in user code:

    C:\Users\ALI~1.ARY\AppData\Local\Temp/ipykernel_6000/1654050223.py:11 detect_fn  *
        image, shapes = detection_model.preprocess(image)
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\meta_architectures\ssd_meta_arch.py:484 preprocess  *
        normalized_inputs, self._image_resizer_fn)
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\utils\shape_utils.py:492 resize_images_and_return_shapes  *
        outputs = static_or_dynamic_map_fn(
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\utils\shape_utils.py:246 static_or_dynamic_map_fn  *
        outputs = [fn(arg) for arg in tf.unstack(elems)]
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\object_detection-0.1-py3.8.egg\object_detection\core\preprocessor.py:3327 resize_image  *
        new_image = tf.image.resize_images(
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\dispatch.py:206 wrapper  **
        return target(*args, **kwargs)
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\ops\image_ops_impl.py:1538 resize_images
        return _resize_images_common(
    C:\Users\Ali.Aryo\Documents\TFOD_500\TFODCourse\tfod\lib\site-packages\tensorflow\python\ops\image_ops_impl.py:1396 _resize_images_common
        raise ValueError('\'images\' must have either 3 or 4 dimensions.')

    ValueError: 'images' must have either 3 or 4 dimensions.

I just found out that this problem occur only when I play video on full resolution 1080p (1920 x 1080).

I tried to convert/compress the video to 720, and all the codes are working (video is running, object detection is running).

However, this is not really solving my problem.
As the original video (1080p) consisting metadata (GPS location). And if I convert/compress my video, all of the metadata are lost.

So, I am hoping if someone could tell me any solution on how to run the code for 1080p?
or alternatively, if there is any way to compress data from 1080p to 720p without losing the metadata?

I am using GoPro hero 8 for the video.

Thanks so much in advance

please show code . . . . . . . . . . . .
.

Dear sir,

My code is already uploaded on the first post.