Passing Mat to another thread causes invalid memory

My main application reads a video file then sends the frame to be processed on a separate thread. The issue is that after so many frames it eventually starts saying the Mat is null inside Detector. What would cause this? I have tried mat.clone(), copying the bytes from a BufferedImage into byte[] then building the Mat again in the thread but the same issue happens. Everything works fine if I use a single thread.

Screenshot from 2023-03-23 09-39-58

So I’ve traced the issue to the need to call System.gc(); to prevent memory leaks… I have updated the code to allow you to duplicate the issue. I assume this has to do with the use of finalizers.

BlockingQueue<Frame> queue = new LinkedBlockingQueue<>(100000);
VideoCapture camera = new VideoCapture("/path/video.mp4");
Mat frame = new Mat();
Mat mat;
boolean finished = false;
Rect rectangle = new Rect(145, 0, 575, 426);
Runnable detection1Task = new ObjectDetection(queue);
Thread detection1Thread = new Thread(detection1Task);

detection1Thread.setName("Detection 1 Thread");

while ( !finished )
    if (
        mat = frame.submat(rectangle);
        queue.add(new Frame(mat,0,0));//I do many more than just this mat this though 

     ///Calling this on a timer will produce the same results eventually
     ///I have a task that needs to run 24/6 365 Days a year. 
     System.gc();//Removing this line will lead to a CRAZY amount of RAM consumption
        finished = true;

My Thread:

public class ObjectDetection implements Runnable{
    private final BlockingQueue<Object> queue;
    private Frame object;
    private Detector detector;
    public ObjectDetection(BlockingQueue queue){
        this.queue = queue;

    public void run() {
            object = queue.take();
            Detector detect = new Detector("path/to/cfg", "path/to/weights", object.getMat(), 640,128,.9f);
            object.clear();//Calls release
        catch(Exception e){e.printStackTrace();}



private final String cfg;
    private final String model;
    private final Mat mat;
    private final int networkWidth;
    private final int networkHeight;
    private final float threshold;
    public Detector(String cfg, String model, Mat mat, int networkWidth, int networkHeight, float threshold){
        this.cfg = cfg;
        this.model = model;
        this.mat = mat;
        this.networkWidth = networkWidth;
        this.networkHeight = networkHeight;
        this.threshold = threshold;

    public String detect(){
            return "";
        Net net = Dnn.readNetFromDarknet(cfg, model);
        Size sz = new Size(networkWidth,networkHeight);
        Mat blob = Dnn.blobFromImage(mat, 0.00392, sz, new Scalar(0), true, false);

        List<Mat> result = new ArrayList<>();
        List<String> outBlobNames = getOutputNames(net);  

        net.forward(result, outBlobNames);

        List<Integer> clsIds = new ArrayList<>();
        List<Float> confs = new ArrayList<>();
        List<Rect2d> rects = new ArrayList<>();

        for (int i = 0; i < result.size(); ++i)
            // each row is a candidate detection, the 1st 4 numbers are
            // [center_x, center_y, width, height], followed by (N-4) class probabilities
            Mat level = result.get(i);
            for (int j = 0; j < level.rows(); ++j)
                Mat row = level.row(j);
                Mat scores = row.colRange(5, level.cols());

                Core.MinMaxLocResult mm = Core.minMaxLoc(scores);

                float confidence = (float)mm.maxVal;
                Point classIdPoint = mm.maxLoc;
                if (confidence > threshold)
                    int centerX = (int)(row.get(0,0)[0] * mat.cols());
                    int centerY = (int)(row.get(0,1)[0] * mat.rows());
                    int width   = (int)(row.get(0,2)[0] * mat.cols());
                    int height  = (int)(row.get(0,3)[0] * mat.rows());
                    int left    = centerX - width  / 2;
                    int top     = centerY - height / 2;

                    rects.add(new Rect2d(left,top,width,height));


        // Apply non-maximum suppression procedure.
        float nmsThresh = .50f;
        if (confs.isEmpty())//If no results were found return
            return "";
        MatOfFloat confidences = new MatOfFloat(Converters.vector_float_to_Mat(confs));
        Rect2d[] boxesArray = rects.toArray(new Rect2d[0]);
        MatOfRect2d bbox = new MatOfRect2d();
        MatOfInt indices = new MatOfInt();
        Dnn.NMSBoxes(bbox, confidences, threshold, nmsThresh, indices);

        // Grab Results:
        int [] ind = indices.toArray();  
        float confy = 0;
        int classID = -1;
        for (int i = 0; i < ind.length; ++i)
            int idx = ind[i];

            classID = clsIds.get(idx);
            Float confidence = confs.get(idx);
            confy = confidence;


        //Release the mats
        return classID + "," + confy;
    private List<String> getOutputNames(Net net) {
        List<String> names = new ArrayList<>();

        List<Integer> outLayers = net.getUnconnectedOutLayers().toList();
        List<String> layersNames = net.getLayerNames();

        outLayers.forEach((item) -> names.add(layersNames.get(item - 1)));
        return names;

After so many frames the program crashes and gives:

# A fatal error has been detected by the Java Runtime Environment:
#  SIGSEGV (0xb) at pc=0x00007fda20e7bac1, pid=34593, tid=0x00007fd9ecaed640
# JRE version: OpenJDK Runtime Environment (8.0_362-b09) (build 1.8.0_362-8u362-ga-0ubuntu1~22.04-b09)
# Java VM: OpenJDK 64-Bit Server VM (25.362-b09 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  []  cv::dnn::experimental_dnn_34_v21::Net::Impl::forwardLayer(cv::dnn::experimental_dnn_34_v21::LayerData&)+0xc61

crosspost: java - Passing Mat to another thread causes invalid memory - Stack Overflow

your matis a ‘global var’ !

apart from that, noone can reproduce your current ‘dummy’ code …

what can i say …

your (now deleted) graph simply looks, like you’re just filling that queue up with images, until the movie is over …

i still suspect, there’s some dead simple thing going on here …

if you can, present a MRE. if that doesn’t fit in a post, perhaps host it somewhere (github/gist, …). make sure it’s as minimal as possible, no operations that are not vital to reproducing/demonstrating the issue. anything required to build and run the code is vital.

that also means you need to figure out if the MRE requires usage of OpenCV dnn code or not.

I’ll do that shortly.

again (please prove me wrong !):

  • main thread is filling the queue continuously from video (like, 30 fps)
  • detection thread removes only 1 last frame from the queue to process
    (yolov3 might take 3 seconds per inference)

the queue will grow & grow & grow, no ?

please don’t do this ! (it’s akin to another dreaded ‘global’, used with threading !)
make a new Mat(a local variable) inside the video loop !

currently, even with a new Mat() around a slice of the frame Mat – they (might, i cannot check, sorry) point to the very same pixel data (at c++ level)

why on earth do you want multi-threading, then ?

that’s what multi-class detectors, like yolo are for.

you need one, (re-trained on the classes you want to detect), not many …

if you want to speed it up – load the network once, dont create a new one per image

somehow, 90% of all cases, where ppl throw threads at it, are just “overengineering” …

I would expect nothing but segfaults and issues with that many threads.

basically, your problem is too complex to ask for free help. there is no “little fix” here because your code isn’t little.

you need someone to take your code apart entirely and redo it. that is what “MRE” means.

it’s work, really, only you can perform because only you can reproduce the issue. anyone else would have to first understand your goal, and your code, and then strip it bare so it becomes debuggable.

you appear unwilling to deconstruct that code and start fresh, with very small steps, to build a MRE.

MRE doesn’t mean “the entire program” as you wish it to be used. MRE means the absolute bare bones to provoke the issue, with every single line of code, every single token being vital.

I’ve moved over to JavaCV and the issue is gone and i’m now getting detections at 1ms rather than 20… Thanks for you help