Detecting shapes in an image

Working on detecting objects (mostly geometric shapes) in an image using OpenCV.
Managed to get it working for many use-cases. However, for shapes containing gradient fill, improvements are required.

Input image:

Detected contours:

Notice how the 2nd internal rectangle is not detected as a single contour. Any ideas on what can be done here?

are you using grayscale input for your shape detection? Maybe the middle region has same intensity as the background?

If that’s the case you should/could use the union of all individual processed channels (e.g. the sum of edge images of each individua channel) as input for your shapes.


If you later also want to identify partly occluded or scattered shapes, have a look at “Gestalt Theory” rules of how to group things.

1 Like

Thanks @Micka for your response.

Yes, grayscale image, with thresholding is used to generate the input image passed to cv2.findContours() function. And yes, you have rightly guessed it, the middle region has the same intensity as the background.

If that’s the case you should/could use the union of all individual processed channels (e.g. the sum of edge images of each individua channel) as input for your shapes.

By edge image are you referring to the image generated using Canny Edge? Would be helpful if you could add more information on how this can be done.

I dont know what kind of algorithm you use to detect the shapes. Just adapt it in a way that it uses color information as well, to distinguish different colors with same brightness. One way might be to use your current algorithm individually on each of the 3 color channels and to fuse the 3 individual results to a common result in the end or in between.

For example, if your algorithm is:

  1. use sobel magnitude on grayscale image => img_sobel
  2. threshold img_sobel => img_thresh
  3. find contours in img_thresh

Then you could adapt it by:

  1. use sobel magnitude on R channel => r_sobel
  2. use sobel magnitude on G channel => g_sobel
  3. use cobel magnitude on B channel => b_sobel
  4. combine them by e.g. full_sobel = r_sobel + g_sobel + b_sobel or by full_sobel = max_per_pixel(r_sobel, g_sobel, b_sobel)
  5. threshold full_sobel => img_thresh
  6. find contours in img_thresh

But this is just an example. It depends on your actually used algorithm, how to adapt it for color images.

1 Like


Had been thinking for a while as to how to make the computer detect the gradient boundaries that a human eye can detect. Splitting the image was a passing idea. Thanks for your response, it made me explore this option.

The results are better when splitting the channel and running the detection.

The current approach used:

  1. Use Canny - for each channel detect edges
  2. Combine edges, dilate, and detect contours.

Looking for ways to solve this use-case, not that simple though:

Original Image:

I think the outer gradient isnt a clear “shape” at all (is appears to be “open” on the right) and the result you get could be the same result if you asked a human.

Do you have assumptions about the different shapes? Can you assume that it always have to be rectangles? Then you could try to find the best line that closes the unfinished shape that doesnt fit to your assumptions. You could also try to use “Gestalt theory” rules for that.

By the way, afair you can directly feed color images to canny edge detection, but I dont know how then the different channels will be combined. Might be better or worse.

Yes, it isn’t that straightforward. However, intuitively, an individual would guess it to be a rectangle.

The shapes could be of any type, not necessarily rectangles. Checked Gestalt theory, looks powerful. Any pointers on how to apply the theory in OpenCV for Contour Detection? Yet to find anything useful

I doubt that. It depends on the task description and the context.

To my knowledge there is no full gestalt theory algorithm implemented and probably it’s not (yet) even possible, because of the human visual cortex complexity.

But you can get inspiration from it, like the rule of closure would tell you, that the outer part isn’t closed, so you might want to search for a way to close it. You could use raw sobel magnitude (maybe second order derivative instead of the commonly used first order derivative) and observe that there is a very soft border that would close that contour.

But this is very context dependent and it might be impossible to implement a “simple” general purpose algorithm for that.

1 Like

for these kind of images the second order sobel gradient works “ok”.

int main()

	cv::Mat img = cv::imread("C:/data/StackOverflow/gradientShapes.png");
	std::vector<cv::Mat> channels;
	cv::split(img, channels);
	std::vector<cv::Mat> sobelX;
	std::vector<cv::Mat> sobelY;
	for (int i = 0; i < channels.size(); ++i)
		cv::Mat dstX, dstY;
		cv::Sobel(channels[i], dstX, CV_32FC1, 2, 0, 3, 1.0, 0.0, cv::BORDER_REFLECT);
		cv::Sobel(channels[i], dstY, CV_32FC1, 0, 2, 3, 1.0, 0.0, cv::BORDER_REFLECT);


	cv::Mat combined;
	for (int i = 0; i < sobelX.size(); ++i)
		cv::Mat mag;
		cv::Mat angle;
		cv::cartToPolar(sobelX[i], sobelY[i], mag, angle);

		if (combined.empty()) combined = mag.clone();
		else combined = combined + mag;
		cv::Mat res1; cv::resize(mag, res1, cv::Size(), 0.5, 0.5);
		cv::Mat res2; cv::resize(combined, res2, cv::Size(), 0.5, 0.5);
		cv::imshow("mag" + std::to_string(i) , res1);
		cv::imshow("combined", res2/5.0f);
		//cv::imshow("combined", res2 > 100);

	for (int i = 100; i > 1; i = i - 10)
		cv::Mat mask = combined > i;
		cv::imwrite("C:/data/StackOverflow/shape_" + std::to_string(i) + ".png", mask);
		cv::resize(mask, mask, cv::Size(), 0.5, 0.5);
		cv::imshow("shapes", mask);


With that code you will get a set of images. In the first threshold of 100 on combined magnitude you will already see the internal rectangles as closed contours, but the outer shape isn’t closed.

Going down to threshold 10 you will see the shape closed, but also a lot of internal lines.

Now the challenge is, how to combine that information, how to select the “right” pixels for the shapes and to ignore the “wrong” ones. Maybe you can use confidence scores and hierarchies and score high magnitudes high and score closedness and completeness high.

But in the end it won’t generalize well over all tasks and contexts.

1 Like

same code with 3rd order sobel is better:

cv::Mat dstX, dstY;
cv::Sobel(channels[i], dstX, CV_32FC1, 3, 0, 5, 1.0, 0.0, cv::BORDER_REFLECT);
cv::Sobel(channels[i], dstY, CV_32FC1, 0, 3, 5, 1.0, 0.0, cv::BORDER_REFLECT);

With threshold 80:


Thanks for the detailed code snippet @Micka. Since Python is currently being used, had translated the code snippet you posted. After resolving some basic issues, I am currently looking to implement the masking you implemented. Below error is thrown.

cv2.error: OpenCV(4.5.4) :-1: error: (-5:Bad argument) in function 'resize'
> Overload resolution failed:
>  - src data type = 0 is not supported
>  - Expected Ptr<cv::UMat> for argument 'src'

Post masking all values on the combined array become false. Would be great if you could add clarity for implementing the masking portion of your algorithm.

the mask is just a thresholding. Your can remove the resizing, it is just for my small notebook screen for displaying purposes.

1 Like

Ok, @Micka understood. Commenting the resizing code helped check the mask image.

Finding ways to get the masked image working for contour detection. Current error:

cv2.error: OpenCV(4.5.4) :-1: error: (-5:Bad argument) in function 'findContours'
> Overload resolution failed:
>  - image data type = 0 is not supported
>  - Expected Ptr<cv::UMat> for argument 'image'

Edit: Resolved. This was a data type issue.

(post deleted by author)


Any ideas on how the same can be implemented? Yet to come across anything out of the box in openCV that provides this.

Regarding duplicates:
Using this approach has increased the accuracy of detection, however, duplicate contours have also increased. Attempted using contour related measurements to detect duplicates.Measurements like, contourArea (with orientation), coordinates, perimeter, angle of orientation etc.
However, sometime non-duplicates are also removed. Any suggestions on eliminating duplicates?

Been experimenting with Sobel and Canny for edge detection
The script you here @Micka gives better results. However, additional contours and duplicates are detected.
While in some cases, the duplicates are easier to detect and cleanup, for some cases due to incorrect hierarchy duplicate filtering becomes challenging.

Contour detection on image generated using Canny:

5 Contours detected (more accurate in this case)

Contour detection on image generated using 3 order sobel:

In the violet-colored circle, 2 contours are detected with the parent for both being 2 (the larger contour), whereas for the 5th contour the parent should be 4. Due to this, it becomes tricky to eliminate the duplicates - unless each and every contour is compared with the other.

Any ideas on how to make the contour detection more accurate?

You could do something like:
For each pair if contours:
compute intersection over union: iou
If iou is very high, (90%?) they"re duplicates.

1 Like