Symbol recognition is very poor, not sure why

FatCat · March 20, 2023, 9:31am

Hi! I’m new to using OpenCV and I used this short tutorial for KNN character recognition. OpenCV 3 KNN Character Recognition C++ - YouTube
I switched out the image used for training for one with custom drawn symbols. However, the accuracy is not very good.

This is my image for generating the xml file. Every time it gets a symbol wrong I paste it into the image. The “C” is super accurate, so I only needed to have it in there twice. But right now it keeps confusing the horizontal line for the circle, even though they are very distinct shapes.
I could keep putting more lines into the training data, but it does not seem to be helping very much. Is this normal? I’m not sure what else I can do to improve the accuracy. I’ve tried resizing the input image to different sizes, but nothing is helping.

berak · March 20, 2023, 10:12am

so, we have to watch a silly youtube video, to understand, what you try to do ? NO WAY.

please show actual code / data.

FatCat · March 20, 2023, 10:34am

The function for generating the xml file from the png is here link
The function for then recognizing the symbol from an input mat is here link

I have not made modifications to the code aside from changing the resized image width/height values as well as changing the valid chars in the intValidChars vector in the GenData function.

berak · March 21, 2023, 9:40am

that repo wasn’t maintained in 6 years, and opencv3 was when ?

Symbol recognition is very poor

the methods applied here are definitely not SOA.

cropping with contours
did you visualize, what you get, e.g. from a | or - symbol ? in the worst case you get a solid black box …
using raw, ‘flattened’ pixels as ‘features’
cheap, but inaccurate.
e.g. draw a small & a large circle on paper, cut up both into slices & compare (manually, as a human !) – you won’t find many matching parts, though this should be the same ‘class’ !
classifiying raw pixels fails for translation, rotation, shearing, whatever variation you can imagine (and which will happen with handwritten symbols !)
making ‘features’ from it, like HOG, HAAR, LBP were (slightly better) attempts to overcome this, but, hmmmm, all of it has been superseeded by cnn’s , which automatically learn the most appropriate features from the input data !
knearest as classifier
again, cheap (as in: easy to understand for humans) but inaccurate (svm, ann, trees ?)
small train data set
more is better (overfitting). if you try to classify a 30x20=600 pixel window, try with 600+ train images

tl;dr:
read books, do some ml courses, learn about cnn’s
(and avoid (!!!) youtube videos, when looking for information !)

FatCat · March 21, 2023, 10:24am

Alright, thanks for the pointers! Cropping isn’t an issue, since it always resizes the image to a set size. But the horizontal lines do get squished into more of a box shape in the end which could be a problem. Another potential issue with drawing symbols of varying sizes is that the pencil stroke width will be wider/narrower relative to the overall size.

I’ll try a larger training set to try to improve the recognition a bit for this project without having to replace the method entirely since it’s on a strict deadline, but I will make sure to read up on the other classifiers and cnn’s thoroughly and then do a ml course when I have time!

crackwitz · March 22, 2023, 7:41am

that should be changed fundamentally.

the resize() turns everything into a 20x30 image. if it’s a vertical line… that gets turned into a black rectangle. if it’s a horizontal line, it gets turned into? a black rectangle (or white, after the threshold with INV flag).

you’ll need to resize while maintaining aspect ratio, while also setting the border mode to CONSTANT with background value chosen appropriately.

Topic		Replies	Views
Best way to identify a circle C++ imgproc	6	1388	December 4, 2021
Contours matching by matchShapes whitch return big difference for two similar shapes C++ shape , imgproc	8	1489	March 10, 2023
Finetuning HoughCircles C++	2	65	January 6, 2025
OpenCV in Java: how can I output all kind of symbols? (Chinese, Japanese, Accents, etc…) Android/Java imgproc , contrib , freetype	4	1391	July 1, 2021
OpenCVのxmlファイルについて C++	6	373	November 24, 2022

Symbol recognition is very poor, not sure why

Related topics