Hello everybody!
I want to build myself a program to solve a HASHTAG riddle. It is like the Wordle game, or mastermind, wrapped into one - and then you solve four of them in a grid together.
This is what it looks like:
Solving the puzzle is done already. If I enter the 16 letters and colors my code solves the puzzle in a few seconds, referencing a dictionary to find legal words. The riddle there is in German language btw., should you try to solve it yourself!
Of course I do not want to enter that information manually. I want to hold my phone in front of my laptop camera and see the magic. This is where OpenCV enters the stage.
The picture above was taken by said laptop camera with these lines of code (this is C#)
private VideoCapture _capture;
_capture = new VideoCapture(0);
_capture.Set(VideoCaptureProperties.FrameWidth, 2048);
_capture.Set(VideoCaptureProperties.FrameHeight, 1536);
and
public Mat frame = new Mat();
_capture.Read(frame);
I then process the Mat object using
Cv2.CvtColor(input, gray, ColorConversionCodes.BGR2GRAY);
Cv2.GaussianBlur(gray, gray, new Size(3, 3), 0);
I build thresholds using
Cv2.AdaptiveThreshold(gray, thresh, 255, AdaptiveThresholdTypes.GaussianC, ThresholdTypes.BinaryInv, 11, 2);
I can provide pictures of the gray blurred image and the thresholded one, yet as a new user I am allowed only one media per post. I might reply to myself or to your answers with those other pictures.
Finally, I do find contours and ApproxPolyDP and filter for 4-corner objects, 500 pixel or more in size, and almost square in aspect ratio.
var contours = new Point[][] { };
HierarchyIndex[] hierarchy;
Cv2.FindContours(thresh, out contours, out hierarchy, RetrievalModes.External, ContourApproximationModes.ApproxSimple);
var candidates = new List<Rect>();
foreach (var contour in contours) {
var approx = Cv2.ApproxPolyDP(contour, 0.02 * Cv2.ArcLength(contour, true), true);
if (approx.Length == 4 && Cv2.ContourArea(approx) > 500) {
var rect = Cv2.BoundingRect(approx);
float ratio = (float)rect.Width / rect.Height;
if (ratio > 0.75 && ratio < 1.25)
candidates.Add(rect);
}
}
This works somewhat. I did get up to 12 boxes of the 16 recognized. With the example picture it caught only one of them, see the coloured picture with the green rectangle.
Don’t get me wrong - I am amazed what OpenCV delivers here with me giving 10 commands. I am stunned and shocked how powerful this is.
Now I need to point out, the code above is 99.9% ChatGPT created. I do have a good understanding what it is doing, yet I am confident there are better options available in OpenCV to create better results.
Can you help me find those? What I thought of so far:
- Increase resolution of webcam picture (if possible, need to check hardware)
- Lightning correction and colour signal amplifing of webcam picture
- Fiddling with the parameters of the gaussian blur or the threshold detection
- Rotating or perspective transformation to get a flatter smartphone screen reading
- Different methods of image processing between the steps I have so far
- Do the complete image processing workflow not once on one picture, but continuously. Merge together the results, like top 8 in first try, left 5 in second, nothing in third, nothing in fourth, right 8 boxes on fifth try and so on
The threshold picture looks so promising to me, it has clearly worked out the 16 squares and the gaps between them as thresholds. This is so close. I am confident, this must be possible, right?
Thanks for any and all input - I love playing around with this toolkit!