Hello,
I’ve been working through some examples with OpenCV and feature matching and have hit a point where I’m frankly unsure of how to improve results.
Background:
My goal, itself, is pretty simple - given some game screenshots, I’d like to be able to extract meaningful information. There will be absolutely no rotation in images, though there may be some scale variance if I try to scan for information using different resolution images.
This project is done in C# through EmguCV, but as my questions are related to OpenCV and/or tuning the various detectors/extractors/matchers or picking other options, it should translate well enough.
These are the results I’ve been able to achieve so far:
imgur gallery - Flann
imgur gallery - Flann (blue matches)
imgur gallery - BFMatcher
I am working from these base images:
imgur gallery
This set consists of five images run against one model, each run for AGAST/FREAK, ORB, and STAR/BRIEF. Matches were detected via KnnMatch. In my initial testing, these combinations seemed to yield the best count of keypoints in the correct ROI in the main image based on those detected in the model image. In general, STAR/BRIEF seems to be the most consistent, though the calculated homography on all AGAST/FREAK and ORB examples leads me to believe something is very wrong with matching.
1: Given the goal - extraction of information from images which should have no need for rotation consideration - are there better options?
2: Given these should always be same-plane, is there a way to constrain results to same-plane? To clarify - in some examples, homography seems to differ extremely from the same-plane expected result.
The parameters I’m currently using are derived from a mix of examples and trial/error - I’ve found very little documentation regarding the effects these parameters have and have mostly inferred effects from EmguCV’s documentation or from examples.
I am using these parameters:
Matcher:
DescriptorMatcher matcher = new FlannBasedMatcher(indexParams: new Emgu.CV.Flann.LshIndexParams(20, 10, 2), search: new Emgu.CV.Flann.SearchParams(checks: 50));
AGAST/FREAK:
Feature2D orbDetector = new AgastFeatureDetector(threshold: 15, nonmaxSuppression: true, type: AgastFeatureDetector.Type.AGAST_5_8);
Feature2D freakExtractor = new Freak();
ORB:
Feature2D orbDetector = new ORB(numberOfFeatures: 1500, scaleFactor: 1.6f, nLevels: 12, fastThreshold: 15, edgeThreshold: 0);
STAR/BRIEF:
Feature2D starDetector = new StarDetector(maxSize: 22, responseThreshold: 20, lineThresholdProjected: 15, lineThresholdBinarized: 8);
Feature2D briefExtractor = new BriefDescriptorExtractor();
The logic for feature matching is fairly straightforward and is just a cleaned-up adaptation of an EmguCV example:
/// <summary>
/// Match the given images using the given detector, extractor, and matcher, calculating and returning homography.
///
/// The given detector is used for detecting keypoints.
/// The given extractor is used for extracting descriptors.
/// The given matcher is used for computing matches.
///
/// Detection and matching will be done in two separate stages.
///
/// The Mat and Vector... properties of this result are unmanaged - it is assumed the caller will dispose results.
/// </summary>
/// <param name="featureDetector"></param>
/// <param name="featureExtractor"></param>
/// <param name="matcher"></param>
/// <param name="observedImage"></param>
/// <param name="modelImage"></param>
/// <returns></returns>
public MatchFeaturesResult MatchFeatures(Feature2D featureDetector, Feature2D featureExtractor, DescriptorMatcher matcher, Mat observedImage, Mat modelImage)
{
using (UMat observedImageUmat = observedImage.GetUMat(AccessType.Read))
using (UMat modelImageUmat = modelImage.GetUMat(AccessType.Read))
{
// Detect keypoints
var observedImageKeypoints = featureDetector.Detect(observedImageUmat);
var modelImageKeypoints = featureDetector.Detect(modelImageUmat);
var observedDescriptors = new Mat();
var modelDescriptors = new Mat();
var observedKeypointVector = new VectorOfKeyPoint(observedImageKeypoints);
var modelKeypointVector = new VectorOfKeyPoint(modelImageKeypoints);
// Compute descriptors
featureExtractor.Compute(observedImageUmat, observedKeypointVector, observedDescriptors);
featureExtractor.Compute(modelImageUmat, modelKeypointVector, modelDescriptors);
// Match descriptors
matcher.Add(modelDescriptors);
var matches = new VectorOfVectorOfDMatch();
matcher.KnnMatch(observedDescriptors, matches, 2);
// Filter matches based on ratio
//matches = LowesFilter(matches);
var mask = new Mat(matches.Size, 1, DepthType.Cv8U, 1);
mask.SetTo(new MCvScalar(255));
Features2DToolbox.VoteForUniqueness(matches, 0.8, mask);
Mat homography = null;
var nonZeroCount = CvInvoke.CountNonZero(mask);
if (nonZeroCount >= 4)
{
nonZeroCount = Features2DToolbox.VoteForSizeAndOrientation(modelKeypointVector, observedKeypointVector, matches, mask, 1.5, 20);
if (nonZeroCount >= 4)
{
homography = Features2DToolbox.GetHomographyMatrixFromMatchedFeatures(modelKeypointVector, observedKeypointVector, matches, mask, 2);
}
}
var result = new MatchFeaturesResult(observedKeypointVector, observedDescriptors, modelKeypointVector, modelDescriptors, matches, mask, homography);
return result;
}
}
There is a duplicate which condenses DetectAndExtract for ORB.
I have tried applying a Lowe’s filter but it negatively impacted results - it’s entirely possible I misimplemented it.
3: Is there anything obvious I could do with the above code especially regarding my usage of OpenCV (via EmguCV, I know) components to improve results?