How to identify near-duplicate features from anchor image before doing feature matching?

When doing feature matching between an anchor image/model and a scene some matches will not be correct if the anchor image contains near-duplicate features.

Our current implementation currently uses the following approach loosely inspired of the ratio-test from the original SIFT paper by Lowe:

std::shared_ptr<ImageFeatures> PlanarDetectorTracker::filterNotUniqueFeatures(
    std::shared_ptr<ImageFeatures> features,float min_dist)
{
cv::Ptr<cv::BFMatcher> matcher = cv::BFMatcher::create(cv::NORM_HAMMING, false);
std::vector<std::vector<cv::DMatch>> matches;
std::vector<cv::KeyPoint> key_points_filtered;
cv::Mat descriptors_filtered;

matcher->knnMatch(features->descriptors(), features->descriptors(), matches, 2);

if (!matches.empty()){
    for(int i = 0; i < matches.size(); i++){
        // if the 2nd best match is lower than min_dist the match is discarted
        if (matches[i][1].distance > min_dist){
            key_points_filtered.push_back(features->keypoints()[matches[i][1].trainIdx]);
            descriptors_filtered.push_back(features->descriptors().row(matches[i][1].trainIdx));
        }
    }
}

return std::make_shared<ImageFeatures>(key_points_filtered, descriptors_filtered);
}

By tuning the parameter ‘min_dist~=110’ we achieve improved performance but I think there is room for improvement.

How to remove these near duplicate features in an effective manner?

You should add some images. It would help understanding what is going on.

Why are you not directly using the Lowe’s ratio test? It should help discarding ambiguous matches. Also, in your case you are adding the second best match and not the closest match.

1 Like

I am in fact using Lowe’s ratio-test when matching against a scene and that successfully removes bad matches in most cases. All good there.

Perhaps I should clarify that this question is motivated by FPS not by accuracy.
We’re running a real-time application on slow hardware so we need the extra performance. A profiling revealed unsurprisingly that knnMatch() is computationally heavy. If it’s possible to identify and discard near-duplicate anchor image features once and for all in the beginning FPS will be improved, remembering that KNN runs O(n*m).

When matching anchor image features against the scene features that are often matched incorrectly because there exist near-identical features within the anchor image. They are filtered out when doing findHomography() with method=RANSAC but at that point we already ran the heavy knnMatch().

The code I posted shows how we attempt to identify and remove those features. When matching the features from the anchor image against themselves the 1st best match will be the same features (distance=0). So a ratio-test does not make sense.

you should use FLANN, not BFMatcher

1 Like

@crackwitz thank you for the suggestion. Will try that tomorrow.

However I’m still interested in removing features that cannot be matched well since this can be precomputed.

Any suggestions?

Sorry I did not get that you are matching against the same image to remove “ambiguous/similar” keypoints.

Maybe you can find some inspiration with this: Local Features: from Paper to Practice

Example: 1st geometrically inconsistent nearest neighbor ratio (FGINN) strategy

look into “clustering” of feature descriptors. k-means perhaps. since I have no idea what you are doing or why, it’s a shot in the dark.