Detecting a smartphone screen

Hi,
I’m trying to detect a smartphone screen in an image. I have control over the phone’s screen, so I can put a very visible pattern on the screen before capturing, which I would assume makes this doable.

First, it looks like the checkerboard calibration stuff might be a good fit for this, but it doesn’t appear to be in the javascript distribution. I’m looking at what it would take to add that, but if someone already knows and can do it easily, that would be great.

Another idea I had was to use a colored checkerboard as the phone screen content which seemed very distinctive and then use backprojection with the histogram of the colored checkerboard, to get a backprojection image that’s just the phone’s content (from there it seemed very solvable to get the actual shape of that region), but when I actually go to do that, it doesn’t pick up the colored checkboard, I’m guessing because of the color of ambient light. I’m using a 3d histogram but I tried various subcombinations (also tweaked the bin numbers).

My other idea was to use SIFT or something like it to get matches, but seemed like I’d potentially still need to do some math to figure out the actual content region and before I go down that road would love some validation that’s a good approach.

Thanks!

Some code in case it’s useful—tweaked slightly—it’s in the context of a react app so I removed some of that stuff, but the opencv and canvas code is verbatim

const HISTOGRAM_BINS = 30;
const HISTOGRAM_CHANNELS = [0, 1, 2];
const HISTOGRAM_RANGES = [0, 180, 0, 256, 0, 256];

function hsvHistogramMat(srcMat: cv.Matrix, dstHistMat: cv.Matrix, hsvVec: cv.MatVector) {
      const mask = new cv.Mat();
      const hsvSrcMat = new cv.Mat();

      cv.cvtColor(srcMat, hsvSrcMat, cv.COLOR_RGB2HSV);
      hsvVec.push_back(hsvSrcMat);

      const bins = HISTOGRAM_CHANNELS.map(_ => HISTOGRAM_BINS);

      cv.calcHist(hsvVec, HISTOGRAM_CHANNELS, mask, dstHistMat, bins, HISTOGRAM_RANGES);
      cv.normalize(dstHistMat, dstHistMat, 0, 255, cv.NORM_MINMAX);
      mask.delete();
}

function draw() {
      const width = videoRef.current.videoWidth;
      const height = videoRef.current.videoHeight;
      if(!width || !height) {
        return;
      }
      const srcMat = new cv.Mat(height, width, cv.CV_8UC4);
      const patternMat = new cv.Mat(height, width, cv.CV_8UC4);
      const canvasContext = canvasRef.current.getContext("2d")!;
      const debugCanvasContext = canvasRef.current.getContext("2d")!;

      debugCanvasRef.current.width = width;
      debugCanvasRef.current.height = height;
      canvasRef.current.width = width;
      canvasRef.current.height = height;

      renderCanvas(canvasRef.current, width, height);
      patternMat.data.set(canvasContext.getImageData(0, 0, width, height).data);

      canvasContext.drawImage(videoRef.current, 0, 0, width, height);
      srcMat.data.set(canvasContext.getImageData(0, 0, width, height).data);

      canvasContext.imageSmoothingEnabled = false;
      debugCanvasContext.imageSmoothingEnabled = false;

      const srcHistogram = new cv.Mat();
      const srcHsvVec = new cv.MatVector();
      hsvHistogramMat(srcMat, srcHistogram, srcHsvVec);

      const patternHistogram = new cv.Mat();
      const patternHsvVec = new cv.MatVector();
      hsvHistogramMat(patternMat, patternHistogram, patternHsvVec);

      const backProjectionMat = new cv.Mat();
      cv.calcBackProject(srcHsvVec, HISTOGRAM_CHANNELS, patternHistogram, backProjectionMat, HISTOGRAM_RANGES, 1);

      cv.imshow(canvasRef.current, backProjectionMat); 
      cv.imshow(debugCanvasRef.current, srcMat);
      // drawHistogram(debugCanvasRef.current, srcHistogram, HISTOGRAM_BINS, width, height);
      // drawHistogram(canvasRef.current, roiHist, bins, width, height);

      srcMat.delete();
      srcHistogram.delete();
      srcHsvVec.get(0).delete();
      srcHsvVec.delete();

      patternMat.delete();
      patternHistogram.delete();
      patternHsvVec.get(0).delete();
      patternHsvVec.delete();

      backProjectionMat.delete();
}

I’m always a fan of AR markers like aruco because they’re less fidgety than checkerboards. some people even abuse QR codes for AR but they aren’t made for precise localization.

you could either put a square AR marker in the center, and encode the display’s aspect ratio, or you could stretch it and ignore the 3D pose stuff and only use its four corners. remember to keep the quiet zone (white border). that’s vital to finding the code.

I don’t know if the aruco module has been tested/prepared for transpilation to opencv.js

when in doubt, diy. draw the square/rectangle, find it (“document scanner” tutorials, approxPolyDP), and for precise results throw the approximated contour away and calculate the edges from the full contour.

Thanks for the suggestions.

I was talking to someone about aruco (which is in the JS bindings), but, and I didn’t mention this in my original post, the underlying problem is detecting the outline of a screen on cracked screen. I’ll experiment, but they thought aruco might have trouble with that level of noise.