Cropping an Image based on an existing cue in the image

I have the following input image:

and I’d like to crop it in a way that the output could be an image cleaned from the label and other irrelevant things. My goal is to finally have fixed 1000x775 resolution. I’ve implemented the following:

def crop_plants(path):
    def getCrop(mask, frame):
        # Eroding frame so there won't be any noise
        mask_Erode = cv2.erode(mask, (3, 3), 1)

        # finding most-left extremum point
        corners = cv2.goodFeaturesToTrack(mask_Erode, 50, 0.01, 10)
        cornerns = np.int0(corners)
        min_X = float("inf")
        for i in corners:
            x, y = i.ravel()
            #, (int(x), int(y)), 3, (130, 70, 255), 7)
            if min_X > x:
                min_X = x

        # Moving x left until it hits QR label or iterates 11 times 11*30(330)px
        counter = 0
        distance = 0
        min_X -= 100
        while True:
            (b, g, r) = frame[40, int(min_X)]
            # r > 140 if min_X reaches the label, since it contains only red and white colours
            if r >= 140:
                min_X += 40
            elif distance >= 500:
            elif counter == 11:
                min_X -= 30
                distance += 30
                counter += 1
        return 0, int(min_X), int(0 + 775), int(min_X + 1000)
    video = cv2.VideoCapture(path)
    bgs = cv2.createBackgroundSubtractorMOG2(detectShadows=False)  # Background Subtraction
    # The frame which function is going to work with, previous might not be properly background-subtracted
    s_Frame = 7
    for i in range(s_Frame + 1):
        ret, frame =
        if frame is None:
            return 0
        bgs_Mask = bgs.apply(frame)
        if i == s_Frame:
            return getCrop(bgs_Mask, frame)

While my approach works for many input images, for some images, like the input I’ve given as in this post, it fails as it retrieves negative integer for the dimensions:

In [1]: getCrop(bgs_Mask, frame)
Out[1]: (0, -121, 775, 879)

I was wondering if there is a more clever way of doing this, and perhaps less hard-coded way, as I started to feel like I am using a lot of fixed numbers here and there to make it work.

UPDATE: I’ve uploaded a sample video file. sciebo

video/image sequence please.

and why does that thing seemingly exhibit motion blur?

I’ve uploaded the problematic video file. There is indeed motion blur, and it’s because when the images were compressed (to save space), the .MOV compression introduced that blur. I don’t think there is a way to revert that, sadly…

don’t blame compression. there simply was too little light in the scene to allow the camera to have short exposure times. since you need some depth of focus, you’ll need smallish apertures, so that requires more light on top of that.

“cropping” means taking a rectangular, axis-aligned subregion of a picture. judging from your use of a bluescreen, I doubt you want to “crop”.

what are you trying to accomplish? I’m not asking to be explained the code. put that aside. and I’m not asking for your chosen approach. I’m asking for the goal. I won’t voice my speculations for now.

The original resolution of the images was 4K, and thanks to the person who conducted the data collection and turned them into .MOV files, now I am stuck with 2K images with motion blur involved. I can say a lot of things but complaining about the past won’t affect the future, so I leave that aside and try to make things work with what I have.

My goal is to have the plant visible in the image and not much else. Because in the next block, this cropped image is fed into a DCNN which segments the plant beautifully. If the network is given an image that is not cropped (e.g. foreign objects are in play such as that label, metal pieces, cables, etc.) then the network misbehaves. I could solve that by introducing more training samples but that’s a terrible sweat work which I’m not willing to involve myself in.

BTW my approach works on many videos, and I could also add some “cropped” images here so that you get what I am after. But unfortunately the forum only allows me to upload one image.

For the record, it’s not terribly important if some parts of the plant get cut off, because the plant rotates and those cut-off regions are introduced in the consecutive images anyway.

then just

  • take a frame from the video
  • open in a photo editor (mspaint)
  • draw a white-on-black mask for the things you want gone
  • save it
  • use that mask to fill those regions of every video frame with some generic flat blue

if you really wanted to crop, which will take away a good portion of the picture… you still open a frame in a photo editor, but now you do that to pick the coordinates and write them down.

numpy slicing to crop: subregion = picture[y0:y1, x0:x1]

the motion blur definitely does not come from any postprocessing but from how the camera was set (exposure time/shutter speed).