How to find a generic straw segmentation method?

I am working on a project to find all straws in the images. My thought is to use the Thresholding, Morphology, and ConnectedComponents to get the contours, see the following code and attached figures. It is easy to try some parameters and make them work for a specific image, but I have thousands of images, it is hard to find a set of universal parameters. I did try the Histogram Equalization/CLAHE, but it is still hard to find a model to fit all. Are there any other techniques or trained models I can try? Any suggestions or comments are welcomed! Thanks!

def segment(im_bgr, k=3, open_i=2, dilate_i=8):
    im_gray = cv2.cvtColor(im_bgr, cv2.COLOR_BGR2GRAY)

    # Reomve noises and get the forground
    ret, thresh = cv2.threshold(
        im_gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU
    kernel = np.ones((k, k), np.uint8)
    opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=open_i)
    sure_bg = cv2.dilate(opening, kernel, iterations=dilate_i)
    msk_interest = ~sure_bg

    # Marker labelling
    ret, im_marker = cv2.connectedComponents(msk_interest)

    # Draw bounding boxes
    im_marker = np.uint8(im_marker)
    contours, h = cv2.findContours(im_marker, 1, 2)
    for contour in contours:
        rect = cv2.minAreaRect(contour)
        box = cv.boxPoints(rect).astype(int)
        cv2.drawContours(im_draw, [box], -1, (0, 255, 0), 1)


why do you need to find straws, what’s this leading up to? maybe we can find another solution, if we knew what the real goal was.

I expect deep learning to be the only viable solution for these (cluttered!) pictures and your stated goal so far. you’ll need labeled training data. you could try instance segmentation. or you could train a detector that predicts bounding boxes, or coordinates of the endpoint pairs.

Hi, Crackwitz,
Thank you for your suggestion! The lengths & endpoints of straws are good predictors of our business logic. Yes, I have tried the MaskRCNN to generate the instance segmentation. I have training the model on 100 images, but the result is not good enough. I guess I need to label more images and tune up hyper-parameters. I was wondering this problem can be solved without the deep learning? It yes, we can save the effort to label and train the deep learning models. Thank you so much!

how much trouble are you willing to go to?

you could see about getting depth/RGBD data.

are you willing to try monochromatic light of various wavelengths? does straw react markedly to any particular ranges of light?

again… how about you explain what the goal is, the ultimate goal? we can’t guess here. and your chosen solution might be unsuitable. these pictures certainly are.

Another question: is it possible to clean up the scene? There are a lot of small particles that can cause over-segmentation.

Hi, Kbarni,
Thank you for your suggestion! It is easy to clean up the small chaffs/particles in the 1st and 2nd row images, but it is challenging to do it in the 3rd row image, where many small particles are next to each other and on the top of the big straws. Removing them may break the boundaries of the big straws, which leads to wrong estimate of the lengths of the straws. Are there are clean-up techniques I can try? Thanks!

ah, great. that’s what I was wondering.

they’re making you solve the problem with “magic”. if you want a real measurement, this won’t do.

if they insist on magic and are willing to tolerate very approximate “measurements”, you could train a network on a ton of data of known length straw, train it to predict the known length. then you can present it with straw within that range and it should tolerably predict the length value you trained it to predict.

it should be easy to collect this data. engage a bunch of farmers, have them set the chopper blades a particular way (in lots of increments), run the thing through the field and collect video continuously, then pick the bales apart, measure by hand what actually came out, and there you have your training data.

you should feel free to push back and tell them this needs physical changes if they want a real measurement. if they insist on a magical solution, ask them to double your salary. they expect you to be proficient in deep learning, and I am guessing that you aren’t. since they need you to be, they can pay you for the skillset they demand (and the necessary training).

a proper engineering solution (if you want a real measurement) would be separating this straw into individual pieces. industrially, one would throw this onto a fast-moving conveyor, which by its speed difference vs the oncoming material, physically separates individual pieces so they’re easier to grasp visually. now they’re also mostly in one plane (on the conveyor or flying through the air) so you can rather precisely measure each piece.

any “segmentation” would only be part of the “proper engineering” solution. the “magic” solution doesn’t use any of that, it directly predicts the straw’s length from the picture.

RGBD/depth means stereo cameras like a Kinect. ignore that, that’s going down a path that’s not feasible.

I really appreciate your valuable suggestions. Thank you so much! Have a nice day.

you know, if you have a data set (labeled with “straw length” of course), you could host a competition on Kaggle Competitions

a possible DNN might not just predict a single value for a picture but a histogram of values. I see some of your pictures contain differently sized pieces. that’s probably useful information. and this gives the network (and the training) more subtle ways of judging performance, and you as the trainer better ways of finding mislabeled data and correcting it.