Using Raspberry Pi, opencv, and YOLO To Determine Weights of Irregular Objects

I am working on a system on a raspberry pi that uses a YOLO instance segmentation model to classify different foods and mask them with opencv. I then want to use the detected classes to find the weight of each food, and add it to a total counter. The camera will most likely be facing top down, so I am curious what the best way to find the depths of the food is. Currently, the code I have now just takes the 2d mask, so it just takes the mask of the object straight down and then finds the weight from the area of that mask. This isn’t accurate because we are missing depth, and I need the system to be as accurate as possible.

What are some possible low-cost, yet effective solutions I could use to find the volume of foods and not just the area. There could be multiple foods on one plate in the frame, and they would all have different shapes & sizes. They will most likely be breakfast foods, so scrambled eggs, tater tots, french toast, etc.

you can’t tell weight or volume from appearance.

if you need to know the volume, just run the whole thing through MRI because I hope your food doesn’t consist of bones or metal.

or… weigh the plates. I’m sure that’s cheap and simple.

if the food items are of known value, just count the items. if there’s a glob of mashed potatoes, or a steak hidden under that, good luck.