I am currently working on an interactive installation project that requires real-time measurements of human proportions. The idea is to capture visitors’ images with their arms extended in front of a high-resolution camera, in controlled lighting conditions, and immediately process these images to extract measurements such as arm span, leg length, torso size, etc.
I’m seeking guidance on the best practices or existing solutions within the OpenCV framework to achieve the following:
Rapid Image Processing: The system needs to capture an image and process it in real-time (ideally in a few seconds) to keep the interactive experience fluid.
Accurate Measurement Extraction: We need to measure specific body proportions from the images. The accuracy of these measurements is critical as they will be compared to idealized proportions.
Streamlining for Speed: Any tips on optimizing the processing flow to ensure the shortest possible delay between image capture and displaying measurements would be greatly appreciated.
Does anyone have experience with or can recommend efficient methods for achieving this kind of real-time image analysis and measurement with OpenCV? Are there any particular algorithms, tools, or approaches within OpenCV that are well-suited for this task?
Any help or pointers you can provide would be invaluable, as I am aiming to ensure the installation is both engaging and educational, with minimal wait times for participants.
OpenCV can help with some of it, but you should also incorporate other libraries that are better at specific things.
it’s not all about software. you need to think about hardware as well.
first you’ll need something that captures the geometry. that means stereo vision with 2+ cameras that you calibrate yourself, or you get a commercially available “RGBD/depth sensor” like a kinect or whatever’s good these days, or a 3D lidar scanner.
extracting measurements requires you to fit a model of a human body to the scan. that’s pose estimation. mediapipe seems popular these days for that. relative to the fitted pose, you can then figure out where to measure parts of the captured geometry.
if your time requirement is on the order of a second, don’t worry too much. you may want to execute various parts of this on a GPU, so NVIDIA Jetson of some type, or anything with a suitably powerful GPU. the neural network (pose estimation) is supposed to be “light” enough for video anyway, if executed on a GPU. the geometry processing itself isn’t that costly. getting the geometry can be costly if you need to use vision for it. lidar has no such cost.
The augmented reality aspect is fascinating, and it’s clear that a combination of software and hardware considerations is crucial. Actually, I am not really sure if I would need a depth camera. I have some Kinect, but I don’t know how accurately can I get the arms, legs, body measures from it. I was thinking about using a 2D image or video (from a regular camera) and measuring the limbs and body from that. Of course, players would be located in a specific location, and will be told to stand with their arms open.
I’m interested in the feasibility of using 2D imaging for measurements by capturing a photograph and analyzing it post-capture. This could potentially simplify the setup compared to real-time video processing.
I have never used MediaPipe. Could I really get the measurement in cm using it? or only a pose estimation?
As you can see, I am still completely unsure on if I should use a camera, an Azure Kinect, a LIDAR, etc. I am quite new to this and I don’t know what could get my installation working easy and almost real-time for people using it.
Also, regarding the processing power needed, I’m glad to hear that GPU-intensive operations may not be overly demanding for this application. I’m evaluating whether a dedicated GPU setup, like NVIDIA Jetson, would be beneficial or if simpler hardware could achieve the desired results effectively.
Any further suggestions or alternative approaches for 2D-based measurements would be greatly appreciated, especially any considerations I might be overlooking in an interactive installation context.