Helping understand a sample code - samples/dnn/openpose.cpp

Hi all!

I have downloaded that sample code : samples/dnn/openpose.cpp
and I also manage to make it work and even add some functions of my own.

but I don’t really understand all of the lines written in that code and I would be happy if someone could help me.

what are those lines mean?

"{ p proto          |           | (required) model configuration, e.g. hand/pose.prototxt }"
        "{ m model          |           | (required) model weights, e.g. hand/pose_iter_102000.caffemodel }"
        "{ i image          |           | (required) path to image file (containing a single person, or hand) }"
        "{ d dataset        |           | specify what kind of model was trained. It could be (COCO, MPI, HAND) depends on dataset. }"
        "{ width            |  368      | Preprocess input image by resizing to a specific width. }"
        "{ height           |  368      | Preprocess input image by resizing to a specific height. }"
        "{ t threshold      |  0.1      | threshold or confidence value for the heatmap }"
        "{ s scale          |  0.003922 | scale for blob }"

what is the proto and why do I need it?
what are model weights?
why can’t only the dataset work?
why do I need to resize the image and what is scale and blob?

thank you for your time,
Kobi.

there are pretrained tensorflow models for body, hand or face detection, and those come as a weight (.caffemodel) and a layer description file (.prototxt).

in the end, you need to tell the machine, where the data is…

the models were also trained on different datasets (MPI, COCO), and internal sizes differ, so you need to specify that, too.

you dont need to resize it manually, those are params for the dnn::blobFromImage() function.
(you can even vary the size, as long as both W&H are a multiple of 16)
it also tries to “whiten” the image (subtract mean, and divide by stddev), that’s what the scale actor does.