Clarification on DNN automatic input transposition

I’m currently working on a configurable input/output pipelining system for neural networks so we can run inference on arbitrary models with OpenCV and handle any pre/post processing without hardcoding in C++.

Reading around I’ve run into some conflicting information about NCHW and NHWC tensor formats. I know OpenCV uses NCHW and Tensorflow uses NCHW.

From what I understand OpenCV deals with this conflict by automatically inserting a transposition at the head of the graph (or possibly before each convolution?). What I don’t understand is how the decision to transpose is made.

I know that dnn::readNet() dispatches to dnn::readNetTensorflow() if the file has a “.pb” or “.pbtxt” suffix. Is the NHWC → NCHW transposition done based on this simple filename check, or is there some more complex logic?

From my experiments it appears that a Tensorflow “Frozen Graph” that is converted to ONNX will be in NHWC format. Is OpenCV able to detect tensor shape for an ONNX model and insert the appropriate transpositions?

I’d appreciate it if anyone clarify where the transpositions are inserted into the graph (i.e. just in-front or for all convolutions?) and what the condition for automatic transposition is (i.e. file extension, graph inspection, etc.?)

Thanks

1 Like

a nice “pipe dream” (pun intended !)

there is NO SUCH THING !
you’re chasing a chimera…

IF your network requires NCHW input, you can use the convenient blobFromImage() function to feed your data into Net::setInput()

if it is not so, you’re on your own, simply put.
no magic or automation happening under the hood here.
(there are indeed a lot of networks with different order, more or less dimensions than 4 (action rec. , text) or even more than 1 input)

again, NO, that is your job, as a human,
you’ll have to read the code that generated it to find out proper dimensions, if theyre “fixed” or not, order, etc.

last, take a word from an old man here:
dont try to build a one-size-fits-all framework , it will break at the next bend …

Sorry, maybe I’m misunderstanding the situation here. I’ve seen discussions of this elsewhere, and in my testing I’ve noticed that NHWC Tensorflow graphs appear to be converted transparently.

I’m not familiar with the OpenCV codebase so I might be misunderstanding or looking at dead code, but “dnn/src/tensorflow/tf_importer.cpp” appears to check the graph for a layout metadata somewhere around here

I think I partially answered my own question here, OpenCV inspects the graph (and then later assumes NHWC). I’ll look into the source for the other graph importers, and see what I can find.

FWIW, I understand that a catch all framework is a fools erand, but we need to run a few different kinds of models and being able to set up a pipeline of matrix transforms etc., simplifies deploying them.

1 Like

oook, so im partly proven wrong (at least for importing tf (also caffe (?)) graphs)
still, no such magic in the network code, afaiik.

Yeah, I assume the actual layer implementations are the same regardless of the graph source. I’m just trying to get a grasp on the invisible witchcraft happening during import.

Manipulating input to a spec is easy enough, the problem is figuring out how that spec does or doesn’t change after OpenCV’s fiddling. e.g. ONNX converts a NHWC graph into a NCHW graph, but inserts a single transposition layer at the front of the network so that it still takes data in NHWC format. OpenCV appears to read that as NCHW and I’ll need to account for that transpose myself.