Poor Performance on Apple Silicon

I’m seeing poor performance from some openCV functions when compiled for Apple Silicon.
The same version of openCV compiled on x86_64 will run faster, regardless on whether the host is intel or Rosetta on Apple Silicon.

Specifically, the cv::resize is slow, but especially when using certain interpolation modes. INTER_AREA is slow and INTER_LINEAR is really slow ( about 3.5x expected processing time ).

cvtColor() seems significantly slower as well, though I haven’t dug too deep on this one yet.

So is there any known issue or cause that might explain this?

OpenCV 4.5.5 compiled from scratch, separately on x86 and arm64 machines (no fat-binaries)
Testing on last gen Intel MBP as well as M1 Max MBP.