Orin assertion deviceSupports(SHARED_ATOMICS) in cudaimgproc

I recompiled OpenCV and my app using -DCUDA_ARCH_BIN=87 (PTX was auto detected as 86). It works fine but I see no changes in my app size ~18MB. By the way, on TX2 it is just 1.8 MB. Could you tell me why it is ~10 times bigger on Orin?
If I use gcc-11 instead of gcc-10 on Orin board app size is ~13MB.

I have no static linking, although I’d like to have (to let application work on any board without OCV and CUDA installation) - is it possible?
I tried once -DBUILD_SHARED_LIBS=OFF but it lead to the app size 130 MB (it is ok) and the next level of error messages when running (it is not ok). Should I create another topic about static linking (if this goal to use application without any CUDA and OCV installation is even possible) or it is ok to discuss it here? Perhaps you have a link to a detailed explanation with examples?