OpenCV DNN + YOLO v8 and CUDA doesn't work

Hey, guys. I am stucked with a little problem. I need to run Yolov8 using OpenCV and CUDA. Firstly i’have converted Yolov8n.pt in ONNX model, using different opset’s (from 9 to 18) and tryed to run such code:

import cv2
import numpy as np
from PIL import Image

INPUT_WIDTH = 640
INPUT_HEIGHT = 640

net = cv2.dnn.readNet(‘yolov8n-opset18.onnx’)
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

image = cv2.imread(‘Chandler.jpg’)
blob = cv2.dnn.blobFromImage(image, 1/255.0, (INPUT_WIDTH, INPUT_HEIGHT), swapRB=True, crop=True)

net.setInput(blob)
preds = net.forward(net.getUnconnectedOutLayersNames())

print (preds[0][0][:,:4])

It works without any error but predict’s nothing, instead of XYWH i see just zeroes. Output looks like:
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
…,
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]], dtype=float32)

But, if i turn off CUDA backend and target and run this code on CPU i will get a correct prediction.

array([[ 3.0534186, 7.0815964, 6.4741874, 14.3838 ],
[ 19.781336 , 3.41468 , 39.110847 , 6.830521 ],
[ 26.047003 , 3.303193 , 51.6015 , 6.6188235],
…,
[548.43353 , 579.7528 , 181.32599 , 119.66675 ],
[549.1049 , 578.23346 , 179.46942 , 122.4859 ],
[550.46356 , 576.0064 , 178.82916 , 128.20746 ]],
dtype=float32)

Help please, what could it be.

OpenCV 4.8.1
CUDA 12.1
CuDNN 8.8

Lots of other NN like YUNet, SFace, Yolov5 working correct with same versions of OpenCV DNN + CUDA.

The same goes for opencv4.9