Hello @AlexeyAB, thank you for the amazing repo.
My System Configuration : I’m working on a Yolov3 model with GeForce RTX 2080 Ti GPU with 11 GB GPU memory and Intel(R) Core™ i9-9900KF CPU with 6 core and 64GB of RAM.
When i inference on images, my average FPS is between 35 FPS and GPU utilization is 1128MiB(~1GB out of 11GB). I use opencv and load the yolo on the GPU
I am able to increase the volume of image inference by running the same setup in 6 instances and I am able to achieve 210 FPS and GPU utilization is 6760MiB(~6GB out of 11GB). However this method requires me to seperate the images and feed non duplicate entries to each of the 6 instances.
How do I run a single instance that utilizes the full GPU so I can extract the best FPS from all the 11 GB of GPU.
This will help me to feed the images from multiple sources to a single instances therefore reducing the need for me to split and assign to different instances.