Which features of camera should be considered when choosing a camera for object detection and image stitch?

For Yolo detection you don’t need high resolution image; the image is scaled to ~400x400 or ~500x500 pixels (0.025Mpix) before detection.

Every lens has some distortion, best is to correct this using the chessboard calibration pattern: OpenCV: Camera calibration With OpenCV , so you wont need a new camera.

For uniform luminosity the best is to lock the cameras exposure time.

By the way, did you think of buying a wide angle (or fisheye) camera, so you don’t need to move the camera and stitch the images?