How to Follow the Action on a Basketball Court

Hi All

Is it possible?

I want to track players ( and the basketball if possible ) on a basketball court and decide where the action is.

Tracking input video comes from a stationary wide angel camera that covers the entire court.

Output from the tracking process should be pan instructions to a PTZ camera that follows the game. Initially only panning. No tilt or zoom.

All ideas are appreciated!


Thomas S


  1. Try to detect the basketball, you will get a bounding box and will be able to calculate the center of this box. Then, calculate the distance between the center of your image and the center of the bounding box. If it is above a defined range, move the camera to the position of the ball but:

    • May be not efficient since the basketball is bouncing, with football it’s easier because the ball is just rolling over two axes
    • The ball may not be detected because of its size
  2. Try to detect all the players and then calculate if the percentage of 10 is above a defined number. Like if 7 or 8 players are at a position, it probably means that the game takes place there. Then, with all the players boxes you should be able to calculate a global center of all there positions, and move the camera to this point but:

    • Again the players can be not detected since they somtimes overlap
    • If you chose this approach, I advise you to take only basketball players (the outfit) for the detection model, otherwise the public may be detected too.
    • The camera will need a large view or at least, not being zoomed into the players.
      But your camera is only panning so it should be easier, and so the calcul position can be done only on the horizontal axis too
      Also, you can combine the two ideas but they may need a large calculation power.

I also know that some datasets provides action models, like “hitting a ball”, “playing something” so it maybe an alternative of the player detection.

Here some links that should be useful:

I recently found this video that actually show what I explained

Hi PommePomme,

Thanks a lot.

I did try Adrian Rosebrocks follow the ball script. It turns out the basketball on my video is way to small!

Do you know more about the code behind the YouTube video above?

Right now I try to use

calcOpticalFlowFarneback(prev, next, None, 0.5, 3, 15, 3, 5, 1.2, 0) and mag, ang = cv2.cartToPolar(flow[…,0], flow[…,1], mag, ang, True)

to get displacement vectors from one frame to the next.

Currently, I am trying to understand why I get displacement vectors from one frame to the next when nothing is changing from one frame to the next.


Thomas S

please be precise in your description.

you always get vectors. one for every pixel, unless the algorithm is one of those that can mark a pixel’s vector as invalid for some reason.

when nothing moves, these vectors should simply be 0 or very close.

video usually contains noise. anything derived from noisy data (calculating optical flow) will also contain noise.

Hi crackwitz


I get the vectors from a call to

flow = cv2.calcOpticalFlowFarneback(prev, next, None, 0.5, 3, 15, 3, 5, 1.2, 0)

flow is now a 3D array.

Now I am confused!

Does flow represent pixel displacement from frame ‘prev’ to frame ‘next’ where x displacement is in flow[:,:,0] and y displacement is in flow[:,:,1]


do I have to use flow in this call:

mag, ang = cv2.cartToPolar(flow[…,0], flow[…,1], mag, ang, True)

and now I have pixel displacement distance in mag and displacement angel in ang?


Thomas S

both. kinda. as you can imagine.

I am puzzled as to why you posed this as an either-or question.

it’s not a choice between two options.

the first is a fact.

the second is something you can do, if you need it.

the cartesian and polar representations of a vector represent the same vector. you decide what representation you need, and then you convert or you don’t.