How would you synchronize two webcams to be acquiring their images at the same times?


How would you synchronize two webcams to be acquiring their images at the same times?

I’m not talking about concurrently retrieving images from the webcams, but about making sure that at a given fps set for both cameras, they will also be synchronized in time, simultaneously as possible taking their images. Rather than each of them initializing at its own time and then they will be having the same fps but most likely unsynchronized in time.

Give or take their internal timing may slightly drift over time even after they have been syncrhonized once.

There’s not even an indication on the api for the camera’s image acquistion time or is there?

Thanks for your comments!

P.S. Yes I know, that’s what costly machine vision cameras with hardware synchronization features are made for …

USB UVC does not have the capability to decide exactly when an exposure is started. not for video capture modes anyway. for still image capture modes, I don’t know. USB probably doesn’t allow it (no “broadcast”/multicast to several devices to trigger anything). the device might still run in some “frame rate” even in still image capture modes.

if you need real control over this, get industrial ones with hardware trigger (digital input) or genlock.

if you don’t care about ~1 frame of difference, run threads, capture continuously, follow one camera’s timing (or a made up one), then take the latest frame of the other camera.

as for your questions from the other two posts (below) that I moved here from other threads:

a webcam will give you the next frame. it will make you wait. if you want to do something with that time, either use multithreading and synchronization primitives, or cv::VideoCapture::waitAny() to select/poll on a set of capture objects.

So does this mean that each grab obtains the latest image taken by the camera over the camera’s course of endlessly looping its initially set frame rate, and hence we wait a little in the common and almost certain case that it is still in the middle of completing its next image acquisition at the precise time that we grab it?

I mean unless we were timing our grabs so precisely aligned to the camera’s loop time that we never had to wait, which is obviously impractical.

Hi Filip,

Does this imply that we don’t have a way to make the camera wait for our request for a grab as if it should acquire an image only once it gets the grab request, with typical modern webcams? that they can only go at their fps, politely acknowledging our grab requests but ultimately sending back their latest already-acquired image?

I mean with OpenCV api, and otherwise if we are using some other toolkit which generalizes over different webcam models and drivers on commodity operating systems.


Thanks crackwitz, this is what I thought most plausible.

I think there’s many variations to the particular flow that you suggest.

E.g. running a thread per camera on a continuous grab retrive loop faster than the cameras frame rate, waiting on those threads outside the specific OpenCV api. This also yields deriving I think very precisely ― if the threads aren’t slowed down by processor contention in any significant way ― the actual time that each image was optically acquired under only mild assumptions about the camera’s swiftness to process the grab request, the exposure time ― plus maybe some assumptions about USB bus contention affecting the grab time which may be the weak spot here.

If it’s okay, I’ll take the liberty putting together just one follow-up set of questions arising in relation to this objective.

Is there any advice on trying to force when the cameras actually start their frame acquisition loops? does sending them a request to set any property like frame rate or resolution force them to restart their loop or is that just wishful thinking to assume for the vast majority of webcam hardware out there?

I assume that UVC does not define anything about resetting the camera streams or about delivering timing information along with (or parallel to) the images data transfer, but is there an authoritative source for the UVC standard?

Can we control, from the host computer, such that no buffers are used for storing more than a single yet-ungrabbed frame, so that we reduce the effect of getting back old frames on grabs when catching up on a stream?


If your camera supports hardware trigerring, that’s what you are looking for.

Yes well that’s with super-expensive computer vision cameras isn’t it. Whereas I’m working to approximate some synchronism guarantees more weakly with consumer-grade webcams (or get their image acquisition timestamps relayed through the drivers).

A lot of webcam-class cameras support some sort of single shot mode, and that might be a path worth exploring. I’m thinking of triggering a single shot simultaneously on both cameras. You’d have to be ok with (presumably) a lower frame rate, jitter (you might have trouble getting frames exposed at a consistent interval) and some amount of skew between the pseudo-synchronized “streams.” To be clear, I don’t know that this will work at all, but I think it has a good chance, at least for some cameras.

If I were in your shoes, I’d start by doing something like:
Set up single shot mode on your camera, point the camera at a high resolution stopwatch (the stopwatch on my iphone has been good enough for me in the past), and take successive snapshots with the camera with random delays in between.

  1. Note the system time just prior to triggering a single shot.
  2. Note the system time just after receiving the frame.
  3. Note the timestamp field in the buffer that was returned.
  4. Save the image along with trigger time, return time and buffer timestamp (I usually encode these in the image filename, but you can get as fancy as you want.

For extra credit, analyze the captured image and extract the current time on the stopwatch.

I’d be looking for how consistent the relationship between the stopwatch (in the image) time is with when you triggered the single shot. IF it’s consistent, then maybe you have a path forward with this. Repeat with 2 cameras, triggering them simultaneously. Or, I guess you could just start with this and compare the image they captured - just be careful to position the cameras the same, so you don’t get any discrepancies due to the rolling shutter (which is presumably what your webcam uses).

Another option is that you can run two streams at once, and inspect the timestamp values. Restart one of the streams until you get timestamp values that are close enough to each other for your needs. Just be ready for drift between the two streams. (Or instead of fully restarting the stream, you could just call VIDIOC_STREAMOFF and then VIDIOC_STREAMON at the right time.

One more idea is to run two streams and inspect the timestamps coming off of both streams, then intentionally delay the return of the buffer to one of the streams to try to force them into closer synchronization. To do this you will need to hold on to all of the buffers, and choose when to return the first buffer to the camera. This is similar to the “software genlock” trick that has been used for synchronizing video outputs - that is to say, a hack, but one that might just work.

Again, these are all just ideas, and they require that you have varying degrees of control over the cameras. Maybe there is enough support via OpenCV to do the single shot test, but the other ones will need lower level access, so you either need to control the camera directly (e.g. using the V4L2 interface) or be willing/able to edit and compile the OpenCV code to get at the bits you care about.

Good luck!

Thanks a lot Steve (!)

Indeed some of your ideas have been on my mind and some which go beyond make total creative sense.

Indeed I don’t care the least about jitter in my use case, so thanks for not avoiding the thoughts due to that. I have grown somewhat familiar with how IOCTL are used to communicate with the cameras on linux, but what I’m not fully sure of is whether v4l2 is only a protocol and implemented envelope running very simple IOCTL workflows against the cameras, or whether it is actually a driver that you’d want to call instead of making those IOCTL while passing around the v4l2 headers defined structs and data types on your own.

From the tone I guess that VIDIOC_STREAMON/OFF were supposed to work impeccably if the v4l2 camera devices were 100% loyal to v4l2 compliance, but can also assume that being of a somewhat marginal nature to the overall normal flow of webcam usage mileage may vary with various webcam firmwares and vendors’ behavior when when that command is received by them.

Only thing I’m not sure of is whether some camera stamped time is really sent along with each camera acquired image, whether that’s a thing with UVC, v4l2 or real-world webcams, as much as their internal time tracking is anything that’s very useful. With OpenCV’s api a notion of a frame’s capture time is not relayed back to user code when fetching an image.

Obviously, an image capture isn’t a point in time but a (typically very short) interval of time at which the sensor cells accumulate light.

Well, in case you provide hourly contracting work on creative assignments I’d be happy to hear from you through any of the private channels here or elsewhere …


As far as using the IOCTL calls directly or going through some additional layer, I’ve found that managing everything myself is the right choice for me. It is more tedious, but it gives me the level of control that I need for my purposes. It’s a pretty general framework and is supported by a wide range of cameras. I think the reason for preferring a driver layer is either to hide the details from the programmer (for example the OpenCV VideoCapture interface - there are multiple backend implementations, but you access them through the same interface). Also a proprietary driver might be helpful when you are using special features of a specific camera. I don’t know enough about your requirements, but I’d be inclined to use the V4L2 ioctls directly.

As for whether VIDIOC_STREAMON / VIDIOC_STREAMOFF will work, I suspect it will be implementation dependent. I can imagine that VIDOC_STREAMOFF could disable the stream, but leave the video timing in place, so that when you call STREAMON it would resume with the same timing in effect (so this wouldn’t be an effective way of shifting the frame start times.) I think you will just have to test and see, and if you need a method that works for “any” webcam, you might need to have multiple methods for achieving snychronization.

Maybe one of the stereo camera modules would be a better thing to try? You could potentially solve your timing issues just by using an inexpensive hardware solution. Is this an option?

On the timestamp question, I’m not sure if all cameras honor this part of the spec, but for the cameras I work with it appears they are providing good data. From the spec:

For capture streams this is time when the first data byte was captured, as returned by the clock_gettime()function for the relevant clock id; seeV4L2_BUF_FLAG_TIMESTAMP_*in [Buffer Flags]( For output streams the driver stores the time at which the last data byte was actually sent out in thetimestampfield. This permits applications to monitor the drift between the video and system clock. For output streams that useV4L2_BUF_FLAG_TIMESTAMP_COPY the application has to fill in the timestamp which will be copied by the driver to the capture stream.

I can confirm as a case in point that with multiple Logitech webcams, the timestamps, systematically verified using stopwatch phone applications being pictured by the webcams, are the times of the images taken by the sensor. Using OpenCV’s CAP_PROP_POS_MSEC to fetch them.