Frame of reference of stereo depth image

I have a query regarding the representation of depth images (not particularly related to OpenCV in specific) obtained from stereo cameras, such as zed (or obtained using OpenCV SGBM matcher).
Are the individual depth values merely the ‘z’ distance along the ‘z’ axis of the camera? Or are the depth values the absolute distance between the center of the camera and the 3D point?
For instance, suppose a large flat surface is placed in front of a camera 1m in front of the camera with the center of the camera directly aligned with the center of the flat surface, then according to the first interpretation, every single pixel of the depth image should be 1m, however, according to the second definition, the depth values of pixels towards the corner of the flat surface (but still on the surface) should be larger than the depth values towards the center. (For e.g. a point 1m above the center of the flat surface should show a depth value of \sqrt{2}).
So which of these definitions is true for a depth map?

PS: I do not have a zed camera to experiment this, but am interested in this aspect


that is the situation.