Why does saving a binarized image as jpg have multiple values?

  • OpenCV-python: 4.5.5
  • Operating System / Platform: Windows 64 Bit
  • Python version: 3.8.13

I binarized an image and saved it as JPG. When I re-read the image, There are more than two values. PNG images don’t have this problem.

Steps to reproduce
img_path = r"temp.tif"
png_path = r"temp.png"
jpg_path = r"temp.jpg"
gray = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
print("gray shape: {}, values: {}".format(gray.shape, np.unique(gray)))

output:

gray shape: (125, 125), values: [191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208
209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226
227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244
245 246 247 248 249 250 251 252 253 254 255]

_, thresh = cv2.threshold(gray, 200, 255, cv2.THRESH_BINARY)
print("thresh shape: {}, values: {}".format(thresh.shape, np.unique(thresh)))
cv2.imwrite(jpg_path, thresh)
cv2.imwrite(png_path, thresh)
jpg = cv2.imread(jpg_path, cv2.IMREAD_UNCHANGED)
png = cv2.imread(png_path, cv2.IMREAD_UNCHANGED)
print("jpg shape: {}, values: {}".format(jpg.shape, np.unique(jpg)))
print("png shape: {}, values: {}".format(png.shape, np.unique(png)))

output:

thresh shape: (125, 125), values: [ 0 255]
jpg shape: (125, 125), values: [ 0 1 2 3 4 5 250 251 252 253 254 255]
png shape: (125, 125), values: [ 0 255]

When the jpg image is read, there are values like 1, 2, 253, 254
I thought it was a bug, so I asked it in the GitHub issue, but it doesn’t seem to be, and suggested to ask here, so I want to know the reason for this situation. I found a similar situation on StackOverflow python - Saving and Reading Binary Image but didn’t find the real reason.

read JPEG - Wikipedia

1 Like

Thanks, I see why. I know that jpg is a lossy compression, but I have never known the details. When I encountered this problem, I didn’t think to consider the reason for jpg itself at the first time. Next time I ask a question, I need more thought.