I noticed that for some reason my script exceeds the maximum memory usage (16GB) pretty fast, and after using memory-profiler
I saw that cv2.imread(filename)
takes up 6.7 GB after just 20 iterations when there is ~1400 overall.
Here is a snippet from the profiler (the rest of the code is commented anyway):
90 59.8 MiB 0.0 MiB 1 img_array = []
91 59.8 MiB 0.0 MiB 1 size = 0
92 6827.2 MiB -2.1 MiB 21 for idx, path in enumerate(path_list):
93 6827.2 MiB 0.0 MiB 21 if idx == 20:
94 6827.2 MiB 0.0 MiB 1 return
95 5728.9 MiB 0.0 MiB 20 print('------------------------------------------------------------------------')
96 5728.9 MiB 0.0 MiB 20 print(f'Iteration Number: {idx}, In Directory: {path}')
97 5728.9 MiB 0.0 MiB 20 print('------------------------------------------------------------------------')
98 6827.6 MiB 5.8 MiB 61620 for filename in sorted(glob.glob(path + '/*.png'), key=len):
99 6827.7 MiB 6757.6 MiB 61600 img = cv2.imread(filename)
100 6827.7 MiB -0.1 MiB 61600 height, width, _ = img.shape
101 6827.7 MiB -0.1 MiB 61600 size = (width, height)
102 6827.7 MiB 3.7 MiB 61600 img_array.append(img)
Each image size is 224 x 171 and in total 62k images were read during this test.