I’m working on some image processing algorithm in Python where I need to do some per-pixel operations (i.e. I can’t solve it with matrix operations)*.
The algorithm was extremely fast in C++, but it takes an eternity in Python.
I also made a quick test to compare the speed of a simple operation ( image = image+2 ) with matrix operations and compared it with iterating through the image (2 for loops) and it is over 1000 times slower!!!
What are the possibilities to speed it up?
*It’s a kind of Hough transform, so I need to manipulate the pixel value, its neighborhood and the coordinates… so I don’t really see other solution than to iterate through the image with 2 for loops along the X and Y axis and to access the pixels using image[x,y] .
[I’m also grateful to the new forum as it allows extended discussions…]
I’ll look into @volkmar.wieser’s suggestion, and I find interesting @crackwitz’s idea of Python/C++ interoperation, especially as I already have several algorithms implemented in C++ that I’d like to reuse in Python.
There seems to be a more direct way to create Python modules from C++ code using OpenCV’s bindings. I found this tutorial in the docs about expanding Python OpenCV with my own modules: https://docs.opencv.org/3.4/da/d49/tutorial_py_bindings_basics.html (second part). Unfortunately it’s a very cryptic and incomplete description (it’s really far from being a tutorial), I didn’t understand it even with 6+ years of OpenCV experience.
I found this article on the same subject, but I couldn’t make it work, it seems outdated - but it’s more or less the idea I would like to achieve.
you can take a look at https://github.com/sturkmen72/EDLib-test a sample to add a custom function in a new module to OpenCV. Or another way is you can add your functions in existing modules such as imgproc etc.
Second, you mark functions and classes with CV_EXPORTS_W and other macros and use InputArray, OutputArray and other types known by the wrappers as parameters. The tutorial describes these macros pretty well although supported types are not clearly documented.
Thanks @sturkmen and @mshabunin! I think we’re getting closer! (I didn’t know that you can add a list of folders as OPENCV_EXTRA_MODULES_PATH)
It’s clear that building my module as a part of the OpenCV build process is the most straightforward solution, but I still wonder if it’s possible to build my module separately (for simpler modifications and redistribution)?
map/reduce/filter will be worthless for image manipulation. not just because they’re the wrong APIs but because they still base their actions on python code.
the pythonic way is to use the library functions given by numpy and OpenCV, which do the job in compiled, optimized code and also parallelized when sensible.
I think it’s a bad idea to consider making an “OpenCV module” for your application-specific code. application code simply doesn’t belong there.
you can write a python module in C/C++ and use that from python, beside OpenCV. you can also use OpenCV’s C++ API in your module. since writing python extension modules is a little difficult to approach, cython was made.
Thanks for all the ideas, guys! @crackwitz’s suggestions were particularly helpful.
I tested most of the methods; here is a wrap-up:
Map/reduce/filter: not really applicable. Most of the time the images aren’t reduced, and often we need to manipulate array indexes, which is impossible with these methods (or with other matrix operators)
Pyton wrapper for C code: very interesting solution, but unfortunately you need to create an OpenCV module - and putting application-specific code in a library is a bad idea. However it would be great if there was a simple wrapper to create a python header and a .so file from a C++ code
Cython - this is the best solution. The time-critical Python code gets translated to C (and binary code if necessary), and imported.
As I didn’t find any simple example on Cython/OpenCV, I’m attaching my simple testing code below. It is mostly based on this tutorial. Note that this is my first experiment, so probably it can still be optimized/simplified, butI still hope this can help!
My results on a 10MP photo: OpenCV: 0.003s; Numpy: 0.023s Python loops: 12.9s[!!!] Cython loops: 0.01s
thresholding.py
import cv2
import numpy as np
import time
# import and compile cython code
import pyximport
pyximport.install()
import fastthreshold
def pythonthresh(gray):
res = np.zeros(gray.shape, np.uint8)
for y in range(gray.shape[0]):
for x in range(gray.shape[1]):
res[y, x] = 255 if gray[y, x] > th else 0
return res
# Open file and convert to grayscale
filename = "IMG_02506.jpg" # change this
img = cv2.imread(filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
th = 128
# OpenCV thresholding
t1 = time.time()
res1 = cv2.threshold(gray, th, 255, cv2.THRESH_BINARY)
t2 = time.time()
print(" ------ CV2 thresholding: %s seconds -------" % (t2-t1))
# Numpy thresholding
# probably can be optimized, the multiplication takes time
res3 = (gray > th) * 255
t3 = time.time()
print(" ----- Numpy thresholding: %s seconds ------" % (t3-t2))
# iterating through the array using for loops; function above
res2 = pythonthresh(gray)
t4 = time.time()
print(" --- Per pixel thresholding: %s seconds ---" % (t4-t3))
# fast iteration using cython
out = np.zeros(gray.shape, np.uint8)
fastthreshold.fastthreshold(th, gray, out)
t5 = time.time()
print(" ---- Cython thresholding: %s seconds ------" % (t5-t4))
fastthreshold.pyx
#cython: language_level=3
cimport cython
@cython.boundscheck(False)
@cython.wraparound(False)
cpdef fastthreshold(int th, unsigned char[:,:] gray,unsigned char[:,:] output):
cpdef int x,y
for y in range(gray.shape[0]):
for x in range(gray.shape[1]):
output[y,x] = 255 if gray[y,x]>th else 0