How to reduce extract image feature time?

Hi everyone,
I have just learned about computer vision. Currently, i am processing a problem about image classification. I used HoG to extract features from images, but it takes a very long time whenever i run it. Is there anyway to reduce the time run it because my dataset is very large (about 15k images). And HoG extracts an image into a (1, 16740) vector dimension.
Here is the code i uses to extract features:

def calc_hog(imgPath):
    img = cv2.imread(imgPath)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    resized_img = cv2.resize(img, (256, 128))
    features, hog_image = hog(resized_img, orientations=9, pixels_per_cell=(8, 8),
                              cells_per_block=(2, 2), visualize=True, channel_axis=-1)
    return features

I am very thankful that you guys read my question, and i hope i can receive an answer soon.
Regard.

Edited:
This is the code which i use to load images:

def loadDataset(imgPath, dataf, features_extraction):
    x = []
    y = []

    sto = dataf.reset_index(drop=True)

    for i in range(len(sto)):
        img_path = os.path.join(imgPath, sto['image'][i])
        x.append(features_extraction(img_path))
        y.append(dataf['label'][i])

    x = np.array(x)
    y = np.array(y)

    return x, y

please find out how much time each line in your code takes. sum up the individual times per line across the entire dataset. use time.perf_counter(), or use a proper profiler

1 Like

It’s about 4683s (1 hour 18 minutes) for 12271 images to make a (12271, 16740) matrix from using HoG. And i still have 3k images more to run :frowning:.

i don’t mean the grand total.

I mean you need to find out which line in calc_hog takes how long.

1 Like

this should be a one-time, offline task.

.gather your hog features once & serialize for later use

Great!

So that looks like… twenty nine seconds (29 s) to merely read the image, and another 0.3 secs for the actual processing, the majority of which is the hog() call.

now I don’t know how hog is defined. you didn’t show that code. for all I know, it’s not even from OpenCV but from skimage

could you provide details on that?

yes, you are right, that hog is come from skimage. I dont know clearly how it’s work but they provided it in this documentation Histogram of Oriented Gradients — skimage 0.23.2 documentation (scikit-image.org) and this scikit-image/skimage/feature/_hog.py at main · scikit-image/scikit-image · GitHub.
By the way as berak say, can you suggest me a keyword or something so that i can findout way to do this once (i wonder if the time i load is more faster than process it for each time run).

it’s just a numpy array, so you can use any method to store and load numpy arrays

1 Like