Pedestrian Detection using Histogram of Gradients (HoG) and a Random Forests Classifier

1,178 64×128 sized images of pedestrians (positive samples) and 4,530 of the same sized negative samples were extracted from the INRIA dataset and divided randomly into two to give a training set and a validation set. HoG features on the training images were used to train a Random Forests classifier. The number of trees was set to 100 and the maximum depth of each tree set to 5.

For detection, the search was carried out only over one scale (for testing purposes), on a separate set of images collected from the lab.

Completely un-optimized code on a Core-Two duo laptop already runs at 20 fps.

SURF Feature Descriptors on Python OpenCV

This is a test of SURF features (, claimed to be faster than the original SIFT feature descriptors) on Python OpenCV.
Parameters used were: (0, 300, 3, 4)
0: (basic, 64-element feature descriptors)
hessian threshold: 300
number of octaves: 3
number of layers within each octave: 4
Time taken for execution: 0.108652 seconds on a 249×306 image (MacBook Air 2.13 GhZ, Core 2 Duo, 4 GB RAM)

import cv

# SURF Feature Descriptor
# Programmed by Jay Chakravarty

from time import time

if __name__ == '__main__':
    cv_image1_grey = cv.CreateImage( (cv_image1.width, cv_image1.height), 8, 1 )
    cv_image2_grey = cv.CreateImage( (cv_image2.width, cv_image2.height), 8, 1 )

    cv.CvtColor(cv_image1, cv_image1_grey, cv.CV_BGR2GRAY);
    cv.CvtColor(cv_image2, cv_image2_grey, cv.CV_BGR2GRAY);
    # SURF descriptors
    #param1: extended: 0 means basic descriptors (64 elements each), 1 means extended descriptors (128 elements each)
    #param2: hessianThreshold: only features with hessian larger than that are extracted. good default value is 
    #~300-500 (can depend on the average local contrast and sharpness of the image). user can further filter out
    #some features based on their hessian values and other characteristics.
    #param3: nOctaves the number of octaves to be used for extraction. With each next octave the feature size is
    # doubled (3 by default)
    #param4: nOctaveLayers The number of layers within each octave (4 by default)
    tt = float(time())    

    #SURF for image1
    (keypoints1, descriptors1) = cv.ExtractSURF(cv_image1_grey, None, cv.CreateMemStorage(), (0, 300, 3, 4))
    tt = float(time()) - tt
    print("SURF time image 1 = %g seconds\n" % (tt))
    # draw circles around keypoints in image1
    for ((x, y), laplacian, size, dir, hessian) in keypoints1:
        cv.Circle(cv_image1, (x,y), int(size*1.2/9.*2), cv.Scalar(0,0,255), 1, 8, 0)

    cv.ShowImage("SURF_mvg1", cv_image1)
    tt = float(time())    
    #SURF for image2
    (keypoints2, descriptors2) = cv.ExtractSURF(cv_image1_grey, None, cv.CreateMemStorage(), (0, 300, 3, 4))
    tt = float(time()) - tt
    print("SURF time image 2= %g seconds\n" % (tt))

    # draw circles around keypoints in image2
    for ((x, y), laplacian, size, dir, hessian) in keypoints2:
        cv.Circle(cv_image2, (x,y), int(size*1.2/9.*2), cv.Scalar(0,0,255), 1, 8, 0)
    cv.ShowImage("SURF_mvg2", cv_image2)
