HOG
The histogram of oriented gradients (HOG) is a feature descriptor used in computer vision for the purpose of object detection. The feature descriptors produced from this approach is widely used in the pedestrian detection.
The technique counts the occurrences of gradient orientation in localized portions of an image. HOG is computed over a dense grid of uniformly spaced cells and normalized over overlapping blocks, for improved accuracy. The concept behind HOG is that the object appearance and shape within an image can be described by the distribution of intensity gradients or edge direction.
Both RGB and gray inputs are accepted to the function. In the RGB mode, gradients are computed for each plane separately, but the one with the higher magnitude is selected. With the configurations provided, the window dimensions are 64x128, block dimensions are 16x16.
API Syntax
template void xFHOGDescriptor(xF::Mat &_in_mat, xF::Mat &_desc_mat
Parameter Descriptions
The following table describes the template parameters.
PARAMETERS | DESCRIPTION |
---|---|
WIN_HEIGHT | The number of pixel rows in the window. It is fixed at 128. |
WIN_WIDTH | The number of pixel cols in the window. It is fixed at 64. |
WIN_STIRDE | The pixel stride between two adjacent windows. It is fixed at 8. |
BLOCK_HEIGHT | Height of the block. It is fixed at 16. |
BLOCK_WIDTH | Width of the block. It is fixed at 16. |
CELL_HEIGHT | Number of rows in a cell. It is fixed at 8. |
CELL_WIDTH | Number of cols in a cell. It is fixed at 8. |
NOB | Number of histogram bins for a cell. It is fixed at 9 |
ROWS | Number of rows in the image being processed. (Should be a multiple of 8) |
COLS | Number of columns in the image being processed. (Should be a multiple of 8) |
SRC_T | Input pixel type. Must be either XF_8UC1 or XF_8UC4, for gray and color respectively. |
DST_T | Ouput descriptor type. Must be XF_32UC1. |
DESC_SIZE | The size of the output descriptor. |
NPC | Number of pixels to be processed per cycle; this function supports only XF_NPPC1 or 1 pixel per cycle operations. |
IMG_COLOR | The type of the image, set as either XF_GRAY or XF_RGB |
OUTPUT_VARIENT | Must be either XF_HOG_RB or XF_HOG_NRB |
The following table describes the function parameters.
PARAMETERS | DESCRIPTION |
---|---|
_in_mat | Input image, of xF::Mat type |
_desc_mat | Output descriptors, of xF::Mat type |
- NO is normal operation (single pixel processing)
- RB is repetitive blocks (descriptor data are written window wise)
- NRB is non-repetitive blocks (descriptor data are written block wise, in order to reduce the number of writes).
Resource Utilization
The following table shows the resource utilization ofxFHOGDescriptor
function for normal operation (1 pixel) mode as generated inVivado HLS 2017.1version tool for the partXilinx Xczu9eg-ffvb1156-1-i-es1at 300 MHz to process an image of 1920x1080 resolution.
Resource | Utilization (at 300 MHz) of 1 pixel operation | |||
---|---|---|---|---|
NRB | RB | |||
Gray | RGB | Gray | RGB | |
BRAM_18K | 43 | 49 | 171 | 177 |
DSP48E | 34 | 46 | 36 | 48 |
FF | 15365 | 15823 | 15205 | 15663 |
LUT | 12868 | 13267 | 13443 | 13848 |
Performance Estimate
The following table shows the performance estimates of xFHOGDescriptor() function for different configurations as generated inVivado HLS 2017.1version tool for the partXilinx Xczu9eg-ffvb1156-1-i-es1to process an image of 1920x1080p resolution.
Operating Mode | Operating Frequency (MHz) | Latency Estimate | |
---|---|---|---|
Min (ms) | Max (ms) | ||
NRB-Gray | 300 | 6.98 | 8.83 |
NRB-RGBA | 300 | 6.98 | 8.83 |
RB-Gray | 300 | 176.81 | 177 |
RB-RGBA | 300 | 176.81 | 177 |
Deviations fromOpenCV
- Border care
The border care thatOpenCVhas taken in the gradient computation is BORDER_REFLECT_101, in which the border padding will be the neighboring pixels' reflection. Whereas, in the Xilinx implementation, BORDER_CONSTANT (zero padding) was used for the border care.
- Gaussian weighing
The Gaussian weights are multiplied on the pixels over the block, that is a block has 256 pixels, and each position of the block are multiplied with its corresponding Gaussian weights. Whereas, in the HLS implementation, gaussian weighing was not performed.
- Cell-wise interpolation
The magnitude values of the pixels are distributed across different cells in the blocks but on the corresponding bins.
Pixels in the region 1 belong only to its corresponding cells, but the pixels in region 2 and 3 are interpolated to the adjacent 2 cells and 4 cells respectively. This operation was not performed in the HLS implementation. - Output handling
The output of theOpenCVwill be in the column major form. In the HLS implementation, output will be in the row major form. Also, the feature vector will be in the fixed point type Q0.16 in the HLS implementation, while in theOpenCVit will be in floating point.
Limitations
- The configurations are limited toDalal’s implementation.
- Image height and image width must be a multiple of cell height and cell width respectively.