Dense Pyramidal LK Optical Flow
Optical Flow works on the following assumptions:
- Pixel intensities of an object do not have too many variations in consecutive frames
- Neighboring pixels have similar motion
Taking the Taylor series approximation on the right-hand side, removing common terms, and dividing by dt gives the following equation:
Where,,and.
The above equation is called the Optical Flow equation, wheret,fx
and
fy
are the image gradientsand
ft
is the gradient along time. However, (u, v) is unknown. It is not possible to solve this equation with two unknown variables. Thus, several methods are provided to solve this problem. One method is Lucas-Kanade. Previously it was assumed that all neighboring pixels have similar motion. The Lucas-Kanade method takes a patch around the point, whose size can be defined through the ‘WINDOW_SIZE’ template parameter. Thus, all the points in that patch have the same motion. It is possible to find (
fx
,
fy
,
ft
) for these points. Thus, the problem now becomes solving ‘WINDOW_SIZE * WINDOW_SIZE’ equations with two unknown variables,which is over-determined. A better solution is obtained with the “least square fit” method. Below is the final solution, which is a problem with two equations and two unknowns:
API Syntax
template< int NUM_PYR_LEVELS, int NUM_LINES, int WINSIZE, int FLOW_WIDTH, int FLOW_INT, int TYPE, int ROWS, int COLS, int NPC> void xFDensePyrOpticalFlow( xF::Mat & _current_img, xF::Mat & _next_image, xF::Mat & _streamFlowin, xF::Mat & _streamFlowout, const int level, const unsigned char scale_up_flag, float scale_in )
Parameter Descriptions
The following table describes the template and the function parameters.
Parameter | Description |
---|---|
NUM_PYR_LEVELS | Number of Image Pyramid levels used for the optical flow computation |
NUM_LINES | Number of lines to buffer for the remap algorithm – used to find the temporal gradient |
WINSIZE | Window Size over which Optical Flow is computed |
FLOW_WIDTH, FLOW_INT | Data width and number of integer bits to define the signed flow vector data type. Integer bit includes the signed bit. The default type is 16-bit signed word with 10 integer bits and 6 decimal bits. |
TYPE | Pixel type of the input image. XF_8UC1 is only the supported value. |
ROWS | Maximum Height or number of rows to build the hardware for this kernel |
COLS | Maximum Width or number of columns to build the hardware for this kernel |
NPC | Number of pixels the hardware kernel must process per clock cycle. Only XF_NPPC1, 1 pixel per cycle, is supported. |
_curr_img | First input image stream |
_next_img | Second input image to which the optical flow is computed with respect to the first image |
_streamFlowin | 32-bit Packed U and V flow vectors input for optical flow. The bits from 31-16 represent the flow vector U while the bits from 15-0 represent the flow vector V. |
_streamFlowout | 32-bit Packed U and V flow vectors output after optical flow computation. The bits from 31-16 represent the flow vector U while the bits from 15-0 represent the flow vector V. |
level | Image pyramid level at which the algorithm is currently computing the optical flow. |
scale_up_flag | Flag to enable the scaling-up of the flow vectors. This flag is set at the host when switching from one image pyramid level to the other. |
scale_in | Floating point scale up factor for the scaling-up the flow vectors. The value is (previous_rows-1)/(current_rows-1). This is not 1 when switching from one image pyramid level to the other. |
Resource Utilization
The following table summarizes the resource utilization of xFDensePyrOpticalFlow for 1 pixel per cycle implementation, with the optical flow computed for a window size of 11 over an image size of 1920x1080 pixels. The results are after implementation inVivado HLS 2017.1for the Xilinx xczu9eg-ffvb1156-2L-e FPGA at 300 MHz.
Operating Mode | Operating Frequency (MHz) |
Utilization Estimate | |||
---|---|---|---|---|---|
LUTs | FFs | DSPs | BRAMs | ||
1 Pixel | 300 | 32231 | 16596 | 52 | 215 |
Performance Estimate
The following table summarizes performance figures on hardware for the xFDensePyrOpticalFlow function for 5 iterations over 5 pyramid levels scaled down by a factor of two at each level. This has been tested on the zcu102 evaluation board.
Operating Mode | Operating Frequency (MHz) |
Image Size | Latency Estimate |
---|---|---|---|
Max (ms) | |||
1 pixel | 300 | 1920x1080 | 49.7 |
1 pixel | 300 | 1280x720 | 22.9 |
1 pixel | 300 | 1226x370 | 12.02 |