dipping toes in GPU programming

After trying many different ways to speedup poselets could not get below the 1 sec per frame. Best I could do was 6 sec per frame (choose the pyramid levels given in standard code, that is too many pyramid levels).

Therefore natural progression was lets go to the GPU. For going through lectures of “Introduction to Parallel Programming” on Udacity. Till now the lectures are going quickly.The tricky parts have not yet started but the instructor’s teaching style keeps you interested and involved.

Hopefully will be calculating HOG and evaluating convolution in next week. Interesting will be how to make it work on different scales keeping everything inside the GPU. Keeping fingers crossed for making to go below 1 sec.