Posts Tagged ‘ CVPR ’

Vision and Deep Learning in 2012

This entry is an effort to collect important Deep Learning Papers that were published in 2012 especially related to computer vision.

There is general resource but not a good resource that collects the papers in Deep Learning w.r.t to Computer Vision problems.

General Resources 

Interesting Papers 

Co-Segmentation in CVPR 2012

There is the list of Co-Segmentation papers in the CVPR 2012 {if you find someother interesting papers regarding Co-Segmentation please send message or post comment thanks}

  • “Multi-Class Cosegmentation” Armand Joulin, Francis Bach, Jean Ponce
  • “On Multiple Foreground Cosegmentation” Gunhee KIM, Eric P. Xing
  • Higher Level Segmentation: Detecting and Grouping of Invariant Repetitive Patterns” Yunliang Cai, George Baciu: not directly co-segmentation paper but could be seen in that way. 
  • “Random Walks based Multi-Image Segmentation: Quasiconvexity Results and GPU-based Solutions” Maxwell D. Collins, Jia Xu, Leo Grady, Vikas Singh
  • A Hierarchical Image Clustering Cosegmentation FrameworkEdward Kim, Hongsheng Li, Xiaolei Huang
  • Unsupervised Co-segmentation Through Region MatchingJose C. Rubio, Joan Serrat, Antonio López, Nikos Paragios

Some interesting papers to look into

  • Learning Image-Specific Parameters for Interactive Segmentation”  Zhanghui Kuang, Dirk Schnieders, Hao Zhou, Kwan-Yee K. Wong, Yizhou Yu, Bo Peng
  • “Graph Cuts Optimization for Multi-Limb Human Segmentation in Depth Maps” Antonio Hernández-Vela, Nadezhda Zlateva, Alexander Marinov, Miguel Reyes, Petia Radeva, Dimo Dimov, Sergio Escalera
    • {JUST want to read it to See How the Depth Data is Being Used}
  • “Active Learning for Semantic Segmentation with Expected Change”  Alexander Vezhnevets, Joachim M. Buhmann, Vittorio Ferrari
    • Basic Objective is to Learn about the “Active Learning” and how it is used
  • “Semantic Segmentation using Regions and Parts”  Pablo Arbeláez, Bharath Hariharan, Chunhui Gu, Saurabh Gupta, Lubomir Bourdev, Jitendra Malik
  • “Affinity Learning via Self-diffusion for Image Segmentation and Clustering” Bo Wang, Zhuowen Tu
  • “Bag of Textons for Image Segmentation via Soft Clustering and Convex Shift”  Zhiding Yu, Ang Li, Oscar C. Au, Chunjing Xu
  • “Multiple Clustered Instance Learning for Histopathology Cancer Image Classification, Segmentation and Clustering” Yan Xu, Jun-Yan Zhu, Eric Chang, Zhuowen Tu
  • “Maximum Weight Cliques with Mutex Constraints for Video Object Segmentation”  Tianyang Ma, Longin Jan Latecki

CVPR 2012, attending Sebastian Thrun’s talk

{being updated as the talk goes on}

Sebastian is wearing a version of google glasses and showing the Google driver-less car video. The project is interesting but the story behind is much more interesting, they shows video of one of researcher driving it, who is legally blind.

It is showing how the Change Detection and segmentation will become an important problem.

If you guys can find the video link do share, talk is good enough to re listen

Talking about California the motorbikes can pass through very near to the car or inbetween the two cars to overtake them, so it becomes difficult to track them. Especially when they become so near to the car that they appear to be one. Mentions that same kind of problem could be seen in the Kinect and if someone working on the Tracking can solve more efficiently this problem it could be life saving. 

Saying the driverless car is lot safer than the human in case of collision but mentions where there are situations where computer cannot fully understand the situation and reacts improperly.

Showing one more application, used by the motor patrolling personals. So that they don’t have to give much attention to car and can do their job.

Excellent talk, although we were discussing a case of such cars in cities like Lahore, Delhi or even New York. In some crowded cities such situations arise quite commonly where the such “safe” cars might end up in deadlock.

Computer Vision algorithms where are implementations?

Why can’t one find the implementations of papers published in prestigious conferences and journals? even one which have been published 2 years back?
It takes way an important tool to compare current algorithms with the previous works. If you want to compare you have implement that paper and end up figuring out all the tweaking parameters. This greatly hampers both the speed of research as well introduces the mistrust, did the author’s algorithm really working? or I am making some mistake? was the tweaking done by hand? or there is some other algorithm picking up those values?

CVPR, ECCV and other prestigious conferences should ask the authors to make their implementations available or available on request at-least after the conferences are held. I know certain research’s implementation could not be made available due to contractual obligations however if the algorithm is been published it should come with some open source implementation or atleast an executable for the popular platforms.

Questioning Sparsity

Went through Rigamonti’s CVPR 2011 paper “Are Sparse Representations Really Relevant for Image Classification?” {Rigamonti, Brown, Lepetit}

Recently there being quite a lot of papers on the Sparsity, above is a very valid question. They report lot of experiments and compare many different techniques. Their conclusion is sparsity in important while learning Feature Dictionary but not helpful during classification. Although only thing it was able to convince me was that might be in their setting the convexity is not working.

Looking forward to see rebuttals or papers questioning or answering questions raised by Rigamonti; in coming year. Overall this appears to be paper that will be cited quiet a lot.

Anchors and Cluster Centers

Went through A Probabilistic Representation for Efficient Large Scale Visual Recognition Tasks  (Bhattacharya, Sukthankar, Jin, Mubarak Shah) CVPR 2011. It has good results but basically they are trying to find weights to fit mixture of gaussians (each centered around a selected feature vector).

That’s what my understanding is ……

Instead of doing the clustering to find the words of dictionary, they randomly select the Features from the dataset and call them ‘Anchors’. These ‘Anchors’ do same job as words afterwords. Instead of just matching one feature with only one word, they try to get the weight on each word; that is for each given image, they get K features, each feature can say how important each ‘Anchor’ is, that makes the weight ‘w’ vector. They find weight vector through maximum likelihood estimator. Now when they have weight vector for each image, they do the SVM for classification.

Their results are good but I still have questions about how they know how many Anchors to randomly pick and every time they will run their experiment their results will be different because their Anchors have changed.

But again they have done extensive experiment. Should have a look at their experiment section.

CVPR 2011, interesting Papers

Few of the papers looking interesting