Archive for the ‘ Links ’ Category

How Google Photos finds trees?


When a friend posted on facebook that if he types alligator in his Google Photos search page, it can find all the photos of the alligators in his images, even the ones which are not tagged. I was intrigued …

If you want to learn how it does work go to last section of this blog. I have detailed the links, both for general reading and technical one. Let me just dazzle you with some results here.

So I searched for trees in my photos,

IMG_20141107_111758710_HDR IMG_20150103_122857248_HDRIMG_20141106_120345930_HDR

It was even able to find few looking like trees, for example following from Museum of Fine Arts exhibition

DSCN2797  Which to me made sense, one can look at the texture, shape and colors. Thus inferring that some image has a tree in it. They might have added some context too, if there is a horizon etc.. (I already knew that they are using CNN, this was me trying to make sense of if I did not knew that they were using CNN).

What really surprised me were results below. One on the left has no color, none of the tree like texture and very thin tree like structure. On right there is an image in which tree has been made blue (it’s art installment where trees were colored blue to represent veins carrying oxygen), so it cannot use the color cues.

IMG_20150103_125706516_HDR

Gainesville Blue Trees art installment

Gainesville Blue Trees art installment

That made me think might be they using little more than just image cues, they might be using similarity among images and see if there is any other similar image, that has been labeled, has been tagged by someone or has some caption or description. Which in this dan and age is fair and smart thing to do.

I remembered a very old paper, I think it was from UCF Computer Vision Lab (I think one of the students of Mubarak Shah), where they were trying to distinguish between the grass, shrubs, trees, … and it was not an easy task. So I experimented with it. Next thing I searched was “grass” and yes results were quite different. Although it was not as accurate, however it did made sense.

IMG_20141106_120345930_HDR DSCN2711 IMAG0046 (1)R2 IMG_20141108_174249930_HDR

Terms like “food” give the most worse result. However, more defined objects give much better result, for example it was able to find “cycle” that was not even the main part of the image, similarly results for car and airplane were good. For searches on terms “Chair” it did an interesting thing. It found people in sitting pose, most of these were people who were sitting on Sofa or on the ground, but it made an association of human pose with the concept of “chair”.

How Magic works? (aka how google can do these amazing things?)

If you want to have a Google’s view of how the ‘magic’ works have a look at this blog http://googleresearch.blogspot.com/2013/06/improving-photo-search-step-across.html. If you are looking to learn something out of it, have a look at Freebase, CNN, Dr. Hinton…. and that their final layer is just linear classifier. If you are interested in getting some technical know-how, have a look at Dr. Hinton’s paper ImageNet Classification with Deep Convolutional Neural Networks“.

However, we know things have moved at a quite fast pace, about a year ago there was an announcement by google that they can now provide “natural description of images” http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html.  Technical stuff to look for  RNN (recursive neural network), how they are using for machine translation, have a look their paper, etc….

Advertisements

Meet Edward H Adelson


If you have seen following image you have met Edward H Adelson,

He is faculty member at M.I.T Dept. of Brain and Cognitive Sciences, was recently going over some part of his paper “On Seeing Stuff: The Perception of Materials by Humans  and Machines“, quite interesting paper. Talks about why recognizing materials is important and points out that machines are not able to do that. In his paper one example is of icecream, that due to it’s texture can be recognized even by the child where as a machine cannot do that. This paper was published in 2001, I feel still the machines cannot recognize ice-cream.

There are many illusions on his webpage, have a look, http://persci.mit.edu/people/adelson

SimpleCV; another Computer Vision Open Source Library


SimpleCV (http://www.simplecv.org/) is a python based library. Have not tried that, but it looks interesting. Question is how it is different from OpenCv or just in different language.

Car Datasets


I am looking for the Car Detection Datasets, especially rear and front ones.
I have found following few

  1. http://lear.inrialpes.fr/data
  2. http://www.vision.ee.ethz.ch/~bleibe/data/datasets.html#cars-rear
    1. They have Side view of Car and Multiview Car dataset also
  3. http://www.vision.caltech.edu/html-files/archive.html
  4. http://vasc.ri.cmu.edu/idb/html/car/

 

The ones in the 2 and 3 are more good trying to get more datasets, if you have some please send me a link

 

Boundary For Object Detection


Reading “A Boundary Fragment Model for object Detection” by Oplet et. al. (The third authoris Zisserman)

This paper uses Distance Transform using the Chamfer Distance.

The explanation could be found here http://www.tc18.org/subfields/distance_skeletons/DistanceTransform.pdf

But much better and simple explanation is in the paper “Hierarchical Chamfer Matching: A parametric Edge Matching Algorithm” by Gunilla Borgefors.

However the nice explanation is also here http://www.mis.informatik.tu-darmstadt.de/Education/Courses/cv_ss08/ex4/exercise04.pdf/download

Code could be found here

Perspective Nonrigid Shape and Motion Recovery


R. Hartley and F. Schaffalitzky, “Reconstruction from projections using Grassman tensors,” ECCV,

 http://www.robots.ox.ac.uk/~vgg/publications/papers/hartley04.pdf

ECCV 2008, “Perspective Nonrigid Shape and Motion Recovery” Richard Hartley and Ren´e Vidal  http://cis.jhu.edu/~rvidal/publications/eccv08-nonrigid.pdf 

Trifocal Tensor Could be read from here

Structure from Motion


Studying the Nonrigid Structure from Motion in Trajectory Space by Ijaz et. al.

Have to give presentation in front of the study group here.

To understand it properly I will recommen reading “A Closed-Form Solution to Non-Rigid Shape and Motion Recovery” by Xiao, Chai, Kanade, IJCV 2006. They properly explains the mathematics of it.

The starting work I think was from Tomasi Kanade (Tomasi Kanade Fractorization ) “Shape and motion from image streams under orthography: a factorization method” 1992,