Posts Tagged ‘ Research ’

How Google Photos finds trees?


When a friend posted on facebook that if he types alligator in his Google Photos search page, it can find all the photos of the alligators in his images, even the ones which are not tagged. I was intrigued …

If you want to learn how it does work go to last section of this blog. I have detailed the links, both for general reading and technical one. Let me just dazzle you with some results here.

So I searched for trees in my photos,

IMG_20141107_111758710_HDR IMG_20150103_122857248_HDRIMG_20141106_120345930_HDR

It was even able to find few looking like trees, for example following from Museum of Fine Arts exhibition

DSCN2797  Which to me made sense, one can look at the texture, shape and colors. Thus inferring that some image has a tree in it. They might have added some context too, if there is a horizon etc.. (I already knew that they are using CNN, this was me trying to make sense of if I did not knew that they were using CNN).

What really surprised me were results below. One on the left has no color, none of the tree like texture and very thin tree like structure. On right there is an image in which tree has been made blue (it’s art installment where trees were colored blue to represent veins carrying oxygen), so it cannot use the color cues.

IMG_20150103_125706516_HDR

Gainesville Blue Trees art installment

Gainesville Blue Trees art installment

That made me think might be they using little more than just image cues, they might be using similarity among images and see if there is any other similar image, that has been labeled, has been tagged by someone or has some caption or description. Which in this dan and age is fair and smart thing to do.

I remembered a very old paper, I think it was from UCF Computer Vision Lab (I think one of the students of Mubarak Shah), where they were trying to distinguish between the grass, shrubs, trees, … and it was not an easy task. So I experimented with it. Next thing I searched was “grass” and yes results were quite different. Although it was not as accurate, however it did made sense.

IMG_20141106_120345930_HDR DSCN2711 IMAG0046 (1)R2 IMG_20141108_174249930_HDR

Terms like “food” give the most worse result. However, more defined objects give much better result, for example it was able to find “cycle” that was not even the main part of the image, similarly results for car and airplane were good. For searches on terms “Chair” it did an interesting thing. It found people in sitting pose, most of these were people who were sitting on Sofa or on the ground, but it made an association of human pose with the concept of “chair”.

How Magic works? (aka how google can do these amazing things?)

If you want to have a Google’s view of how the ‘magic’ works have a look at this blog http://googleresearch.blogspot.com/2013/06/improving-photo-search-step-across.html. If you are looking to learn something out of it, have a look at Freebase, CNN, Dr. Hinton…. and that their final layer is just linear classifier. If you are interested in getting some technical know-how, have a look at Dr. Hinton’s paper ImageNet Classification with Deep Convolutional Neural Networks“.

However, we know things have moved at a quite fast pace, about a year ago there was an announcement by google that they can now provide “natural description of images” http://googleresearch.blogspot.com/2014/11/a-picture-is-worth-thousand-coherent.html.  Technical stuff to look for  RNN (recursive neural network), how they are using for machine translation, have a look their paper, etc….

Advertisements

Computer Vision algorithms where are implementations?


Why can’t one find the implementations of papers published in prestigious conferences and journals? even one which have been published 2 years back?
It takes way an important tool to compare current algorithms with the previous works. If you want to compare you have implement that paper and end up figuring out all the tweaking parameters. This greatly hampers both the speed of research as well introduces the mistrust, did the author’s algorithm really working? or I am making some mistake? was the tweaking done by hand? or there is some other algorithm picking up those values?

CVPR, ECCV and other prestigious conferences should ask the authors to make their implementations available or available on request at-least after the conferences are held. I know certain research’s implementation could not be made available due to contractual obligations however if the algorithm is been published it should come with some open source implementation or atleast an executable for the popular platforms.

CVPR 2011, interesting Papers


Few of the papers looking interesting

Mendeley and Zotero


Have started using Both 🙂

Both have their pros and cons.

I like Zotero because it is right in my Browser, click it and you can add tags search through it. Want to add webpages no problem. Link different documents with each other, add crazy tags, notes and other things. Amazing isn’t it.

Don’t like just because it is Browser. Well I am not huge fan of Firefox especially after the Chrome. Somehow Firefox and my comptuer’s memory management does not go hand by hand. And the Zotero Does not work with Chrome 😦

And Mendeley allows to see and put notes INSIDE the PDF. Well Frankly with the Apple’s Tablet on the Way I hope they can add feature to write on the PDF with PEN rather than that YELLOW Post-it appearing; but till then this option looks nice to me.

There is one big problem with the Mendeley, it does not allow you to note down the website. For example I was searching for some Image Datasets and I went to INRIA’s page. Now I wanted to store it and tag it as dataset so that I can access it afterwards. Yes I can bookmark it but what is the point of bookmarking if you have to save and access someother interface for this. Mendeley does not allow it. I don’t know why but they don’t. However Zotero does.

However there is Mendeley has the option to sync the Zotero, but it only syncs the PDF and nothing else. Therefore my line of action is use Zotero while working online and Mendeley for the offline work.