Skip to main content

Image classification for location identification/scene description

Objective:-
build an app so that blind people can take a picture and get the information regarding the same i.e, if your image is text it should be able to process the text in the image and read out .If the any other picture is taken it should describe the picture.

Current status:- 
Below are screenshots of the android application that is built and also the results we are getting for the images that are taken .Our app works on talkback.


→  
    ↙                      ↓                       ↘

 



Approach:-                                                     
1) Done comparitive analysis of  standard image description APIs from Google and Microsoft on a dataset of images taken in real outdoor scenarios
Link for the analysis:-
https://drive.google.com/drive/folders/16HspJ5lntmMwDxWE7oPqXJf0n4dNT6Mn
2) Building app for text and image recognition using the api from the above analysis.The application was built using Microsoft vision api as we thought from the analysis that it is giving us better results.
3)Testing the app with blind people

Challenges:-

For some of the images we are getting results that are not relevant to the images captured this is due to the api used.Challenge is how can we choose the api(among google,microsoft etc..) to be used that gives the result close to the image leaving behind the cases where all the api's gives wrong results.Here in our app we used microsoft vision api which gives good results to most of the images but it is not accurate.

Github link:- https://github.com/ManoTeja/COP315

   




Comments