How to begin your career in Computer Vision and Machine Learning Field?

This post is for Computer vision enthusiasts. We are highlighting the focal point of resources and computer vision tutorials for all CV aficionados and get themselves started in this emerging field. As a passionate beginner in the field of computer vision, we hope you find this useful.
Some prior knowledge about linear algebra, calculus, probability and statistics would definitely be a plus but its not always required. The most important thing is to get started and you can learn other essential things on the fly.
  • Computer Vision – Mubarak Shah (UCF) : All the materials related to the course are available online and what is more interesting is that even the video lectures are available.
  • Computer Vision – Subhransu Maji (UMass Amherst) : Provides access to all the lecture materials and assignments but there are no video lectures.
  • Visual Recognition – Kristen Grauman (UT Austin) : This Provides links to some of the interesting and fundamental papers in computer vision.
  • Language and Vision – Tamara Berg (UNC Chapel Hill) : This course is basically aimed towards exploring topics straddling the boundary between Natural Language Processing and Computer Vision.
  • Convolutional Neural Networks for Visual Recognition – Fei-Fei Li and Andrej Karpathy (Stanford University) : This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for computer vision tasks with a main focus on image classification.
Some additional resources:

Tweet: How to begin your career in Computer Vision and Machine Learning Field? #AI #career #technology #CV #MLBooks:

Computer Vision
In addition to this, it is very useful to be aware of most of the basic image processing techniques presented in this book Digital Image Processing – Gonzalez 2007
OpenCV Programming
Software Packages:
You can find an exhaustive list of links presenting code which implements some of the standard vision algorithms at
Major Conferences: 
Below are some of the major conferences listed in their ranking order.
Most of the papers published in the above mentioned conferences can be accessed at
Nice way to keep track of the conferences deadline is via
Now that you have acquired some knowledge of computer vision and Deep Learning (from the previous post), please feel free to compete in Kaggle competitions (best way to put your learning into practice).
If you would like to have any guidance/support in CV domain or have any additional resource information, we would love to hear it without judging you. You may drop your comments below.

Tweet: How to begin your career in Computer Vision and Machine Learning Field? #AI #career #technology #CV #ML


Mobile based Face Attendance & Monitoring System for Skill Development Institutions

This post is on a case study with a vocational training institution that has adopted our face based solution for effectively managing their processes and management. We felt putting this post out would help us articulate and communicate the value that such similar institutions can find in adopting robust systems to held as they scale their organizations. Such organizations are working to bring operational improvements and a structured approach to scaling up.

Opportunities and challenges related to skill development in India is humongous. Billions of aspiring and ambitious people are ready to work hard and push their limits. It becomes close to impossible to handle such variability.

Skill development programs are becoming imperative for our ambitious youth. It becomes all the more critical for a country that is emerging as an economic powerhouse. Today, skill development organizations are taking care of end to end processes starting from identifying and sourcing young employable people, identifying suitable courses, providing the best state of art trainings, and finding jobs and opportunities for the trainees and mobilizing them.

Improving the methodologies of training and bringing effectiveness in this complete training process is a continuous endeavor for these organizations. As a positive attempt to work in this direction, Aindra Systems in April 2015 has enabled one of the fastest growing skill development and training institutions, in deploying a breakthrough and innovative biometric technology to address their three persistent problems:

  • Controlling the drop out rates of the enrolled trainees
  • Ensuring presence of expected trainees at the training sessions
  • Automation of the complete monitoring process

Some of the challenges that the implementation faced, were:

  • Working with an unorganized sector, the solution has to be extremely simple to use
  • The solution should be able to work with little or no consistent power
  • It should work in remote areas without consistent Internet connectivity
  • It should not be dependent on expensive hardware and infrastructure maintenance
  • It has to be scalable in nature, to meet the demands of a growing organization
  • Change management within the users of the organization

Following are a few snapshots taken at actual location in one of the training centres in Hosur, Tamil Nadu.

Face based attendance- monitoring in action
Face based attendance- monitoring in action
Smart Phone Application Auto Captures the faces for biometric  attendance
Smart Phone Application Auto Captures the faces for biometric attendance

Tweet: Mobile based face attendance and monitoring system via @AIndraSystems

The mobile based face attendance solution is an application which is installed on the trainers mobile phones. Based on a patent pending product that uses Artificial Intelligence, the application has intelligence to automatically capture the trainees faces and later update their attendance on any cloud application, be it the NSDC, MIS or any other ERP system.

Deep Learning Online Courses, Reading Materials and Software Packages

Why do you want to learn Deep Learning?

Deep learning has been termed as one of the leading scientific breakthroughs in recent years, you can read an interesting article: MIT Technology Review. It has resulted in state of the art performance in a variety of areas, including computer vision, natural language processing, reinforcement learning, and speech recognition. It has attracted significant industrial investment with large groups at Google ( hired Geoff Hinton), Facebook ( hired Yann LeCun to head Facebook AI Lab), Baidu (hired Andrew Ng), IBM, Microsoft etc working on applications of deep learning. Most interestingly, we at Aindra are also making use of this cutting-edge technology to develop our products.


Do you want to learn about Deep Learning?

The best way is to take an online course. There are many online courses on Machine Learning but very few on Deep Learning. Here’s a list:

Deep Learning and Neural Networks: by Kevin Duh. It consists of only four lectures but provides an excellent foundational understanding at a level sufficient for anyone to start reading research papers in this exciting and growing area.

Neural Networks for Machine Learning:  You could listen to the lectures here from one of the legend Geoffrey Hinton from the University of Toronto. This course emphasizes both on basic algorithms and the practical tricks needed to get them to work well in applications like speech and object recognition, image segmentation, modeling language and human motion, etc.

Material for the Deep Learning course: Here is all the course related materials by the legend Yann Lecun, NYU. It also has pointers to some of the tutorials conducted at various conferences.

Convolutional Neural Networks for Visual Recognition: at Stanford University. Though, video lectures are not available online but you can still get access to an excellent course notes and assignments.

Deep Learning seminar course: by Lorenzo Torresani at Dartmouth University. This might not be an extensive list of papers related to deep learning but definitely mentions some of the important deep learning papers and its well categorised.

Neural networks class – Université de Sherbrooke: by Hugo Larochelle Provides an in-depth lecture videos along with the slides on Deep Learning topics starting from the very basics of Neural Network.


Apart from these online courses you could also find some interesting tutorial talks at various conferences/workshop, here is a partial list:

Deep Learning for Natural Language Processing: This was a tutorial at NAACL HLT 2013 presented by Richard Socher, Yoshua Bengio, and Christopher Manning.

Deep Learning of Representations-Google Techtalk: Hear it from Yoshua Bengio, University of Montreal, who has made significant contributions in the deep learning field.

Deep Learning methods for vision: This was one of the CVPR 2012 tutorial.

Software packages which support Deep Learning include

  • Caffe: Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.
  • Torch7: an extension of the LuaJIT language which includes an object-oriented package for deep learning and computer vision. It is easy to use and efficient, built using LuaJIT scripting language, and an underlying C/CUDA implementation.
  • Theano + Pylearn2: Pylearn2 is a python wrapper that lets you use Theano’s symbolic differentiation and other capabilities with minimal overhead.
  • cuda-convnet: High-performance C++/CUDA implementation of convolutional neural networks, based on Yann LeCun work written by Alex Krizhevsky.


For any further materials related to Deep Learning you can find them at

Please leave your comments or suggestions to include more sources related to deep learning (that we might have missed in this article).

Why should a startup use its own products

Once a product has been built, what is the best way to understand your customers concerns and experience the product usage pattern? I ‘m sure all product managers talk to their customers regularly to understand and get feedback. They could have the customers respond to product satisfaction surveys or have focused group discussions.

But, this might still not be enough to unearth all the latent issues that your customers are facing. What then, is a good way to put yourself in your customer’s shoes ?

We believe it is by actually using your own product. As a proper customer.

While it may be a lot easier for a B2C product startup to have its people use its products, it is not entirely impossible for a B2B startup to do so.

At Aindra Systems, our valuable employees and stake holders use our SmartAttendance SME product to mark their daily attendance as they come into the office and start work. We haven’t installed any dedicated device, scanner or RFID booth for the daily attendance but all of us are using our smart phones to give our daily mobile based face recognition; biometric attendance. This has manifold outcomes. While the direct benefit is, using the product on a regular basis brings out all the corner case defects, that would have otherwise gone unnoticed even after all the testing that is done on the product before it is made live.

The next benefit is that, the whole team becomes empathetic to the concerns of the customer. Every fix and every feature that is discussed and deliberated goes through the ‘Customer filter’, before getting implemented. With this approach we should, hopefully, be able to address the issue of the “Curse of Knowledge in product design” as well. The effects of which have been wonderfully articulated by Ben Yoskovitz in his blog post.

I am sure most of the Indian B2C product startups are doing this regularly. But I am curious to know if the B2B startups are doing this and what are their experiences.

This is still an experiment in progress for us and we will share more of our insights as we go along.


Does your Start-Up have the right culture?

What makes up the culture of a start-up ? Is it the presence of a youthful team ? Is it how the team interacts ? Is it in working long hours ? Is it the collective actions of the people in the start-up in reaching out to a common and shared vision ? Is it about displaying extreme passion and pride in what we do ?

We, at Aindra believe this is, but only some of the aspects of what should form the culture at our start-up. Have we got it nailed down and is it done and dusted ? I guess not. But we are working towards building it. And how can we dream of achieving a culture that both aligns with our value system and our vision of where we want to go ? It is of course, by having the right kind of people within the start-up. People who will be a part of the journey of building the culture at the start-up.

We have had instances where really motivated guys joined us with pay cuts. We have had instances when our early people, who left the start-up and moved to the US to continue with their higher studies continue to work with us remotely as interns, while they continue with their coursework. We believe this happens when there is perfect alignment of the underlying values of the people with the start-up.

While we have been lucky to have had an awesome set of people working with us, we also had unfortunate instances when there were misfits. Extremely sharp people who, unfortunately, did not have the cultural fit that we wanted in our start-up. When there is extreme pressure to hire within a short period, it becomes difficult to correctly gauge the cultural fit of the candidate. We have learnt the hard way, that this later could turn out to be an extremely costly mistake. When the values of a new hire does not align with our values as a start-up, then it is bound to create dissonance. And this dissonance starts becoming acute, until we reach a point and realize that it is better to let the person move on, and not spend both mental and physical resources to deal with the dissonance. We were in this unfortunate situation once and realize the enormous cost of dealing with this.

This alignment of personal values to the culture at the start-up, determines the environment at the start-up and become the driving force. While this could be true for even a large company, it could mean the difference between a start-up succeeding or struggling to make it big. There are highs and lows that are a regular occurrence in a start-ups early years. It is the collective culture that sees the start-up through these phases. And what is the best way to create a culture that you would want at your start-up ? The whole team has to live those values at work, day-in and day-out. Be ready to let go of critical people when this doesn’t happen, even at the cost of missing out on a milestone.

And most importantly, the core team has to lead by example.


What are we aiming for, as Culture in our start-up ?

  • Transparency in our work life
  • Honesty and Integrity while dealing with Customers
  • Personal attention to Customer issues
  • Camaraderie between the team
  • Freedom with boundaries and Flexibility with constraints


We would be happy to hear your thoughts on this and hear how you have built the culture at your start-ups.




Which is the best off-the-shelf classifier?

If you are thinking of using some random classifier to solve the classification problem for your own data, then your best option would be to try Random Forest or a Support Vector Machines (SVM) with Gaussian Kernel. In a recent study these two algorithms have proven to be the most effective among nearly 200 other algorithms tested on more than 100 publicly available data sets.

In this blog post we are highlighting some important points quoted in the paper – “Do we Need Hundreds of Classifiers to Solve Real World Classification Problems?”,  which aid in choosing the right algorithm for our own machine learning problems.

Authors have evaluated 179 classifiers from 17 families on 121 datasets (total number of experiments is 241,637) and this  is definitely an exhaustive study of classifier performance with a significant contribution to our community. The dataset consisted of 10 to 130,064 patterns, from 3 to 262 inputs and from 2 to 100 classes. Here is the snippet of the classifiers in the ranked (ascending) order:

  • Random Forests with 8 variants.
  • SVM with 10 variants.
  • Neural networks with 21 variants.
  • Decision trees with 14 variants.
  • Bagging with 24 variants.
  • Boosting with 20 variants.
  • Other Methods with 10 variants.
  • Discriminant analysis with 20 variants.
  • Nearest neighbor methods with 5 variants.
  • Other ensembles with 11 variants.
  • Logistic and multinomial regression with 3 variants.
  • Multivariate adaptive regression splines with 2 variants
  • Generalized Linear Models with 5 variants.
  • Partial least squares and principal component regression with 6 variants.
  • Rule-based methods with 12 variants.
  • Bayesian approaches with 6 variants.
  • Stacking with  2 variants.

The average accuracy of the best performing 25 classifiers are shown in the below graph. Among them, the best performers are random forest  and SVM with Gaussian kernel with average accuracy of  82.0%(±16.3) and 81.8%(±16.2) respectively. Most of the experiments involved fine-tuning the parameters. The reported average accuracies are computed using 4-fold cross validation.

Average accuracies for the 25 best classifiers


Random Forest classifier has helped people win some Kaggle competitions. Here is what people have to say about usage of Random Forests:

“Since they have very few parameters to tune and can be used quite efficiently with default parameter settings (i.e. they are effectively non-parametric). This ease of use also makes it an ideal tool for people without a background in statistics, allowing lay people to produce fairly strong predictions free from many common mistakes, with only a small amount of research and programming.”




Final thoughts:

  •  Try random forests/Gaussian SVM as a baseline and later move onto new or advanced methods.
  • Always remember to standardize (scaling, transforms) your data, this has shown to significantly affect the performance.
  • Fine-tune the classifier parameters to get the best performance, you could do this by cross validation.
  • If possible try to extract better features (may need domain knowledge) from the data which could aid the classifier to meet better performance.


Drop a comment and let us know what classifiers you are using in your products and what has been the experience so far.

Debunking Myths About Computer Vision – Face Recognition via

We loved this blog post from Placemeter. The link to the original post is here.


Technology is often over-estimated or under-estimated based on what people see in TV shows and movies. Thanks to Mission Impossible, Jack Bauer, and almost any modern police drama out there computer vision has been recently over-estimated.



As a CV veteran, I always get amused, and sometimes annoyed by the things that happen “on TV.” I also realize that the impression people get from TV and movies can affect how people perceive technologies like these. So in a new series, I’m going to give readers an idea of what current misconceptions of computer visions are.

Myth Number One:

Instant, Universal Face Recognition

Face recognition is a very complex task. The human eye is extremely well trained at recognizing people—faces are the first things children recognize. But physically, two faces are always very, very close. In short, it is a lot harder to recognize someone than to tell a cow from a car in a picture. Today, face recognition works well in two cases:

  1. if the subject is willing to be recognized
  2. if the “dictionary” of faces is relatively limited

The first is the case when an ATM is using your face to identify you and unlock your card. The algorithm in that case can use several images of your face, use longer exposure time, and usually a large amount of pixels to recognize you.

The second is the case where Facebook or Apple is able to recognize your friends or family in your photos. In that case, the total number of people you are trying to recognize is limited, usually < 100.


In any case, you need a good amount of pixels to recognize a face, at least 60×60 in general. And you need good pictures. And you need a limited recognition test.

Face detection algorithms need at least a 60x60 pixel sized face to work.

Law enforcement does use scenarios like those, but it involves a lot of manual work: to recognize a terrorist in a stadium, a cutting edge face recognition algorithm could probably suggest thousands of matches out of the tens of thousands of people present, and an officer would have to manually sift through those. These limits were on full display during the Boston marathon bombing when expensive facial recognition systems failed to identify the Tsarneav brothers.

Based on that, two things would not work: recognizing anyone no matter what on Facebook without the context of who is your friend, or instantly recognizing a terrorist in the crowd in a stadium using a drone.