With an AI leg-up, machine eyes count vehicles one at a time — missing none

Jayadevan PK January 4, 2018 5 min

Pune architect Nikhil Mijar was in a bind in May 2016. The municipal corporation had subcontracted a survey of two roads so it could plan to build a rapid bus transit system. A traffic survey is routine before any major road work is taken up. Only, Mijar was short on time. He had about two months to deliver reports.

“We were in a hurry to submit the report,” recalls Mijar. In the normal course of things, he’d have three options: one, set up a team, typically of students, alongside the road to count vehicles passing by. Two, shoot videos of traffic flowing through these roads and have 4-5 people count vehicles on the screen. Three, use pressure sensors on the road.

All methods would take time with little control on accuracy. Roadside counting of vehicles deals with human challenges ranging from poor classification of vehicle data to attention spans. Counting and classifying vehicles on a video screen can be as tedious and error prone. Pressure tubes tend to count vehicles more than once and the sheer diversity of vehicles on Indian roads makes it harder for use.

For planners, getting accurate numbers is crucial. “Everything in planning is based on numbers. Plus or minus five-ten is okay. But if it multiplies into hundreds and thousands, then the numbers don’t give a realistic story,” says Mijar, a graduate of CEPT University in Ahmedabad. The data from such surveys help town planners project growth and build capacity – roads and junctions are designed based on these numbers and getting it wrong would mean bad roads.

A segment from the Stomatobot image recognition feed at work in a Pune traffic survey

This is where a tiny computer vision company from Pune stepped into the picture. Stomatobot Technologies, a bootstrapped company building computer vision software, did the survey in record time using a fairly simple application of artificial intelligence. For the uninitiated, computer vision technologies help computers understand real world images and suggest or make decisions.

With video footage of the two roads and an algorithm put together by its founder Anand Muglikar, the company classified vehicles into heavy, light and medium-sized vehicles. Based on the reports, the municipal corporation has taken up construction on one of the roads.

To be sure, it’s not leading edge AI at play here but by most expert accounts, the solution has the potential to save time and cost for town planners to take quick calls on expansion projects.

“I’ve seen this in other countries but counting vehicles in India is a very tedious job unless we have better automation. Over a period of time, this will reduce cost and energy. But bringing it from US to India will need a lot of modifications,” says traffic expert M N Sreehari.

He points out a few flaws before such systems: lane discipline is rare in India and vehicles may get counted wrongly. Also, unlike in the US, where the traffic is mostly cars and trucks, Indian roads are busy with more than a dozen types of vehicles — bicycles, scooters, motorbikes, autorickshaws, small transport vehicles, cars, tractors, vans, buses, among others.

Despite these drawbacks, the machines may still win.

“It’s better than manual counting any day. In manual counting, the fear is that lot of times they count for an hour and throw some number at us,” Pawan Mulukutla, Head – Integrated Transport at WRI India, a think tank specialising in sustainable urban issues.

With nearly 100 GB of video footage, Stomatobot plans to refine its algorithms to better suit Indian conditions. “We can already distinguish between an SUV and a car and other types of vehicles,” says Muglikar. There are several players in this market already – ranging from companies such as Cisco and Toshiba that offer complex analytics solutions along with sensors and cameras to players such as Mindtree and Intellivision.

Muglikar says that unlike established players in the market who bundle a hardware and software solution together, his platform is hardware-agnostic. “They are trying to sell more of their stuff by saying specific hardware is required for specific features. We believe all the intelligence lies in the software… Ours is a pure software play like Android or Windows,” he said. He also claims that at Rs 2,500 to Rs 3,500 per month per camera, his solution is one of the cheapest in the market. Swedish engineering company Sandvik’s India arm is his sole client right now.

He has plans to sell his solution to traffic departments, road transport offices, law enforcement agencies, and toll booths. “We dream of a pan India security shield of public as well as private CCTVs.” Muglikar previously worked at Teleskin, a startup that uses machine vision to detect skin cancer, before founding Stomatobot in 2015.

Video analytics isn’t very new to traffic management, especially in enforcing traffic laws. It is, however, new to traffic surveys and urban planning.

Stomatobot’s main product is an automated security system that seeks to replace guards at apartment complexes and other places such as ATM kiosks. The product, called WatchMan, automates CCTV surveillance and alerts its users if it notices intrusions.

While we’ve seen applications of computer vision in sectors such as healthcare and retail, urban planning is a relatively new area. According to startup tracker Tracxn, over $5.9 billion has been invested in AI companies in 2016-17 alone.

“Collecting data is always beneficial. For planners in India, it would be more helpful if there’s a whole recce of the entire stretch of the road. Today we can’t see traffic in isolation,” says Mulukutla.

Stretch the idea of using technologies like AI and computer vision a bit further, to add maps, geospatial data, dash cams and CCTV footage, and you’ll see cities that are run a lot more smartly.

Take for instance what scientists Nikhil Naik and Scott Duke Kominers achieved by studying millions of Google Street View images. The duo at MIT Media lab, took an earlier study that identified signs of gentrification in Chicago, and applied an AI layer to teach computers how to automatically score streetscapes.

“By having a computer do it, we were able to really scale up the analysis, so we examined images of about 1.6 million street blocks from five cities — Boston, New York, Washington, DC, Baltimore and Detroit,” Naik told Science Daily.


Disclosure: FactorDaily is owned by SourceCode Media, which counts Accel Partners, Blume Ventures and Vijay Shekhar Sharma among its investors. Accel Partners is an early investor in Flipkart. Vijay Shekhar Sharma is the founder of Paytm. None of FactorDaily’s investors have any influence on its reporting about India’s technology and startup ecosystem.