The tough black box choices with algorithmic transparency in India

Automated decision-making is key to the functioning of several companies. In some cases, like product or movie recommendations, the risk of a flawed decision is not very high: perhaps you see a comedy dull enough to make you doze off or you end up buying a pair of jeans that’s not at all your style.

But in other cases, such as loan approval or autonomous driving, a bad decision has more significant consequences. Without transparency into how these automated decisions are made, regulators and consumers alike will find it difficult to hold these systems accountable for any adverse effects their algorithms might have.

The EU’s GDPR (short for General Data Protection Regulation) legislation has sought to tackle this issue, by promising an individual – or a “data subject” in regulatory parlance – “meaningful information about the logic involved, as well as the significance and envisaged consequences of such (automated decision-making)”. This statute has been broadly described as the ‘right to explainability’.

The effect of the GDPR legislation extends far beyond the EU. As a popular data processing destination for many European companies, India’s IT services sector has had to beef up its privacy controls in order to protect its business interests. In addition, several Indian software-as-a-service vendors with clients in Europe have had to comply.

More than that, GDPR provides a welcome jolt to a conversation that is already ongoing in India regarding the privacy of a citizen’s data and what can conscionably be done with it.

Alok Prasanna, Senior Resident Fellow at Vidhi Center for Legal Policy, says the right to explainability will help regulators to understand the context of a decision that an algorithm makes

Alok Prasanna, Senior Resident Fellow at Vidhi Center for Legal Policy, spoke to FactorDaily on his interpretation of the right to explainability. “In my view, there are two things: first is related to the idea of fairness, accountability and transparency. Consumers need to understand if they were treated unfairly. Being rejected for credit on the basis of caste or geography is discriminatory, whereas being rejected on the basis of credit history could be legitimate. The key is that consumers need the information to make this distinction,” Prasanna said.

“Secondly, (the right to explainability) helps regulators to understand the context of the decision. Take for example, self-driving cars. If there is a crash or other bad decision made by the system, regulators need the context to be able to understand what actually happened.”

What makes the ‘black box’ black?

With an unprecedented volume of user data available, many companies turn to machine learning methods to provide automated decision-making that is responsive and personalized. But the most sophisticated techniques of machine learning – those with the best predictive outcomes – are encased in an opaque, often inscrutable shell.

G S Madhusudan, senior adviser at IIT Madras’s Shakti project and a member of India’s AI task force, calls machine learning the “beginning of a new science”. “It is in its exploratory phase and we have some way to go before we reach the comprehension phase,” he says. “We have a collection of techniques, and we mine from this collection to find one that gives good predictive outcomes. But in most cases, we have no idea how it works. It is like alchemy.”

MIT Technology Review called this reliance on alchemy over understanding AI’s “dark secret”. Take, for example, the deep neural net – a machine learning method which produces some of the most uncannily accurate automated decisions. Labeled data is fed into a deep neural net and passed through layers and layers of computation before an output is reported. If the output doesn’t match the expected label, the neural net is instructed to work backwards and tweak its parameters. Eventually, the system will tune itself to arrive at the desired output – say, identifying a dog as a dog. It is an arcane combination of functions and weights but it is incredibly good at identifying other pictures of dogs.

MoneyTap co-founder Kunal Varma says the industry is reluctant to give out the reason for a rejection over fears that it can be used to game the system. "We want to avoid people trying to reverse-engineer our system," he says. — MoneyTap co-founder Kunal Varma says lenders won’t give out the reason for a rejection over fears that it can be used to game the system. “We want to avoid people trying to reverse-engineer our system,” he says.

When pressed to backtrack and explain their work, most deep learning technologies will give an answer that is entirely unsatisfactory to a human. “This leaves scientists with a very uneasy feeling,” says Madhusudan.

This is especially problematic when the outcomes of these models have high individual stakes – perhaps denying someone a job, or raising the premium in an insurance policy. “How do you fix liability,” asks Madhusudan. “Is the model incorrect, is there a bug in the implementation, or could it be that the data-set the model was trained on is itself flawed?”

Will bad apples spoil the bunch?

The world of consumer credit now has several players who use technology and automated systems to approve loans. MoneyTap is one such lender, providing an app-based line of credit. Co-founder Kunal Varma speaks of the balancing act between requesting personal information and guaranteeing utility. “At MoneyTap, we provide unsecured credit,” says Varma. “We have to make sure we have the information available to us that we need to provide a seamless experience. We know that users will ask ‘Does the information extracted about me justify the value?’”

In MoneyTap and other alternative lenders looked at by the FactorDaily team, users that are rejected loan applications are left to guess at why this is the case. When quizzed on this Varma said that they (MoneyTap) were simply following an industry standard.
“The reason why this is done is because of a best practice or directive coming from the regulatory bodies, since the reason for rejection can be used for gaming the system. We want to avoid people trying to reverse-engineer our system.”

These fears of reverse-engineering and ‘gaming the system’ have been echoed by the organizations at the forefront of the transparency debate. In a December, 2017 hearing before the UK House of Commons’ Science and Technology Committee, Charles Butterworth, a Managing Director at the credit scoring giant Experian, said that exposing their credit-scoring algorithm would allow “gamification of the system and that would be to the detriment of the credit industry”.

At the same hearing, Martin Wattenberg, a senior staff research scientist at Google AI, suggested that what users want might not be transparency so much as ‘translucency.’ “(Full transparency) may conflict with issues around privacy and security. At the same time, you might want some view as to what is going on…I would promote the idea of algorithmic translucency. It is like frosted glass in the bathroom; it lets in light but not certain details.”

“I would promote the idea of algorithmic translucency. It is like frosted glass in the bathroom; it lets in light but not certain details,” says Martin Wattenberg, a senior staff research scientist at Google AI

Varma of MoneyTap clarified that his organization follows a similar principle when dealing with user requests around rejected applications. “When users reach out to us, what we do is we send them the 5 or 6 reasons that might be relevant to their case.”

Filling in the blanks

To be sure, not all automated decision-making systems are as hard to interpret as deep neural nets. Less sophisticated rule-based models, like decision trees, can in fact yield legible explanations for their decisions, but whether companies will part with this information willingly is another story. Leaving aside concerns of manipulation and ‘gaming’, the heuristics baked into a company’s automated decision-making form a core part of their intellectual property. Even more crucial are an AI’s inferences – the things you don’t know that it knows about you. Speaking to FactorDaily in May 2018, Shaadi.com CEO Gourav Rakshit spoke of how machine learning in the matchmaking algorithm “allows us to do missing field analysis, where we’re able to use the rest of what we know about a person to try and create a picture of the person.” Filling in these “missing fields” might go a long way to providing better outcomes, but it runs the risk of leaving users in the dark if things go wrong.

Varma suggests companies should stay wary of collecting this kind of ‘alternate data’. “If you’re following any approach where you’re snooping data without informing the user, it is not just unethical, it is unsustainable,” says Varma.

“Companies think they can get away with it because of the notion of short-term arbitrage. They believe there is an advantage to be gained in the short-term while there is information asymmetry,” adds Varma. “Users will consent to give the data if they see value in your product. If your system needs the data, you are better off just asking for deliberate consent.”

Time to call in the cavalry

The privacy and ethical concerns around “explainability” have led many experts to warn against thinking of it as purely an engineering problem. “The road to disaster lies when you only take inputs from techies,” says IIT-Madras’s Madhusudan. He goes on to stress that to agree on “the intellectual framework needed to construct an explanation,” it is crucial to involve philosophy and ethics experts. “In India, the philosophical and ethical aspects of AI are given short shrift,” he says. “There is a societal bias against philosophy intruding into science. This is my biggest concern about the evolution of AI here.”

G S Madhusudan, a member of India’s AI task force, says, “In India, the philosophical and ethical aspects of AI are given short shrift." — G S Madhusudan, a member of India’s AI task force, says, “In India, the philosophical and ethical aspects of AI are given short shrift.”

While companies dealing with the EU slowly get to grips with the GDPR, India’s own framework for regulating the use of personal data is undergoing a revamp, led by a expert committee chaired by B N Srikrishna, a retired Supreme Court judge. Speaking to FactorDaily in April 2018, Srikrishna had noted that, “something like the GDPR in our country to work will be very hard … Our concept of privacy is very different from the European concept of privacy.”

Still, Vidhi Legal’s Prasanna believes that the way Indians think about data privacy is evolving for the better, citing the widespread media coverage and public engagement with the 2017 Aadhaar ruling in the Supreme Court, which guaranteed the right to privacy as a fundamental right. “Something has definitely changed,” says Prasanna. “One possible reason is the increase in connectivity – 3G, 4G and the like. People have more data online.”

It is uncertain if the legislation proposed by the Justice Srikrishna committee will be as stringent on the subject of automated decision-making as the GDPR is. “Europe is on Day 100 of thinking about data privacy,” says Prasanna. “We are still on Day 1.”

Updated at 5:48 PM on 13 June to add hyperlinks to the copy.

Disclosure: FactorDaily is owned by SourceCode Media, which counts Accel Partners, Blume Ventures and Vijay Shekhar Sharma among its investors. Accel Partners is an early investor in Flipkart. Vijay Shekhar Sharma is the founder of Paytm. None of FactorDaily’s investors have any influence on its reporting about India’s technology and startup ecosystem.

The tech that makes Swiggy tick — and what’s coming next

Four ways to curb Big Tech’s power over our data

The Flipkart data view: Q&A with Mayur Datar, AI team