“By far, the greatest danger of artificial intelligence is that people conclude too early that they understand it.”
― Eliezer Yudkowsky
I second that. Artificial intelligence (AI) is pushing the boundaries of human imagination. Machines today are capable of doing a lot of things that we could not imagine doing 20 years ago. AI has changed the way we look at learning and inventing. From drug discovery to sports analysis to protecting the oceans, AI has marked its presence everywhere. But, is it outperforming humans? It’s an undoubtful yes. In this article, I will try to sum up where and how.
AI impersonating celebrities
A Canadian AI company, LyreBird, uses realistic-sounding voice audio by listening to the audio for a minute. By analysing the voice of Trump, Hillary Clinton, Barack Obama and others, the system was able to reproduce their voices with great accuracy. It’s Trump impression beat that of Alec Baldwin!
LyreBird takes the sample of a voice to analyse its waveforms and cues. It then picks up the deviations of that voice from the platonic ideal of an English voice, and instructs its voice synthesis component to make the exact same adjustments to its audio waveforms as those sound curves are generated. This process not only delivers an impeccable accent or general sound, but also takes care of the minor quirks and jerks.
Identifying images
At the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, Google first came up with a convolutional neural network approach that resulted in just a 6.6% error rate, almost half the previous year’s rate of 11.7%. The programme correctly identified 74.9% of the sketches it analysed, while the humans participating in the study could correctly identify the objects only 73.1% of the time.
This started an race between different research groups across the world, resulting in deeper neural network architectures coming up and changing the state of the art. Residual networks, dense networks and, very recently, DIRAC networks have come up with deeper architectures to increase the accuracy of visual recognition by machines.
These convolutional neural networks have enabled machines to even write suitable captions for images. There are still some situations in which machines lag behind, but the continuous advancement they are making look promising.
Google’s AI for detecting cancer
Alphabet, Google’s parent company, has been working in the direction of diversifying its research work so as to have a broader impact on human life. In a white paper, Detecting Cancer Metastases on Gigapixel Pathology Images, Google disclosed its research on the diagnosis of breast cancer using its deep learning AI.
Also read: This AI pathologist could be a life-saver for India’s ailing diagnostics sector
To test the system, Google’s experts used a data set of images, courtesy the Radboud University Medical Center. After customising and training the model to examine the image at different magnifications, it exceeded the performance of human doctors. Google’s algorithm produced improved prediction heatmaps and its localisation score reached 89%, higher than the 73% in the case of humans. Also, the time it took to complete the diagnosis was far less than that taken by humans.
AI is an expert at lipreading
All this time, lipreading has been considered a human art. However, the latest deep learning technologies have outperformed even the best of lipreaders. One such deep learning model is LipNet.
LipNet is the first end-to-end sentence-level deep lipreading model that simultaneously learns spatiotemporal visual features and a sequence model. On the GRID corpus, LipNet achieves 95.2% accuracy at sentence-level, overlapped speaker split task, outperforming experienced human lipreaders and the previous 86.4% word-level state-of-the-art accuracy.
Your very own language translator
Following in the success of neural machine translation systems, researchers at Google came up with “zero-shot translation”. The idea was that if the machine was taught to translate English to Korean and vice versa, and also English to Japanese and vice versa, could it translate Korean to Japanese, without resorting to English as a bridge between them? The answer is yes! The translations are quite reasonable for the two languages with no explicit linking whatsoever.
Now, there’s one more side to this achievement: If the computer is able to establish connections between concepts and words that have no previous links, does it mean that the machine has formed a concept of shared meaning for those words? Simply put, is it possible that the computer has developed its own internal language? Based on the relation of various sentences with each other in the memory space of the neural network, Google’s language experts and AI researchers believe this is possible.
At Arxiv, you can read the paper describing the research work on efficient multi-language translation. This paper also scratches the surface of the above-mentioned “interlingua” issue. However, a lot of research is required to reach any conclusion for the mystery. Until then, we can live with the idea of such possibility.
Speech transcription AI better than human professionals
A paper from Microsoft claims to have achieved better transcription level than that of humans. To test how its algorithm stacked up against humans, Microsoft hired a third-party service to tackle a piece of audio for which it had a confirmed 100% accurate transcription. The professionals worked in two stages: one person typed up the audio, and then a second person listened to the audio and corrected any errors in the transcript. Based on the correct transcript for the standardised tests, the professionals had 5.9% and 11.3% error rates. After getting trained for 2,000 hours of human speech, Microsoft’s system tackled the audio and managed to score 5.9% and 11.1% error rates. The difference may be minute, but is significant.
Also read: Biased bots: Artificial Intelligence will mirror human prejudices
AI players are better than humans
A computer programme under the name of AlphaGo is the first one to defeat a professional Go player, a World champion. It beat the three-time European Champion Fan Hui with a statistical significance of 5-0. It then went on to defeat legendary player Lee Sedol, who happens to own 18 world titles to his name. Although the rules are simple, the complexity of Go makes it more multifarious than chess. AlphaGo uses deep neural networks and tree search to master the game of Go.
Poker is an exemplary game of imperfect information, and a time-honoured challenge to AI. DeepStack, an algorithm for imperfect information settings, combines recursive reasoning to handle information asymmetry, decomposition to focus computation on the relevant decision, and a form of intuition that is automatically learned from self-play using deep learning. DeepStack has defeated professional poker players with quite a margin in heads-up, no-limit Texas hold’em.
Detecting diabetic retinopathy
A specific type of neural network — called a deep convolutional neural network — optimised for image classification was trained to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. The algorithm was validated in January and February 2016 using two different datasets. The deep learning algorithm was recognised for having high sensitivity and specificity for detecting referable diabetic retinopathy. However, it requires further research before it can be applied in clinical settings.
Real-time adaptive image compression
This is a machine learning approach for image compression that outperforms all the existing codecs while running in real-time. The algorithm produces files 2.5 times smaller than JPEG and JPEG 2000, two times smaller than WebP, and 1.7 times smaller than BPG on datasets of generic images across all quality levels. This design is deployable and lightweight. It can code or decode Kodak dataset in around 10ms per image on GPU. You can download the full pdf here.
AI a better data scientist than humans?
The job of a data scientist is to extract and interpret meaning from data by the means of statistics and machine learning algorithms. But it turns out, this job is taken up by AI now. AI has outsmarted human data scientists at writing algorithms for text classification. The Neural Architecture Search Neural Network generated a new cell called the NASCell that outperforms all the previous human generated ones, so much that is already available in Tensorflow.
Machines playing doctors
A team of researchers at Stanford University, led by Andrew Ng, has shown that a machine learning model can identify heart arrhythmias from an electrocardiogram (ECG) better than a human expert. This approach could revolutionise everyday medical treatment by diagnosing heartbeat irregularities that could get fatal. In Andrew’s words, “I’ve been encouraged by how quickly people are accepting the idea that deep learning can diagnose with an accuracy superior to doctors in select verticals.” With this pace of advancements, AI algorithms will metamorphose healthcare.
Looking inside a machine’s brain
A Bristol-based startup, Graphcore, has created a series of ‘AI brain scans’, using its development chip and software, to produce Petri dish-style images that reveal what happens as the machine learning processes run. In simple words, this is basically seeing what machines see as they learn new skills. Machine learning systems go through two phases — construction and execution. During the construction phase, graphs showing the computations needed are created. While in the execution phase, the machine uses the computations highlighted in the graph to run through its training processes. In Graphcore’s images, the movement of these phases and the connections between them have been assigned different colours. Read the complete technology behind it here.
AI in neuroanatomy
AI has surpassed humans in making detailed 3D reconstructions of brain microstructures. In a recent report, a Google team and its collaborators were able to solve the problem of recreating 3D neurites in microscopy images of the brain.
Rational reasoning
Google’s new algorithm — a relation networks (RN)-augmented network — is able to take an unstructured input, like an image, and implicitly reason about the relations of objects contained within it. For instance, an RN is given a set of objects in an image and is trained to figure out the relation between the objects — say, if the sphere in the image is bigger than the cube. All the relations are added to produce a final outcome for all the pairs of shapes in the setting. The ability of deep neural networks to perform complicated relational reasoning with unstructured data has been documented in these two papers — A simple neural network module for relational reasoning and Visual Interaction Networks.
The technical advances in AI are evolving fast, and so are the fields it has been deployed into. Human effort has been reduced drastically as the machines evolve. AI has outsmarted humans in a significant number of fields, and it won’t be bizarre to think that in the near future, most human jobs could be taken over by machines.
Also read: The AI secret sauce: Skill sets that will make you ready for the post-AI world
Subscribe to FactorDaily
Our daily brief keeps thousands of readers ahead of the curve. More signals, less noise.
To get more stories like this on email, click here and subscribe to our daily brief.
Angam Parashar is cofounder and CEO at ParallelDots, an artificial intelligence startup providing cutting-edge AI solutions to enterprises. When he is not working, he is either lost in the alleys of Reddit or he is off playing badminton. This article was first published on Linkedin. Disclosure: FactorDaily is owned by SourceCode Media, which counts Accel Partners, Blume Ventures and Vijay Shekhar Sharma among its investors. Accel Partners is an early investor in Flipkart. Vijay Shekhar Sharma is the founder of Paytm. None of FactorDaily’s investors have any influence on its reporting about India’s technology and startup ecosystem.