You Look Like a Thing and I Love You: How Artificial Intelligence Works and Why It's Making the World a Weirder Place
“You Look Like a Thing and I Love You pulls back the curtain to expose the Wizard of AIs.”
Author Janelle Shane is the author of the blog AI Weirdness, where she collects and reports on AI’s unusual behaviors. From the contents of AI Weirdness, comes the book You Look Like a Thing and I Love You. Meant for popular science audiences, and as entertaining as it is, should be mandatory reading for policy makers. You Look Like a Thing and I Love You is not a textbook, nor a review of state of the art, but an overview of the state of the practice, with all its hopes and blemishes.
AI is everywhere, determining advertisements you see, suggesting books and videos, detecting social media bots, scanning resumes, approving bank loans, recognizing voice commands, driving cars, adding bunny ears to images, naming cats, creating new recipes, designing crushable car bumpers, modeling protein-folding for medicines, and even flirting, which led to the book’s title.
We are also discovering the downsides of AI. A significant source of problems, as well as insight, comes from training and testing. Modern AI is not rules based but trained on existing data; once trained, an AI is expected to work in new situations, and a significant source of problems, as well as insight, comes from training and testing. The mistakes AI makes when tasked to draw a cat or write a joke will be the same sort of mistake, and AI will make processing fingerprints or sorting medical images. Shane writes that “giving it a task and waiting for it to flail . . . is a great way to warn us about AI.”
Chapter 1 begins with a description of how AI works, that rules are not decided ahead of time by programmers but “figured out” by software acting on data, in repeated trial and error. Shane explains, “[AI] can discover rules and correlations that the programmer didn’t even know existed.”
Shane herself uses an AI that runs on a standard laptop that models the brain as a bunch of connected neurons, but only as many, approximately, as a worm. Though a trained human easily understanding the logic of rules-based software, there is no way to tell, or at least it is very difficult to tell, for a human to determine which AI neuron does what.
There are simpler and more understandable models that are alternatives to neural nets, for example, Markov chains. A Markov chain looks at past and predicts the future, and are used for example in entry auto-complete.
Shane discusses the large datasets needed for training, and for AIs that operate on text, provides a look at the intermediate steps between producing noise or nonsense toward producing streams of recognizable words, using the example of an AI designed to create a human-recognizable knock-knock joke. The process from nonsense to sense can be understood by the analogy of applying an infinite number of monkeys with typewriters in an attempt to duplicate, if not exactly Shakespeare, then at least a five year old.
The implication is that it is not much of a step from producing a knock-knock joke to naming fancy perfumes, creating edible recipes, or creating anything where rules may be complicated or unknown. One might ask, for example, what are the rules for a cat? The process an AI that could recognize a cat would be, for that AI, to look at 10,000 images of cats, and then figure out the rules that can recognize a cat from an arbitrary image, most of the time.
At the time of publication, though autonomous vehicles have travelled six million actual miles they have previously travelled 5 billion simulated miles. Despite this, autonomous vehicles still cause auto accidents. What can go wrong, in particular, are mistaken assumptions in the criteria used for classification. For example, if you put a sheep on a leash, a classifier trained with dogs on leashes might identify the sheep as a dog, as a goat in a tree might be classified as a bird, or any person in a kitchen as female. From this, it becomes easy to understand how a human hand of a darker hue might not be recognized as a hand or a person in a crosswalk as something that can be driven through. The possibilities for misclassification are endless, and the consequences can be deadly.
The also AI cannot distinguish causation from correlation. The example provided by Shane is of an AI trained to detect cancer cells that instead detected marked rulers used to provide size and scale in images. If there was a ruler in an image, the AI claimed it detected cancer, when instead, it detected the ruler as a proxy for cancer because only images with cancer contained rulers.
Yet another problem is that an AI has only a short-term memory, and long-term memory is needed for prediction-making. As it learns, all an AI’s “neurons” are “up for grabs” and past learning is lost from “catastrophic forgetting,” as memory is reused. Failures from memory constraints can show up early in testing, the AI behaving similar to a human with attention deficit disorder where attention is drawn away from what is important to what is new, also known as the “Look, a squirrel!” effect. Researchers are currently working on protecting long term memory from short term reuse, and improvements can also be made by adding more memory, allowing memory searches across longer time periods, similar to a human’s long-term vs short term memory.
Shane points out types of problems in the real world AI will likely fail at, for example, that a particular problem might be too hard, or inappropriate, or that the AI is not capable of critical thinking, while not making the effort to define “too hard,” “inappropriate,” or “critical thinking.” But Shane certainly hits the nail on the head when she identifies what AI is not good for, in particular, the real-world need for the AI to not make a mistake when the stakes are high. This is true for humans as well, no one, not a human, nor an AI can see beyond the next choice if the consequence of the choice is not understood.
Chapter 3 shows how an AI learns. Neural nets can have multiple layers of neurons with “hidden” layers that add complexity. Shane explains the layers, and how training works.
Neurons are trained on data, and training data sets are more important than algorithm design for neurons. Training data is so important, and so sensitive that if corrupted or biased as little as 3%, can significantly alter results. Shane writes, “Usually our first clue that something has gone wrong is when the AI has done something wrong.”
To improve game play, AIs can be trained to play by contesting a copy of itself. This kind of game play, called generative adversarial networks, generates new data for training sets. Though OpenAI’s self-trained “Five” can beat humans in multiplayer games, “Five” is trained by playing 180 years-worth of games, against itself, every day. Generative adversarial networks can also be used on photographs, to create images having enough realistic characteristics to fool human observers that the image is a “real” photograph.
Success for an AI, is initially set to a low bar, that of being correct at least half the time. All or nothing responses may be given at the beginning, but after more training, an AI classifier may still may not recognize rare occurrences. Missing rare occurrences happens by generalizing too much, and is called “overfitting.” For an AI trained on mostly white swans, all swans are white.
Problems for images includes separating background from the object, and bias. For example, there may be datasets that have more images of giraffes and sheep than trees and dirt. After training on such a dataset, the AI might report it has seen a giraffe where there is none. One might presume, that to the AI, the tree sort of, perhaps maybe, had a few giraffe-like features.
Trained-on-text anomalies can also come from non-random training sets. Data sets containing bible passages, credit card and other customer data might result in AI detecting (and leaking) bible verses, credit card numbers and customer data, where there was none. Similarly, irrelevant or inappropriate strings can show up in Google’s auto-complete. The odd results from auto-complete aren’t driven by appropriateness or “truth,” because the results are driven by past inputs used as training. One area of training that works well, for example, is Google Translate, which, even though it has problems in creating what humans consider well-crafted language, Google Translate allows a human to suss out intent.
Textual AI is, by training set, a slavish mirror of human behavior; for answering yes/no questions, the AI will answer based on the percentage of past answers. If more people answer yes than no, for “Is there an apple in this image,” the AI, too, will answer “yes,” whether or not there is an apple in the current image.
Obviously in hindsight, AI will repeat the bias of humans from having been trained on human-selected data. “The question they’re answering is not ‘What is the best solution?’ but ‘What have humans taught us to do?’” Since AI is prone to follow the bias of developers, problems of bias are not caught in testing.
Shane digresses to consider under what conditions this form of bias might be against the law, for example, when an AI selects whom to hire or to offer a loan. And ironically, while writing this review, an article by Katyana Quatch from The Register appears in my news feed, “MIT apologizes, permanently pulls offline huge dataset that taught AI systems to use racist, misogynistic slurs.”
There is a chapter on what can go wrong other than bias, that prevents an AI from making appropriate decisions. Factors include improperly selecting rewards in training. By the designer choosing any reward, appropriate or not, the AI will act like a greedy or lazy human, picking the easiest and most rewarding path, and just like a human who isn’t aware of (or cares about) the consequences.
Poorly optimized game play can lead to quitting before losing, or acting destructively—the equivalent to tossing a game board. Another cheat-to-win is for the AI to discover a lack of a rule that should have been in place in the first place, allowing a legal but unintended move. Shane writes, “Sometimes I think the surest sign that we’re not living in a simulation is that if we were, some organism would have learned to exploit its glitches.”
There is a chapter on how to recognize those pesky AIs masquerading as humans, and a chapter on AIs partnering with humans. In partnering with humans, when an AI fails, it is important for humans to intervene, for example, the necessity to check the spam folder to catch the AIs that hid your important email, or to grab the steering wheel when the AI driving the car gives up and quits. Cheat-to-win works both ways, having an AI check for human errors, for example, having the AI detect human alertness while driving, can be gamed by the human.
There is a chapter on potential AI futures. There are notes and an index.
You Look Like a Thing and I Love You pulls back the curtain to expose the Wizard of AIs. Remember the words of Dorothy Gale, “If you were really Great and Powerful, you'd keep your promises!”