Watson. It's elementary.
Alphie the robot
When I was about eight years old, I had an Alphie. Alphie was a robot toy with a big red button on its head and a slot on its torso for different cards that you could insert to play games, listen to music, or answer questions. I enjoyed playing with it, and I think Alphie may have even become a member of the other casts of characters and action figures playing roles in various storylines, campaigns, adventures, and explorations–although Alphie was probably typecast into the role of "giant robot".
However, after a short time I was completely disaffected by Alphie's desire to take an active role in my learning process, and I decided that Alphie should be sacrificed so that I could understand how this blue plastic shell with red blinky lights and levers was trying to help me make sense of the world. I quickly found a thin phillips screwdriver and set upon the deeply-set machine-tightened screws in the plastic case. Soon I had the cover undone, and I discovered my first circuit board. I immediately started pressing the buttons and fiddling with it to get a better handle on what was doing what. But it was difficult to see everything and so I started to pry apart different layers of the circuits. Before long, I had torn through the speaker membrane and my hands were covered in sticky-smooth vaseline from the solder points.
And yet, I still didn't understand how Alphie was making the these blinky lights and noise or why Alphie was using those techniques to help me learn. You can imagine how unsettling this must have been for me as an eight year old, especially when I didn't yet have the language or knowledge to interpret these frustrations in the context of learning or interaction design.
Or, you could just call it curiosity.
I think what I found was a tremendous sense of satisfaction in opening that case, looking at what lies hidden from view, and breaking it enough to find out when it starts not to work the way it was intended, but as something else altogether. I remember the experience much more vividly than others from that time period. I was admonished for "breaking" and disassembling my toys, but nonetheless a matchbox carwash soon followed when I became fascinated by the interworking of gears and sponges and the mechanization of movement.
Boxes like Alphie have many precedents in myth and cultural lore. The Greek myth of Pandora's Box comes to mind. Opening a Pandora's Box means creating evils that cannot be undone. It's one of the ways we normalize our behaviors around technology and knowledge, avoiding deep questions about technology, processes, and experiences–probably for the sake of simplicity and efficacy. If we start to look too deeply, we find inconsistencies, contradictions, alternative opportunities, weak points, and other ways to rewire how we understand the world and interact with each other. That can be very unsettling for a lot of people, but for me, Alphie was my Pandora's Box. It was the first of many black boxes that I would want to crack open, fiddle around with, and use to stimulate more complete and compelling questions for myself.
Big Blue.
On Wednesday I visited IBM Research for a symposium on their new Watson technology and its implications for higher education. Normally I'm a little reluctant to attend show-and-tell events like this one because organizations (especially big ones) often try too hard to communicate their "DNA", values, or vision, and it ends up a little too much on the narcissistic side for my taste. But I'm glad I went. It was a stimulating day filled with detailed descriptions of Watson's capabilities, examples of its limits, lucid discussions about its implications for higher education, and case studies for how IBM sees its opportunities.
Watson is a search technology. That's it. The technical description is automatic open-domain question answering or "deep QA" for short. Watson uses evidence to look for answers to questions about pretty much anything in the world that are phrased in natural language. In other words, when you ask a question, Watson scans a bunch of data, information, text, and other knowledge resources to provide the best answer. Watson's capabilities were demonstrated on the TV quiz game show Jeopardy after researchers were confident that its accuracy and range of knowledge was equivalent or better than former Jeopardy champions.
IBM's overall objective with Watson is to to make money for their shareholders by shifting the cognitive and physical effort required for decision making to expert-based resources and cognitive support tools. That has important implications for higher education, learning and society, but it's important to remember that we've been doing this for thousands of years. Paper? The abacus? Cave paintings? Maps? For the Watson project, solving this challenge involves four distinct problems: 1) directing users to precise answers with 2) accurate confidences along with 3) consumable justifications and 4) fast response times.
So how does Watson do it? Watson is a series of servers linked together to process and analyze knowledge resources based on the question being asked. It does this by parsing sentences from the available data and then aggregating the results generally and statistically. It takes text strings and then looks at their syntactic frames (grammar) and then the semantic frames (meaning) of word relationships and co-occurrences. Among many analytical tasks, Watson describes probabilities for the presence of different word relationships, performs text synthesis and decomposition as pattern recognition, and identifies missing links and common bonds among words. All of these help orient the meaning and relevance of candidates as an answer to the question. You could ask, for example, what a telephone, shirt, and TV remote control have in common, and Watson would give you an answer: buttons.
Border crossings.
Notice that I wrote "an answer". Watson is choosing relevant responses from a ranked list of associations for this question. Do iPhones have buttons? What about kurtas (an Indian pullover shirt)? What starts to become very noticeable is the extent to which Watson relies on categories. Maybe that's okay, or maybe it's not. One of the advantages of Watson is that you get a probability that shirts and buttons co-occur–based on the available data. So now instead of getting a single answer, you get information about its relevance. Most of the time, buttons are a likely response, but what about when they aren't? Sometimes we are interested in the rare occurrences. I think the effect of the way that Watson processes text ultimately reinforces existing categories in the responses provided, and that can be dangerous for how we interpret knowledge. If I'm interested in reframing what shirts, phones, and remote controls are, Watson presents an obstacle to that process of innovation.
As with many artificial intelligence and computer science problems, Watson separates aspects of the search and recognition problem, applies analytical methods and processes to solve each, and uses machine learning to weigh analytical decisions against each other. This results in arrays of evidence profiles and plots that weigh classes of evidence based on time, space, source, type, and so on. Once weighed, Watson provides an answer based on the ranked scores.
Watson's task is finding factual answers to questions, which is why one important application could be to help identify where facts are more uncertain, disputed, or not supported as facts. That is, after all, where the edges of knowledge and learning exist. An example from the morning's presentation demonstrated this well. Watson was asked a question about the location of a river in South America. Using evidence in categories like location, passage support, popularity, source reliability, classification, Watson determined that the river was in Peru. However, Bolivia was a very close second answer. It turned out that Peru and Bolivia had/have a border dispute, and the results placing the river in Bolivia were somewhat driven by evidence in the popularity category. Imagine asking Watson where Palestine is and where its borders are and I think you starts sense the implications of data-driven interpretation as a contested space–for political boundaries, causation in natural processes, or the beginning of life. The list is immense, but being able to see closely ranked results and why they are so close would be a boost to those of us trying to sort through areas of knowledge where we agree and disagree.
What this means is that Watson's answers might be more meaningful not for their certainty of correct answers, but the analytics around the uncertainty of correct answers. Regardless, the knowledge controversies that Watson starts to make visible are a good way to think about the limits and opportunities of the information as it's provided and how it gets used.
Rules are made to be broken, too.
So it turns out that the engineers designing Watson think that stringent rules for DeepQA are a bad idea. It seems that anytime you define a hard and fast rule around how to interpret the correct answer, it will usually fail, perhaps because language is so nuanced, or because translations and cross-cultural exchanges remix meaning. From Jeopardy alone there are over 2500 different types of answers and questions, and that means there is a really long tail of less frequent but important answer types. So you will likely never create a set of rules that satisfies them all. This was explained with an example question that asked in which month Muslims fast. If there is a rule that months are January, March, December and so on, Watson would have a difficult time determining that Ramadan is that holy month.
But the critical issues for me come down to how we understand kinds, types and categories, and this is why I think controversies about facts and interpretations make sense for understanding the grey areas of human computer interactions and knowledge networking. We interpret things differently across cultures and social groups. We place emphasis on different forms of subject-verb-object relationships, and these lead to different experiences and attributions of causal relationships. For example, the outcomes for many relationships are framed in terms of direct causation, when they ought to be interpreted as an outcome of emergent causation. "The brain controls emotion" is a good example of a statement that attributes a direct causal relationship when the relationship between cause and effect is a more emergent interaction of protein expression, environmental stimuli, and neurological excitation This presents a cognitive obstacle for learning and conceptual change when individuals aren't able to interpret behaviors across scales–from individual to group–as factors in the behavior of systems.
Another obstacle for conceptual change and learning is mistaking a process for a thing. Heat, electricity, and mutation are good examples of terms we commonly mistake as having essential properties, rather than correctly interpreting them as processes that result in expressions of warmth, excitation, or phenotypic variability.
This is important because Watson is tasked with identifying definable objects from text and feeding those objects back based on meaningful categories. Watson certainly isn't drawing new categories, conflating existing ones, or identifying new unmarked classes–as one might do using metaphor and analogy. Redefining meaning is something that we do on a continuous basis. It's something that's culturally and socially variable, and it has everything to do with how we grow, learn, and develop with others around us over time.
Ashwin Ram and my colleague, Mike Leibhold both reminded me that learning is fundamentally built around a dialectic of questioning and answering–call and response. It's a process that iteratively increases the relevance of ideas over time through the answers and the questions that arise and how they allows people to explore the space around a concept, thing, or process, examine its interrelationships, and seek out addition ones. That relevance happens because processing a certain input at a certain time yields cognitive effects. Cognitive effects happen when previous beliefs are revised and/or new conclusions are reached by comparing new information with one's prior information.
So remember when I began by sharing IBM's goal that it shift the cognitive and physical effort from decision makers to experts? Now that goal seems to be less about learning and higher education, as it is about shifting our attention from learning about knowledge that already exists and towards education-free and error-reduced applications of that knowledge. That means that Watson may be more disruptive than facilitative in the context of higher education, and I guess this is why the usual fears of Watson's implications dwell on how technology displaces jobs and tasks that people already do. But then again, I suppose that same disruptive potential is probably why the VP framed the day's presentations around IBM's 'Smart World' as a place where "there will be clear winners and losers". Hmmm.
That technologies like Watson will replace jobs is a red herring. While it's true, there are plenty of examples from history that show how we shift to different tasks as a result–as in our transition from hunting and gathering to agriculture. There's more to this, and I'm not going to get into it here. I think the main thing is to ask what happens when cognitive effort in one or more domains is shifted to others. What does it mean for task processing and executive functioning in the brain? What does it mean for emotion and attention to salient information? How will that affect our behaviors and relationships? And how will the experience of using a technology infrastructure like Watson further reorient our brain and patterns of attention?
UI, meaning…You and I…not just You
Speaking of patterns and attention, I didn't hear much discussion at all about personal and group experiences for a technology like Watson. I assume it's because IBM believes they will only run the back end of these kinds of services, but it's surprising because what's important for services is how they link experience across back and front end processes. For me it goes without saying that the design of a technology experience–including how it's found, appropriated, normalized, shared, and fits in an environment–is critical to its use and successful intertwinement in the fabric of everyday life. I can't say that IBM has a great vision for how we'll interact with Watson. Jeopardy and other quiz performances are they've demonstrated it so far, and that's in keeping with the grand public witnessing traditions of Western science. Keep in mind that IBM is the same company that advertises a 'Smarter Planet' using some pretty sophisticated iconography. It's also the company that helped Paul Rand make a valiant case for the role of graphic design as a critical capability of modern organizations. Plus, IBM has/had Many Eyes–a strangely cubist gem of information graphics. But it seems to have fallen by the wayside with no support provided or continued development (I'd love to be able to retrieve a print-quality pdf, for example).
It's meaningful that aesthetics and usability aren't emphasized in the Watson conversation because 1) it seems to suggest that IBM is thinking of Watson as a black box–something that hides its internal processes from its users, and 2) that sharing and making reconfigurable the underlying analytical processes as tangible interface for knowledge development and learning is not part of the intended product. If you watch the Jeopardy video I linked above, rather than show the analytical process, they just wave some layered equations and poof! A list pops up! It would be amazing if some sort of API or open framework were developed to make use the internal processes for finding, sorting, and making sense of the data that feeds Watson's results. That could lead to super fun experiments where you could segment the available data and start to play with how the documents produced in different communities lead to different answers.
Differential Diagnostics
You know that TV show House–where Hugh Laurie and his team sit around a table and try to brainstorm a diagnosis for a mysterious set of symptoms? Later in the show, someone (usually Dr. House) has a miraculous insight during some unrelated event or melodramatic moment that allows them to come to the rescue by connecting the dots and making the correct diagnosis and treatment. Well, Watson aims to automate that process.
(Digression: In future seasons of the show, we'll see the team as a collection of boxes on a shelf with their lights blinking steadily like a wi-fi router–the occasional staccato to incorporate new data. Meanwhile, the main narrative arc of the show will shift to the lab where the team will be anxiously awaiting the appearance of a precipitate in solution or the faint appearance of a unknown protein band on a sequencing gel, while being caught up in the details of sampling, replicability, interpretation of results, and the statistical tests. In short, medical drama suspense will start to depend on the details of the process of science and its interpretation–not its vocabulary.)
Differential diagnosis involves the process of doing patient interviews, gathering medical histories and symptoms, and developing a ranked list of potential diagnoses along with tests and possible treatments. Watson, as the kind of DeepQA technology, can successfully aggregate knowledge about disease symptoms, causes, and treatments. Those patient interviews and histories that you fill out in the doctor's office or ER are about to get a more interactive.
A physician and neuroscientist friend of mine is looking forward to the time when patient interviews can be more automated, where she gets additional support for finding out symptom-disease relationships, and in helping her not miss anything important. Currently, she still has to transcribe, annotate, and document patient interviews, and that's just getting plain tedious as well as taking time from other forms of pattern recognition. Because diagnostic errors are 2-3x greater than other errors in medicine, differential diagnosis is seen as a particularly useful place for Watson-like technology to provide reminders of possible diagnoses and serve up best practices for treating and testing.
But don't expect nurses and doctors to vacate those jobs. We still need people who manage the interview process–like data nurses to rate the symptoms and choose additional tests to deduce the probabilities of different diagnoses. There's also the back and forth between the patient and doctor for better understanding of lifestyle and other environmental influences on health. And of course, people are still going to have to make sense of the results. Data nurses will soon be supporting the analytic capabilities of DeepQA robots, curating their resource databases, fine-tuning the diagnostic strategy for diseases like schizophrenia, managing the evidence classes, and applying further tests. And maybe without so much additional research, work, and error, health systems can shift their attention to improved information and operations processes for their medical and health care services to improve patient and provider experiences all around. With a massive database and supercomputer in our hands, we won't need to involve a doctor because we'll be able to self-interpret and apply our own treatments. Now about those pharmaceuticals.
- Gabriel Harp's blog
- Login to post comments
-
