Big data, big deal

The joke goes like this:

Sherlock Holmes and Dr. Watson decide to go on a camping trip. After dinner and a bottle of wine, they lay down for the night, and go to sleep.

Some hours later, Holmes awoke and nudged his faithful friend.
"Watson, look up at the sky and tell me what you see."

Watson replied, "I see millions of stars."

"What does that tell you?"

Watson pondered for a minute.
"Astronomically, it tells me that there are millions of galaxies and potentially billions of planets."
"Astrologically, I observe that Saturn is in Leo."
"Horologically, I deduce that the time is approximately a quarter past three."
"Theologically, I can see that God is all powerful and that we are small and insignificant."
"Meteorologically, I suspect that we will have a beautiful day tomorrow."

"What does it tell you, Holmes?"

Holmes was silent for a minute, then spoke: "Watson, you idiot. Someone has stolen our tent!"

Last week, I found this eight page insert in the New York Times, and I was left wondering if this IBM ad was inadvertently another form of the joke.

IBM, you see, is trying (again) to transform itself.  Once the industry leader in whatever it wanted to do, it has now spent years slowly decapitalizing as it tries to find a commercial niche.

Now, it is offering services based on Watson, noting that "Watson is designed to understand, reason and learn.  In a sense, to think."

In a sense?

Here's a quick summary from Wikipedia:

Watson is a question answering (QA) computing system that IBM built to apply advanced natural language processing, information retrieval, knowledge representation, automated reasoning, and machine learning technologies to the field of open domain question answering.

The key difference between QA technology and document search is that document search takes a keyword query and returns a list of documents, ranked in order of relevance to the query (often based on popularity and page ranking), while QA technology takes a question expressed in natural language, seeks to understand it in much greater detail, and returns a precise answer to the question.

According to IBM, "more than 100 different techniques are used to analyze natural language, identify sources, find and generate hypotheses, find and score evidence, and merge and rank hypotheses."

Is that thinking? Stanley Fish offered this view:

Far from being the paradigm of intelligence, therefore, mere matching with no sense of mattering or relevance is barely any kind of intelligence at all. As beings for whom the world already matters, our central human ability is to be able to see what matters when.

So, in short, IBM is offering an expensive tool that might help corporate executives troll through lots of data and try to divine commercially relevant strategies. It suggests that, "When your business thinks, you can outthink" the competition.

I guess the proof of the pudding is whether this approach can be applied to IBM itself.  What I see instead is a behemoth of a corporation, with tens of thousands of employees spread across the world unfocused in purpose and execution, stagnant in the capital markets--with thirteen straight quarters of decline in revenues.  The company is an exemplar of brute force decision-making, being outflanked left and right by more nimble players in the marketplace.  The additional value offered by Watson is unlikely to be attractive to industry leaders in other fields.  The millions of dollars spent on the Times insert is, in my mind, just another example of ineffective corporate thinking.  What's the audience, and how is the ad persuasive?

Back in the 1960's, you could drop by the IBM building in New York City and pick up the iconic blue think desk sign seen above.  I still have mine.  I'm saving it for my daughters to take to Antiques Roadshow someday, where it might have some value as an a piece of industrial archeology.


Sam said...

I hope somebody from IBM high enough up to make a difference reads your blog. They made a big mistake when they went the Watson route. They developed a product based on brute force. It reminds me of the story of Richard Lionheart and Saladin. At a meeting of the two before a battle Richard took his sword and split and anvil in two to demonstrate his strength. Saladin tossed a silk scarf in the air and extended the blade of his saber. The scarf floated downward across the blade and fell to the floor in two pieces. IBM’s Watson is Richard. The future calls for Saladin. I don’t think they can turn on that particular dime.

JP said...

Although you'd have a hard time convincing me that Watson can think, one thing it does seem to do particularly well is stringent literature review and summary. So as an input into the decision making process, one where we'd ideally like it to be data driven, it seems like Watson has a clear role in retrieving, organizing existing, and developing a baseline rational better than a document search. I doubt we can attribute IBM's business troubles to Watson, or blame it for not helping IBM out, the box can analyze data, but it doesn't (yet?) have a seat on the board.

Bruce Olsen said...

There's no doubt IBM is overselling; that's what IBM and most other tech vendors do. It's been their M.O. for at least 50 years. One has to be fairly naïve to uncritically buy any technology silver bullet (including Watson), but IBM has long since extracted their customers' naiveté along with large sums of money—often more than once. There have been few uncritical buyers for a long time. This doesn't apply solely to IBM, of course; most tech companies oversell.

But don't throw baby Watson out with the bathwater. It's a ridiculously fast, highly capable search engine that can draw very good (if limited) conclusions about a domain of knowledge and it will keep improving. That's all it does, but even that can be highly useful. Kelly and Dreyfus (the guest authors of the cited Stanley Fish column) seem to be saying "If it ain't human it's crap!" (apologies to Mike Myers) and that's just, well, crap.

Watson can have many uses without actually thinking/feeling/caring or whatever "uniquely human" characteristic one chooses to enable tautological victory over the machines.

However, medicine does pose an interesting challenge for Watson. It will surely become more knowledgeable than any human physician can be (even if it doesn't actually "know" anything), and to the extent that good diagnoses (for example) depend on factual knowledge and the ability to draw fairly simple conclusions it will become much better at presenting all the viable diagnostic options for a particular case.

Of course, people aren't merely a bundle of symptoms (though we probably all know a counterexample or two) and it will be a long time before Watson can make use of all the non-medical stimuli a human can. One of the themes in current AI thinking is that the body is important, but I'm not sure how many of us would hope for a chance encounter with Watson at, say, the grocery store simply to improve its diagnostic ability. But differences will fade as sensor technology improves and health records become more complete.

Further, we know very little about medicine or the human body, so Watson's ability to sift through orders of magnitude more data than a human actually confers less of an advantage than one might hope. I'd expect it to surface more alternatives for diagnosis, especially in less common conditions that a human physician can't keep up with. The glacially slow pace of knowledge diffusion is frustrating and there's value to offering a longer list of diagnostic possibilities that free a human to do some real thinking—and to watch out for the "screamer" mentioned in the article (where Watson identified Toronto as a US city).

So my advice is to find a wealthy, naïve friend and borrow their Watson when they aren't using it.

Paul Levy said...

Love the advice, Bruce!