Slow reading six years later. Digital technology has evolved, and so have I. There is a trade-off.

slowreadingcov_sI was recently interviewed by The Wall Street Journal about slow reading. It has been a few years since I did one of these interviews. I wrote Slow Reading in 2008, six years ago. At the time, the Kindle had just been released and there was a surge of discussion about reading practices, to which I attribute the interest in my little book of research. The request for an interview suggests an ongoing interest in slow reading. So what do I have to say about the subject now?

I used to slow-read often. I would write books reviews, thinking myself progressive in a digital sense for blogging reviews in just four paragraphs. A shift began. My ongoing use of digital technology to read, write and think forced that shift along. I tried to write about that shift in a new online book project — I, Reader — but I failed. The shift was still in progress. I hit a wall at one point. I thought for a time I had reached the end of reading. In 2013, I stopped reading and writing. A year later I started again. I have a good perspective on the shift, but I have no immediate plans to resume writing about it.

So what did I tell the interviewer about slow reading? I confessed that I slow-read print books less often. I re-asserted that “Slow reading is a form of resistance,  challenging a hectic culture that requires speed reading of volumes of information fragments.” I admitted that my resistance is waning. Digital technology has evolved to allow for reading, not just for scanning of information fragments, but also for comprehension of complex and rich material. I was surprised and pleased to discover how digital technology has re-programmed my reading and writing skills to process information more quickly and deeply. I am smarter than I used to be.

I have resumed my writing of book reviews. I restored a selection of book reviews from the past, ones relevant to my current blogging purposes. I will be writing new reviews, probably less often. I will be writing them differently. Currently I am reading Book Was There: Reading in Electronic Times by Andrew PiperI no longer take notes on paper as I read. I have been tweeting notes. I like the way it is evolving. I use a hashtag for the title and author, and sometimes a reader joins in. When I am done, I will write a very short review, two paragraphs tops, and post it here.

That’s not all I said to the interviewer. I said there has been a trade-off because of digital technology. There is always a trade-off. We just have to decide whether whether the gains are more than the losses. What have we lost? I lingered on this question because the loss is less than I anticipated. We still read. We still read rich and complex material. Students still prefer print books for serious reading but I expect they are going through the same transition as I did. What is lost, I assert, is long-form writing. Books born print can be scanned and put online, but books born digital are getting shorter all the time. It is no coincidence that my book, Slow Reading, was short. I was already a reader in transition. Digital technology prefers shortness. It is one reason that many kinds of poetry will survive and thrive on the web. Things should be short and simple as possible (but not simpler, per the quote attributed to Einstein). Long-form novels and textbooks will be lost in time. It is a loss. Is it worth it?

The four steps Watson uses to answer a question. An example from literature.

Check out this excellent video on the four steps Watson uses to answer a question. The Jeopardy style question (i.e., an answer) comes from the topic of literature, so quite relevant here: “The first person mentioned by name in ‘The Man in the Iron Mask’ is this hero of a previous book by the same author.’ This video is not sales material, but a good overview of the four (not so simple) steps: 1. Question Analysis, 2. Hypothesis Generation, 3. Hypothesis & Evidence Scoring, 4. Final Merging & Ranking. “Who is d’Artagnan?” I am so pleased that IBM is sharing its knowledge in this way. I had new insight watching it.

Physika, the next phase: Text analysis of the novel. Selected book reviews are back.

NovelTM is an international collaboration of academic and non-academic partners to produce the first large-scale quantitative history of the novel. It is a natural fit with my interests in cognitive technologies, text analytics, and literature. I am getting to know the players, and hope to contribute. Given that, I have reorganized things a bit here at my blog. The next “Wilson” iteration of my basement build of a cognitive system will focus on text analysis of the novel. Note too, I have brought back a number of book reviews related to text analysis of the novel. In particular, note my review of Orality and Literacy by Ong. In that review, back in 2012, I noted, “It blows my information technology mind to think how these properties might be applied to the task of structuring data in unstructured environments, e.g., crawling the open web. I have not stopped thinking about it. It may take years to unpack.” Two years later, I am slowly unpacking that insight at this blog. 

Genre detection. Natural language processing models ought to be trained by genre specific content.

Having completed the second iteration of Whatson, I am going kick about for a bit, exploring special topics, before I take on another iteration. One topic of interest is genre detection. To start with the obvious, genre is a category of art, most commonly in reference to literature, characterized by similarities in form, style, technique, tone, content, and sometimes length. Genres include fiction and non-fiction, tragedy and comedy, satire and allegory, and many more refined classifications.

Why is it interesting for a cognitive system to detect genre? Clearly, my focus in on literature and genre is a major category. However, it goes deeper. I’m reading through an article by Stamatatos:

Kessler gives an excellent summarization of the potential applications of a text genre detector. In particular, part-of-speech tagging, parsing accuracy and word-sense disambiguation could be considerably enhanced by taking genre into account since certain grammatical constructions or word senses are closely related to specific genres. Moreover, in information retrieval the search results could be sorted according to the genre as well.

Natural language processing depends on the choice of models for entity recognition. One major choice is language, e.g., English or other. Perhaps the very next choice is genre. Models really ought  to be trained by genre specific content.

Knowing the content’s genre can help with extended analytics. Take the analysis of people names as an example. If we know in advance that we are analyzing fiction, and know that names are selected based on character traits, then we might be able to say something interesting about a person by the “color” of their name. In non-fiction, “color” is incidental, and we might consider other classifications like genealogy to be more informative.

What is unstructured data? Anything not in a DBMS? Text? Non-repetitive data?

What is unstructured data? Anything not in a DBMS? Text? “Many English teachers would contend that the English language is in fact highly structured.” In IBM Data Mag, Inmon suggests distinguishing structured data from unstructured data based on repetition of data occurrences.

“Data that occurs frequently, repetitive data, is data in a record that appears very similar to data in every other record. … Examples of repetitive data—and there are many—include metering data; click-stream data; telephone call records data, such as time of call, the caller’s telephone number, and the call’s length; analog data; and so on.”

“The converse of repetitive data, nonrepetitive data, is data in which each occurrence is unique in terms of content—that is, each nonrepetitive record is different from the others. … There are many different forms of nonrepetitive data, and examples include emails, call center conversations, corporate contracts, warranty claims, insurance claims, and so on.”

Or is the concept of “non-repetition” just another way of saying “meaning?” When events repeat they begin to blur. We look for discontinuities, change, or lack of repetition to indicate meaning. Especially when it comes to big data, we do not know in advance what cases of non-repetition are of value. That’s what we seek to find out.

What makes a system cognitive? Conclusion to the Whatson iteration.

In March I asked the question, Can I build Watson Jr in my basement? I performed two iterations of a basement build that I dubbed “What, son?” or “Whatson” for short. In a first iteration, I recreated the Question-Answer system outlined in Taming Text by Ingersoll, Mortin and Farris. In a second iteration, I did deep dive into a first build of my own, writing code samples on essential parts and charting out architectural decisions. Of course there is plenty more to be done but I consider the second iteration complete. I have to put the next “Wilson” iteration on hold for a bit as my brain is required elsewhere. I would like to conclude this iteration with a final post that covers what I believe to be the most important question in this emerging field … What makes a system cognitive?

Here are some key features of a cognitive system:

Big Data. Cognitive systems can process large amounts of data from multiple source in different formats. They are not limited to a well-defined domain of enterprise data but can also access data across domains and integrate it into analytics. One might call this feature “big open data” to reflect its oceanic size and readiness for adventure. You would expect this feature from an intelligent system, just as humans process large amounts of experience outside their comfort zone.

Unstructured Data. Structured data is handled nicely by relational database management systems. A cognitive system extracts patterns from unstructured data, just as human intelligence finds meaning in unstructured experience.

Natural Language Processing (NLP). A true artificial intelligence should be able to process raw sensory experience and smart people are working on that. A entry level cognitive system should at least be able to perform NLP on text. Language is a model of human intelligence, and the system should be able to understand Parts of Speech and grammar. The deeper the NLP processsing the smarter the system.

Pattern-Based Entity Recognition. Traditional database systems and even the modern linked data approach rely heavily on arbitrary unique identifiers, e.g., GUID, URI. A cognitive system strives to uniquely identify identities based on meaningful patterns, e.g., language features.

Analytic. Meaning is a two-step between context and focus, sometimes called figure and ground. Interpretation and analytics are cognitive acts, using contextual information to understand the meaning of the focus of attention.

Game Knowledge. Game knowledge is high order understanding of context. A cognitive system does not simply spit out results, but understands the user and the stakes surrounding the question.

Summative. A traditional search system spills out a list of results, leaving the user to sort through them them for relevance. A cognitive system reduces the results to the lowest possible number of results that satisfy the question, and presents them in summary format.

Adaptive. A cognitive system needs to be able to learn. This is expressed in trained models, and also in the ability to accept feedback. A cognitive system uses rules, but these rules are learned “bottom-up” from data rather than “top-down” from hard-wired rules. This approach is probabilistic and associated with a margin of error. To err is human. It allows systems to learn from new experience.

I believe the second Whatson iteration demonstrates these features.

QA Architecture III: Enrichment and Answer. Playing the game with confidence.

1-3 QA Enrich AnswerThe Question and Answer Architecture of Whatson can be divided into three major processes. Previous posts covered I – Initialization and II – Natural Language Processing and Queries. This post describes the third and final process, III – Enrichment and Answer as shown in the chart to the right.

  1. Confidence. At this point, candidate results have been obtained from data sources and analyzed for answers. The work has involved a number of Natural Language Processing (NLP) steps that are associated with probabilities. Probabilities at different steps are combined to calculate an aggregate confidence for a result. There will be one final confidence value for each result obtained from each data source. The system must decide if it has the confidence to risk an answer. The risk depends on Game Rules. In Jeopardy, IBM’s Watson was penalized for a wrong answer.
  2. Spell Correction. If the confidence is low, the system can check the original question text for probable spelling mistakes. A corrected query can be resubmitted to Process 2 to obtain new search results, hopefully with higher confidence. Depending on the Game being played, a system might suggest spell correction before the first search is submitted, i.e., Did You Mean … ?
  3. Synonyms. If the confidence is still low, the system can expand the original question text with synonyms. E.g., ‘writer’ = ‘author’. The query is submitted, with the intent of obtaining higher confidence in the results.
  4. Clue Enrichment Automatic. The system is built to understand unstructured text and respond with answers. This build can be used to enrich a question with additional clues. Suppose a person asked for the author of a particular quote. The quote might be cited by several blog authors, but the system could deduce that the question refers to the primary or original author.
  5. Clue Enrichment Dialog. If all else fails the system will admit it does not know the answer. Depending on the Game, the system could ask the user to restate the question with more clues.
  6. Answer. Once the confidence level is high enough, the system will present the Answer. In a Game like Jeopardy only one answer is allowed. Providing only one answer is also a design goal, i.e., it should be smart enough to know the answer, and not return pages of search results. In some cases, a smart system should return more than one answer, e.g., if there are two different but equally probably answers. The format of the answer will depend on the Game. It makes sense to utilize templates to format the answer in a natural language format. Slapping on text-to-speech will be easy at this point.
  7. Evidence. Traditional search engines typically highlight keywords embedded in text snippets. The user can read the full document and try to evaluate why a particular result was selected. In a cognitive system, a single answer is returned based on a confidence. It can demonstrate why the answer was selected. A user might click on a “Evidence” link to see detailed information about the decision process and supporting documents.

This post concludes the description of the three processes in The Question and Answer Architecture of Whatson.

QA Architecture II: Natural Language Processing and Queries. Context-Focus pairing of the question and results.

The Question and Answer Architecture of Whatson can be divided into three major processes: I – Initialization, II – Natural Language Processing and Queries, and III – Enrichment and Answer. This post describes the second process, as shown in the chart:

1-2-QA-NLP

  1. Context for the Question. There are two pairs of green Context and Focus boxes. The first pair is about Natural Language Processing (NLP) for the Question Text. Context refers to all the meaningful clues that can be extracted from the question text. The Initialization process determined that the domain of interest is English Literature. In this step, custom NLP models will be used to recognize domain entities: book titles, authors, characters, settings, quotes, and so on.
  2. Focus for the Question. The Context provides known facts from the question and helps determine what is not known, i.e., the focus. The Focus is classified as a type, e.g., a question about an author, a question about a setting.
  3. Data Source Identification. Once the question has been analyzed into entities, the appropriate data sources can be selected for queries. The Data Source Catalog associates sources with domain entities. More information about the Catalog can be found under the discussion of the Tank-less architecture.
  4. Queries. Once the data sources have been identified, queries can constructed using the Context and Focus entities as parameters. Results are obtained from each source.
  5. Parts of Speech. Basic parts of speech (POS) analysis is performed on the results, just like in the Initialization process.
  6. Context for the Results. The second pair of green Context and Focus boxes is for the Results text. Domain entities are extracted from the results. Now the question and answer can be lined up to find relevant results.
  7. Focus for the Results. The final step is to resolve the focus, asked by the question and hopefully answered by the result. The basic work is matching up entities in the Question and Results. Additional cognitive analysis may be applied here.

The third and final post will describe how the system evaluates results before offering an answer.

QA Architecture: Initialization. The solution is in the question.

1-1 QA DetectionThe Question and Answer Architecture of Whatson can be divided into three major processes. The first process may be called Initialization, and is shown in the chart to the left. It involves the following steps:

  1. Accept the Question. A user asks a question, “Who is the author of The Call of the Wild?” Everything flows from the user question. One might say, the solution is in the question. It is assumed that the question is in text format, e.g., from an HTML form. A fancier system might use voice recognition. The user can enter any text. It is assumed at the beginning that the literal text is entered correctly, i.e., no typos, and that there are sufficient clues in the question to find the answer. If these conditions prove wrong, a later step will be used to correct and/or enrich the original question text.
  2. Language Detection. The question text is used to detect the user’s language. The cognitive work performed by the system is derived from its knowledge of a particular language. Dictionaries, grammar, and models are all configured for individual languages. The language to be used for analysis must be selected right at the start.
  3. Parts of Speech. Once we know the language of the question, the right language dictionary and models can be applied to obtain the Parts of Speech that will be used to do Natural Language Processing (NLP).
  4. Domain Detection. A typical NLP application will use English language models to perform tasks such as Named Entity Recognition, the identification of common entities such as People, Locations, Organizations, etc. This common level of analysis is fine for many types of questions, but there are limitations. How can a Person detector know the difference between an Author and a Character? I have shown how to build a custom model for Book Title identification. My intent is to build custom models for all elements of the subject domain. The current domain of interest is English literature but a system should use the question text to identify others domains too.

The next post will describe how to use the inputs for NLP.

Hammering out the Question and Answer Architecture. The big picture.

I settled on the Tankless option for the overall architecture — see diagram and discussion. In that architecture, the Question and Answer piece was one major component. I need to hammer out the details of that component because it has the most complexity, naturally. The following is the complete picture of the Question and Answer Architecture. On the left is the flow from the the original question text, to the natural language processing and querying steps in the middle, to the clue enrichment and final answer on the right. All of these pieces need explanation. I will be presenting and discussing the pieces in three posts. Stay tuned.

1 Question and Answer Architecture