Categories? Tags? Pffft. Words are the true pit of chaos. Or not.

Categories? Very eighteenth century. Tags? So Web 2.0. Pretty cryptic stuff. What will Lila do differently? Let’s take another step.

Tags are messier than categories; I called tags evil. But tags are easier to manage than the next level down, the words themselves. Tags are messy when left to humans, but tags can be managed with automation. Many services auto-suggest tags, controlling the vocabulary. Lila will generate its own tags, refreshing them on demand. Tags can be managed.

wordleWords are the true pit of chaos. People conform to the rules of language when they write, or they don’t. People make up words on the fly. Down the rabbit hole. But is it so bad? It happens time and again that we think an information problem is too complex to be automated, only to analyze it and discover that we can do a good chunk of what we hoped following a relatively simple set of rules. One mature technology is keyword search. Keyword search is so effective we take it for granted. Words can be managed with the right technologies.

Another mature technology is Natural Language Processing (NLP). Its history dates back to the 1950’s. The field is enjoying a resurgence of interest in the context of cognitive computing. Consider that a person can learn basic capability in a second language with only a couple thousand words and some syntax for combining them. Words and syntax. Data and rules. Build dictionaries with words and their variant forms. Assign parts-of-speech. Use pattern recognition to pick out words occurring together. Run through many examples to develop context sensitivity. Shakespeare it is not, but human meaning can be extracted from unstructured text in this way for many useful purposes.

Lila’s purpose is to make connections between passages of text (“slips”) and to suggest hierarchical views, e.g., a table of contents. I’ve talked a lot about how Lila can compute connections. Keywords and NLP can used effectively to find common subjects across passages. Hierarchy is something different. How can the words in a passage say something about how it should be ordered relative to other external passages? We can go no deeper than the words. It’s all we have to work with. To compute hierarchy, Lila needs something different, something special. Stay tuned.

Tags are the evil sisters of Categories. Surprising views, sour fast. Lila offers a different approach.

hashtagI’m a classification nut, as I told you. In the last post I told you about the way I organize files and emails into folders. Scintillating stuff, I know. But let’s go a level deeper toward Lila by talking about tagging. Tags are the evil sisters of categories. Categories are top-down classification — someone on high has a idealized model of how everything fits into nice neat buckets. Tags are situational and bottom-up. In the heat of the moment, you decide that this file or that email is about some subject. Tags don’t conform to a model, you make them up on the fly. You add many tags, as many as you like. Mayhem! I’ve tried ‘em, I don’t like ‘em.

Tags do one thing very well, they let you create surprising views on your content. Categories suffer from the fact that they only provide one view, a hierarchical structured tree. Tags let you see the same content in many different ways. Oh! Look. There’s that short story I wrote tagged with “epic.” And there’s those awesome vacation pics tagged with the same. Hey, I could put those photos on that story and make it so much better. But the juice you get out of tags sours fast. The fact that they are situational and bottom-up causes their meaning to change. “Bad” and “sick” used to mean negative things. As soon as people get about a hundred tags they start refactoring them, merging and splitting them, using punctuation like underscores to give certain tags special meanings. Pretty soon they dump the whole lot of them and start over. Tags fail. What people really want is, yup, categories.

Lila is a new way to get the juice out of tags without going sour. Lila works collaboratively with the author to organize writing. Lila will let writers assign categories and tags, but treat them as mere suggestions. The human is smart, Lila knows, and needs his or her help, so it will use the author’s suggestions to come up with its own set of categories and tags. Lila’s technique will be based on natural language processing. Best part, the tags can also be regenerated at the click of a button, so that the tags never sour. You get the surprising views and the tags maintain their freshness. Sweet.

I’ve been pretty down on tags in this post, so I will say there is one more thing that tags do quite well. They connect people, like hash tags in twitter. They form lose groupings of content so that disparate folks can find each other. It doesn’t apply so much to a solitary writing process, but it might fit to a social writing process. I will think on that.

I’m a bit of a classification nut. It comes from my Dutch heritage. How do you organize files and emails into folders?

Dutch Efficiency

I’m a bit of a classification nut. It comes from my Dutch heritage — those Dutchies are always trying to be efficient with their tiny bits of land. It’s why I’m drawn to library science too. I think a lot about the way I organize computer files and emails into folders. It provides insight into the way all classification works, and of course ties into my Lila project. I’d really like to hear about your own practices. Here’s mine:

  1. Start with a root folder. When an activity starts, I put a bunch of files into a root folder (e.g., a Windows directory or a Gmail label).
  2. Sort files by subject or date. As the files start to pile up in a folder, I find stuff by sorting files by subject or date using application sorting functions (e.g., Windows Explorer).
  3. Group files into folders by subject. When there are a lot of files in a folder, I group files into different folders. The subject classification is low level, e.g, Activity 1, Activity 2. Activities that are expire are usually grouped together into an ‘archive’ folder.
  4.  Develop a model. Over time the folder and file structure can get complex, making  it hard to find stuff. I often resort to search tools. What helps is developing a model that reflects my work. E.g., Client 1, Client 2. Different levels correspond to my workflow, E.g., 1. Discovery, 2. Scoping, 3. Estimation, etc. The model is really a taxonomy, an information architecture. I can use the same pattern for each new activity.
  5. Classification always requires tinkering. I’ve been slowly improving the way I organize files into folders for as long as I’ve been working. Some patterns get reused over time, others get improved. Tinkering never ends.

(I will discuss the use of tagging later. Frankly, I find manual tagging hopeless.)

Several methods to compute association between passages of text. Time to unpack Walter J. Ong.

What does it mean for two passages of text to be associated? How can association be computed? The answer always comes down to word matching. But not all words are of equal value. Some words say the same thing in different ways. And words mean different things depending on how they are combined. In the previous post I gave a rough cut at how Lila could compute association. A finer method would use more advanced techniques. Here is a list of several techniques. I briefly describe how each method can be used to select important words that limit a search, or expand meaning for a broader search. Once an adequate search query can be defined, a results ranking can be returned and a measure of association computed.

1. Parts of Speech

Nouns and noun phrases. Nouns and noun phrases are most important because they describe the subject of sentences — people, places, things. This is what the text is about, the focus. Noun and noun phrases can be extracted using Natural Language Processing (NLP) technology, ready to use as keywords in a search algorithm.

Nouns with Verbs. Verbs indicate change, perhaps the essence of meeting. Compare “the fox and the chicken” with “the fox ate the chicken.” Text without verbs are continuous streams with nothing happening. I think of verbs as heat, indicating an exchange of energy, a change of state. Verbs indicate something meaningful is going on with a noun, and is worth capturing for search.

Adjectives and Adverbs. Adjectives modify a noun, adverbs modify verbs. Staples of grammar. Extended parts-of-speech analysis can help decide which nouns and verbs are more important to the meaning of text.

2. Normal Forms, Lemmas, and Synonyms

A normal form is the standard expression of multiple surface forms; e.g., $200 = “two hundred dollars”, IBM = “International Business Machines”.  A lemma is the canonical form of a word; e.g., “run” for “runs”, “ran”, and “running.” Synonyms are words of comparable meaning; e.g., tortoise = turtle. These variations can be handled by lookup in existing NLP dictionaries. Matching can be expanded on any terms declared equivalent in a dictionary.

3. Word Properties

Quantitative properties of words have been compiled by language researchers and scientists. These properties are available as lists for direct look-up. They can be used to decide which words are more important as keywords. The trick is to decide how the properties imply importance. Here are some possible applications.

Frequency. An infrequently used word is more important than a frequently used word. E.g., “tortoise” is less frequently used than “tower.” Infrequency implies more deliberate word selection.

Concreteness. Abstract words summarize many concrete examples. “Food” is more abstract than “apple.” An abstract word is virtually metadata, and can generally be considered more important. I studied Allan Paivio’s dual-coding theory as an undergrad psychology student. The theory seems to have receded, but his measures of word concreteness are begging to be tapped by Lila.

4. Phase and Sentence Properties

Sentiment and Emotion. Positive or negative regard for a thing are referred to as sentiment. “I like apples” is positive sentiment, and “I hate bananas” is negative sentiment. Sentiment is also an indicator of emotion. Generally, positive sentiment can be assumed to indicate a higher degree of association, unless of course you are looking for contrary views.

Idea Density. The number of ideas in words can be computed as idea density. “The old gray mare has a big nose” — this sentence is short and choppy; it has low idea density. “The gray mare is very slightly older than …” — this sentence has complex interrelationships; it has higher idea density. Idea density can be computed as Number of Ideas / Number of Words. Generally, text with comparable idea density can be assumed to have a greater association. For example, an academic paper on a subject is more likely to associated with another academic paper than, say, a blog post.

5. Orality

ongIn Orality and Literacy: The Technologizing of the Word, Walter J. Ong identifies properties of oral communication in societies in which literacy is unfamiliar. I read this book in 2012. I wrote,

One might think that oral culture could not engineer complex works, yet the Iliad and the Odyssey were oral creations. Ong explains the properties of orality that make this possible. Oral memory is achieved through repetition and cliche, for example. Also, phrasing is aggregative, e.g, “brave solider” rather than analytic, e.g., “soldier’. … Ong contrasts many more properties of oral memory. They defi ne the lifeworld of thought prior to structuring through literacy. It is an architecture of implicit thought, of domain knowledge. It blows my information technology mind to think how these properties might be applied to the task of structuring data in unstructured environments, e.g., crawling the open web. I have not stopped thinking about it. It may take years to unpack.

The time has come to unpack these ideas. Oral phrasing uses sonorous qualities like repetition to increase emphasis, aid memory, and shape complexity. These oral patterns carry over to the way people write text today, and are one of the reasons computers have trouble analyzing unstructured text. But Ong catalogued these techniques and at least some of them can be used to select concepts of importance. A text algorithm can easily detect word repetition, for example.

A first rough cut at Lila’s calculation of association between slips. Keep it simple to scale for large volumes of text.

How will Lila calculate the strength of association between slips? Here is a first rough cut. The key thing I want to illustrate is the use of simple steps to decide the subject of a slip and compute a quantitative measure of association with other slips. The method must be simple and quick to scale for large volumes of text. A chess program has to contend with infinite combinations. It uses a simple evaluation of game rules and a sum of chess piece points. E.g., if a move leaves white with ten points and black with nine points, then a particular move is a good one for one. This simple calculation (applied for as many iterations as a game level will allow) is sufficient to beat most chess players on the planet. Lila uses a comparable approach.

The simple rules for text analysis are the following:

  1. Extract nouns. Basic grammar says that you find the subject of a sentence in the nouns: people, places and things.
  2. Use word properties to rank their relative meaningfulness. For example, if a word has a lower frequency of usage it can be considered more interesting and important. There are several such word properties that can be applied by a simple calculation. Just word frequency will be used here. Based on the properties, rank the nouns.
  3. Use synonyms and variant forms to match on meaning rather than just a single surface form. Variant forms and synonyms are a simple and powerful semantic matching technique.

Here is an example. Suppose Stephen Hawking used Lila when writing A Brief History of Time. Suppose this line was a slip in his writing project:

Most people would find the picture of our universe as an infinite tower of tortoises rather ridiculous, but why do we think we know better? What do we know about the universe, and how do we know it?

In this first rough cut, Lila analyzes the slip and generates the following table:

Noun Frequency of usage
(academic)
Rank Synonym
tortoise 123 1 turtle
tower 1563 2 castle
universe 4868 3 cosmos

Lila has applied the rules:

  1. Nouns were extracted.
  2. For each noun, frequency of usage was used to calculate rank. An arbitrary rule for this example limits the nouns of interest to the top three ranked nouns.
  3. Synonyms, e.g., tortoise = turtle, were generated by simple look-up from a list.

The nouns are used to find other related slips and compute their strength of association.

A first Google search was performed on [tortoise tower universe]. It would make sense to apply a boost factor to the keywords based on the ranking; in this case I trusted Google to use word order. Many results were nearly identical to the original slip. Nearly identical slips may be interesting to Hawking but will not add much insight.

A second search was performed on the synonyms [turtle castle cosmos]. Divergent results were found, such as a website about Turtle’s ice cream. A snippet was selected from the website for analysis by the Lila algorithm:

“Turtles Oreo
There’s more to see…
Sign up to discover and save different things to try in 2015.
About this map
Ice Cream

28 Pins •
[Image: Cosmic Castle”

Noun Frequency of usage
(academic)
Rank Association
cream 419 1 No Match
pins 724 2 No Match
castle 788 3 Match

This site is about an unrelated subject, ice cream. Limiting again to the top three ranked nouns, there is only one match — between Hawking’s term “tower” and its synonym “castle.” A measure of association of 1/3 or 0.33 is computed. This low value could be used to obscure or exclude the slip.

Another result matched better, a blog about turtles in cosmology. A snippet was analyzed using Lila’s algorithm:

The Cosmic Turtle Around the World

Japan:

In Japanese mythology, the tortoise supports the ‘Abode of the Immortals’ and the ‘Cosmic Mountain’, where the Cosmic Mountain relates to the axis mundi – the world axis.

Noun Frequency of usage
(academic)
Rank Association
cosmos/cosmic 955 1 Match
turtle 1116 2 Match
axis 2046 3 No Match

Perhaps Hawking would not be interested in such a blog. Perhaps he would. In this case there are two matches. A measure of association of 2/3 or 0.66 is computed. A more refined algorithm would weigh in the triple use of “cosmic.” This site is related subject matter.

This is a first rough cut of how Lila will calculate the strength of association between slips. Certainly a more sophisticated algorithm is required, taking in account multiple word properties. The algorithm should weigh words as more important if they repeat within a slip, especially if they repeat in author-suggested categories and tags. But sophistication must always answer to the need for a simple algorithm. Simplicity is the only way to achieve reasonable performance when analyzing large quantities of text.

Lila’s four cognitive extensions to the writing process

Lila is cognitive writing technology. It uses natural language processing to extend the cognitive abilities of a writer engaged in a project. In the previous post I described the seven root categories used to organize a non-fiction writing project and to optimize Lila’s analytics. These categories are considered a natural fit with the writing process and can be visualized as folders that contain notes. In this post I present a diagram that maps Lila’s four cognitive extensions through the folders to the writing process.

4 extensions to writing

A “slip” is the unit of text in Lila. A slip is equivalent to a “note,” usually one or a few sentences, but no hard limit.

  1. The early stages of the writing process focus on thinking and research. An author sends slips to an Inbox and begins filing them in a Work folder. Documents and books that have not been read are sent to the TLDR folder. Lila processes the unread content, generating slips that are also filed in the Work folder.
  2. As the slips build up the author analyzes them. Using Lila, an author can visualize the connections between slips. The author can  “pin” interesting connections and discard others.
  3. Connections are made between the author slips, and from author slips to unread content slips. Where the connections are made to unread content, a link is provided to the original document or book. Authors can read both slips and original material in the context of their own content. This is called “embedded reading”, allowing for swifter analysis of new material.
  4. Analysis leads to organizing and writing drafts. An author will organize content in a particular hierarchical view, a table of contents. The author can get new insight by viewing the content in alternate hierarchical views generated by Lila.

The writing process usually involves each of these steps — thinking, research, analysis, etc. — at each step. Lila can perform its cognitive extensions at any step, e.g., integrate a new unread document late in the process. As the writing process continues, slips will be edited and integrated into a longer work for publication. Lila maintains a sense of “slips” in the background even when the author is working on a long integrated unit of text.

Seven root categories for organizing non-fiction writing and optimizing Lila’s analytics

Lila technology collaborates with an author engaged in a writing project. A model of the writing process is assumed, one that is considered natural for writing non-fiction, at least, and compliant with existing writing software. In this model, an author writes notes and organizes them into categories. Seven root categories are assumed to be fundamental to a writing project, folders than contain the written material. The categories are presented here not so much as Lila system requirements, but as a best practice, structures that optimize the writing process and Lila’s analytics. If you do not use these categories to organize your non-fiction writing project, you might consider doing so, whether or not you intend to use Lila.

Step in the Writing Process Structural Category/Folder Category/Folder Description Comparison with Pirsig’s categories
1 The author begins a project. A root Project folder is created, a repository for everything else. Project A single root folder. Contains all other folders and slips. Root folder may contain high level instructions regarding project plans, to do lists, etc., but these are not content  for Lila’s analysis. Like Pirsig’s PROGRAM slips, the Project folder may contain “instructions for what to do with the rest of the slips” but this information will not operate as a “program.” All programming functions will be handled by Lila code.
2 The author takes notes on ideas using various software programs on different devices. Many notes will require further thought before filing into the project. These notes get sent to an inbox, a temporary queue, a point for later conscious attention and classification. Project > Inbox The Inbox may be an email inbox or an Evernote notebook dedicated to an inbox function. There can be multiple inboxes. Notes in the inbox may be tentatively assigned categories and/or tags, but these will be reviewed. Inbox corresponds to Pirsig’s UNASSIMILATED category, “new ideas that interrupted what he was doing.”
3 Notes are filed into a main folder, a workspace for all the active content. Project > Work Notes in the Work folder are organized by categories and subject classified by tags. These notes are the target of Lila’s analytics. See upcoming post on subject classification for more information. The Work folder contains all the topic categories Pirsig developed as he was working.
4 Some ideas are considered worth noting, but either not sufficiently relevant or too disruptive to file into the main work. These notes should not be trashed, but parked for later evaluation. Project > Park Parked notes are excluded from Lila’s analytics, but can be brought back into play later. Park corresponds to Pirsig’s CRIT and TOUGH categories. I see these two categories as the positive and negative versions of the same thing, i.e., disruptive ideas. Don’t let them take over but don’t ignore them either. Let them hang out in the Park for awhile.
5 A primary function of Lila is to assist with the large volume of content that an author does not have time to read. On the web, the acronym TLDR is used, “Too Long; Didn’t Read.” Project > TLDR TLDR is not a flippant term. Content Management Systems typically have special handling for large files. Lila will generate notes (slips) from this unread content, and present it in context for embedded reading. Pirsig provided no special classification for unread content. Likely it just went in a pile, perhaps left unread.
6 Some notes, and chains of notes, seem important at one time but later are considered irrelevant or out of scope. This typically happens as the project matures and editing is undertaken. These notes are not trashed but archived for possible reuse later. Project > Archive Archived notes are excluded from Lila’s analytics. Perhaps a switch will allow them to be included. The archive could tie into version control for successive drafts. Pirsig filed these notes in JUNK, “slips that seemed of high value when he wrote them down but which now seemed awful.”
7 Other notes are just plain trash: duplicates, dead lines of thought. To avoid noise in the archive it’s best to trash them. Project > Trash Trashed notes are excluded from Lila’s analytics. These notes may be purged on occasion. Pirsig filed these notes in JUNK, but maintained them indefinitely.

 

“Actually, these last two piles, JUNK and TOUGH, were the piles that gave him the most concern.”

Phaedrus is the philosopher-protagonist in the well-known book, Zen and the Art of Motorcycle Maintenance by Robert Pirsig. Phaedrus is Robert Pirsig, the author, and his books represent a serious metaphysical inquiry. Lila is the lesser-known sequel in which Phaedrus refines and organizes his thought. It is the organizational elements that inspired my current software project. In the following quote, Phaedrus describes the information architecture of his project. It is elegant and complete, found in better organized folder systems, reflecting the natural development of thought.

In addition to the topic categories, five other categories had emerged. Phaedrus felt these were of great importance:

The first was UNASSIMILATED. This contained new ideas that interrupted what he was doing. They came in on the spur of the moment while he was organizing the other slips or sailing or working on the boat or doing something else that didn’t want to be disturbed. Normally your mind says to these ideas, ‘Go away, I’m busy,’ but that attitude is deadly to Quality. The UNASSIMILATED pile helped solve the problem. He just stuck the slips there on hold until he had the time and desire to get to them.

The next non-topical category was called PROGRAM. PROGRAM slips were instructions for what to do with the rest of the slips. They kept track of the forest while he was busy thinking about individual trees. With more than ten-thousand trees that kept wanting to expand to one-hundred thousand, the PROGRAM slips were absolutely necessary to keep from getting lost.

What made them so powerful was that they too were on slips, one slip for each instruction. This meant the PROGRAM slips were random access too and could be changed and resequenced as the need arose without any difficulty. He remembered reading that John Von Neumann, an inventor of the computer, had said the single thing that makes a computer so powerful is that the program is data and can be treated like any other data. That seemed a little obscure when Phaedrus had read it but now it was making sense.

The next slips were the CRIT slips. These were for days when he woke up in a foul mood and could find nothing but fault everywhere. He knew from experience that if he threw stuff away on these days he would regret it later, so instead he satisfied his anger by just describing all the stuff he wanted to destroy and the reasons for destroying it. The CRIT slips would then wait for days or sometimes months for a calmer period when he could make a more dispassionate judgment.

The next to the last group was the TOUGH category. This contained slips that seemed to say something of importance but didn’t fit into any topic he could think of. It prevented getting stuck on some slip whose place might become obvious later on.

The final category was JUNK. These were slips that seemed of high value when he wrote them down but which now seemed awful. Sometimes it included duplicates of slips he had forgotten he’d written. These duplicates were thrown away but nothing else was discarded. He’d found over and over again that the junk pile is a working category. Most slips died there but some reincarnated, and some of these reincarnated slips were the most important ones he had.

Actually, these last two piles, JUNK and TOUGH, were the piles that gave him the most concern. The whole thrust of the organizing effort was to have as few of these as possible. When they appeared he had to fight the tendency to slight them, shove them under the carpet, throw them out the window, belittle them, and forget them. These were the underdogs, the outsiders, the pariahs, the sinners of his system. But the reason he was so concerned about them was that he felt the quality and strength of his entire system of organization depended on how he treated them. If he treated the pariahs well he would have a good system. If he treated them badly he would have a weak one. They could not be allowed to destroy all efforts at organization but he couldn’t allow himself to forget them either. They just stood there, accusing, and he had to listen.

Pirsig, Robert M. (1991). Lila: An Inquiry into Morals. Pg. 25-26.

What is the difference between a Question and an Answer? Focus and Context Slips in Lila.

What is the difference between a Question and an Answer? Both are bunches of text. One could say that the Question is missing something that the Answer provides, but Questions and Answers are not often shaped as neatly as puzzles, with the missing part plugging easily into the whole. In natural language processing (NLP), the distinction between Question and Answer is understood in the cognitive terms of Focus and Context. Focus refers to attention, the text that is currently being analyzed. In Lila, each slip written by the author is analyzed as a Focus point, asking a Question of the large corpus of unread content. The Focus slip provides the particulars of the Question, but the point is always to find relevant Context that will shed new light on the Focus or expand it with new information. The Question queries the corpus and Lila responds with zero, one or many Context slips. Focus is joined to Context through the Question, as shown in this first figure.

focus and context slips

The Question is implemented as an NLP query. Search results are ranked by relevance. The rankings can be expressed as a correlation between Focus and Context slips. At this point the system has completed a portion of the cognitive work that used to be manual, i.e., reading and filtering from a large volume of material. That work is now completed automatically, and the author can instead focus of higher-order cognitive work, thinking about the Context and integrating it into the Focus.

An author might choose to modify a Focus slip, or “pin” a Context slip to it, creating a association for later work. In a traditional programmed system, these association would be maintained with unique identifiers, shown in the second figure as “slip1,” “slip42,” and so on, as shown in this second figure.

pin association

In cognitive systems, there is a shift away from using unique identifiers. It requires a change in thinking about how information is organized. In a traditional Von Neumann architecture, data is stored in structured tables and related through unique identifiers. This kind of system is normally planned carefully in advance because changes in the information architecture require costly database work. Cognitive systems like Lila are being designed to be more fluid, allowing for quickly shifting views and analysis, essentially changing information architecture on the fly. How is this possible? Consider that it is really the Question and its query that make the association. The query is the embodied link between the full text of a Focus slip and the full text of the Content slip. One can imagine a powerful cognitive system in which an Author edits a Focus slip and the system responds dynamically with new queries and Context slips.

Consider that idea again. It is really the Question and its query that make the association. The query is the embodied link between the full text of a Focus slip and the full text of the Content slip. We think of metadata as being “shorter” than content, e.g., “slip1″ is an identifier for a content record, a short item that stands in place of the longer content. This “shortness” is traditionally what makes metadata useful in organizing and finding content. Things change with cognitive systems. Unlike a unique identifier, a query is built from the full text of a Focus slip and maintains it meaning. The difference between metadata and content breaks down. (When surveillance agencies tell you they are only looking at metadata, this means they are also looking at content. Think about it.)

It Lila it will be practical to use unique identifiers to store temporary pinned associations between Focus and Context slips. As the work progresses the pins and their associations will disappear because the author will modify the Focus slips into a longer integrated stream of text for publication.

Lila is cognitive writing technology built on top of software like Evernote. Key differences.

evernoteWriters everywhere benefit from content management software like Evernote. Evernote can collect data from multiple devices and locations and organize it into a single writing repository. Evernote is beautiful software. For the last few years, I have been using Google Drive to collect notes. Recently I tried Evernote again, and I am impressed enough to switch. Notebooks, tags, collaboration, web clipping, related searches. All very nice.

Lila is cognitive writing technology built on top of software like Evernote. Here are some key differences between the products:

1. Evernote users read long-form content manually, decide if it is relevant, and then write notes to integrate it into their project. Lila will pre-read content for users and embed relevant notes (slips) in the context of the user’s writing. This will save the writer lots of reading and evaluation time.

2. Evernote users get “related searches” from a very limited number of web sources. Lila will perform open web searches for related content.

3. Evernote users can visualize a limited number of connections between notes. I am yet to get any utility out of this. Lila will use natural language search to generate a vast number of connections between notes, allowing a user to quickly understand complex relationships between notes.

4. Evernote users can use tags to construct a hierarchical organization of content. Notebooks can only have one sub-level of categorization, essentially chapters, but many writers need additional levels of classification. Tags can be ordered hierarchically and if you prefix them with a number they will sort in a linear order. You can use tags for hierarchical classification but it creates problems.

  • If you want both categories and tags, you will have to use a naming convention to split tags into two types.
  • Numbering tags causes them to lose type-ahead look-up functionality, i..e, you have to start by typing the number. It is a problem because numbers can be expected to change often.
  • If you decide to insert a category in the middle of two tags, you have to manually re-number all the tags below.
  • Tags are shared between Notebooks. Maybe that works for tags? Not for hierarchical sectioning of a single work.

None of these problems are technically insurmountable. I hope Evernote comes out with enhancements soon. I would like to build Lila on top of Evernote. Lila has something to add. To be cognitive means an inherent ability to automate hierarchical classification. Lila will be able to suggest hierarchical views, different ways of understanding the data, different choices for what could be a table of contents.