
Actionable Data by Daniel Hopping
Last month I talked about the explosion of unstructured information in retail and the new technologies to tame its growth. This month I want to discuss how you can get actionable data from some of that content.
Unstructured data includes new technologies such as Blogs and Wikis. Many companies have embraced these forms of communications but don’t have an effective way to comprehend the content in a way that can be actionable.
Blogs now number in the millions and according to the latest issue of Forbes, are growing at 100,000 per day. Anyone can now create their own free blog on Google’s Blogger.com. Blogs have both made companies and crippled companies. Many companies are starting their own blogs to counter blogs of dissatisfied customers. There are numerous Blog Search engines but comprehending the range of content is too much even for the most sophisticated searches.
By the way, blogs are old hat now.
Wikis are the newest form of community interface. There are even blogs about Wikis. Wiki-wiki is Hawaiian for quick. A Wiki is like a blog on steroids that allows content to be cross referenced, organized and easy to follow a thread. To best understand the Wiki, visit http://www.wikipedia.com and create an entry in any subject that you have an interest in. Wikipedia is an on-line free encyclopedia that any one can edit or create new topics. In fact one blog has a wiki joke.
Q. How many Wiki people does it take to change a light bulb?
A. One, but anyone can change it back.
Retailers are starting to use the Wiki technology for knowledge management. Buyers, merchants, store managers and vendors can use the Wiki to quickly input unstructured information such as a customer comment about a product or service. They can input information about a competitor, marketing promotions, new colors for fashion, what’s selling and what’s not. This information can be made instantly available to everyone in the company who needs it.
Still who has time to read through the growing terabytes of data for relevant content? IBM research has developed a tool for visualizing such evolving documents and the postings of multiple authors.
It is called History Flow Visualization.
History Flow Visualization Application is a “tool for visualizing dynamic, evolving documents and the interactions of multiple collaborating authors”. The application includes online help, as well as a plug-in for retrieving the history of a given page from any MoinMoin “wiki.” (MoinMoin is an advanced, wiki engine.)
How does it work?
History Flow Visualization Application represents each document as a vertical line whose length corresponds to the length of the document. The technology then applies a standard “diff” algorithm to successive versions of a document, using periods and angle brackets to indicate changes. This level of detail is effective for free-form prose. After matching passages of successive versions have been identified, the matches are represented onscreen by a parallelogram connecting the appropriate sections of the document segments. Both segments and parallelograms are colored to indicate authorship.
History Flow Visualization Application has four main visualization modes that allow the user to comprehend the flow of changes and better understand the underlying pattern in the content.
Community view:
This is the default mode and it shows all contributions from different authors, color-coding the text to indicate the author of each sentence.
Individual author view:
This mode highlights the contributions of a single author and it depicts the persistence of these contributions over time.
Recent Changes View:
This mode highlights the new content in each version of the Wiki page, independent of authorship. This view allows us to see what portions of the text have been edited the most over time. For instance this view might point to an emerging trend.
Age View:
This mode has no colors representing authorship; instead, the focus is on the persistence of different contributions. A gray scale gradient goes from white (brand-new contribution) to dark gray (very old contribution). The patterns revealed by History Flow Visualization Application show such information as spacing by date; occurrences of vandalism; authorship; growth; and persistence.
The history flow application charts the evolution of a document as it is edited by many people using a very simple visualization technique.
Imagine a scenario where several people will make contributions to a Wiki page at different points in time. Each person edits the page and then saves their changes to what becomes the latest version of that page. History flow connects text that has been kept the same between consecutive versions; in other words, it connects corresponding segments on the lines representing versions. Pieces of text that do not have correspondence in the next (or previous) version are not connected and the user sees a resulting “gap” in the visualization; this happens for deletions and insertions.
A simple example
Here’s an example of a simple page with just a few edits: the first eight versions of the Wikipedia entry for IBM. The page has three named authors (listed at left), including a script which changed some formatting. Each author is given a unique color. Several anonymous authors also made contributions; their insertions are shown in shades of gray. The green regions show the contributions of the initial author, Peter Winnberg, many of which persist throughout the versions shown. Text that persists over time is darkened to indicate its age, so at the right side of the diagram Peter Winnberg’s contributions have changed from bright green to dark green.
Related work
There are many existing methods for visualizing document revisions. Several popular source control systems include the capability to color-code changed regions in files, and to show a side-by-side comparison of two files, graphically connecting matching sections. Other methods use a thumbnail view of a program, with line-by-line coloring to indicate authorship or age; see for example the work by Eick and others on software visualization. History flow diagrams have some visual similarity to Theme River ™ and to Inselberg’s parallel coordinates, but our method depicts a completely different type of data. As far as we know the timeline visualization introduced here is new, but please let us know if you’re familiar with other work we should cite.
History Flow is available to the public on IBM’s Alphaworks Website
History Flow was developed by Martin Wattenberg and Jonathan Feinberg for visualizing patterns in very complex and dynamic collaborative content.
Daniel Hopping is a global technology futurist, author, consultant and speaker. With four decades of hands-on experience, Dan’s area of expertise is forecasting the impact that technology will have on the retail industry and tomorrow’s consumer.
Copyright © Daniel Hopping
About the Author
Daniel Hopping is a global technology futurist, author, consultant and speaker. With four decades of hands-on experience, Dan’s area of expertise is forecasting the impact that technology will have on the retail industry and tomorrow’s consumer
Book keynote speaker Dan Hopping to speak at your meeting www.keynoteresource.com/DanielHopping.html
Amazon Affiliate WordPress Plugin – Product Style 2

No comments yet.
Leave a comment