The most popular and comprehensive Open Source ECM platform
Text analytics, also referred to as Text mining, is used in business intelligence and data analysis to identify structure and context in text data. Text analytics uses a variety of linguistic, statistical, and machine learning techniques. Seth Grimes, independent analyst for the analytics market, estimates the size of the Text Analytics market to be on the order of $1 billion.
The Text Analytics market is rapidly consolidating with numerous mergers and acquisitions over the past year. Oracle purchased Endeca for it’s technology to mash up unstructured data from many different data sources. HP acquired similar technology from Autonomy. IBM picked up Vivisimo and Lexmark for their rich search features.
Popularity of Big Data is driving huge interest in the area of text analytics, and that interest is causing the technology to evolve quickly. Text analytics techniques, for example, are becoming increasingly richer and more sophisticated in the information and insight which they can provide.
The ‘Tag Cloud’, for example, is a simple text analytics technique that was popularized during the rise of Web 2.0. But at the recent Text Analytics Summit in Boston, David Williams, manager of marketing analytics and optimization for Orlando, Fla.-based Walt Disney World, announced that Tag Clouds are dead. Williams said that with a Tag Cloud that “You’re just looking at words and you don’t know the context of those words.” Instead, Williams said that a better technique is one called ‘concept linking’ and available in the Text Miner product from SAS. The tool allows the depth of a relationship to be visualized by using thicker lines when the links are drawn between words. But this is just one of many tools and products for text analytics that are becoming available.
Increasingly Text analytics applies semantics to derive inter-relationships between data. The concept of the ‘semantic web’ has been talked about for some time but has not yet been fully realized. Part of the problem has been that there are no standards for the technology, but as interest in the area of text analytics continues, that is likely to change.