« O'Reilly Rough Cuts | Main | Armadillo: Historical Data Mining »

May 18, 2007

WMatrix: text analysis + semantic analysis

Text Analysis programs that can do word frequency, KWIC, concordancing, etc. are fairly well-established (cf: Harald Klein's text analysis informational pages or U of Alberta's TaPOR site).

WMatrix is a web-based tool that does the standard analysis but, like more recent knowledge mining applications, "extends the keywords method to key grammatical categories and key semantic fields." It also adds a log-likelihood tool to "perform a comparison of the frequency list for their corpus against another larger normative corpus such as the BNC sampler, or against another of their own texts." And does all of this on tagged (HTML, SGML, XML) texts that you upload to their site. Only downside: after the free subscription runs out it costs £100/yr to subscribe.


Related Links
1) GATE: General Architecture for Text Engineering, University of Sheffield
2) Nasukawa, T. and T. Nagano, "Text analysis and knowledge mining system"
3) Overview of natural language processing at wikipedia

Posted by hag at May 18, 2007 8:53 AM

Comments

Post a comment

Thanks for signing in, . Now you can comment. (sign out)

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)


Remember me?