« April 2007 | Main | June 2007 »

May 21, 2007

DH 2007 Abstracts

As has been the case for the past 18 years, this year's Digital Humanities Conference (formerly the AHC/ALLC Conference) has a wealth of interesting presentations. The conference has posted the abstracts online at:
http://www.digitalhumanities.org/dh2007/abstracts/

Posted by hag at 10:59 AM | Comments (0)

May 18, 2007

Armadillo: Historical Data Mining

Armadillo: Historical Data Mining

"This project examines new ways of extracting ('mining') relevant information from unconnected electronic sources. It is an attempt to answer the question of how to locate and interpret information contained in distributive online research datasets effectively, using criteria acceptable to the Arts and Humanities community."

They will be presenting at the Digital Resources in the Humanities conference, Sept. 2007

Posted by hag at 9:46 AM | Comments (0)

WMatrix: text analysis + semantic analysis

Text Analysis programs that can do word frequency, KWIC, concordancing, etc. are fairly well-established (cf: Harald Klein's text analysis informational pages or U of Alberta's TaPOR site).

WMatrix is a web-based tool that does the standard analysis but, like more recent knowledge mining applications, "extends the keywords method to key grammatical categories and key semantic fields." It also adds a log-likelihood tool to "perform a comparison of the frequency list for their corpus against another larger normative corpus such as the BNC sampler, or against another of their own texts." And does all of this on tagged (HTML, SGML, XML) texts that you upload to their site. Only downside: after the free subscription runs out it costs £100/yr to subscribe.


Related Links
1) GATE: General Architecture for Text Engineering, University of Sheffield
2) Nasukawa, T. and T. Nagano, "Text analysis and knowledge mining system"
3) Overview of natural language processing at wikipedia

Posted by hag at 8:53 AM | Comments (0)

May 4, 2007

O'Reilly Rough Cuts

In January 2006 O'Reilly debuted their "Rough Cuts" program. Here's how they describe it:

"When you buy a book on the Rough Cuts service, you get access to an evolving manuscript. You can read it online, download as a PDF, or print. Once you've purchased a Rough Cuts title, you have a chance to shape the final product - you can send suggestions, bug fixes, and comments directly to the author and editors."

"You have your choice in the Rough Cuts program of purchasing just online access, just the print book when it releases, or the best of both worlds - online access immediately and the print book later."

This is a great model for infotech books, which often suffer from the speed differential between paper-bound publishing and techno change.

Of particular interest is the option to comment on the book as it is being written, a sort of collaboration between author and reader that might seem distasteful to literary traditionalists but is eminently useful in the case of technology books. How would a similar model work in the humanities, I wonder?

Posted by hag at 9:36 AM | Comments (0)