Text mining (TM) has been one of the most frequently discussed methodologies in the humanities in the last year, along with many tools can help with some basic and some not so basic TM methodologies. Although it may seem like overkill, learning how to use the statistical software package R for TM is a great way to learn more about some fundamental processes and how you can get more control over your own TM explorations.
This introductory workshop will demonstrate how to:
install R
use the R console (like the command line)
create a set of text files to explore
explore the basic TM features
create a visualization of document similarity
Fred: I’ll be there. I’ve been using R for charting data and for making maps. I’d like to learn how you do text mining with R, and perhaps we can talk about R more generally.
Hi Fred, this looks great. I’ve been meaning to jump into R for a while, maybe this will be the spark I need. Should we supply our own text files, or will you have a set to work with?
@Lincoln: great! do be warned that this is incredibly introductory, but i’m certainly looking forward to hearing more about how you use R for mapping. Hopefully the session can branch out from my simple walk-thru.
@Zach: great question. i’ll supply a few trivial sets of texts for us to play with, but it would be fun to have your own as well, even if just a handful of to get practice working on a new set.
I wish I could be at THATCamp Prime this year, but since I can’t, I’ll at least contribute a suggestion: use RStudio (www.rstudio.com/ ) as the integrated development environment. It’s the best I’ve used in my (limited) R experience.