AntConc, The Place To Start

by Dr. Ryan Nichols, Philosophy, Cal State Fullerton, Orange County CA

Its simplicity, functionality, and support suggest that humanists ought to begin their explorations in text analytics with AntConc.

As we get started with this post, to have a look at the tool and to download AntConc, head here. You’ll see that AntConc is one of a suite of tools created and written (in Perl) by Laurence Anthony, Professor in the Faculty of Science and Engineering at Waseda University, Japan. While we may write some posts about Dr. Anthony’s other tools, in this one we will tour AntConc alone. We are working with v3.4.3 for Mac released in September 2014.

Many features of AntConc recommend it, especially for first-timers. First, it is multi-platform. AntConc runs on Linux, Mac and PC. For other tools, even costly tools, I need to restart my Mac and boot up Windows from Boot Camp, which wastes time and slows the workflow. Being multi-platform also allows easy file-sharing and collaboration with others. Second, AntConc is supported not only by a developer dedicated to seeing these tools in the hands of people who will use them, but by an educator with a passion for teaching how to use his tools effectively. I know of no other corpus linguistics program that has as dedicated a developer and as passionate a community of users as AntConc. The youtube support for AntConc is second to none, with patient walkthroughs from by Dr. Anthony himself of every major feature of AntConc. (Laurence Anthony’s youtube channel is found here.) Third, AntConc builds in a lot of functionality for a reliable, freeware program. In this post we will introduce its central instrument, its concordance tool, but AntConc offers seven functions, as pictured.

antconc1

Let’s introduce the concordance tool by posing a few research questions that humanists might ask of a corpus. First, suppose your research questions revolve around issues of religion and faith in the early novels of James Joyce. You might ask: What keywords occur most frequently in the context of the term ‘God’ in James Joyce’s earlier novels?

AntConc delivers the data for this answer in a flash with its concordance tool. Simply download the UTF-8 text files of novels from Project Gutenberg then upload them to AntConc (File/Open File). For demo purposes, I downloaded three novels for testing: Portrait of an Artist as a Yong Man, Dubliners, and Ulysses. Click on the ‘Concordance’ window in the upper left and type ‘God’ in the search bar at the bottom. You can view the results, sorted by book, in the sentence window that pops up. As you see in the image below, Dubliners appears to contain 36 occurrences of ‘God’. On the right side of the display window, you’ll note that the 37th occurrence marks the transition from Dubliners to Portrait.

antconc2

Once you have made the concordance, AntConc offers several ways to manipulate it. First, you might want to download the data from this search for further study. Go to File/Save Output to Text. This can then be imported easily into Excel or Word. Second, you can also crop the findings at an arbitrary number of characters from ‘God’. Perhaps your first concordance swept up too many stopwords or filler. Open up the search parameters to gather up more words. Third, you can tally the rank of these words in the corpus as a whole using the n-gram function. This feature may be of special importance to humanists interested in learning more about the way ‘God’ relates to other words in these novels. In that case, your first stop will be the n-gram or clustering tool.

To use this, click on the ‘Clusters/N-Grams’ tab at the top. Once there you are faced with some additional choices in the lower middle of AntConc’s GUI. The n-gram window is adjustable. This means that you can tell AntConc to display only the range of words surrounding the target word in which you are most interested. In this case, I set AntConc to collect keywords-in-context with ‘God’ on the right, with a minimum cluster size of 2 and a max of 5. The results are pictured below.

antconc3

These results might assist answering certain research questions and raising others. I’ll leave to the Joyce expert the job of integrating these clusters into an interpretation of Joyce on God and faith. But no doubt for humanists interested in working with corpus linguistics tools, getting the counts of n-grams and clusters with a target word is insufficient to answer any serious questions. What they will like to do is combine such data with close readings. Here too AntConc has something to offer.

If you are following along, using AntConc with me, then return to the ‘Concordance’ tab to view our original results. Scroll down to hit #30, which looks interesting. In AntConc’s window we read the KWIC for this hit as “here a good holy pious and God-fearing Roman Catholic.” To see this hit in the context of its occurrence in Dubliners, click on the target term ‘God’, which should appear in blue. By doing so, AntConc will swing you over to its file viewer to read the entire sentence in the context in which that occurs. It turns out that this is a snippet of dialogue spoken by Mr. Power in the story “Grace,” as pictured.

antconc4

This might open up further avenues of research for the Joyce scholar. Does “Grace” reinforce Catholic social mores through Mr. Power’s work with Tom Kernan? Does the representation of God in Dubliners have a different moral valence than in the other two novels sampled here? We will continue this exploration in the next how-to post about AntConc.