Query a text corpus with Python

Some corpora come without a search interface. How do you search in them? Perhaps you read them into a concordance program like AntConc, but then you notice that the corpus has some weird idiosyncratic format that messes with the lines. AntConc quickly becomes pretty unusable if that is the case. So, what can you do? The simplest solution is to write a small Python script!

Continue reading

Advertisements

Save your datasets as csv files

Although it is a good idea to build your datasets in spreadsheet software, it is an even better idea to save your dataset (after you are ready with the annotation, of course) into the csv format.

Continue reading