Corpus linguistics is real(ly)? awesome

Now and then, you hear something, and you wonder why it was said the way it was said. For me, that is the phenomenon that you hear the word “real” without the prescriptively required adverbial “ly” as a modifier of adjectives:

I just heard some real bad news (Kanye West)

That shirt is real fly! (Fresh Prince of Bel-Air)

As said, one would expect “really bad” and “really fly”. These kinds of things attract my attention, and I decided to do a small corpus linguistic investigation to find out what is going on.

Continue reading

How to build a dataset for a corpuslinguistic study

Datasets are among the most important objects in a scientific study. It is best to stick to a widely used format for your dataset so that other people are able to understand what you have done. In order to find a good format for corpuslinguistic datasets, the nature of corpuslinguistic data needs to be investigated.

Continue reading