Big corpus

UK publishers produce over 180,000 books each year. (About one third are in digital formats.) So that’s a lot of words, even before the outputs of other countries are taken into account, and all the other words generated online — self published, or unpublished — and journal, magazine and newspaper articles. These large text corpuses are more than…More