This resource contains a multilingual digitized version of thousands of documents describing natural languages of the world. The corpus is annotated with various meta, word, and text level attributes, and is password protected for copyright reasons. More details about the data and annotations can be found in the reference given below:
There is also an openly available part of the corpus which can be found here.