README The Swedish Treebank, version 1.1 (2010-03-29) Documentation ------------- To view the documentation, open doc/index.html in your web browser. Corpus Data ----------- The corpus data is divided into two subcorpora - one for Talbanken data and one for SUC data. Each subcorpus is available in three forms: 1) document-by-document, 2) train-test split, and 3) all-in-one. See "Document Structure and Encoding Formats" in the documentation for details. Folder Structure ---------------- doc/ index.html ... corpus/ talbanken/ talbanken.xml talbanken.conll talbanken-train.xml talbanken-train.conll talbanken-test.xml talbanken-test.conll parts/ P101.xml P101.conll ... P418.xml P418.conll suc/ suc.xml suc.conll suc-train.xml suc-train.conll suc-test.xml suc-test.conll parts/ aa01.xml aa01.conll ... kr06.xml kr06.conll