close
Warning:
AdminModule failed with TracError: Unable to instantiate component <class 'trac.admin.web_ui.LoggingAdminPanel'> (super(type, obj): obj must be an instance or subtype of type)
- Timestamp:
-
Jan 17, 2017, 9:46:59 PM (9 years ago)
- Author:
-
hales
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v37
|
v38
|
|
| 12 | 12 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=amwac16&reload=1 amWaC16 corpus], 20 million tokens |
| 13 | 13 | |
| 14 | | Amharic Web corpus. Crawled by !SpiderLing in August 2013, October 2015 and January 2016. Cleaned, de-duplicated. Tagged by !TreeTagger trained on Amharic WiC. [AmharicCorpus Corpus deliverable/technical report] |
| | 14 | Amharic Web corpus. Crawled by !SpiderLing in August 2013, October 2015 and January 2016. Cleaned, de-duplicated. Tagged by !TreeTagger trained on Amharic WiC. [[BR]] [AmharicCorpus Corpus deliverable/technical report] |
| 15 | 15 | |
| 16 | 16 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=or_spoken Oromo spoken corpus], 7,500 tokens. |
| … |
… |
|
| 20 | 20 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=orwac16 orWaC16 corpus], 5.1 million tokens. |
| 21 | 21 | |
| 22 | | Oromo Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [OromoCorpus Corpus deliverable/technical report] |
| | 22 | Oromo Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [[BR]] [OromoCorpus Corpus deliverable/technical report] |
| 23 | 23 | |
| 24 | 24 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=sowac16 soWaC16 corpus], 80 million tokens. |
| 25 | 25 | |
| 26 | | Somali Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [SomaliCorpus Corpus deliverable/technical report] |
| | 26 | Somali Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [[BR]] [SomaliCorpus Corpus deliverable/technical report] |
| 27 | 27 | |
| 28 | 28 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=tiwac16 tiWaC16 corpus], 2.5 million tokens. |
| 29 | 29 | |
| 30 | | Tigrinya Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [TigrinyaCorpus Corpus deliverable/technical report] |
| | 30 | Tigrinya Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. [[BR]] [TigrinyaCorpus Corpus deliverable/technical report] |
| 31 | 31 | |
| 32 | 32 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=czech_norwegian_opus__norwegian Czech-Norwegian parallel corpus], 4 million aligned segments. |
| 33 | 33 | |
| 34 | | Czech-Norwegian parallel corpus from subtitles, OpenSubtitles2016 subcorpus of OPUS2, filtered for Czech and Norwegian. |
| | 34 | Czech-Norwegian parallel corpus from subtitles, OpenSubtitles2016 subcorpus of OPUS2, filtered for Czech and Norwegian. [[BR]] [ParallelCzechNorwegian Corpus deliverable/technical report] |
| 35 | 35 | |
| 36 | 36 | == Publications == |