close
Warning:
AdminModule failed with TracError: Unable to instantiate component <class 'trac.ticket.admin.PriorityAdminPanel'> (super(type, obj): obj must be an instance or subtype of type)
- Timestamp:
-
Jan 17, 2017, 12:57:49 PM (9 years ago)
- Author:
-
xsuchom2
- Comment:
-
Web corpora names
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v30
|
v31
|
|
| 10 | 10 | Amharic WIC corpus (News from Walta Information Center), manually tagged. |
| 11 | 11 | |
| 12 | | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=amwac16&reload=1 Amharic WaC corpus], 20 million tokens |
| | 12 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=amwac16&reload=1 amWaC16 corpus], 20 million tokens |
| 13 | 13 | |
| 14 | 14 | Amharic Web corpus. Crawled by !SpiderLing in August 2013, October 2015 and January 2016. Encoded in UTF-8, cleaned, deduplicated. Automatically tagged by !TreeTagger trained on Amharic WiC |
| … |
… |
|
| 17 | 17 | |
| 18 | 18 | Oromo spoken corpus containing 1205 utterances. Built by Text Laboratory, University of Oslo. |
| 19 | | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=orwac16 Oromo WaC corpus], 5.1 million tokens. |
| | 19 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=orwac16 orWaC16 corpus], 5.1 million tokens. |
| 20 | 20 | |
| 21 | 21 | Oromo Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. |
| 22 | 22 | |
| 23 | | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=sowac16 Somali WaC corpus], 80 million tokens. |
| | 23 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=sowac16 soWaC16 corpus], 80 million tokens. |
| 24 | 24 | |
| 25 | 25 | Somali Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. |
| 26 | 26 | |
| 27 | | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=tiwac16 Tigrinya WaC corpus], 2.5 million tokens. |
| | 27 | * [http://corpora.fi.muni.cz/habit/run.cgi/first_form?corpname=tiwac16 tiWaC16 corpus], 2.5 million tokens. |
| 28 | 28 | |
| 29 | 29 | Tigrinya Web corpus crawled by !SpiderLing in January 2016. Cleaned, de-duplicated. |