Teitl: Yr Amliadur: Frequency Lists for Contemporary Welsh (Version 1.0.0)
Dyfyniad
Knight D, Morris S, Tovey-Walsh B, et al. (2020). Yr Amliadur: Frequency Lists for Contemporary Welsh (Version 1.0.0). Cardiff University. https://doi.org/10.17035/d.2020.0120164107
Hawliau Mynediad: Creative Commons Attribution Share Alike 4.0 International
Dull Mynediad: I anfon cais i gael y data hwn, ebostiwch opendata@caerdydd.ac.uk
Manylion y Set Ddata
Cyhoeddwr: Cardiff University
Dyddiad (y flwyddyn) pryd y daeth y data ar gael i'r cyhoedd: 2020
Fformat y data: .xls, .pdf
Amcangyfrif o gyfanswm maint storio'r set ddata: Llai na 100 megabeit
Nifer y ffeiliau yn y set ddata: 4
DOI : 10.17035/d.2020.0120164107
DOI URL: http://doi.org/10.17035/d.2020.0120164107
Related URL: https://www.corcencc.org
Yr Amliadur contains the following sample frequency lists of contemporary Welsh language usage: The sample frequency lists are based on the CorCenCC (Corpws Cenedlaethol Cymraeg Cyfoes - National Corpus of Contemporary Welsh, Knight et al., 2020 which includes 14,338,149 tokens (circa 11.2-million-words). The data in CorCenCC represents a wide range of contexts, genres and topics and has, as far as possible, been anonymised using a combination of manual and automated techniques, and fully tagged in terms of part-of-speech (POS) and semantic categories. The research on which this frequency list dataset is based was funded by the UK Economic and Social Research Council (ESRC) and Arts and Humanities Research Council (AHRC) as the Corpws Cenedlaethol Cymraeg Cyfoes (The National Corpus of Contemporary Welsh): A community driven approach to linguistic corpus construction project (Grant Number ES/M011348/1). All outputs from the CorCenCC project are licensed under Creative Commons CC-BY-SA v4 and thus are freely available for use by professional communities and individuals with an interest in language. Bespoke applications and instructions are provided for each tool. When reporting information derived by using the CorCenCC corpus data and/or tools, CorCenCC should be appropriately acknowledged.
Disgrifiad
Prosiectau Cysylltiedig