El COC del CUB: un corpus per a l'estudi de la conversa col·loquial
As in other languages, in the case of Catalan there has been a great deal of research interest during recent years in patterns of authentic spoken language, especially spontaneous and informal speech. This article introduces the Corpus Oral de Català Col·loquial (COC), a corpus of oral Catalan colloquial speech created with the aim of providing students with adequate material to study colloquial conversations in Catalan. This corpus is part of a more extensive project, namely the Corpus de Català Contemporani de la Universitat de Barcelona (CUB), a large corpus of contemporary Catalan developed at the University of Barcelona, which includes other sub-corpora that capture the geographical, social and functional variety of contemporary Catalan. The COC consists of 50 colloquial conversations (with four native speakers of Eastern Catalan, the majority of whom are from the urban region around Barcelona), which were recorded, transcribed and orthographically unified. Ten of these conversations, which form the so-called basic block, have already been corrected and revised. These can be made available to interested researchers.
Dieses Werk steht unter der Lizenz Creative Commons Namensnennung - Nicht-kommerziell - Keine Bearbeitungen 4.0 International.