Creation and Analysis of the Yugoslav Rock Song Lyrics Corpus from 1967 to 2003
| Kreiranje i analiza korpusa tekstova jugoslovenskih rok pesama od 1967-2003. |
INFOtheca, Scientific paper [pdf] | INFOteka, Naučni rad [pdf] [WikiData] |
ID: 1.2019.1.1 Number: 1 Volume: 19 Year: 2019 UDC: 811.163.41’322 [tmx] [bow] |
Ljudmila Petković Institution: University of Belgrade Mail: ljudmila.petkovic@gmail.com | Ljudmila Petković Institucija: Univerzitet u Beogradu E-pošta: ljudmila.petkovic@gmail.com |
Abstract The paper analyses the process of creation and processing of the Yu-goslav rock song lyrics corpus from 1967 to 2003, from the theoretical and practical perspective. The data have been obtained and XML-annotated using the Python programming language and the libraries lyricsmaster/yattag. The corpus has been preprocessed and basic statistical data have been generated by the XSL transformation. The diacritic restoration has been carried out in the Slovo Majstor and LeXimir tools (the latter application has also been used for generating the frequency analysis). The extraction of socio-cultural topics has been performed using the Unitex software, whereas the prevailing topics have been visualised with the TreeCloud software. | Apstrakt U radu se sa teorijskog i praktičnog aspekta analizira proces obrazovanja i obrade korpusa tekstova jugoslovenskih rok pesama od 1967-2003. Za preuzimanje građe i XML anotiranje korišćene su biblioteke lyricsmaster i yattag u jeziku Python. Korpus je prošao fazu preprocesiranja, a XSL transformacijom generisani su osnovni statistički podaci. U aplikacijama Slovo Majstor i LeXimir sprovedena je restauracija dijakritika (a u drugoj aplikaciji i frekvencijska analiza).
Pronalaženje društveno-političkih tema vršeno je u softveru Unitex, dok su preovlađujuće teme vizualizovane u TreeCloud aplikaciji. |
Keywords: corpus linguistics, Yugoslav rock and roll, web scraping, natural language processing, text mining. | Ključne reči: korpusna lingvistika, jugoslovenski rokenrol, grebanje veba, obrada prirodnih jezika, kopanje po tekstu. |
Pages: 5-29 | Strane: 5-29 |
Publishing place: Publisher: Publishing year: | Mesto izdanja: Izdavač: Godina izdanja: |
Translator: | Prevodilac: |