FrameNet Lexical Database: Presenting a Few Frames Within the Risk Domain | Leksička baza Frejmnet: nekoliko primera okvira iz domena rizika |
INFOtheca, Scientific paper [pdf] | INFOteka, Naučni rad [pdf] [WikiData] |
ID: 1.2021.1.1 Number: 1 Volume: 21 Year: 2021 UDC: 81’322.2 [tmx] [bow] |
Aleksandra Marković Institution: Institute for Serbian Language, SASA Mail: aleksandra.markovic@isj.sanu.ac.rs | Aleksandra Marković Institucija: Intitut za srpski jezik SANU E-pošta: aleksandra.markovic@isj.sanu.ac.rs |
Ranka Stanković Institution: University of Belgrade, Faculty for Mining and Geology Mail: ranka.stankovic@rgf.bg.ac.rs | Ranka Stanković Institucija: Univerzitet u Beogradu, Rudarsko-geološki fakultet E-pošta: ranka.stankovic@rgf.bg.ac.rs |
Natalija Tomić Institution: University of Belgrade, Faculty for Mining and Geology Mail: ntomic@hotmail.com | Natalija Tomić Institucija: Univerzitet u Beogradu, Rudarsko-geološki fakultet E-pošta: ntomic@hotmail.com |
Olivera Kitanović Institution: University of Belgrade, Faculty for Mining and Geology Mail: olivera.kitanovic@rgf.bg.ac.rs | Olivera Kitanović Institucija: Univerzitet u Beogradu, Rudarsko-geološki fakultet E-pošta: olivera.kitanovic@rgf.bg.ac.rs |
Abstract This paper gives a short overview of the frame semantics theory that forms the theoretical basis of the Berkeley FrameNet project. We present the basic concepts of this database, as well as the possibility of implementing it in Serbian. We also take a close look at the lexical analysis used in the FrameNet development project and point out the differences between the frame-based lexical analysis and its word-based counterpart. This is followed by an illustration of a couple of related frames evoked by words from the risk domain. FrameNet data is also readily available through the Python API included in the NLTK (Natural Language Toolkit) suite, which provides a good natural language processing resource. The last chapter shows a corpus search of the noun risk in a mining-themed corpus. We also present its most common collocates, word sketch, individual pattern concordances, thesaurus entry of its synonyms and related words, collocation frequency graphs. A word cloud for the word risk is also included. | Apstrakt U radu se daje kratak prikaz teorije semantike okvira (engl. Frame Semantics), na kojoj je zasnovana leksička baza Frejmnet (engl. FrameNet). Predstavljena je koncepcija ove mreže, kao i mogućnosti njene primene. Predstavljena je i leksička analiza koja se primenjuje u projektu izrade Frejmneta i ukazano na razlike između analize zasnovane na okviru u odnosu na analizu zasnovanu na reči. Zatim je prikazano nekoliko povezanih okvira koje prizivaju reči iz domena rizika. U radu je predstavljena i platforma NLTK (engl. Natural Language Toolkit), pomoću koje se mogu koristiti razni jezički resursi, među njima i Frejmnet. Završno poglavlje pruža analizu imenice rizik na korpusu rudarstva. Predstavljeni su najčešći kolokati ove imenice, skica njene upotrebe, konkordance za pojedine modele, pronalaženje sinonima i povezanih reči u vidu tezaurusa, grafički prikaz frekvencija pojedinih kolokacija, kao i oblaka reči. |
Keywords: Serbian language, frame
semantics, FrameNet, risk scenario, mining
corpus, natural language processing | Ključne reči: Srpski jezik, semantika okvira, Frejmnet, scenario rizika, korpus rudarstva, obrada prirodnih jezika |
Pages: 7-33 | Strane: 7-36 |
Publishing place: Publisher: Publishing year: | Mesto izdanja: Izdavač: Godina izdanja: |
Translator: | Prevodilac: |