Verbario is our first attempt to extract lexical patterns using corpus statistics. A pattern is a structure that combines syntactic and semantic features and is linked to a conventional meaning of a word. This means, for example, that the verb to die does not have intrinsic meanings, but potential meanings which are activated by the context: in ‘His mother died when he was five’, the meaning of the verb differs from ‘His mother is dying to meet you’, due to collocational restrictions and syntactic differences. With the automatic analysis of thousands of concordances per verb, we can make a first approach to the problem of detecting these structures in corpora, a very time-consuming task for lexicographers. The average precision is around 50%. The next step to increase precision is adding a dependency parser to the system and make adjustments to the automatic taxonomy we have created for semantic labeling.
Web demo: http://www.verbario.com/
Funding: This research is supported by a grant from the Chilean Government: Conicyt-Fondecyt 11140704, “Detección automática del significado de los verbos del castellano por medio de patrones sintáctico-semánticos extraídos con estadística de corpus” (Automatic Extraction of patterns of use of Spanish verbs using corpus statistics). Lead researcher: Irene Renau.
+ Nazar, R.; Renau, I. (forthcoming). A Quantitative Analysis of the Semantics Of Verb-Argument Structures. In S. Torner and E. Bernal (eds.) "Collocations and other lexical combinations in Spanish. Theoretical and Applied approaches", Routledge.
+ Nazar, R.; Renau, I. (forthcoming). Automatic extraction of lexico-semantic patterns from corpora. Proceedings of EURALEX 2016. Tbilisi, Georgia.
Related concepts: computational lexicography; lexical patterns; Spanish verbs; taxonomy
Contact: irene.renau at gmail.com