Termout.org logo/LING


Update: February 24, 2023 The new version of Termout.org is now online, so this web site is now obsolete and will soon be dismantled.

Lista de candidatos sometidos a examen:
1) extraction (*)
(*) Términos presentes en el nuestro glosario de lingüística

1) Candidate: extraction


Is in goldstandard

1
paper corpusSignosTxtLongLines181 - : Turney, P. (1997). Extraction of keyphrases from text: Evaluation of four algorithms . Ottawa: National Research Council Canada. Technical Report ERB-1051. [ [118]Links ]

2
paper corpusSignosTxtLongLines416 - : Open Information Extraction from real Internet texts in Spanish using constraints over part-of-speech sequences: Problems of the method, their causes, and ways for improvement

3
paper corpusSignosTxtLongLines416 - : sequences. We identify the causes of the errors of each type and suggest ways for preventing such errors with corresponding analysis of their cost and scale of impact. The analysis is performed for extractions from two Spanish-language text datasets: the FactSpaCIC dataset of grammatically correct and verified sentences and the RawWeb dataset of unedited text fragments from the Internet . Extraction is performed by the ExtrHech system.

4
paper corpusSignosTxtLongLines416 - : Resolution of the coordinating conjunctions results in extraction of three tuples:

5
paper corpusSignosTxtLongLines416 - : The relation phrase in this extraction is incorrect: it should be “llevó a cabo”, i .e., include the noun cabo of the idiom llevar a cabo, lit. “to bring to accomplishing”, which is not supported by the current algorithm. In addition, since a verb relation phrase is the first element of the relation tuple looked for by ExtrHech, incorrect detection of the relation phrase in most of the cases leads to incorrect detection of the arguments. In this example, the second argument should be numerosas manifestaciones públicas.

6
paper corpusSignosTxtLongLines416 - : and the corresponding extraction:

7
paper corpusSignosTxtLongLines416 - : However, the system generated the extraction:

8
paper corpusSignosTxtLongLines416 - : Therefore, the extraction:

9
paper corpusSignosTxtLongLines416 - : The sentence tells us that the first hominids ate meat only under some specific conditions, namely, when they encountered leftovers. Therefore, the extraction:

10
paper corpusSignosTxtLongLines416 - : The correct extraction would be:

11
paper corpusSignosTxtLongLines416 - : that looks similar to a passive construction, and the corresponding erroneous extraction:

12
paper corpusSignosTxtLongLines416 - : We have analyzed in detail the errors typical to the method of Open IE based on heuristic rules over POS-tags. No detailed description or accurate classification of the errors had been reported before, although some types of errors along with some issues were mentioned by Fader et al. (2011), but not distinguished. We have distinguished between errors and their sources. We have classified all information extraction errors into four types based on the component of an extracted fragment where an error occurred: incorrect relation phrase, incorrect arguments, incorrect argument order, and incorrect arguments with correct relation phrase . This classification is complete: it covers all possible errors.

13
paper corpusSignosTxtLongLines496 - : Extraction Method: PCA/ Rotation Method: Varimax with Kaiser Normalization/ Q: question

14
paper corpusSignosTxtLongLines533 - : Herramientas computacionales con alto nivel de precisión están en auge para el procesamiento automático de textos escritos. ^[86]Venegas, Zamora y Galdames (2016), por ejemplo, utilizaron patrones léxico-gramaticales y léxico-semánticos utilizando la plataforma WEKA (Waikato Environment for Knowledge Analysis) para clasificar automatizadamente las macromovidas de trabajos finales en el género tesis de licenciatura. La plataforma WEKA es una colección de algoritmos para el análisis de datos y modelado predictivo. ^[87]Cotos, O’Connor, Chapelle, Coetze, Brabanter, Gilbert y Mac Donald (2016) están desarrollando el Automated Functional Language Extraction System (AFLEX), una herramienta computacional capaz de analizar secciones individuales en los artículos de investigación escritos por estudiantes, generar retroalimentación especifica de la disciplina basada en las convenciones retóricas de este género y proveer diferentes pautas guiadas para la escritura . A pesar de que muchas de estas

Evaluando al candidato extraction:


1) incorrect: 7
2) errors: 7 (*)
3) relation: 6
5) phrase: 4 (*)
8) corresponding: 3 (*)
9) correct: 3

extraction
Lengua: eng
Frec: 92
Docs: 23
Nombre propio: 5 / 92 = 5%
Coocurrencias con glosario: 3
Puntaje: 3.790 = (3 + (1+4.95419631038688) / (1+6.53915881110803)));
Candidato aceptado

Referencias bibliográficas encontradas sobre cada término

(Que existan referencias dedicadas a un término es también indicio de terminologicidad.)
extraction
: Akbik, A. & Loser, A. (2012). Kraken: N-ary Extractions in Open Information Extraction. Proceedings of the AKBC-WEKEX 2012, 52-56.
: Banko, M., Cafarella, M. J., Soderland, S., Broadhead, M & Etzioni, O. (2007). Open Information Extraction from the Web. Proceedings of the IJCAI 2007, 2670-2676.
: Barbu, V. (2008). Hyponymy patterns: Semi-automatic extraction, evaluation and inter-lingual comparison. En P. Sojka, A. Horak, I. Kopecek & P. Karel (Eds.), Text, Speech and Dialogue (pp. 37-44). Berlin: Springer.
: Basili, R., Moschitti, A., Pazienza, M. T. & Zanzotto, F. M. (2001). A contrastive approach to term extraction. Ponencia presentada en el 4th Terminological and Artificial Intelligence Conference. Nancy, Francia.
: Bast, H. & Haussmann. E. (2013). Open Information Extraction via contextual sentence decomposition. Proceedings of the ICSC 2013, 154-159.
: Bourigault, D. (1992). Surface grammatical analysis for the extraction of terminological noun phrases. Ponencia presentada en el 14th International Conference on Computational Linguistics. Nantes, Francia.
: Castella Xavier, C., Souza, M. & Strube de Lima, V. (2013). Open Information Extraction based on lexical-syntactic patterns. Proceedings of the Brazilian Conference on Intelligent Systems, 189-194.
: Ciaramita, M. & Altun, Y. (2006). Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (pp. 594-602). Association for Computational Linguistics.
: Del Corro, L. & Gemulla, R. (2013). ClausIE: Clause-based Open Information Extraction. Proceedings of the World Wide Web Conference (WWW-2013), 355-366.
: Dennis, S. (2004). An unsupervised method for the extraction of propositional information from text. Proceedings of the National Academy of Sciences, 101, 5.206-5.213.
: Etzioni, O., Banko, M., Soderland, S. & Weld, D. S. (2008). Open Information Extraction from the Web. Commun. ACM, 51(12), 68-74.
: Fader, A., Soderland, S. & Etzioni, O. (2011). Identifying relations for Open Information Extraction. Proceedings of the EMNLP 2011, 1535-1545.
: Favre, B., Grishman, R., Hillard, D., Hi, H., Hakkani-Tür, D. & Ostendorf, M. (2008). Punctuating speech for information extraction. Spoken Language Technologies. [en línea]. Disponible en: http://www.cs.nyu.edu/hengji/ssie.pdf
: Gamallo, P. (2014). An overview of Open Information Extraction. Proceedings of the SLATE’ 14, 13-16.
: Gamallo, P., Garcia, M. & Fernández-Lanza, S. (2012). Dependency-based Open Information Extraction. Proceedings of the ROBUS-UNSUP 2012, 10-18.
: Jackson, P. & Moulinier, I. (2003). Natural language processing for online applications. Text retrieval, extraction and categorization. Philadelphia: Benjamins.
: Kadhim A. I. (2019). Term weighting for feature extraction on Twitter: A comparison between BM25 and TF-IDF. En actas de la International Conference on Advanced Science and Engineering (ICOASE) (pp. 124-128). Kurdistán: University of Zakho and Duhok Polytechnic University.
: Kim, J. & Moldovan, D. (1993). Acquisition of semantic patterns for Information Extraction from Corpora. Proceedings of the 9^th IEEE Conference on AI for Applications, 171-176.
: Kolesnikova, O. (2011). Automatic extraction of lexical functions. Tesis doctoral, Instituto Politécnico Nacional, Ciudad de México, México.
: López Rodríguez, C. I. (2007). Understanding scientific communication through the extraction of the conceptual and rhetorical information codified by verbs. Terminology, 13(1), 61-84.
: Mausam, Schmitz, M., Bart, R., Soderland, S. & Etzioni, O. (2012). Open language learning for Information Extraction. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL ‘12), 523-534.
: Patry, A. & Langlais, P. (2005). Corpus-based terminology extraction. Ponencia presentada en el 7th International Conference on Terminology and Knowledge Engineering, Copenhagen, Dinamarca.
: Pennacchiotti, M. & Pantel, P. (2009). Entity extraction via ensemble semantics. En Proceedings of Conference on Empirical Methods in Natural Language Processing. Singapore: ACL.
: Sierra, G., Alarcón, R., Aguilar, C. & Bach, C. (2008). Definitional verbal patterns for semantic relation extraction. Terminology, 14(1), 74-98.
: Soderland, S. (1999). Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1-3), 233-272
: Wu, F. & Weld, D. S. (2010). Open Information Extraction using Wikipedia. Proceedings of the ACL 2010, 118-127.
: You, W., Fontaine, D. & Barthes, J. P. (2013). An automatic keyphrase extraction system for scientific documents. Knowledge and Information Systems, 34(3), 691-724.
: Zhila, A. & Gelbukh, A. (2013). Comparison of Open Information Extraction for Spanish and English. Computational Linguistics and Intellectual Technologies 12(1), 794-802.
: Zhila, A. & Gelbukh, A. (2014). Open Information Extraction for Spanish language based on syntactic constraints. Proceedings of the ACL SRW 2014, 78-85.
: Zhila, A. (2014). Open Information Extraction using constraints over part-of-speech sequences. Unpublished doctoral dissertation, Instituto Politécnico Nacional, Mexico.