Lista de candidatos sometidos a examen:
1) extraction (*)
(*) Términos presentes en el nuestro glosario de lingüística

1) Candidate: extraction

Is in goldstandard

paper corpusSignosTxtLongLines181 - : Turney, P. (1997). Extraction of keyphrases from text: Evaluation of four algorithms . Ottawa: National Research Council Canada. Technical Report ERB-1051. [ [118]Links ]

paper corpusSignosTxtLongLines416 - : Open Information Extraction from real Internet texts in Spanish using constraints over part-of-speech sequences: Problems of the method, their causes, and ways for improvement

paper corpusSignosTxtLongLines416 - : sequences. We identify the causes of the errors of each type and suggest ways for preventing such errors with corresponding analysis of their cost and scale of impact. The analysis is performed for extractions from two Spanish-language text datasets: the FactSpaCIC dataset of grammatically correct and verified sentences and the RawWeb dataset of unedited text fragments from the Internet . Extraction is performed by the ExtrHech system.

paper corpusSignosTxtLongLines416 - : Resolution of the coordinating conjunctions results in extraction of three tuples:

paper corpusSignosTxtLongLines416 - : The relation phrase in this extraction is incorrect: it should be “llevó a cabo”, i .e., include the noun cabo of the idiom llevar a cabo, lit. “to bring to accomplishing”, which is not supported by the current algorithm. In addition, since a verb relation phrase is the first element of the relation tuple looked for by ExtrHech, incorrect detection of the relation phrase in most of the cases leads to incorrect detection of the arguments. In this example, the second argument should be numerosas manifestaciones públicas.

paper corpusSignosTxtLongLines416 - : and the corresponding extraction:

paper corpusSignosTxtLongLines416 - : However, the system generated the extraction:

paper corpusSignosTxtLongLines416 - : Therefore, the extraction:

paper corpusSignosTxtLongLines416 - : The sentence tells us that the first hominids ate meat only under some specific conditions, namely, when they encountered leftovers. Therefore, the extraction:

paper corpusSignosTxtLongLines416 - : The correct extraction would be:

paper corpusSignosTxtLongLines416 - : that looks similar to a passive construction, and the corresponding erroneous extraction:

paper corpusSignosTxtLongLines416 - : We have analyzed in detail the errors typical to the method of Open IE based on heuristic rules over POS-tags. No detailed description or accurate classification of the errors had been reported before, although some types of errors along with some issues were mentioned by Fader et al. (2011), but not distinguished. We have distinguished between errors and their sources. We have classified all information extraction errors into four types based on the component of an extracted fragment where an error occurred: incorrect relation phrase, incorrect arguments, incorrect argument order, and incorrect arguments with correct relation phrase . This classification is complete: it covers all possible errors.

paper corpusSignosTxtLongLines496 - : Extraction Method: PCA/ Rotation Method: Varimax with Kaiser Normalization/ Q: question

paper corpusSignosTxtLongLines533 - : Herramientas computacionales con alto nivel de precisión están en auge para el procesamiento automático de textos escritos. ^[86]Venegas, Zamora y Galdames (2016), por ejemplo, utilizaron patrones léxico-gramaticales y léxico-semánticos utilizando la plataforma WEKA (Waikato Environment for Knowledge Analysis) para clasificar automatizadamente las macromovidas de trabajos finales en el género tesis de licenciatura. La plataforma WEKA es una colección de algoritmos para el análisis de datos y modelado predictivo. ^[87]Cotos, O’Connor, Chapelle, Coetze, Brabanter, Gilbert y Mac Donald (2016) están desarrollando el Automated Functional Language Extraction System (AFLEX), una herramienta computacional capaz de analizar secciones individuales en los artículos de investigación escritos por estudiantes, generar retroalimentación especifica de la disciplina basada en las convenciones retóricas de este género y proveer diferentes pautas guiadas para la escritura . A pesar de que muchas de estas

Evaluando al candidato extraction:

1) incorrect: 7
2) errors: 7 (*)
3) relation: 6
5) phrase: 4 (*)
8) corresponding: 3 (*)
9) correct: 3

Lengua: eng
Frec: 92
Docs: 23
Nombre propio: 5 / 92 = 5%
Coocurrencias con glosario: 3
Puntaje: 3.790 = (3 + (1+4.95419631038688) / (1+6.53915881110803)));
Candidato aceptado

Referencias bibliográficas encontradas sobre cada término

(Que existan referencias dedicadas a un término es también indicio de terminologicidad.)
