Tecling logo » The World is automatic      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis
June 25, 2019:
A new version of Kind is online

We just added a new function that performs a full autocheck of the results for all hyponyms in each category, flagging those suspected of being errors. For the moment, it does not delete the suspected errors, it just flags them. We will be monitoring this in the next few days and then proceed with the definitive deletion of all suspected errors. Please send us email to alert us if you find something strange in the results.

[UPDATE June 29, 2019]: We just implemented a new algorithm (morfeo) to the filtering process and the quality of the error-detection result has considerably improved. But beware: we are updating the data right now. You might expect some inconsistencies in the process.

June 11, 2019
Antonio San Martín presented a talk on 'flexible' terminological definitions

Antonio San Martín, from Université du Québec à Trois-Rivières, presented a talk with the title "La definición terminológica flexible" (The flexible terminological definition).
A short abstract: The terminological definition, as a linguistic description of the most relevant content of a concept, is one of the most important elements in any terminological resource. However, it is common for users consulting a definition not to find the information they need. One of the main causes is that, although the knowledge conveyed by a term is not always the same and depends on the context of activation, the usual practice in terminology is for each term to receive a single definition (with the exception of cases of polysemy). As a result, definitions tend to be either too general in an effort to cover all possible contexts, or too specific, implying that the definition is not applicable to other relevant contexts. In this conference, the approach of the flexible terminological definition was presented, a new method based on the principles of cognitive linguistics that aims to remedy this situation and develop more useful definitions.
And here, the slides of the talk (only in Spanish)

May 7, 2019:
Estilector introduces new features!

After many weeks of work, we introduce a new version of Estilector. In this version we have implemented POL and EMaD as subroutines of the text analysis and we offer some statistical information (number of words, paragraphs and characters). Now, in the header of the text, you will see a section with all this information to improve the writing of your text. Also, in Contact section, you will find FAQs of Estilector, based in several messages that we have received through years.

April 8, 2019
The slides of Francisco Mondaca's talk are available

For those who attended his presentation at the ILCL (or for those who couldn't make it but are still interested) here we publish the slides he used, in this link. We learned a lot and we look forward to continue our collaboration. Thanks, Francisco!

Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Bifid: a parallel corpus aligner

+ Dsele: a model dictionary for ELE learners

+ EMaD: automatic categorization of discouse markers

+ Estilector: a tool for assisted writing

+ GeNom: a program to detect the gender of proper nouns

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Neven: a program to detect eventive nouns

+ Termout: a terminology extraction system

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ Sapo: a program to detect similarities between documents

+ Sicam: a Perl implementation of Ricardo Martínez' Excel routine to separate a Spanish Word in syllables (new!)

+ Verbario: corpus pattern analysis in Spanish


This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

November 7 and 8, 2019: Wopatec 2019 is on! It will take place in Valparaíso. The deadline for sending an abstract is March 4, 2019

November 13-15, 2019: Rogelio Nazar will offer a course with the title Introducción a la Lingüística computacional (Introduction to Computational Linguistics) at JELING 2019 (II Jornadas Nacionales y I Internacionales de Estudios Lingüísticos). Facultad de Filosofía y Letras, Universidad Nacional de Cuyo, Mendoza, Argentina.

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Irene Renau; Rogelio Nazar; Valesca Lecaros. (Forthcoming). "La evolución de las marcas ortográficas y tipográficas en los procesos de lexicalización de neologismos: un estudio en el vocabulario de la crisis económica en prensa española". Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics.

+ Robledo, H.; Nazar, R. (2018). "Clasificación automatizada de marcadores discursivos", Procesamiento del Lenguaje Natural, n. 61, pp 109-116.

+ Nazar, R. (Forthcoming). "El análisis cuantitativo de la coocurrencia léxica en la lexicografía especializada". Actas del VIII Congreso Internacional de Lexicografía Hispánica. Valencia, España: 27-29 Junio 2018.

+ Nazar, R. (2009 [2018]). Invitación al estudio estadístico del lenguaje. ArXiv:1804.07349 [stat.AP] (PDF)

+ See more.

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.