Tecling logo » The World is automatic      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis
September 21, 2020:
Very impressive user statistics of Estilector.com

In the year 2015 we launched Estilector.com, a program for the revision of Spanish academic texts. It was designed to help university students in the creation of their first academic texts, and it was an effort to help ourselves because we found that we were correcting always the same problems in our own student's texts. Now, however, we are surprised to see how much the community of its users has grown. At a pace of 250,000 text per month, it is now reaching the incredible landmark of 5 million texts processed since its creation. It is, by far, the most successful program we have developed so far, at least in terms of users. It has, of course, many many problems we have yet to tackle. But rest assured we will be paying more attention to it from now on.

September 7, 2020
Wopatec will be held on October 21-22, 2020

The 5th Workshop on Natural Language Processing will take place in Antioquía, Colombia, in conjunction with the 3rd International Congress of Computational and Corpus Linguistics. For the first time, the event will be held completely online. Registration is free for anyone.
For more information visit: http://www.wopatec.cl (in Spanish only)

August 24, 2020:
A web interface of the detection of deixis in Spanish

This is only very preliminary. In fact, we already have almost ready the version that will replace the one we are showing now... but yeah, we could not wait any longer. We wanted to share our new toy with the world:
It accepts a text in Spanish and will try to detect and classify instances of personal, temporal and spatial deixis.
Documentation is currently non-existent. But someone, at some point, will have to do something about that.
Update: (September 4, 2020) We added some documentation.

Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Bifid: a parallel corpus aligner

+ Dsele: a model dictionary for ELE learners

+ Deixis: a tool for the identification of deixis in Spanish texts

+ EMaD: automatic categorization of Spanish discouse markers

+ Estilector: a tool for assisted writing

+ GeNom: a program to detect the gender of proper nouns

+ HAT: a project for the treatment of polysemy in lexical taxonomies

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Neven: a program to detect eventive nouns

+ Termout: a terminology extraction system

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ pullPOS: a project for the detection of plurals in Spanish

+ Sapo: a program to detect similarities between documents

+ Sicam: a Perl implementation of Ricardo Martínez' Excel routine to separate a Spanish Word in syllables (new!)

+ Verbario: corpus pattern analysis in Spanish


This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

October 21-22, 2020: Wopatec is back! The 5th Workshop on Natural Language Processing will take place in Antioquía, Colombia, in conjunction with the 3rd International Congress of Computational and Corpus Linguistics. For the first time, the event will be held completely online.

November 30, 2020: Rogelio Nazar will be gest editor of the special issue on Computational Linguistics of the journal Anales de lingüística, founded by the great Catalan linguist Joan Corominas in 1941, at Universidad Nacional de Cuyo, Argentina.
If you would like to have your paper published in this issue, send it by email to rogelio dot nazar at pucv dot cl before the deadline indicated in the heading.
Here is the call for papers (at the moment only available in Spanish).
And here the instructions for authors.

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Nazar, R.; Balvet, A.; Ferraro, G.; Marín, R.; Renau, I. (2020). "Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and French". Journal of Intelligent Systems, (forthcoming...).

+ Nazar, R.; Obreque, J.; Renau, I. (2020). "Tarántula –> araña –> animal : asignación de hiperónimos de segundo nivel basada en métodos de similitud distribucional". Procesamiento del Lenguaje Natural, núm 64, pp. 29-36. PDF

+ Renau, I.; Nazar, R.; Lecaros, V. (2020). "La evolución de las marcas ortográficas y tipográficas en los procesos de lexicalización de neologismos: un estudio en el vocabulario de la crisis económica en prensa española". Revista Española de Lingüística Aplicada, vol. 33, núm. 1, pp. 227-277.

+ See more.

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.