Tecling logo » The World is automatic      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis

January 23, 2021
Readeutsch: a reading pacer for parallel corpora

So, we need to learn German as quickly as possible. How do we do it? Well, one idea would be to read some English literary pieces alongside with there German translations. Seems like a good idea. Why don't you try it and tell us if it is any good?

January 16, 2021
Linguini: our new language detector

Happy times! Linguini is here! We just created this program, which will detect the main languague of a text and then fragments written in other languages. It's pretty cool.

January 8, 2021
Segismund: a text-segmentator

This is our first attempt at a text segmentation tool. You can copy and paste a text in Spanish or English and we will try to split it into sentences. Just don't be too harsh on your judgment! It has only a few minutes of existence. We are confident it will improve in the coming days. Please send us feedback if you feel like.

December 18, 2020
Cryptoman: a script to generate cryptograms

Normally, we tend to deal with relatively serious things. But hey, it's Christmas! We are allowed to diversify efforts into something ridiculous at this time of the year. So we made a script to generate cryptograms. If you like solving these during breakfast, be our guest. It's available in English and Spanish. As always, enjoy with moderation.

December 1, 2020
New paper published at Journal of Intelligent Systems

Wow! Suddenly we are getting a lot of papers published. The latest is published at De Gruyter's Journal of Intelligent Systems about our beloved project Kind (available at tecling.com/kind ).
The title of the paper is ``Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and French'', by Rogelio Nazar, Antonio Balvet, Gabriela Ferraro, Rafael Marín and Irene Renau.

November 23, 2020
New paper published at Names: A Journal of Onomastics

We are proud to announce that we just got a new paper published at Names: A Journal of Onomastics about our dear project Genom (available at tecling.com/genom ).
In the paper we describe a series of methods for automatically determining the gender of proper names, based on their co-occurrence with words and grammatical features in a large corpus. A method like this offers the possibility of obtaining real and up-to-date name-gender links, and this can be applied to a variety of natural language processing tasks such as information extraction, machine translation, anaphora resolution or large-scale delivery or email correspondence, among others.

Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Bifid: a parallel corpus aligner

+ Cryptoman: a script to generate cryptograms

+ Dsele: a model dictionary for ELE learners

+ Deixis: a tool for the identification of deixis in Spanish texts

+ EMaD: automatic categorization of Spanish discouse markers

+ Estilector: computer assisted writing for Spanish

+ GeNom: a program to detect the gender of proper nouns

+ HAT: a project for the treatment of polysemy in lexical taxonomies

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Linguini: a language detector

+ Marzopo: a program to detect discourse markers in Spanish

+ Modal: a program to detect modality in Spanish

+ Neven: a program to detect eventive nouns

+ Termout: a terminology extraction system

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ pullPOS: a project for the detection of plurals in Spanish

+ Sapo: a program to detect similarities between documents

+ Segismund: a text-segmentator

+ Sicam: a program to separate a Spanish Word in syllables

+ Verbario: corpus pattern analysis in Spanish


This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

Around January 30, 2021: we are planning to launch a new version of Bifid. It will include the new functions we have been working on: text segmentation (Segismund) and language detection (Linguini).

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Nazar, R.; Balvet, A.; Ferraro, G.; Marín, R.; Renau, I. (2020). "Pruning and repopulating a lexical taxonomy: experiments in Spanish, English and French". Journal of Intelligent Systems, vol. 30, num. 1, pp. 376-394. PDF

+ Nazar, R.; Renau, I., Acosta, N., Robledo, H., Soliman, H., Zamora, S. (2020). "Corpus-Based Methods for Recognizing the Gender of Anthroponyms". Names: A Journal of Onomastics.

+Asenjo, S.; Nazar, R. (2020). "Marcadores discursivos en niños de 7 años con trastorno específico del lenguaje: estudio descriptivo". RLA. Revista de lingüística teórica y aplicada, vol. 58 núm 1, pp. 93-114. PDF.

+ Nazar, R.; Obreque, J.; Renau, I. (2020). "Tarántula –> araña –> animal : asignación de hiperónimos de segundo nivel basada en métodos de similitud distribucional". Procesamiento del Lenguaje Natural, núm 64, pp. 29-36. PDF.

+ Renau, I.; Nazar, R.; Lecaros, V. (2020). "La evolución de las marcas ortográficas y tipográficas en los procesos de lexicalización de neologismos: un estudio en el vocabulario de la crisis económica en prensa española". Revista Española de Lingüística Aplicada, vol. 33, núm. 1, pp. 227-277.

+ See more.

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.