Tecling logo » The World is automatic      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis
July 21, 2020:
CHUNGUNGO: A new web demo to classify nouns as abstract and concrete



Yes, apparently, we are on fire. Maybe it has something to do with the fact that we are now working from home. Whatever the truth is, we are managing to get the job done. Chungungo had been in our waiting list for ages, but now we have something close to a web demo working:
http://www.tecling.com/chungungo
Based on the thesis of our student Valentina Ravest, it takes a noun (or a list of) and classifies them into abstract or concrete. In the coming days we will be improving the interface. In the meantime, be indulgent with this new creature, as it started its existence only a few minutes ago. But do not hesitate to report any strange behavior.




June 25, 2020
New version of Porcus is open-source



Apparently we have entered in some kind of productivity rush. No one nows how long it will last but we will try to make the most of it. There is a new version of Porcus here. Porcus is our parser aggregator. Up to now it was only a web interface for different Spanish syntactic parsers, but now you can download it and use it offline. Enjoy with moderation!




June 16, 2020:
A new web demo for pullPOS



We promised a web demo for our pullPOS project and here it is. It should be easier now to test the algorithm with any text in Spanish. Before, you had to download the script and execute it on a terminal with Perl. Now it's just some good old cut n' paste. We also improved the algorithm a little bit.


Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Bifid: a parallel corpus aligner

+ Dsele: a model dictionary for ELE learners

+ EMaD: automatic categorization of discouse markers

+ Estilector: a tool for assisted writing

+ GeNom: a program to detect the gender of proper nouns

+ HAT: a project for the treatment of polysemy in lexical taxonomies

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Neven: a program to detect eventive nouns

+ Termout: a terminology extraction system

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ pullPOS: a project for the detection of plurals in Spanish

+ Sapo: a program to detect similarities between documents

+ Sicam: a Perl implementation of Ricardo Martínez' Excel routine to separate a Spanish Word in syllables (new!)

+ Verbario: corpus pattern analysis in Spanish

Sausalito

This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

First semester 2020 (from March to June): Rogelio Nazar will offer a new course with the title Procesamiento de datos lingüísticos (Linguistic data processing) at the Postgraduate Program in Linguistics (PhD + Master programs) of the Pontifical Catholic University of Valparaiso. The idea is to spend the whole semester processing huge loads of text in different languages and have a lot of fun.

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Irene Renau; Rogelio Nazar; Valesca Lecaros. (Forthcoming). "La evolución de las marcas ortográficas y tipográficas en los procesos de lexicalización de neologismos: un estudio en el vocabulario de la crisis económica en prensa española". Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics.

+ Robledo, H.; Nazar, R. (2018). "Clasificación automatizada de marcadores discursivos", Procesamiento del Lenguaje Natural, n. 61, pp 109-116.

+ Nazar, R. (Forthcoming). "El análisis cuantitativo de la coocurrencia léxica en la lexicografía especializada". Actas del VIII Congreso Internacional de Lexicografía Hispánica. Valencia, España: 27-29 Junio 2018.

+ Nazar, R. (2009 [2018]). Invitación al estudio estadístico del lenguaje. ArXiv:1804.07349 [stat.AP] (PDF)

+ See more.

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.