Tecling logo » The universe is not perfect, but it's working on it.      ABOUT RESEARCH SOLUTIONS SOFTWARE CONTACT
Technologies for Linguistic Analysis

February 24, 2023
We have new paper at IJLC


We have a new paper published online in the International Journal of Corpus Linguistics with the title “A proposal for the inductive categorisation of parenthetical discourse markers in Spanish using parallel corpora”, by Hernán Robledo and Rogelio Nazar. It is now available (behind a paywall) at http://doi.org/10.1075/ijcl.20017.rob



February 20, 2023
Termout is finally here!


We are very happy to announce the official opening of the new version of http://www.Termout.org, our term extraction system. With this program, users will be able to process specialized corpora, extract terms, semantic categories, definitions, equivalences, synonyms and more.
Enjoy with moderation!




February 16, 2023
Termout is almost there...


Our beta-testers Benjamín López and Ana Castro are doing their job here with the new version of Termout (http://www.tecling.com/termout2022), which will soon see the light of day.
With this program, users will be able to:

  • process a specialized corpus
  • extract terms
  • classify them in semantic categories
  • extract definitions from the corpus
  • extract equivalents in other language
  • obtain synonyms (term variants)
  • export-import term databases (CSV, HTML, TBX)
  • and even more!
We are putting the last finishing touches before the official opening, which we hope will be in a matter of days.


10 de febrero, 2023
Irene Renau presenta conferencia en Santiago de Compostela


Hoy, viernes 10 de febrero de 2023, Irene Renau impartió una conferencia sobre patrones ocultos de las metáforas en la Universidad de Santiago de Compostela. El título fue 'La metáfora escondida: patrones metafóricos ocultos en los verbos del español', y presentó los resultados de sus investigaciones más recientes sobre el descubrimiento de metáforas en corpus.


January 19, 2023
A new version of Termout is coming soon...


For some months we have been working on a new version of our term extraction system Termout. It now has many functions for automatic and manual terminology processing. An example of computer assisted terminology, it lets you work with your specialized corpus to extract terms from it, to evaluate the extracted candidates, to classify them in semantic categories, extract definitions from the corpus, extract equivalents in other language, obtain synonyms (term variants) and even more. We plan to have it open to the public in February 2023 (next month). Follow the link for more details:
http://www.tecling.com/termout2022


16 de enero, 2023
Irene Renau se adjudica un Proyecto Fondecyt


La profesora Irene Renau, cofundadora del Grupo de Investigación Tecling.com, se acaba de adjudicar uno de los Proyectos más competitivos del país, el Fondecyt Regular 2023. El proyecto, de cuatro años de duración, tiene el título 'Mapa de las metáforas conceptuales en sustantivos y verbos del español: un estudio de los patrones metafóricos basado en corpus'. Rogelio Nazar participa como coinvestigador.


12 de enero, 2023
Cerramos una semana de defensas de tesis


Ya todos nuestros estudiantes del postgrado en lingüística defendieron sus trabajos. Ignacio Lobos presentó su proyecto de tesis doctoral el martes sobre marcadores discursivos y Javier Obreque (en foto) presentó esta mañana su tesis de magíster sobre modalización. A todos les fue muy bien y mostraron excelentes trabajos. Estamos muy contentos de terminar el semestre así.


9 de enero, 2023
Nuestros estudiantes defienden sus tesis de Magíster


Benjamín López, Enzo Soto y Ana Castro, cuyas tesis fueron guiadas por la profesora Irene Renau, están defiendiendo sus tesis de Magíster en el día de hoy. En el caso de Ana (en foto), lo está haciendo en este momento. ¡Mucha suerte!


January 6, 2023
Two of our collaborators are awarded research grants


Hernán Robledo and Ricardo Martínez, both graduated from the PhD Program in Linguistics at PUCV.cl and with years of research collaboration with the Tecling Group, have now been awarded research grants by ANID.cl. Hernán, with a Fondecyt Posdoc, is developing research in the field of discourse markers and Ricardo, with a Fondecyt Iniciación, studies linguistic properties of poetry. Even though their research interests are different, both have made extensive use of computational models for their objects of study. We are very proud to be your friends!

Tools & demos

We have implemented different types of applications and most of them can be tested online. Take a look.

+ Compare: a simple script to compare two lists of words

+ Cryptoman: a script to generate cryptograms

+ Dismark: a multilingual taxonomy of discourse markers

+ Dsele: a model dictionary for ELE learners

+ Estilector: computer assisted writing for Spanish

+ GeNom: a program to detect the gender of proper nouns

+ HAT: a project for the treatment of polysemy in lexical taxonomies

+ Jaguar: a tool for statistic corpus analysis

+ Kind: a lexical taxonomy induction algorithm

+ Kwico: a concordancer for big corpora

+ Lealem: a reading pacer for parallel German-Spanish texts

+ Leafran: a reading pacer for parallel French-Spanish texts

+ Linguini: a language detector

+ Neven: a program to detect eventive nouns

+ POL: named entity recognition and classification

+ Poppins: a supervised text classifier

+ Porcus: an interface for various taggers and parsers for Spanish

+ pullPOS: a project for the detection of plurals in Spanish

+ Randall: a list randomizer

+ Readeutsch: a reading pacer for parallel German-English texts

+ Sapo: a program to detect similarities between documents

+ Sicam: a program to analyze Spanish poetry

+ Termout: a terminology extraction system (new version!)

+ TEXT·A·GRAM: a program to analyze Spanish texts

+ Verbario: corpus pattern analysis in Spanish

Sausalito

This is the view from where we are located, in the Sausalito lagoon, a quiet and lovely place in Viña del Mar, Chile. Sunny days. Birds can be seen in the center of the lagoon (click to enlarge).

As researchers, we are currently affiliated to:
Pontificia Universidad Católica de Valparaíso
Instituto de Literatura y Ciencias del Lenguaje

Av. El Bosque 1290, Viña del Mar, Chile

Upcoming Events

31 de marzo 2023: Estaremos presentando un nuevo proyecto junto con la Revista Perspectiva Educacional, editada por la Escuela de Pedagogía de la Pontificia Universidad Católica de Valparaíso. Se trata de un proyecto muy interesante sobre extracción de terminología e información utilizando Termout.org y otras herramientas desarrolladas específicamente para ese proyecto. Ampliaremos!

1-2 June 2023 2023 Rogelio Nazar will be presenting a paper at the TOTH 2023 Conference, to be held at the University Savoie Mont-Blanc (France). The title of the paper is 'Termout: a new software for the semi-automatic creation of specialized dictionaries'.

27–29 June 2023 Rogelio Nazar, David Lindemann and Nicolás Acosta will be presenting a paper at the eLex 2023 conference in Brno, Czech Republic. The communication is a software demostration entitled 'Termout: Towards the automation of the glossary creation process'.

30 y 31 de agosto y 1 de setiembre de 2023: Irene Renau y Rogelio Nazar estarán presentando un curso / taller titulado Procesamiento de corpus para lexicografía y terminología, en la Facultad de Filosofía y Letras de la Universidad Nacional de Cuyo (Mendoza, Argentina). Esto será en el contexto de las Jornadas de Estudios Lingüísticos (JELing) 2023.

Latest ideas & research projects

We are developing new projects in computational linguistics and natural language processing:

+ Fondecyt Regular (2023-2027): "Mapa de las metáforas conceptuales en sustantivos y verbos del español: un estudio de los patrones metafóricos basado en corpus". Lead researcher: Irene Renau. Co-researcher: Rogelio Nazar.

+ Fondecyt Regular (2019-2021): "Polisemia regular de los sustantivos del español: análisis semiautomático de corpus, caracterización y tipología" (Regular polysemy of nouns in Spanish: semiautomatic analysis of corpus, characterization and tipology). Lead researcher: Irene Renau. Ref.: 1191204.

+ Fondecyt Regular (2019-2021): "Inducción automática de taxonomías de marcadores discursivos a partir de corpus multilingües" (Automatic induction of taxonomies of discourse markers from multilingual corpora). Lead researcher: Rogelio Nazar. Ref.: 1191481.

+ Ecos-Sud (International Project between Chile and France): "Inducción automática de taxonomías del español y el francés mediante técnicas cuantitativas y estadística de corpus". Lead researcher: Irene Renau. Ref.: C16H02.

+ Fondecyt Regular: "Desarrollo de la competencia terminológica a lo largo de la inserción disciplinar". Lead Researcher: Sabela Fernández. Co-researcher: Rogelio Nazar. Ref.: 11121597.

+ See more.

Recent publications

+ Robledo, H.; Nazar, R. (2023). A proposal for the inductive categorisation of parenthetical discourse markers in Spanish using parallel corpora. International Journal of Corpus Linguistics. http://doi.org/10.1075/ijcl.20017.rob

+ Renau, I.; Nazar, R. (2022). Towards a multilingual dictionary of discourse markers: automatic extraction of units from parallel corpus. In: Klosa-Kückelhaus, A.; Engelberg, S.; Möhrs, C.; Storjohann, P. Dictionaries and Society. Proceedings of the XX EURALEX International Congress, Mannheim: IDS-Verlag, pp. 262-272. PDF

+ Nazar, R; Lindemann, D. (2022). Terminology extraction using co-occurrence patterns as predictors of semantic relevance. Proceedings of the TERM21 Workshop. Language Resources and Evaluation Conference (LREC 2022), Marseille, 20-25 June 2022, pp. 26-29. PDF

Solutions for text processing

It is critical for organizations to have the ability to process information automatically, and very often that information is contained in documents to be read by humans rather than machines. We have different methods for text processing depending on the goal.

We can be helpful teaching people how to automatize their text processing routines. We can batch-process thousands of documents to extract information from them or to derive different types of statistics. We can also change these document, or generate databases or email correspondence based on information extracted from them. Anything that involves intelligent management of information can benefit from different degrees of automatization, and by doing that we can free time, effort and resources.

Tell us which are your needs and we will show you what we can do about it.