name {dot} surname {at}

Institute of Linguistics and Language Technology, Rm 111, University of Malta
Tal-Qroqq Msida MSD2080, Malta

(+356) 2340 2150

some useful resources

From time to time I put up some stuff I've done that might be useful to others.


  • Tools for Maltese NLP: Various tools (including POS Tagger, tokeniser, phonetic transcriber) for Maltese. Mostly written in Python. Hosted on the Maltese Language Resource Server.
  • SimpleNLG: a java library for morphological generation and syntactic realisation. This used to be hosted on Google Code, but is now on Github.

language resources

  • The GenChal Repository: an online repository of datasets related to the Generation Challenges, a series of Shared Task challenges organised since 2007.
  • The Maltese Language Resource Server (MLRS): a server for language resources and tools in Maltese. Currently hosts a corpus of ca. 100m tokens of Maltese text. This is continuously being updated.
  • The TUNA Corpus of Referring Expressions, a semantically transparent, annotated corpus of references to objects in visual domains. This corpus has been used in three Shared Task Evaluations since its development.
  • Experiment on temporal structure in narrative: I've recently run this experiment, and have collected a large corpus of narratives that I'll make available once annotation is complete. Meantime, you can read a summary here.
  • Annotated bibliography on the generation of referring expressions (and related problems)
    A collection of publications on reference and its computational treatment in generation, compiled as part of the TUNA Project. Not up to date at all!