projects - Paul Micallef - University of Malta

Projects for 2012-2013

I am here putting some general ideas for consideration. There is a list of recent past projects that I supervised. If you want to discuss any topic please contact me.

Audio based projects

(a) Modelling type continuing on present work

(b) Music analysis or synthesis such as obtaining a musical score from audio or identifying

Speech based projects including:

(a) analysis of speech to build an annotated phoneme database for various applications

(b) pitch analysis of existing audio for synthesis applications

(c) recognition based on a particular type of parameterisation, (eg wavelets, DCT etc.) and a particular type of learning (HMM, various NN types, SVM)

(d) coding building up various models based on CELP type techniques

(e) enhancement type looking into noise reduction

(f) use of SVM (Support Vector machines) parameters in speech applications

(g) speech to text for mobile applications

Signal Processing based including

(a) Parameterisation of arbitrary waveforms based on wavelets in terms of amplitude, positioning and scaling of continuous wavelets.

(b) Multirate filter techniques analysing efficiency of various polyphase structures

Multimedia data mining

Analysis of image, and audio to look into parameterisation and storage attributes.

Software applications

(a) GUI application for HMM and or NN

(b) Analysis and Synthesis of Fast Data retrieval algorithms

These are general headings and any student interested in any particular area is welcome to discuss with me further to arrive at a project proposal in detail.

Past Project List

‘Speech Annotation System', Roberta Camilleri

This thesis continued on previous work, this time using formal taped Maltese sentences. The work consisted in initially manually annotating some of the text, and then training the system. The student used HMM with two mixtures per phoneme and more for silence. The work was very successful and is being used for further work towards speech recognition using Maltese

‘Analysis and Synthesis using Continuous Wavelet Transform, Christian Spiteri.

This thesis dealt with using CWT as a basis to code any waveform using its properties in time, frequency using new statistical techniques that are more powerful then the traditional Fourier transform.

The idea can be applied to the analysis of any signal – speech, image, geophysical, biomedical etc.

‘Simulation of Spatial Audio Reception using Headphones, , Rudi Agius

This project used genetic algorithm techniques on digital filters to generate the spatio-temporal signals for a binaural simulation of the Head Related Transfer Function, HRTF.

Further work on the topic can involve the use of the genetic algorithm for other digital filter audio applications.

Building a Hidden Markov Model using C++ for Speech Applications. Rene’ Axisa

This project has the building blocks to use the HMM as a tool in speech technology. It is based on a public available version of C++. The use of HMM in speech technology encompasses, recognition, synthesis and coding.

The use of speech as a direct PC input is a current research topic especially in a multilingual (eg Maltese, Maltese-English, English) environment.

Further work on building the tools to set up speech annotated databases, using the HMM as the mathematical basis, is a very current topical area in speech technology.

Web based Interactive Systems for Distributed Applications Ian Buhagiar

This project is an on-line access of a text-to-speech system, having input Maltese text and generating an audio output. The application itself while useful is an illustration of the power of an on-line engineering application, distinct from database text access.

There are many other applications that can involve imaging and speech in biomedical environments such as speech impairments.

Speech Annotation using Hidden Markov Models Anthony Psaila

This was an M.Sc. thesis based on the HTK HMM publicly available software tool for speech applications.

The work can be further enhanced by including at temporal (duration) effects on the HMM and in setting up a proper GUI for the annotated database allowing simultaneous view of the basis text and the waveform in time and frequency, including audio