The exercise has three aims
Examples of markup are as follows, but further details, if required, can be found on the MUC6 pages (see IE Lectures pages).
Type | Example | Markup |
Numerical Expressions | 2.5 per cent $100 |
<numex type=pc>2.5.per cent</numex> <numex type=money>$100</numex> |
Organisations | Morgan Stanley | <namex type=org>Morgan Stanley</namex> |
Persons | Thierry Lacraz | <namex type=per>Thierry Lacraz</namex> |
Time Expressions | 1645 GMT | <timex type=time>1645<timex> |
Locations | Frankfurt | <namex type=loc>Frankfurt</namex> |
This is worth 7.5% of the marks for the double credit. Translated into hours this means 7.5 hrs of work. Please indicate the number of hours spent on each part.
I suggest (but do not insist) you use a combination of unix tools + xfst to handle this exercise. Whatever you use I will need to understand the operation of the algorithms you describe.