CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS

ZAMIN, NORSHUHANI (2014) CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPUS. PhD. thesis, Universiti Teknologi PETRONAS.

[thumbnail of 2014 -COMPUTER & INFORMATION SCIENCES - CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPOS - NORSHUHANI ZAMIN.pdf] PDF
2014 -COMPUTER & INFORMATION SCIENCES - CROSS-LINGUAL ANNOTATION PROJECTION FOR THE DEVELOPMENT OF MALAY CORPOS - NORSHUHANI ZAMIN.pdf
Restricted to Registered users only

Download (5MB)

Abstract

Cross-lingual annotation projection methods can benefit from rich-resourced
languages to improve the performance of Natural Language Processing (NLP) tasks in
less-resourced languages. In this research, Malay is experimented as the lessresourced
language and English is experimented as the rich-resourced language. The
research is proposed to reduce the deadlock in Malay computational linguistic
research due to the shortage of Malay tools and annotated corpus by exploiting stateof-
the-art English tools. The aim of the research is to investigate a suitable crosslingual
annotation projection based on word alignment of two languages with
syntactical differences. A word alignment method known as MEW A (Malay-J;nglish
Word Aligner) that integrates a Dice Coefficient and bigram string similarity measure
with little supervision is proposed.

Item Type: Thesis (PhD.)
Subjects: Q Science > Q Science (General)
Departments / MOR / COE: Sciences and Information Technology > Computer and Information Sciences
Depositing User: Mr Ahmad Suhairi Mohamed Lazim
Date Deposited: 16 Sep 2021 22:05
Last Modified: 16 Sep 2021 22:05
URI: http://utpedia.utp.edu.my/id/eprint/21305

Actions (login required)

View Item
View Item