,, ZUL INDRA (2016) AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS. Masters thesis, Universiti Teknologi PETRONAS.
2015 -IT - AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENT - ZUL INDRA - MASTER.pdf
Restricted to Registered users only
Download (6MB)
Abstract
Text classification (TC)provides a better wayto organize information since it allows
better understanding and interpretation of the content. It deals with the assignment of
labels into a group of similar textual document. However, TC research for Asian
language documents is relatively limited compared to English documents and even
lesser particularly for news articles. Apart from that, TC research to classify textual
documents in similar morphology such Indonesian and Malay is still scarce. Hence,
the aimof this study is to develop an integrated generic TCalgorithm which is able to
identify the language and then classify the category for identified news documents.
Furthermore, top-ra feature selection method is utilised to improve TCperformance
andto overcome theonline news corpora classification challenges: rapid datagrowth
of online news documents, and the high computational time.
Item Type: | Thesis (Masters) |
---|---|
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Departments / MOR / COE: | Sciences and Information Technology > Computer and Information Sciences |
Depositing User: | Mr Ahmad Suhairi Mohamed Lazim |
Date Deposited: | 18 Sep 2021 21:14 |
Last Modified: | 24 Jul 2024 07:16 |
URI: | http://utpedia.utp.edu.my/id/eprint/21420 |