AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS

,, ZUL INDRA (2015) AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS. Masters thesis, Universiti Teknologi PETRONAS.

[thumbnail of 2015 -IT - AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENT - ZUL INDRA - MASTER.pdf] PDF
2015 -IT - AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENT - ZUL INDRA - MASTER.pdf
Restricted to Registered users only

Download (6MB)

Abstract

Text classification (TC)provides a better wayto organize information since it allows
better understanding and interpretation of the content. It deals with the assignment of
labels into a group of similar textual document. However, TC research for Asian
language documents is relatively limited compared to English documents and even
lesser particularly for news articles. Apart from that, TC research to classify textual
documents in similar morphology such Indonesian and Malay is still scarce. Hence,
the aimof this study is to develop an integrated generic TCalgorithm which is able to
identify the language and then classify the category for identified news documents.
Furthermore, top-ra feature selection method is utilised to improve TCperformance
andto overcome theonline news corpora classification challenges: rapid datagrowth
of online news documents, and the high computational time.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Departments / MOR / COE: Sciences and Information Technology > Computer and Information Sciences
Depositing User: Mr Ahmad Suhairi Mohamed Lazim
Date Deposited: 18 Sep 2021 21:14
Last Modified: 18 Sep 2021 21:14
URI: http://utpedia.utp.edu.my/id/eprint/21420

Actions (login required)

View Item
View Item