Welcome To UTPedia

We would like to introduce you, the new knowledge repository product called UTPedia. The UTP Electronic and Digital Intellectual Asset. It stores digitized version of thesis, final year project reports and past year examination questions.

Browse content of UTPedia using Year, Subject, Department and Author and Search for required document using Searching facilities included in UTPedia. UTPedia with full text are accessible for all registered users, whereas only the physical information and metadata can be retrieved by public users. UTPedia collaborating and connecting peoples with university’s intellectual works from anywhere.

Disclaimer - Universiti Teknologi PETRONAS shall not be liable for any loss or damage caused by the usage of any information obtained from this web site.Best viewed using Mozilla Firefox 3 or IE 7 with resolution 1024 x 768.

Text Summarization System with Bayesian Theorem on Oil & Gas Drilling Topic

Kurniawan, Iwan (2007) Text Summarization System with Bayesian Theorem on Oil & Gas Drilling Topic. Universiti teknologi petronas. (Unpublished)

[img] PDF
Download (1699Kb)


Text summarization is the process of identifying the important sentences or words from the article which later to be represented and combined to generate the summary. There exist numerous algorithms to address the need for text summarization including Support Vector Machine, k-nearest neighbor classifier, and decision trees. In this project, Bayes theorem algorithm is studied and experimented by the implementation of a textual summarizer. This algorithm is used to extract the important points from a lengthy document, by which it classifies each word in the document under its relevant probability of the word's likeliness to be included in the summary given the corpus containing the summary done by the experts as the initial probability. As the application is used and processed, it would learn and keep track of the probability of each keyword so that it would predict the chance of certain keywords to be included in the future summarization. The objectives of this project are to look at the current situation in the area of text summarization research, to study the statistical approach in automatic text summary generation, and then to create a simple sample of text summarization tool which takes into account the existing research. Since the area of the application is specific, which is on oil and gas drilling topic, the ready-used corpus on that area is not easy to find. The articles collected are from the journals, news and any other information sources which are related to the discussed topic. Evaluation of the application is carried out against another accompanying system-generated summarizer which is already in the market. Human-made summary are used as the ideal or reference summary in evaluating both performance; the Text Summarization system and the Word Auto Summarizer. Current results show that the Text Summarization system performs better than the Word Auto Summarizer at the compression rate 60% and 70% (2/3 of the articles' length) by 11.31% and 10.80% respectively. Optimum value for overall performance is 85.82%.

Item Type: Final Year Project
Academic Subject : Academic Department - Information Communication Technology
Subject: Z Bibliography. Library Science. Information Resources > ZA Information resources
Divisions: Sciences and Information Technology > Computer and Information Sciences
Depositing User: Users 2053 not found.
Date Deposited: 24 Oct 2013 09:14
Last Modified: 25 Jan 2017 09:45
URI: http://utpedia.utp.edu.my/id/eprint/9553

Actions (login required)

View Item View Item

Document Downloads

More statistics for this item...