Preparatory Document Structuring Technique

11Yan Puspitarani, 2Ulil Surtia Zulpratita

202 Views
47 Downloads
Abstract:

The need for mining structured data has increased in the past few years. This structured data is used as input for data mining tasks. Text mining is part of data mining where the data used is in the form of unstructured text. Text mining can able to handle unstructured or semi-structured data sets such as emails HTML files and full text documents etc. The unstructured data usually refers to information that does not reside in a traditional row-column database and it is the opposite of structured data. In order to extract information from text, preprocessing steps are needed.This paper discussed about the theoretical basis of preprocessing document for Text Mining. Brief descriptions of some representative approaches such as NLP tasks and Information extraction are provided as well.

Keywords:

Text mining, Document structuring, Information extraction.

Paper Details
Month2
Year2020
Volume24
IssueIssue 2
Pages3293-3302