AUTHOR IDENTIFICATION OF HINDI STORIES

1Dr.A.Pandian, Dr. M. Abdul Karim Sadiq, Paritosh Maurya, Nitin Jaiswal

131 Views
39 Downloads
Abstract:

Attribution also called Authorship Identification determines the probability of work that is produced by any author by examining other works from that same author. This process is used in various places like Characterization of work of an author, detecting Plagiarism, Cybercrime analysis etc. In this paper, we are using this process on a corpus of 70 Hindi Stories each from three different authors. Various lexical and structural features are extracted from these works like Word count, Average length of sentence, Frequency of words and characters, Function Words etc. With help of these features we build a dataset and use it as input in J48 decision tree algorithm for determining the best features that help in authorship attribution. We then use these extracted features on different types of algorithm lik

Keywords:

Author Identification, Feature Selection, Hindi Stories, J48 Decision Tree, Machine Learning, Stylometry, Weka.

Paper Details
Month4
Year2020
Volume24
IssueIssue 6
Pages6514-6522