AUTHOR IDENTIFICATION OF HINDI STORIES
1Dr.A.Pandian, Dr. M. Abdul Karim Sadiq, Paritosh Maurya, Nitin Jaiswal
Attribution also called Authorship Identification determines the probability of work that is produced by any author by examining other works from that same author. This process is used in various places like Characterization of work of an author, detecting Plagiarism, Cybercrime analysis etc. In this paper, we are using this process on a corpus of 70 Hindi Stories each from three different authors. Various lexical and structural features are extracted from these works like Word count, Average length of sentence, Frequency of words and characters, Function Words etc. With help of these features we build a dataset and use it as input in J48 decision tree algorithm for determining the best features that help in authorship attribution. We then use these extracted features on different types of algorithm lik
Author Identification, Feature Selection, Hindi Stories, J48 Decision Tree, Machine Learning, Stylometry, Weka.