Query-based Text Summarization using Averaged Query

Authors

  • Abhinandh Ajay Department of Computer Science and Engineering SRM Institute of Science and Technology Kattankulathur, Tamil Nadu - 603203 Author
  • Shravan V Department of Computer Science and Engineering SRM Institute of Science and Technology Kattankulathur, Tamil Nadu - 603203 Author
  • R. Srinivasan Department of Computer Science and Engineering SRM Institute of Science and Technology Kattankulathur, Tamil Nadu - 603203 Author

DOI:

https://doi.org/10.61841/m479ds71

Keywords:

text summarization,, data, abstraction

Abstract

Automatic text summarization is one of the most common problems in natural language proc e s s i n g a nd machine learning.Text summarization usually works by shortening a given passage and conveying the general meaning of the passage. There are two approaches to this: extraction-based summarization and abstraction based summarization. Extraction - b a s e d summarization, while easier to implement, is usually grammatically incorrect. Abstraction based summarization overcomes this by framing its own sentences using grammatical knowledge of the language and is therefore much harder to implement. There is an ever-increasing amount of unstructured data in the world, so situations can arise where it is only necessary to extract the summary of a part of the given data. This is the case when a person wants to learn something about a certain topic but often has to skim through a lot of unrelated or unnecessary information. While this is not a problem for small amounts of d a t a , it can quickly turn into a burden as the size of the data increases. This can lead to a loss of focus or interest in the topic. In this work, the proposed technique allows the user to enter a keyword and get a summary related to say query using the word frequencies to find the most relevant words. We also use cosine similarities to remove redundant sentences from t h e  s u m m a r y. This query based text summarization technique produces a unique summary for every unique keyword. Better readability can be achieved by using abstractive text summarization.

Downloads

Download data is not yet available.

References

(1) Gong, Yihong and Liu, Xin, Generic Text Summarization Using Relevance Measure and Laten Semantic Analysis, Association for Computing Machinery, 2001, pages 19-25

(2) N. Rahman and B. Borah, A survey on existing extractive techniques for query-based text summarization,

International Symposium on Advanced Computing and Communication (ISACC), 2015, pages 98-102

(3) Ghambir, Mahak and Gupta, Vishal, Recent Automatic Text Summarization Techniques: A Survey,

Kluwer Academic Publishers, 2017, pages 1-66

(4) Lin, Chin-Yew, {ROUGE}: A package for Automatic Evaluation of Summarie s, Association for Computational Linguistics, 2004, pages 74-81

(5) Yutong Wu, Yuefeng Li and Yue Xu, Dual pattern-enhanced representations model for q u e r y - f o c u s e d m u l t i - d o c u m e n t summarization, Knowledge-Based Systems, Volume 163, 1 January 2019, pages 736-748

(6) H. Daumé III, D. Marcu, Bayesian query- focused summarization, in: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, A C L - 4 4 , A s s o c i a t i o n f o r C o m p u t a t i o n a l Linguistics,

Stroudsburg, PA, USA, 2006, pp. 305–312

(7) S.T. Wu, Y. Li, Y. Xu, Deploying approaches for pattern refinement in text mining, in: Proceedings of the Sixth IEEE International Conference on Data Mining, ICDM 2006, (ISSN: 1550-4786) 2006, pp. 1157–116

(8) E. Canhasi, I. Kononenko, Weighted archetypal analysis of the multi-element graph for query- focused multi- document summarization, Expert Syst. Appl. 41 (2) (2014) 535–543

(9) L. Wang, H. Raghavan, V. Castelli, R. Florian, C. Cardie, A sentence compression based framework to query-focused multi-document s um m a r i z a t i o n, 2016, ar Xiv preprint arXiv:1606.07548.

(10) W. Luo, F. Zhuang, Q. He, Z. Shi, Exploiting relevance, coverage, and novelty for query- focused multi- document summarization, Knowl.-Based System. 46 (2013) 33–42

(11) Y. Guangbing, W. Dunwei, Kinshuk, C. Nian- Shing, S. Erkki, A novel contextual topic model for multi- document summarization, Expert Syst. Appl. 42 (3) (2015) 1340–1352

(12) Baumel, Tal and Cohen, Raphael and Elhadad, Michael, Topic Concentration in Query Focused Summarization Datasets, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016, Pages 2573-2579

(13) Schofield, Alexandra and Magnusson, Maans and Mimno, David, Pulling Out the Stops: Rethinking Stopword Removal for Topic Models, Association for Computational Linguistics, 2017, pages 432-436

(14) Jenna Kanerva and Filip Ginter and Tapio Salakoski, Universal Lemmatizer: A sequence to Sequence Mo del for Lemmatizing Universal Dependencies Treebanks, ArXiv, 2019

(15) Singh, Jasmeet and Gupta, Vishal, A Systematic Review of Text Stemming Techniques, Kluwer Academic Publishers, 2017, pages 157-217

(16) Webster, Jonathan J. and Kit, Chunyu, Tokenization as the Initial Phase in NLP,

Downloads

Published

30.06.2020

How to Cite

Ajay, A., V, S., & Srinivasan, R. (2020). Query-based Text Summarization using Averaged Query. International Journal of Psychosocial Rehabilitation, 24(6), 12033-12044. https://doi.org/10.61841/m479ds71