Kazakh Text Generation using Neural Bag-of-Words Model for Sentiment Analysis

Assel Nurlybayeva, Ali Abd Almisreb, Syamimi Mohd Norzeli, Musab A. M. Ali


Text generation plays an important role in making decisions in business. Analyzing the consumer’s feedback provides a complete picture of the problem with a definite direction. However, sentimental analyses of reviews in the Kazakh language are not widely cultivated. In this paper, we introduce the Kazakh text generation using the Bag-of-Words model (BoW) models for analyzing the opinions of consumers in social networks. The use of proposed models in natural language processing consists of four stages: data collection, cleaning data, building model, and model evaluation. The proposed BoW model is supported by the platform - Colab notebook and implemented using the python language. Based on experimental results, defined method with higher efficiency as compared to other existing analogs.


Kazakh text generation, deep learning, sentimental analyses, Bag-of-words

Full Text:



Yemm, G. (2006), "Can NLP help or harm your business?"

Ranjan, Sandeep, Sumesh Sood, and Vikas Verma. "Twitter sentiment analysis of real-time customer experience feedback for predicting growth of Indian telecom companies." 2018 4th International Conference on Computing Sciences (ICCS). IEEE, 2018.

Liu, Bing. "Sentiment analysis: A multi-faceted problem." IEEE Intelligent Systems 25.3 (2017): 76-80.

Yergesh, Banu, Gulmira Bekmanova, and Altynbek Sharipbay. "Sentiment analysis on the hotel reviews in the Kazakh language." 2017 International Conference on Computer Science and Engineering (UBMK). IEEE, 2017.

Phani, Shanta, Shibamouli Lahiri, and Arindam Biswas. "Sentiment analysis of tweets in three Indian languages." Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing (WSSANLP2016). 2016.

Baly, Ramy, Georges El-Khoury, Rawan Moukalled, Rita Aoun, Hazem Hajj, Khaled Bashir Shaban, and Wassim El-Hajj. "Comparative evaluation of sentiment analysis methods across Arabic dialects." Procedia Computer Science 117 (2017): 266-273.

Yildirim, Ezgi, Fatih Samet Çetin, G. Eryigit and Tanel Temel. “The Impact of NLP on Turkish Sentiment Analysis.” (2016).

D. Yan, K. Li, S. Gu and L. Yang, "Network-Based Bag-of-Words Model for Text Classification," in IEEE Access, vol. 8, pp. 82641-82652, 2020, doi: 10.1109/ACCESS.2020.2991074.

Jin, Wei, and Yunsong Feng. "Automatic Classification for Ground Targets under Complex Background Based on Bag of Words Model." In IOP Conference Series: Materials Science and Engineering, vol. 711, no. 1, p. 012089. IOP Publishing, 2020.

Yadav, Ashima, and Dinesh Kumar Vishwakarma. "Sentiment analysis using deep learning architectures: a review." Artificial Intelligence Review 53, no. 6 (2020): 4335-4385.

Naseem, Usman, Imran Razzak, Katarzyna Musial, and Muhammad Imran. "Transformer based deep intelligent contextual embedding for Twitter sentiment analysis." Future Generation Computer Systems 113 (2020): 58-69.

Alamoodi, Abdullah, Bilal Zaidan, Aws Zaidan, Osamah Albahri, Khaled Mohammed, Rami Malik, Esam Almahdi et al. "Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review." Expert systems with applications (2020): 114155.

Garcia, Klaifer, and Lilian Berton. "Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA." Applied Soft Computing 101: 107057.

Ishihara, Shunichi. "The Influence of Background Data Size on the Performance of a Score-Based Likelihood Ratio System: A Case of Forensic Text Comparison." ALTA 2020: 21.

Saha, Dipanjan, Riya Sil, and Abhishek Roy. "A Study on Implementation of Text Analytics over Legal Domain." In Evolution in Computational Intelligence, pp. 561-571. Springer, Singapore, 2021.

Ahuja, Ravinder, Alisha Banga, and S. C. Sharma. "Detecting Abusive Comments Using Ensemble Deep Learning Algorithms." In Malware Analysis Using Artificial Intelligence and Deep Learning, pp. 515-534. Springer, Cham, 2021.

Pandey, Preksha, Jatin Keswani, and Subrat Kumar Dash. "Comparative Analysis of Various Techniques Used to Obtain a Suitable Summary of the Document." In Rising Threats in Expert Applications and Solutions, pp. 627-633. Springer, Singapore, 2021.

Ikoro, Victoria, Maria Sharmina, Khaleel Malik, and Riza Batista-Navarro. "Analyzing sentiments expressed on Twitter by UK energy company consumers." In 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), pp. 95-98. IEEE, 2018.

Anggraini, Auliya, Entin Martiana Kusumaningtyas, Ali Ridho Barakbah, and M. Tafaquh Fiddin Al Islami. "Indonesian Conjunction Rule Based Sentiment Analysis For Service Complaint Regional Water Utility Company Surabaya." In 2020 International Electronics Symposium (IES), pp. 541-548. IEEE, 2020.

Sari, Eka Yulia, Akrilvalerat Deainert Wierfi, and Arief Setyanto. "Sentiment Analysis of Customer Satisfaction on Transportation Network Company Using Naive Bayes Classifier." 2019 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM). IEEE, 2019.

Baj-Rogowska, Anna. "Sentiment analysis of Facebook posts: The Uber case." 2017 Eighth International Conference on Intelligent Computing and Information Systems (ICICIS). IEEE, 2017.

Abdalla, Ghazi, and Fatih Özyurt. "Sentiment Analysis of Fast Food Companies With Deep Learning Models." The Computer Journal (2020).

Ernawati, Siti, and Eka Rini Yulia. "Implementation of The Naïve Bayes Algorithm with Feature Selection using Genetic Algorithm for Sentiment Review Analysis of Fashion Online Companies." 2018 6th International Conference on Cyber and IT Service Management (CITSM). IEEE, 2018.

DOI: http://dx.doi.org/10.21533/scjournal.v11i2.234


  • There are currently no refbacks.

Copyright (c) 2022 Assel Nurlybayeva, Ali Abd Almisreb, 3Syamimi Mohd Norzeli, Musab A. M. Ali

ISSN 2233 -1859

Digital Object Identifier DOI: 10.21533/scjournal

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License