<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">dsait</journal-id><journal-title-group><journal-title xml:lang="ru">Цифровые решения и технологии искусственного интеллекта</journal-title><trans-title-group xml:lang="en"><trans-title>Digital Solutions and Artificial Intelligence Technologies</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">3033-7097</issn><publisher><publisher-name>Финансовый университет при Правительстве Российской Федерации</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26794/3033-7097-2025-1-4-16-25</article-id><article-id custom-type="elpub" pub-id-type="custom">dsait-27</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ И МАШИННОЕ ОБУЧЕНИЕ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING</subject></subj-group></article-categories><title-group><article-title>Анализ тональности пользовательского текста методами машинного обучения</article-title><trans-title-group xml:lang="en"><trans-title>Sentiment Analysis of User Texts with Machine Learning Methods</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0009-0009-5748-9031</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Горбунова</surname><given-names>Е. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Gorbunova</surname><given-names>E. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Екатерина Александровна Горбунова — cтарший разработчик программного обеспечения</p><p>Санкт-Петербург</p></bio><bio xml:lang="en"><p>Ekaterina A. Gorbunova — Senior Software Developer</p><p>Saint Petersburg</p></bio><email xlink:type="simple">kateswep@mail.ru</email><xref ref-type="aff" rid="aff-1"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0003-3186-3901</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Кочкаров</surname><given-names>Р. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Kochkarov</surname><given-names>R. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Расул Ахматович Кочкаров — кандидат экономических наук, доцент кафедры искусственного интел- лекта факультета информационных технологий и анализа больших данных</p><p>Москва</p></bio><bio xml:lang="en"><p>Rasul A. Kochkarov — Cand. Sci. (Econ.), Assoc. Prof. of Artificial Intelligence Department, Faculty of Information Technology and Big Data Analysis</p><p>Moscow</p></bio><email xlink:type="simple">rkochkarov@fa.ru</email><xref ref-type="aff" rid="aff-2"/></contrib><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0009-0006-4385-4462</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Окунева</surname><given-names>Э. А.</given-names></name><name name-style="western" xml:lang="en"><surname>Okuneva</surname><given-names>E. A.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Эвелина Александровна Окунева — ассистент кафедры математики и анализа данных факультета ин- формационных технологий и анализа больших данных</p><p>Москва</p></bio><bio xml:lang="en"><p>Evelina A. Okuneva — Assistant of the Department of Mathematics and Data Analysis, Faculty of Information Technology and Big Data Analysis</p><p>Moscow</p></bio><email xlink:type="simple">eaokuneva@fa.ru</email><xref ref-type="aff" rid="aff-2"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>ООО «Лаборатория систем автоматизации процессов»</institution></aff><aff xml:lang="en"><institution>Laboratory of Process Automation Systems Limited Liability Co.</institution></aff></aff-alternatives><aff-alternatives id="aff-2"><aff xml:lang="ru"><institution>Финансовый университет при Правительстве Российской Федерации</institution></aff><aff xml:lang="en"><institution>Financial University under the Government of the Russian Federation</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2025</year></pub-date><pub-date pub-type="epub"><day>23</day><month>01</month><year>2026</year></pub-date><volume>1</volume><issue>4</issue><fpage>16</fpage><lpage>25</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Горбунова Е.А., Кочкаров Р.А., Окунева Э.А., 2026</copyright-statement><copyright-year>2026</copyright-year><copyright-holder xml:lang="ru">Горбунова Е.А., Кочкаров Р.А., Окунева Э.А.</copyright-holder><copyright-holder xml:lang="en">Gorbunova E.A., Kochkarov R.A., Okuneva E.A.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://www.digitarin.ru/jour/article/view/27">https://www.digitarin.ru/jour/article/view/27</self-uri><abstract><p>В статье рассматривается применение методов машинного обучения для анализа тональности текстов, опубликованных пользователями социальной сети ВКонтакте. Это дает возможность в режиме реального времени отслеживать и анализировать настроения миллионов пользователей, что способствует оперативному принятию решений и прогнозированию социальных процессов. В рамках исследования был реализован сбор текстовых данных с использованием VK API, включающих посты и комментарии пользователей. Проведена предобработка текстов: очистка, лемматизация, удаление стоп-слов и векторизация методом TF-IDF. Для классификации эмоциональной окраски были протестированы модели: логистическая регрессия, случайный лес, наивный байесовский классификатор, а также нейросетевые архитектуры LSTM и Transformers (RuBERT). Наивный байесовский классификатор показал наилучшие результаты по метрике полноты и сбалансированности по другим метрикам. Согласно результатам анализа, большинство текстов пользователей имеют нейтральную или положительную тональность, и лишь незначительная часть — негативную. Представлены визуализации и статистика распределения тональности. Работа демонстрирует эффективность применения классических методов машинного обучения для обработки и анализа текстов в русскоязычных социальных сетях.</p></abstract><trans-abstract xml:lang="en"><p>This paper explores the application of machine learning methods for sentiment analysis of user-generated texts in the Russian social network VKontakte. The sentiments of millions of users could be monitored and analyzed in real time, that facilitates prompt decision making and forecasting of social processes. Textual data, including posts and comments, were collected via the VK API. The preprocessing pipeline involved text cleaning, lemmatization, stop-word removal, and TFIDF vectorization. Several classification models were tested, including logistic regression, random forest, and naïve Bayes, as well as deep learning models such as LSTM and Transformers (RuBERT). The naïve Bayes classifier demonstrated the best performance in terms of recall and overall metric balance. Sentiment analysis results revealed that the majority of user texts were neutral or positive, with only a small portion being negative. The paper includes visualizations and statistical summaries of sentiment distribution. The study confirms the effectiveness of classical machine learning methods for processing and analyzing textual data in Russian social networks.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>анализ тональности</kwd><kwd>машинное обучение</kwd><kwd>социальные сети</kwd><kwd>Вконтакте</kwd><kwd>обработка естественного языка</kwd><kwd>TF-IDF</kwd><kwd>байесовский классификатор</kwd><kwd>сентимент-анализ</kwd><kwd>посты</kwd><kwd>комментарии</kwd></kwd-group><kwd-group xml:lang="en"><kwd>sentiment analysis</kwd><kwd>machine learning</kwd><kwd>social networks</kwd><kwd>VKontakte</kwd><kwd>natural language processing</kwd><kwd>TF-IDF</kwd><kwd>naive Bayes</kwd><kwd>user-generated content</kwd><kwd>posts</kwd><kwd>comments</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Rodríguez-Ibánez M., Casanez-Ventura F., Castejón-Mateos F., Cuenca-Jiménez P.-M. A review on sentiment analysis from social media platforms. Expert Systems with Applications. 2023;223:119862. DOI: 10.1016/j.eswa.2023.119862</mixed-citation><mixed-citation xml:lang="en">Rodríguez-Ibánez M., Casanez-Ventura F., Castejón-Mateos F., Cuenca-Jiménez P.-M. A review on sentiment analysis from social media platforms. Expert Systems with Applications. 2023;223:119862. DOI: 10.1016/j.eswa.2023.119862</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Wankhade M., Rao A.C.S., &amp; Kulkarni C. A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. 2022;55(7):5731–5780. DOI: 10.1007/s10462-022-10144-1</mixed-citation><mixed-citation xml:lang="en">Wankhade M., Rao A.C.S., &amp; Kulkarni, C. A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review. 2022;55(7):5731–5780. DOI: 10.1007/s10462-022-10144-1</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Cortis K., Davis B. Over a Decade of Social Opinion Mining: A Systematic Review. Artificial Intelligence Review. 2021;54(1):4873–4965. DOI: 10.1007/s10462-021-10030-2</mixed-citation><mixed-citation xml:lang="en">Cortis K., Davis B. Over a Decade of Social Opinion Mining: A Systematic Review. Artificial Intelligence Review. 2021;54(6):4873–4965. DOI: 10.1007/s10462-021-10030-2</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Mutanov G., Karyukin A., Mamykova G. Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms. Computers, Materials &amp; Continua. 2021;69(1):913-930. DOI: 10.32604/cmc.2021.017827</mixed-citation><mixed-citation xml:lang="en">Mutanov G., Karyukin A., Mamykova G. Multi-Class Sentiment Analysis of Social Media Data with Machine Learning Algorithms. Computers, Materials &amp; Continua. 2021;69(1):913–930. DOI: 10.32604/cmc.2021.017827</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Salman I.K., Feizi Derakhshi M.R., Pashazadeh S., Asadpour M. A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks. ArXiv preprint. 2022;arXiv:2207.02160. DOI: 10.48550/arXiv.2207.02160</mixed-citation><mixed-citation xml:lang="en">Salman I.K., Feizi Derakhshi M.R., Pashazadeh S., Asadpour M. A Comprehensive Review of Visual-Textual Sentiment Analysis from Social Media Networks. ArXiv preprint. 2022;arXiv:2207.02160. DOI: 10.48550/arXiv.2207.02160</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Zhou, H. Research of text classification based on TF-IDF and CNN-LSTM. Journal of Physics: Conference Series. 2022;2171:012021. DOI: 10.1088/1742-6596/2171/1/012021</mixed-citation><mixed-citation xml:lang="en">Zhou, H. Research of text classification based on TF-IDF and CNN-LSTM. Journal of Physics: Conference Series. 2022;2171:012021. DOI: 10.1088/1742-6596/2171/1/012021</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Oliveira D.F., Nogueira A., Brito M. Performance comparison of machine learning algorithms in classifying information technologies incident tickets. AI. 2022;3(3):601–622. DOI: 10.3390/ai3030035</mixed-citation><mixed-citation xml:lang="en">Oliveira D.F., Nogueira A., Brito M. Performance comparison of machine learning algorithms in classifying information technologies incident tickets. AI. 2022;3(3):601–622. DOI: 10.3390/ai3030035</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Smetanin S. The applications of sentiment analysis for Russian language texts: current challenges and future perspectives. IEEE Access. 2020;8:110693–110719. DOI: 10.1109/ACCESS.2020.3002215</mixed-citation><mixed-citation xml:lang="en">Smetanin, S. The applications of sentiment analysis for Russian language texts: current challenges and future perspectives. IEEE Access. 2020;8:110693–110719. DOI: 10.1109/ACCESS.2020.3002215</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Braga M., Milanese G.C., Pasi G. Investigating large language models’ linguistic abilities for text preprocessing. arXiv preprint. 2025;arXiv:2510.11482. DOI: 10.48550/arXiv.2510.11482</mixed-citation><mixed-citation xml:lang="en">Braga M., Milanese G.C., Pasi G. Investigating large language models’ linguistic abilities for text preprocessing. arXiv preprint. 2025;arXiv:2510.11482. DOI: 10.48550/arXiv.2510.11482</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Feng J.H., Mohaghegh M. Hybrid model of data augmentation methods for text classification task. Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC 3K 2021). 2021:194–197. DOI: 10.5220/0010688500003064</mixed-citation><mixed-citation xml:lang="en">Feng J.H., Mohaghegh M. Hybrid model of data augmentation methods for text classification task. In Proceedings of the 13th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC 3K 2021). 2021:194–197. DOI: 10.5220/0010688500003064</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Гадасин Д.В., Пак Е.В., Коровушкина В.М., Мелькова Е.К. Предобработка текстовой информации на основе термов естественного языка. REDS: Телекоммуникационные устройства и системы. 2022;1:4-12. URL: https://www.elibrary.ru/pdgavp</mixed-citation><mixed-citation xml:lang="en">Gadasin D.V., Pak E.V., Korovushkina V.M., Melkova E.K. Natural Language Term-Based Text Information Preprocessing. REDS: Telecommunication Devices and Systems. 2022;1:4-12. URL: https://www.elibrary.ru/pdgavp (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit12"><label>12</label><citation-alternatives><mixed-citation xml:lang="ru">Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint. 2019;arXiv:1907.11692. DOI: 10.48550/arXiv.1907.11692</mixed-citation><mixed-citation xml:lang="en">Liu Y., Ott M., Goyal N., Du J., Joshi M., Chen D., Levy O., Lewis M., Zettlemoyer L., Stoyanov V. RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint. 2019; arXiv:1907.11692. DOI: 10.48550/arXiv.1907.11692</mixed-citation></citation-alternatives></ref><ref id="cit13"><label>13</label><citation-alternatives><mixed-citation xml:lang="ru">Щекотин Е.В., Гойко В.Л., Басина П.А., Бакулин В.В. Использование машинного обучения для изучения качества жизни населения: методологические аспекты. Цифровая социология. 2022;5(1):87–97. DOI: 10.26425/2658-347X-2022-5-1-87-97</mixed-citation><mixed-citation xml:lang="en">Shchekotin E.V., Goiko V.L., Basina P.A., Bakulin V.V. Using machine learning to study the population life quality: methodological aspects. Digital Sociology. 2022;5(1):87–97. (In Russ.). DOI: 10.26425/2658-347X-2022-5-1-87-97</mixed-citation></citation-alternatives></ref><ref id="cit14"><label>14</label><citation-alternatives><mixed-citation xml:lang="ru">Гальченко Ю.В., Нестеров С.А. Классификация текстов по тональности ML-методами. Системный анализ в проектировании и управлении. Сборник научных трудов XXVI Международной научно-практической конференции. В 3 ч. Ч. 3. Санкт-Петербург, 13–14 октября 2023 г. СПб.: ПОЛИТЕХ-ПРЕСС. 2023;26(3):369–378. DOI: 10.18720/SPBPU/2/id23-501</mixed-citation><mixed-citation xml:lang="en">Galchenko Yu.V., Nesterov S.A. Sentiment analysis with machine learning methods. Systems Analysis in Design and Management. Proc. Of the XXVI International scientific conference, St. Petersburg, October 13–14, 2023. St. Petersburg: Politekh-Press; 2023;3:369–378. (In Russ.). DOI: 10.18720/SPBPU/2/id23-501</mixed-citation></citation-alternatives></ref><ref id="cit15"><label>15</label><citation-alternatives><mixed-citation xml:lang="ru">Мезенев К.А., Бадрызлова Ю.Г. Анализ эмоциональной тональности русскоязычных текстов с цифровыми методами. НИУ ВШЭ, магистерская диссертация. Москва, 2025. URL: https://www.hse.ru/edu/vkr/1055012487</mixed-citation><mixed-citation xml:lang="en">Mezenev K.A., Badryzlova Yu.G. Sentiment Analysis of Russian-Language Texts Using Digital Methods. Master’s Thesis. National Research University Higher School of Economics (HSE), Moscow. 2025. URL: https://www.hse.ru/edu/vkr/1055012487 (In Russ.).</mixed-citation></citation-alternatives></ref><ref id="cit16"><label>16</label><citation-alternatives><mixed-citation xml:lang="ru">Катермина Т.С., Тагиров К.М., Тагиров Т.М. Элементы ИИ в анализе текстов: LSTM-приложение к Вконтакте. Computational Nanotechnology. 2022;9(2):35–44. DOI: 10.33693/2313-223X-2022-9-2-35-44.</mixed-citation><mixed-citation xml:lang="en">Katermina T.S., Tagirov K.M., Tagirov T.M. Elements of artificial intelligence in solving problems of text analysis. Computational Nanotechnology. 2022;9(2):35-44. (In Russ.). DOI: 10.33693/2313-223X-2022-9-2-35-44</mixed-citation></citation-alternatives></ref><ref id="cit17"><label>17</label><citation-alternatives><mixed-citation xml:lang="ru">Челышев Э.А., Оцоков Ш.А., Раскатова М.В., Щёголев П. Сравнение методов классификации русскоязычных новостных текстов с использованием алгоритмов машинного обучения. Вестник кибернетики. 2022;1(45):63–71. DOI: 10.34822/1999-7604-2022-1-63-71.</mixed-citation><mixed-citation xml:lang="en">Chelyshev E.A., Otsokov Sh.A., Raskatova M.V., Shchegolev P. Comparing classification methods for news texts in russian using machine learning algorithms. Proceedings of Cybernetics. 2022;1(45):63-71. (In Russ.). DOI: 10.34822/1999-7604-2022-1-63-71</mixed-citation></citation-alternatives></ref><ref id="cit18"><label>18</label><citation-alternatives><mixed-citation xml:lang="ru">Ивахин Д.Е., Андиева Е.Ю. Автоматический анализ текста для выявления профессиональных навыков: гибридный подход на основе TF-IDF и нейросетевых эмбеддингов. Вестник науки. 2025;4(85-2):685–692. URL: https://www.вестник-науки.рф/article/22263</mixed-citation><mixed-citation xml:lang="en">Ivakhin D.E., Andieva E. Yu. Automatic text analysis for identifying professional skills: a hybrid approach based on TF-IDF and neural network embeddings. Bulletin of Science. 2025;4(85):685–692. (In Russ.). URL: https://www.vestnik-nauki.com/article/22263</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
