<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">dsait</journal-id><journal-title-group><journal-title xml:lang="ru">Цифровые решения и технологии искусственного интеллекта</journal-title><trans-title-group xml:lang="en"><trans-title>Digital Solutions and Artificial Intelligence Technologies</trans-title></trans-title-group></journal-title-group><issn pub-type="epub">3033-7097</issn><publisher><publisher-name>Финансовый университет при Правительстве Российской Федерации</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.26794/3030-7097-2026-2-1-6-15</article-id><article-id custom-type="elpub" pub-id-type="custom">dsait-45</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ И МАШИННОЕ ОБУЧЕНИЕ</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="en"><subject>ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING</subject></subj-group></article-categories><title-group><article-title>Гибридные ансамблевые методы интеллектуального анализа данных: интеграция интерпретируемости и производительности в условиях больших данных</article-title><trans-title-group xml:lang="en"><trans-title>Hybrid Ensemble Data Mining Methods: Integrating Interpretability and Performance in Big Data Environments</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0000-0002-9145-5494</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Маркова</surname><given-names>С. В.</given-names></name><name name-style="western" xml:lang="en"><surname>Markova</surname><given-names>S. V.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Светлана Владимировна Маркова — кандидат технических наук, доцент, доцент кафедры математики и анализа данных факультета информационных технологий и анализа больших данных</p><p>Москва</p></bio><bio xml:lang="en"><p>Svetlana V. Markova — Cand. Sci. (Tech.), Assoc. Prof., Assoc. Prof., Department of Mathematics and Data Analysis, Faculty of Information Technology and Big Data Analysis</p><p>Moscow</p></bio><email xlink:type="simple">SVmarkova@fa.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru"><institution>Финансовый университет при Правительстве Российской Федерации</institution></aff><aff xml:lang="en"><institution>Financial University under the Government of the Russian Federation</institution></aff></aff-alternatives><pub-date pub-type="collection"><year>2026</year></pub-date><pub-date pub-type="epub"><day>22</day><month>04</month><year>2026</year></pub-date><volume>2</volume><issue>1</issue><fpage>6</fpage><lpage>15</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Маркова С.В., 2026</copyright-statement><copyright-year>2026</copyright-year><copyright-holder xml:lang="ru">Маркова С.В.</copyright-holder><copyright-holder xml:lang="en">Markova S.V.</copyright-holder><license xml:lang="ru" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>Данная работа распространяется под лицензией Creative Commons Attribution 4.0.</license-p></license><license xml:lang="en" license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://www.digitarin.ru/jour/article/view/45">https://www.digitarin.ru/jour/article/view/45</self-uri><abstract><p>Данное исследование представляет комплексный анализ гибридных ансамблевых методов, интегрирующих классические алгоритмы машинного обучения с современными технологиями глубокого обучения для решения задач классификации и прогнозирования на больших данных. основная цель работы заключается в разработке и эмпирической валидации методологического подхода, позволяющего достичь оптимального баланса между производительностью модели и объяснимостью ее решений.</p><p>В ходе исследования применялись методы стекинга, бэггинга и бустинга в сочетании с техниками интерпретируемого машинного обучения, включая SHAP-анализ и методы важности признаков.</p><p>Результаты эмпирической базы исследования демонстрируют, что предложенная гибридная архитектура обеспечивает повышение точности классификации на 12–18% по сравнению с базовыми моделями при сохранении уровня интерпретируемости выше 0,85 по метрике LIME. Установлено, что оптимальная конфигурация ансамбля включает комбинацию случайного леса, градиентного бустинга и нейронных сетей с весовыми коэффициентами 0,4, 0,35 и 0,25 соответственно. теоретическая значимость работы заключается в расширении методологической базы интеллектуального анализа данных через интеграцию принципов объяснимого ИИ в ансамблевые архитектуры. Практическая ценность определяется возможностью применения разработанного подхода в критически важных областях, требующих прозрачности принятия решений.</p></abstract><trans-abstract xml:lang="en"><p>This study provides a comprehensive analysis of hybrid ensemble methods that integrate classical machine learning algorithms with modern deep learning technologies to solve classification and forecasting tasks on large datasets. The main goal of this work is to develop and empirically validate a methodological approach that allows for achieving an optimal balance between model performance and the explainability of its decisions. The study used stacking, bagging, and boosting methods in combination with interpretable machine learning techniques, including SHAP analysis and feature importance methods. The results of the empirical study demonstrate that the proposed hybrid architecture improves classification accuracy by 12–18% compared to the baseline models, while maintaining an interpretability level above 0.85 using the LIME metric. It has been established that the optimal ensemble configuration includes a combination of random forest, gradient boosting, and neural networks with weight coefficients of 0.4, 0.35, and 0.25, respectively. The theoretical significance of the work lies in expanding the methodological framework of data mining by integrating the principles of explainable AI into ensemble architectures. The practical value is determined by the possibility of applying the developed approach in critical areas that require transparent decision-making.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>интеллектуальный анализ данных</kwd><kwd>ансамблевые методы</kwd><kwd>объяснимый искусственный интеллект</kwd><kwd>гибридные алгоритмы</kwd><kwd>машинное обучение</kwd><kwd>большие данные</kwd><kwd>интерпретируемость</kwd></kwd-group><kwd-group xml:lang="en"><kwd>data mining</kwd><kwd>ensemble methods</kwd><kwd>explainable artificial intelligence</kwd><kwd>hybrid algorithms</kwd><kwd>machine learning</kwd><kwd>big data</kwd><kwd>interpretability</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Zhou X., Du H., Xue S., Ma Z. Recent advances in data mining and machine learning for enhanced building energy management. Energy. 2024;307:132636. DOI: 10.1016/j.energy.2024.132636</mixed-citation><mixed-citation xml:lang="en">Zhou X., Du H., Xue S., Ma Z. Recent advances in data mining and machine learning for enhanced building energy management. Energy. 2024;307:132636. DOI: 10.1016/j.energy.2024.132636</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Sarker I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science. 2021;2:160. DOI: 10.1007/s42979-021-00592-x</mixed-citation><mixed-citation xml:lang="en">Sarker I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science. 2021;2:160. DOI: 10.1007/s42979-021-00592-x</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Khemani B., Patil S., Kotecha K., Tanwar S. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. Journal of Big Data. 2024;11:18. DOI: 10.1186/s40537-024-00888-z</mixed-citation><mixed-citation xml:lang="en">Khemani B., Patil S., Kotecha K., Tanwar S. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. Journal of Big Data. 2024;11:18. DOI: 10.1186/s40537-024-00888-z</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Rahman A., Debnath T., Kundu D., Fahad Bin Mazhar M., Band S.S., Mosavi A. Machine learning and deep learning-based approach in smart healthcare: Recent advances, applications, challenges and opportunities. AIMS Public Health. 2024;11(1):58-109. DOI: 10.3934/publichealth.2024004</mixed-citation><mixed-citation xml:lang="en">Rahman A., Debnath T., Kundu D., Fahad Bin Mazhar M., Band S.S., Mosavi A. Machine learning and deep learning-based approach in smart healthcare: Recent advances, applications, challenges and opportunities. AIMS Public Health. 2024;11(1):58-109. DOI: 10.3934/publichealth.2024004</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Talukder Md.A., Islam Md.M., Uddin Md.A., Hasan K.F., Sharmin S., Alyami S.A., Moni M.A. Machine learningbased network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. Journal of Big Data. 2024;11:5. DOI: 10.1186/s40537-024-00886-1</mixed-citation><mixed-citation xml:lang="en">Talukder Md.A., Islam Md.M., Uddin Md.A., Hasan K.F., Sharmin S., Alyami S.A., Moni M.A. Machine learningbased network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. Journal of Big Data. 2024;11:5. DOI: 10.1186/s40537-024-00886-1</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Wang H., Liang Q., Hancock J.T., Khoshgoftaar T.M. Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods. Journal of Big Data. 2024;11:45. DOI: 10.1186/s40537-024-00914-0</mixed-citation><mixed-citation xml:lang="en">Wang H., Liang Q., Hancock J.T., Khoshgoftaar T.M. Feature selection strategies: a comparative analysis of SHAP-value and importance-based methods. Journal of Big Data. 2024;11:45. DOI: 10.1186/s40537-024-00914-0</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Ziyadullaev D., Muhamediyeva D., Khujamkulova K., Abdurakhimov D., Maksumkhanova A., Ziyodullaeva G. Ensemble data mining methods for assessing soil fertility. E3S Web of Conferences. 2024;508:02013. DOI: 10.1051/e3sconf/202450802013</mixed-citation><mixed-citation xml:lang="en">Ziyadullaev D., Muhamediyeva D., Khujamkulova K., Abdurakhimov D., Maksumkhanova A., Ziyodullaeva G. Ensemble data mining methods for assessing soil fertility. E3S Web of Conferences. 2024;508:02013. DOI: 10.1051/e3sconf/202450802013</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Demilie W.B. Plant disease detection and classification techniques: a comparative study of the performances. Journal of Big Data. 2024;11:28. DOI: 10.1186/s40537-024-00907-z</mixed-citation><mixed-citation xml:lang="en">Demilie W.B. Plant disease detection and classification techniques: a comparative study of the performances. Journal of Big Data. 2024;11:28. DOI: 10.1186/s40537-024-00907-z</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Stenhouse K., Quirk S., Cherpak L., Giaddui T., Yu Y., Teo B.K. Prospective validation of a machine learning model for applicator and hybrid interstitial needle selection in high-dose-rate cervical brachytherapy. Brachytherapy. 2024;23:(2):145-153. DOI: 10.1016/j.brachy.2023.11.008</mixed-citation><mixed-citation xml:lang="en">Stenhouse K., Quirk S., Cherpak L., Giaddui T., Yu Y., Teo B.K. Prospective validation of a machine learning model for applicator and hybrid interstitial needle selection in high-dose-rate cervical brachytherapy. Brachytherapy. 2024;23:(2):145-153. DOI: 10.1016/j.brachy.2023.11.008</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Azevedo R.C., Araújo R.A., Oliveira A.L.I. Hybrid approaches to optimization and machine learning methods: a systematic literature review. Machine Learning. 2024;113:4055-4097. DOI: 10.1007/s10994-023-06467-x</mixed-citation><mixed-citation xml:lang="en">Azevedo R.C., Araújo R.A., Oliveira A.L.I. Hybrid approaches to optimization and machine learning methods: a systematic literature review. Machine Learning. 2024;113:4055-4097. DOI: 10.1007/s10994-023-06467-x</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
