Vol 1, No 3 (2025)

Machine Learning Methods for Predicting Life Expectancy

A. V. Kuznetsova, L. R. Borisova, G. A. Postovalova

PDF (Rus)

6-18 109

Abstract

Forecasting of life expectancy is associated not only with serious social and ﬁnancial factors, but also with the state of public health and economy, as well as with the state of the environment. The use of mathematical methods makes it possible to identify the most informative indicators affecting life expectancy. The aim of the paper is to predict life expectancy from World Bank data using machine learning (ML) methods, and to compare the effectiveness of life expectancy prediction using different machine learning algorithms, including such widely used methods as support vector method, decision tree, random forest, Fisher’s linear discriminant, neural networks, two variants of gradient bousting, logistic regression and statistically weighted syndrome method. The database included data for 238 countries. Standard non-parametric chi-square (χ²) and Mann-Whitney criteria (U-test) were applied. Eleven signiﬁcant indicators were identiﬁed. Machine learning (ML) methods of Data Master Azforus data analysis system was used. The prediction result of the statistically weighted syndrome (SWS) method achieved a ROC AUC = 0.986. One-dimensional and two-dimensional diagrams of the relationship between the studied socio-economic and medical indicators on life expectancy are presented. From these charts, predictions can be derived for changes in individual indicators to improve quality and length of life. Thus, the Data Master Azforus data analysis system will enable researchers to create recommendation systems for life expectancy prediction. In addition, the conducted research will help to create a more advanced forecasting system using machine learning models that can serve as a guide for politicians makers in improving life expectancy forecasting.

Modern Tokenization Methods for Text Processing in the Financial Domain

E. F. Boltachev, M. P. Farhadov, A. I. Tyulyakov

PDF (Rus)

19-29 111

Abstract

The paper discusses tokenization as a key step in textual data processing, especially in the ﬁnancial domain. Current tokenization techniques are analyzed with examples from recent research and their impact on the performance of NLP models. The study shows that word-based tokenization algorithms (BPE, WordPiece, Unigram) have become the standard for language models due to their ﬂexibility and text compression efﬁciency. We discuss the limitations of input sequence length in language models (BPE and WordPiece show a tendency to over-partition, Unigram requires complex training, and symbolic tokenisation creates excessively long sequences) and methods to overcome these limitations, including text partitioning, hierarchical processing and extrapolation of pre-trained models with Transformer architecture to handle long input data. For ﬁnancial data, it is recommended to use domain-speciﬁc tokenizers or additional training on specialized systems, which is conﬁrmed by the successful experience of BloomberGPT. Special attention is paid to the problem of processing long texts. Three solution approaches are proposed: text partitioning; hierarchical processing; extrapolation of transformer models. In conclusion, the importance of tokenization for ﬁnancial analytics is emphasized, where the quality of text processing directly affects decision-making. The development of tokenization methods continues in parallel with the improvement of NLP models, which makes this stage of text processing a critical component of modern analytical systems.

Dynamic Graphs and Some of Their Applications

R. A. Kochkarov, A. A. Kochkarov

PDF (Rus)

30-36 71

Abstract

The article considers modern approaches to modeling network systems and networks with a dynamic nature in general. The paper presents a modern class of dynamic graphs with a description of their practical implementation. Basic or simple operations, including deletion or addition of vertices and edges, are presented as a procedure for changing a dynamic graph. A special subclass of prefractal graphs with self-similarity properties is identiﬁed. For the class of dynamic graphs, the concept of a trajectory is deﬁned, represented by a sequence of classical graphs changing from one to another in timeline. The toolkit of dynamic graphs can become the base for developing algorithms for command-information interaction of mobile subscribers in network systems, including network systems of continuous spatial monitoring. To describe optimization problems on multi-weighted graphs, a formal statement of a multi-criteria problem on a prefractal graph is proposed. Sets of feasible solutions, Pareto-optimal and complete solutions are described. Some lemmas of multicriterial optimization for individual problems that have the property of completeness are proposed, as well as restrictions on the linear convolution of criteria for ﬁnding Pareto-optimal solutions. The hereditary properties that manifest themselves in the trajectories of a dynamic graph are investigated, namely, the heredity of structural and functional characteristics and, as a result, the heredity of decisions during the transition from one graph to another in the trajectory of a dynamic graph. This work contributes to the development of network science and the theory of dynamic networks, offering both approaches and particular solutions on general and special classes of graphs.

Forecasting the Success of Projects Based on the Analysis of Structural Characteristics of Communication Networks

D. A. Pavlov

PDF (Rus)

37-43 63

Abstract

The article presents a methodology for predicting the success of project initiatives based on an analysis of the structural characteristics of project team communication networks. The study is based on data from task tracking systems (Jira, Trello) that reﬂect formal interactions between project participants. The research methodology includes a set of analytical tools: correlation analysis using Spearman’s and Pearson’s coefﬁcients, regression modeling based on the Random Forest algorithm, and anomaly detection methods using Isolation Forest. The study revealed statistically signiﬁcant correlations between key network metrics (betweenness centrality, network density, graph diameter) and project performance indicators (adherence to deadlines, budget, quality of results). A statistically signiﬁcant negative correlation was found between excessive centralization and adherence to deadlines (ρ = –0.72), as well as a positive correlation between network density and quality of results (r = 0.68). The developed model based on Random Forest demonstrates the accuracy of forecasting the success of projects at the level of 84%. It was found that excessive centralization of communications reduces the probability of successful project implementation, while the optimal density of the communication network contributes to the achievement of project KPIs. The practical signiﬁcance of the study lies in the possibility of early detection of project failure risks based on objective metrics of communication activity. The developed methodology, tested on IT company data, allows not only to predict risks, but also to form recommendations for optimizing team interactions. The results of the study are of interest to project managers, HR analysts, and data-driven management specialists.

Cybersecurity of the Digital Ruble in the System of Destructive Events of the Digital Space

A. V. Ivanov, A. V. Tsaregorodtsev

PDF (Rus)

44-54 77

Abstract

The article discusses the current problems of ensuring the cybersecurity of the digital ruble in the context of growing threats to the digital space. The relevance of the research is determined by the following factors: the need to develop a new cybersecurity concept for the digital ruble; an increase in the number of computer attacks on the credit and ﬁnancial sector; an increase in the destructive information impact on ﬁnancial organizations; the need to identify and prevent cyber threats; the importance of protection against hacker attacks, viruses and fraud; the need to optimize processes and improve the quality of ﬁnancial payments. The purpose of the research is to develop a technical solution to ensure the cybersecurity of the digital ruble infrastructure. During the study, the following tasks were solved: the concepts of “trust infrastructure” and “cybersecurity” were clariﬁed; cyberspace tools were analyzed in the context of protecting the digital ruble; proposals were developed to improve the cybersecurity of the trust infrastructure. The practical signiﬁcance of the work lies in the possibility of applying the proposed solutions to strengthen the digital ruble protection system.

Undeclared File Architecture Features: Graphical Containers

A. A. Ryzhenko, S. I. Kozminykh

PDF (Rus)

55-61 65

Abstract

The beginning of the two thousand years showed the true face of the digital industry vendors. Signiﬁcant changes in the protocol system, ﬁle system, data transfer system, etc. introduced a fundamentally new direction into the general concept of information security — ensuring data conﬁdentiality from inside. This direction is quite relevant at present time. The avalanche-increasing number of undeclared functions raises objective concerns about the occurrence of various digital risks in all countries, without exception, even in the one where the main ofﬁces of GAFAM are located. This article does not intend to consider circumventing the ban using VPN channels (ofﬁcially authorized by the FSB of Russia using licensed resources from popular providers), but will present an algorithm describing one of the reasons that prompted the government to ban the use of global graphic information exchangers. There are many examples of such algorithms, and as a result, the decision taken by the RCN is justiﬁed from the point of view of protecting the digital sovereignty of our state. As proof, a number of articles are proposed on the hidden functions of information systems and technologies. The initial stage will be devoted to the ﬁle system for both stationary and mobile systems. Start with the simplest features that are overlooked even by modern data protection systems. The articles are intended Only for educational and preventive purposes for specialists and do not include instructions for harming organizations.

Organizational Aspects of the Use Artificial Intelligence in the Field of Defense

A. N. Kogteva

PDF (Rus)

62-68 64

Abstract

In the context of digital transformation, AI is becoming a key factor in increasing combat effectiveness and ensuring the sustainable development of the Armed Forces of the Russian Federation. The article is devoted to the analysis of organizational aspects of the use of artiﬁcial intelligence in the ﬁeld of defense of the Russian Federation, n the context of the need to ﬁnd new ways to strengthen national security and defense capability.

The purpose of the article is to analyze the current problems of introducing artiﬁcial intelligence into the organizational processes of Russia’s defense activities and propose practical ways to solve them.

Methods. The research used methods of system analysis, comparative analysis, and a structural and functional approach, which made it possible to study the features of the integration of artiﬁcial intelligence into the organizational and managerial processes of the Russian Armed Forces. The choice of research methods is based on their complementary capabilities and allows for a comprehensive understanding of the processes of integrating artiﬁcial intelligence into the defense sector.

The problem. The paper identiﬁes key problems in the implementation of artiﬁcial intelligence in the defense sector: technological (limitations of artiﬁcial intelligence, integration into existing systems); resource-related (ﬁnancial and infrastructural inadequacies); human resources (shortage of specialists); institutional (regulatory, legal, and ethical issues); organizational (lack of interagency coordination).

Results. The author proposes speciﬁc organizational and operational measures aimed at overcoming these obstacles and effectively integrating new technologies. For the ﬁrst time, the article comprehensively examines the organizational and management issues of integrating artiﬁcial intelligence in the context of the Armed Forces of the Russian Federation, taking into account modern challenges and national characteristics It identiﬁes current problems in the implementation of artiﬁcial intelligence in defense management processes, such as technological, personnel, and regulatory barriers.

Conclusions. For the effective implementation of artiﬁcial intelligence in the ﬁeld of defense, it is necessary to solve the identiﬁed management problems by developing an appropriate regulatory framework, improving the personnel training system, strengthening interdepartmental cooperation, and adapting existing management practices to new technological realities. In addition, the need for the implementation of pilot projects and step-by-step adaptation of management practices has been identiﬁed, subject to constant monitoring and evaluation of the effectiveness of the measures applied.

Thematic Analysis of Publication Activity Among Academic and Teaching Staff: A Case Study of the Financial University

G. A. Ostapenko, G. G. Rozhkova, V. G. Feklin, R. A. Kochkarov

PDF (Rus)

69-76 99

Abstract

In the context of the rapid growth of scientiﬁc publications, the increase in interdisciplinary research and increased competition in the academic environment, the tasks of analyzing and visualizing scientiﬁc activity are becoming especially actual. Modern digital tools allow not only to track publication dynamics, but also to identify key research areas, as well as stable groups of authors that form scientiﬁc communities. One of the effective approaches in this area is a combination of topic modeling methods and network analysis based on graph theory. Scientiﬁc organizations often face the problem of lack of operational information about the internal structure of research activities: which topics are most actively developing, what are the connections between authors and teams, who acts as the “cores” of scientiﬁc communities. This is especially true for large universities, where hundreds of researchers work, creating a signiﬁcant number of scientiﬁc papers. In such situation, manual analysis becomes impossible, and automated text processing and graph analytics methods come to the rescue. This article is devoted to the analysis of the publication activity of authors of the Financial University. The purpose of the study is to identify the topics of scientiﬁc papers and identify scientiﬁc communities to understand the development of research activities of higher education institutions by example of the Financial University. The study presents an approach to forming a data set of scientiﬁc publications of authors of the Financial University. Visualization of publication dynamics and keyword analysis were carried out, allowing to identify common trends. The BERTopic model was used to solve the problem of text clustering and determining publication topics. Identiﬁcation of scientiﬁc communities was implemented through the construction and analysis of a co-authorship graph, which allows to identify groups of researchers actively collaborating within certain scientiﬁc areas.

Cluster Analysis Russian Federation Regions by Demand for Transport Services

D. Z. Kagan, A. A. Rylov

PDF (Rus)

77-88 97

Abstract

The article presents the application of cluster analysis, one of the most common machine learning methods, to the study of the level of regional transport development, segmentation of regions by the existing demand for transport services, and determination of the most important components of the transport system for certain regions. The purpose of the study is to divide the regions of the country into clusters that are relatively homogeneous in the main aspects of demand for transport services. Each cluster combines regions with similar economic, geographical and economic characteristics, which determines similarity in the most demanded modes of transport and transport infrastructure facilities. The research methodology is based on machine learning algorithms, the application of mathematical metrics to sets of statistical data. The research includes: selection of signiﬁcant factors; analysis and normalization of statistical data; various clustering methods. In normalization, the data are converted to a single scale from 0 to 100 points with outliers excluded. As a result of cluster analysis, regions are distributed into four main clusters. Technical implementation of different variants of cluster analysis is possible in tabular editors and statistical packages. Based on the clustering results, each cluster is interpreted and common characteristics of the regions within them are identiﬁed. The prospects of the study include its annual updating based on updated statistical data. The results can be used to: analyze and develop the country’s transport system; identify priorities in the development of regional transport infrastructure; assess the signiﬁcance and necessity of regional transport projects.

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

Digital Solutions and Artificial Intelligence Technologies

COVER STORY: Artificial intelligence and machine learning

MATHEMATICAL MODELING, NUMERICAL METHODS AND SOFTWARE PACKAGES

METHODS AND SYSTEMS OF INFORMATION PROTECTION, INFORMATION SECURITY

MATHEMATICAL, STATISTICAL AND INSTRUMENTAL METHODS IN ECONOMICS

User

Digital Solutions and Artificial Intelligence Technologies

COVER STORY: Artificial intelligence and machine learning

MATHEMATICAL MODELING, NUMERICAL METHODS AND SOFTWARE PACKAGES

METHODS AND SYSTEMS OF INFORMATION PROTECTION, INFORMATION SECURITY

MATHEMATICAL, STATISTICAL AND INSTRUMENTAL METHODS IN ECONOMICS

Cookies policy