COVER STORY: Artificial intelligence and machine learning
Forecasting of life expectancy is associated not only with serious social and financial factors, but also with the state of public health and economy, as well as with the state of the environment. The use of mathematical methods makes it possible to identify the most informative indicators affecting life expectancy. The aim of the paper is to predict life expectancy from World Bank data using machine learning (ML) methods, and to compare the effectiveness of life expectancy prediction using different machine learning algorithms, including such widely used methods as support vector method, decision tree, random forest, Fisher’s linear discriminant, neural networks, two variants of gradient bousting, logistic regression and statistically weighted syndrome method. The database included data for 238 countries. Standard non-parametric chi-square (χ²) and Mann-Whitney criteria (U-test) were applied. Eleven significant indicators were identified. Machine learning (ML) methods of Data Master Azforus data analysis system was used. The prediction result of the statistically weighted syndrome (SWS) method achieved a ROC AUC = 0.986. One-dimensional and two-dimensional diagrams of the relationship between the studied socio-economic and medical indicators on life expectancy are presented. From these charts, predictions can be derived for changes in individual indicators to improve quality and length of life. Thus, the Data Master Azforus data analysis system will enable researchers to create recommendation systems for life expectancy prediction. In addition, the conducted research will help to create a more advanced forecasting system using machine learning models that can serve as a guide for politicians makers in improving life expectancy forecasting.
The paper discusses tokenization as a key step in textual data processing, especially in the financial domain. Current tokenization techniques are analyzed with examples from recent research and their impact on the performance of NLP models. The study shows that word-based tokenization algorithms (BPE, WordPiece, Unigram) have become the standard for language models due to their flexibility and text compression efficiency. We discuss the limitations of input sequence length in language models (BPE and WordPiece show a tendency to over-partition, Unigram requires complex training, and symbolic tokenisation creates excessively long sequences) and methods to overcome these limitations, including text partitioning, hierarchical processing and extrapolation of pre-trained models with Transformer architecture to handle long input data. For financial data, it is recommended to use domain-specific tokenizers or additional training on specialized systems, which is confirmed by the successful experience of BloomberGPT. Special attention is paid to the problem of processing long texts. Three solution approaches are proposed: text partitioning; hierarchical processing; extrapolation of transformer models. In conclusion, the importance of tokenization for financial analytics is emphasized, where the quality of text processing directly affects decision-making. The development of tokenization methods continues in parallel with the improvement of NLP models, which makes this stage of text processing a critical component of modern analytical systems.
MATHEMATICAL MODELING, NUMERICAL METHODS AND SOFTWARE PACKAGES
The article considers modern approaches to modeling network systems and networks with a dynamic nature in general. The paper presents a modern class of dynamic graphs with a description of their practical implementation. Basic or simple operations, including deletion or addition of vertices and edges, are presented as a procedure for changing a dynamic graph. A special subclass of prefractal graphs with self-similarity properties is identified. For the class of dynamic graphs, the concept of a trajectory is defined, represented by a sequence of classical graphs changing from one to another in timeline. The toolkit of dynamic graphs can become the base for developing algorithms for command-information interaction of mobile subscribers in network systems, including network systems of continuous spatial monitoring. To describe optimization problems on multi-weighted graphs, a formal statement of a multi-criteria problem on a prefractal graph is proposed. Sets of feasible solutions, Pareto-optimal and complete solutions are described. Some lemmas of multicriterial optimization for individual problems that have the property of completeness are proposed, as well as restrictions on the linear convolution of criteria for finding Pareto-optimal solutions. The hereditary properties that manifest themselves in the trajectories of a dynamic graph are investigated, namely, the heredity of structural and functional characteristics and, as a result, the heredity of decisions during the transition from one graph to another in the trajectory of a dynamic graph. This work contributes to the development of network science and the theory of dynamic networks, offering both approaches and particular solutions on general and special classes of graphs.
The article presents a methodology for predicting the success of project initiatives based on an analysis of the structural characteristics of project team communication networks. The study is based on data from task tracking systems (Jira, Trello) that reflect formal interactions between project participants. The research methodology includes a set of analytical tools: correlation analysis using Spearman’s and Pearson’s coefficients, regression modeling based on the Random Forest algorithm, and anomaly detection methods using Isolation Forest. The study revealed statistically significant correlations between key network metrics (betweenness centrality, network density, graph diameter) and project performance indicators (adherence to deadlines, budget, quality of results). A statistically significant negative correlation was found between excessive centralization and adherence to deadlines (ρ = –0.72), as well as a positive correlation between network density and quality of results (r = 0.68). The developed model based on Random Forest demonstrates the accuracy of forecasting the success of projects at the level of 84%. It was found that excessive centralization of communications reduces the probability of successful project implementation, while the optimal density of the communication network contributes to the achievement of project KPIs. The practical significance of the study lies in the possibility of early detection of project failure risks based on objective metrics of communication activity. The developed methodology, tested on IT company data, allows not only to predict risks, but also to form recommendations for optimizing team interactions. The results of the study are of interest to project managers, HR analysts, and data-driven management specialists.
METHODS AND SYSTEMS OF INFORMATION PROTECTION, INFORMATION SECURITY
The article discusses the current problems of ensuring the cybersecurity of the digital ruble in the context of growing threats to the digital space. The relevance of the research is determined by the following factors: the need to develop a new cybersecurity concept for the digital ruble; an increase in the number of computer attacks on the credit and financial sector; an increase in the destructive information impact on financial organizations; the need to identify and prevent cyber threats; the importance of protection against hacker attacks, viruses and fraud; the need to optimize processes and improve the quality of financial payments. The purpose of the research is to develop a technical solution to ensure the cybersecurity of the digital ruble infrastructure. During the study, the following tasks were solved: the concepts of “trust infrastructure” and “cybersecurity” were clarified; cyberspace tools were analyzed in the context of protecting the digital ruble; proposals were developed to improve the cybersecurity of the trust infrastructure. The practical significance of the work lies in the possibility of applying the proposed solutions to strengthen the digital ruble protection system.
The beginning of the two thousand years showed the true face of the digital industry vendors. Significant changes in the protocol system, file system, data transfer system, etc. introduced a fundamentally new direction into the general concept of information security — ensuring data confidentiality from inside. This direction is quite relevant at present time. The avalanche-increasing number of undeclared functions raises objective concerns about the occurrence of various digital risks in all countries, without exception, even in the one where the main offices of GAFAM are located. This article does not intend to consider circumventing the ban using VPN channels (officially authorized by the FSB of Russia using licensed resources from popular providers), but will present an algorithm describing one of the reasons that prompted the government to ban the use of global graphic information exchangers. There are many examples of such algorithms, and as a result, the decision taken by the RCN is justified from the point of view of protecting the digital sovereignty of our state. As proof, a number of articles are proposed on the hidden functions of information systems and technologies. The initial stage will be devoted to the file system for both stationary and mobile systems. Start with the simplest features that are overlooked even by modern data protection systems. The articles are intended Only for educational and preventive purposes for specialists and do not include instructions for harming organizations.
In the context of digital transformation, AI is becoming a key factor in increasing combat effectiveness and ensuring the sustainable development of the Armed Forces of the Russian Federation. The article is devoted to the analysis of organizational aspects of the use of artificial intelligence in the field of defense of the Russian Federation, n the context of the need to find new ways to strengthen national security and defense capability.
The purpose of the article is to analyze the current problems of introducing artificial intelligence into the organizational processes of Russia’s defense activities and propose practical ways to solve them.
Methods. The research used methods of system analysis, comparative analysis, and a structural and functional approach, which made it possible to study the features of the integration of artificial intelligence into the organizational and managerial processes of the Russian Armed Forces. The choice of research methods is based on their complementary capabilities and allows for a comprehensive understanding of the processes of integrating artificial intelligence into the defense sector.
The problem. The paper identifies key problems in the implementation of artificial intelligence in the defense sector: technological (limitations of artificial intelligence, integration into existing systems); resource-related (financial and infrastructural inadequacies); human resources (shortage of specialists); institutional (regulatory, legal, and ethical issues); organizational (lack of interagency coordination).
Results. The author proposes specific organizational and operational measures aimed at overcoming these obstacles and effectively integrating new technologies. For the first time, the article comprehensively examines the organizational and management issues of integrating artificial intelligence in the context of the Armed Forces of the Russian Federation, taking into account modern challenges and national characteristics It identifies current problems in the implementation of artificial intelligence in defense management processes, such as technological, personnel, and regulatory barriers.
Conclusions. For the effective implementation of artificial intelligence in the field of defense, it is necessary to solve the identified management problems by developing an appropriate regulatory framework, improving the personnel training system, strengthening interdepartmental cooperation, and adapting existing management practices to new technological realities. In addition, the need for the implementation of pilot projects and step-by-step adaptation of management practices has been identified, subject to constant monitoring and evaluation of the effectiveness of the measures applied.
MATHEMATICAL, STATISTICAL AND INSTRUMENTAL METHODS IN ECONOMICS
In the context of the rapid growth of scientific publications, the increase in interdisciplinary research and increased competition in the academic environment, the tasks of analyzing and visualizing scientific activity are becoming especially actual. Modern digital tools allow not only to track publication dynamics, but also to identify key research areas, as well as stable groups of authors that form scientific communities. One of the effective approaches in this area is a combination of topic modeling methods and network analysis based on graph theory. Scientific organizations often face the problem of lack of operational information about the internal structure of research activities: which topics are most actively developing, what are the connections between authors and teams, who acts as the “cores” of scientific communities. This is especially true for large universities, where hundreds of researchers work, creating a significant number of scientific papers. In such situation, manual analysis becomes impossible, and automated text processing and graph analytics methods come to the rescue. This article is devoted to the analysis of the publication activity of authors of the Financial University. The purpose of the study is to identify the topics of scientific papers and identify scientific communities to understand the development of research activities of higher education institutions by example of the Financial University. The study presents an approach to forming a data set of scientific publications of authors of the Financial University. Visualization of publication dynamics and keyword analysis were carried out, allowing to identify common trends. The BERTopic model was used to solve the problem of text clustering and determining publication topics. Identification of scientific communities was implemented through the construction and analysis of a co-authorship graph, which allows to identify groups of researchers actively collaborating within certain scientific areas.
The article presents the application of cluster analysis, one of the most common machine learning methods, to the study of the level of regional transport development, segmentation of regions by the existing demand for transport services, and determination of the most important components of the transport system for certain regions. The purpose of the study is to divide the regions of the country into clusters that are relatively homogeneous in the main aspects of demand for transport services. Each cluster combines regions with similar economic, geographical and economic characteristics, which determines similarity in the most demanded modes of transport and transport infrastructure facilities. The research methodology is based on machine learning algorithms, the application of mathematical metrics to sets of statistical data. The research includes: selection of significant factors; analysis and normalization of statistical data; various clustering methods. In normalization, the data are converted to a single scale from 0 to 100 points with outliers excluded. As a result of cluster analysis, regions are distributed into four main clusters. Technical implementation of different variants of cluster analysis is possible in tabular editors and statistical packages. Based on the clustering results, each cluster is interpreted and common characteristics of the regions within them are identified. The prospects of the study include its annual updating based on updated statistical data. The results can be used to: analyze and develop the country’s transport system; identify priorities in the development of regional transport infrastructure; assess the significance and necessity of regional transport projects.