A novel neural network model for rolling linear guide pair optimization design

Song, Chenghao; Du, Weiqi; Li, Shuxin; Han, Junjun

doi:10.5194/ms-17-141-2026

Articles | Volume 17, issue 1

https://doi.org/10.5194/ms-17-141-2026

Articles | Volume 17, issue 1

Research article

26 Feb 2026

Research article |

| 26 Feb 2026

A novel neural network model for rolling linear guide pair optimization design

Chenghao Song, Weiqi Du, Shuxin Li, and Junjun Han

Abstract

This paper addresses the inefficiency of conventional design methods for rolling linear guide pairs, which rely on finite-element analysis and large data demands. This study pioneers a few-shot learning framework based on meta-learning, employing a model-agnostic meta-learning strategy to train an inverted-bottleneck residual fully connected network. The network achieves high prediction accuracy (R²>0.91) with only 126 samples. An integrated parametric platform reduces modelling time from 50 to 2–3 min, significantly improving efficiency. Optimization via sequential least squares quadratic programming demonstrates a 57 % reduction in vertical deformation. However, the current work focuses on static performance optimization, leaving dynamic aspects for future research. This approach offers an efficient and data-effective paradigm for precision mechanical component design.

Download & links

Article (PDF, 2836 KB)

Download & links

How to cite.

Received: 06 Nov 2025 – Revised: 29 Dec 2025 – Accepted: 22 Jan 2026 – Published: 26 Feb 2026

1 Introduction

The rolling linear guide is a core transmission component in precision mechanical equipment. Its structural performance directly determines the positioning accuracy and service life of high-end equipment, such as computer numerical control (CNC) machine tools and industrial robots (Yang et al., 2020; Wang et al., 2018; Zha et al., 2024; Quan and Zhao, 2024). The traditional guide design method relies much on empirical trial and error (Phuyal et al., 2020; Sahoo and Lo, 2022; Liu et al., 2023; Staroszyk et al., 2024). Designers repeatedly adjust geometric parameters and conduct the computer-aided engineering (CAE) analysis to improve mechanical performance, which is quite time-consuming (Zou and Wang, 2015; Tong et al., 2020). Therefore, a rapid and effective design method should be developed for the structural optimization of a rolling linear guide.

Several studies have been conducted for the investigation of rolling linear guide pairs. For instance, Shimizu (1998) and Ohta and Hayashi (2000) were dedicated to developing rigid models for rolling linear guides based on Hertz contact theory. Cheng (2021) introduced a novel stiffness matrix for roller linear guides, investigating the relationship between non-uniform load distribution and contact states. Xu et al. (2023) developed an analytical dynamic model for flexible linear guides, investigating the vibration behaviour of carriages under external periodic excitations in multiple directions. Compared to numerical analysis methods, the finite-element method (FEM) provides higher computational efficiency and superior adaptability. For example, Sun et al. (2015) established a high-fidelity finite-element model accounting for ball-raceway contact. Wei et al. (2022) used FEM to characterize the impact of structural parameters on guide pair mechanical performance. Tong et al. (2019) employed FEM to develop a 5-degree-of-freedom model for calculating the fully occupied stiffness matrix of linear ball guides. Up until now, few studies have paid attention to the structural optimization of rolling linear guides. In addition, structural optimization design with the results of FEM analysis is quite time-consuming and labour-intensive because FEM analysis cannot directly give the explicit relationships between structure and performance.

Traditional optimization algorithms like PSO and GA are highly effective in handling complex problems. However, they require thousands of computations on high-precision models, and so the cycle is very time-consuming. For example, Tong et al. (2020) employed a particle swarm optimization algorithm, using 2000 particles and running for 150 iterations to optimize the design of the linear guide. Wang et al. (2016) utilized finite-element analysis and a genetic algorithm model, conducting 1580 calculations to optimize the structure of a wind turbine blade. In recent years, machine learning techniques (Ramu et al., 2022) have offered another option for designers. For instance, Kazem et al. (2025) proposed a RAGN-R (surrogate multi-subject ensemble machine learning method) for estimating the mechanical properties of advanced materials. This method can directly establish the connection between structure and performance, enabling highly accurate predictions. Up to now, there have been many types of surrogate models. For instance, Fishwick (1989) used a multi-layer neural network model as a simulation model for a basic ballistic model. Jiang et al. (2025) proposed a multi-fidelity fully connected neural network (FCNN) model to optimize the fin structure of rocket projectiles. Roy et al. (2019) developed a support vector regression (SVR) model for the reliability estimation of engineering structure. Nagendra et al. (2005) developed a response surface model based on the optimal noise, vibration, and harshness (NVH) response sensitivity data for optimizing tailored rolled blank (TRB) components. Machine learning, particularly surrogate models, provides a novel solution for the optimization of rolling linear guide pairs. It directly maps the complex nonlinear relationships between geometric parameters (contact angle and curvature ratio) and mechanical properties (including stiffness and stress). It reduces FEM time from hours to minutes, saving significant iterative computation costs for optimization design.

However, surrogate models typically demand a large number of samples to construct datasets, resulting in substantial computational and workload burdens. To enhance efficiency, it is necessary to reduce sample size requirements without losing the accuracy. Therefore, few-shot learning methods (Huisman et al., 2021) have been developed, possessing strong generalization capabilities, minimal data requirements, and high learning efficiency. Su et al. (2022) developed a novel DRHRML (data reconstruction hierarchical recurrent meta-learning) method for few-shot learning in intelligent bearing fault detection. Liu et al. (2016) proposed a support vector machine (SVM) gait controller trained with limited samples for stable bipedal robotic walking. In the field of few-shot learning, model-agnostic meta-learning (MAML) learns a highly task-sensitive model initialization, enabling rapid adaptation to new tasks with minimal examples (Fallah et al., 2020). To validate the algorithm selection rationale in this study, a comparative analysis was conducted between MAML and other algorithms. Compared to the Reptile algorithm (Nichol and Schulman, 2018), which employs first-order approximations for improved computational efficiency, its predictive accuracy may be inadequate for complex nonlinear regression problems. In contrast, meta-SGD (Li et al., 2017) introduces learnable learning rates, which increases the risk of overfitting when handling extremely small datasets. Transfer learning requires a large and relevant source task, which is often difficult to find in mechanical engineering. Therefore, this study selects MAML for application in the optimization design of rolling linear guide pairs. With only 126 sample sets, it achieves a prediction accuracy with R² exceeding 0.91.

Rolling linear guide pairs serve as core transmission components in precision mechanical equipment. The design and performance metrics must satisfy the stringent requirements of high-end application scenarios. The details are outlined below.

High positioning accuracy and motion smoothness. High stiffness and low deformation are core requirements for ensuring precision.
High load capacity and long service life. Controlling the maximum contact stress is the key to ensuring service life.
High reliability and stability. The design must ensure consistent and predictable performance under various working conditions, preventing performance degradation caused by improper parameter matching.

This paper proposes a meta-learning-based inverted-bottleneck residual fully connected network model. The surrogate models for guide pair stiffness and maximum contact stress were established, and the SLSQP optimization algorithm was employed to solve structural parameter optimization. In addition, this study develops a parametric design platform for guide pairs, which enhances the efficiency of dataset construction through parametric modelling and automated FEM.

The structure of this paper is organized as follows: Sect. 2 establishes the response surface model for predicting the mechanical performance of guide pairs, detailing the few-shot learning process. Section 3 presents the developed parametric modelling platform and couples it with automated FEM, establishing an integrated design–simulation workflow. In Sect. 4, comprehensive analysis of the results is conducted. Finally, Sect. 5 summarizes the study.

2 Meta-learning-based inverted-bottleneck residual fully connected network model

Few-shot learning is a machine learning paradigm that achieves efficient learning with limited training samples. This section integrates the inverted-bottleneck structure with a fully connected network and employs the model-agnostic meta-learning (MAML) algorithm to train this model, achieving rapid cross-task adaptation capability.

2.1 Enhanced Latin hypercube sampling for input sample set construction

To ensure the coverage of the parameter space, a sampling strategy is introduced in this section. The Latin hypercube sampling (LHS) method is employed to ensure a uniform and representative distribution of input samples within the parameter space. In addition, a Euclidean distance constraint is introduced to prevent samples from being excessively close to each other.

Before conducting LHS, the value ranges of all parameters must be normalized to the interval [0,1]. For a raw parameter x_j∈ [a_j,b_j], the normalized value $\overline{x_{j}}$ is calculated as follows:

\begin{matrix} (1) & x_{j} = \frac{x_{j} - a_{j}}{b_{j} - a_{j}} . \end{matrix}

Specifically, the value range of each dimension is uniformly divided into N equal subintervals. For each dimension j, a random permutation π_j is generated, where π_j(i) denotes the subinterval number assigned to the i-th sample in the j-th dimension. Based on this, the specific value X_ij for each sample point can be obtained through the following calculation:

\begin{matrix} (2) & X_{i j} = \frac{π_{j} (i) - u_{i j}}{N}, \end{matrix}

where u_ij is a random number sampled from the uniform distribution U(0,1). Following LHS sampling, the parameters must undergo de-normalization processing, linearly mapping X_ij from [0,1] to the actual parameter range [a_j,b_j] for each variable:

\begin{matrix} (3) & S_{i j} = a_{j} + (b_{j} - a_{j}) \cdot X_{i j}, \end{matrix}

where S_ij denotes the actual parameter values output by the final Latin hypercube sampling. When adding new samples, a new sample point must not lie too close to any existing samples to avoid spatial clustering and data redundancy. Therefore, when adding new samples, a Euclidean distance (Parhizkar, 2015) is introduced to ensure sample dispersion:

\begin{matrix} (4) & D_{g} (p, X) = \begin{matrix} min \\ k = 1, \dots N \end{matrix} ∥ p - x_{k} ∥_{2} > δ_{g}, \end{matrix}

where p denotes the new candidate sample, X represents the set of existing samples, N indicates the sample count, $∥ \cdot ∥ 2$ is the Euclidean norm, and δ_g=0.5 denotes the global distance threshold. The global threshold δ_g=0.5 was set based on empirical rules commonly used in experimental design. This setting aims to ensure a sufficient minimum Euclidean distance between sample points in the normalized parameter space. It effectively prevents sample clustering while guaranteeing spatial-filling properties. However, it should be noted that, for parameters with smaller value ranges (such as the ball-positioning angle θ and the interference fit δ in Sect. 4.2, which have small value ranges), a threshold of 0.5 is evidently excessive and inappropriate. Thus, these parameters should be separately assigned a threshold of δ=0.05.

2.2 Inverted-bottleneck residual fully connected network model

Deep neural networks (DNNs) are a significant branch of artificial neural networks (Sze et al., 2017). This section utilizes a fully connected deep neural network (FCNN). By integrating an inverted-bottleneck structure and residual connections, a novel deep residual multi-layer perceptron (MLP) architecture is developed. The cascade architecture – comprising an input layer, multiple residual blocks, and an output layer – effectively captures nonlinear mappings between rolling linear guide design parameters and mechanical performance.

The structural design of neural networks is a pivotal factor in determining their performance. In traditional bottleneck architectures (e.g. ResNet's Bottleneck), a “wide–narrow–wide” pattern is typically adopted. High-dimensional inputs are compressed into a low-dimensional space for feature extraction and then are expanded back to the original high-dimensional space. In contrast, inverted-bottleneck structures (e.g. MobileNetV2) employ the opposite “narrow–wide–narrow” paradigm: low-dimensional inputs are expanded into a high-dimensional space for feature extraction and then return to low dimensionality (Sandler et al., 2018).

Figure 1 illustrates the dimensional transformation schematic of the bottleneck structure and the inverted-bottleneck structure. The inverted-bottleneck structure overcomes the limitations of feature extraction in low-dimensional spaces. This enables the model to possess stronger representational capabilities and capture higher-level features more readily.

https://ms.copernicus.org/articles/17/141/2026/ms-17-141-2026-f01

Figure 1Comparison of dimension transformation with and without inverted-bottleneck structure.

A novel neural network model for rolling linear guide pair optimization design

2.1 Enhanced Latin hypercube sampling for input sample set construction

2.2 Inverted-bottleneck residual fully connected network model

2.3 MAML for supervised learning

2.4 Fine-tuning phase

4.1 Validation of FEA results vs. experimental results

4.2 Influence of structural parameters on mechanical performance

4.3 Comparative analysis of network models

4.4 Optimization solving results

4.4.1 Constraint conditions and convergence criteria

4.4.2 Multi-start optimization results and sensitivity analysis