Full text loading...
Previous studies have extensively reported various feature selection methods for identifying cancer signatures using RNA expression profiles. However, these methods often produce unreliable signatures due to four key factors. First, classifiers other than regression models are always inappropriately applied in prognostic survival analysis. Second, the unknown distribution of samples can lead to the ineffective selection of regression models. Third, high-dimensional expression profiles with small sample sizes typically result in poor predictive performance of the selected regression model. Fourth, variable control is usually overlooked.
To solve these problems, we have proposed a novel feature selection framework using ensemble regression to identify cancer prognostic signatures. This framework utilizes ensemble regression to overcome the limitations of classification models, as classification models reduce survival time to categorical labels, losing the original continuous information. At the same time, it incorporates up-sampling techniques to increase sample size and uses a bagging strategy to randomly select samples and features, addressing the challenges posed by high-dimensional data and small sample sizes. Additionally, the framework controls for clinical variables to ensure stable feature selection and reliable prediction results.
Experimental results demonstrate the effectiveness of this method in addressing the issues mentioned, providing reliable prognostic signatures. The ensemble regression method significantly improves predictive performance, with robust adaptability to unknown sample distributions.
The proposed ensemble regression model outperforms classification and single regressors in prognostic survival analysis by preserving continuous survival information, adapting to sample distribution, and benefiting from controlled variables. Using TCGA-GBM data, six prognostic miRNAs were validated as reliable biomarkers, whereas mRNA-based models showed limited robustness due to high dimensionality and small sample size.
The proposed feature selection framework offers a robust approach to improving the identification of cancer prognostic signatures, enhancing predictive accuracy in prognostic survival analysis.
Article metrics loading...
Full text loading...
References
Data & Media loading...