Full text loading...
Mycobacterium fortuitum is a rapidly growing human pathogenic bacterium that has been linked to a number of clinical conditions. Its ability to quickly develop intricate biofilms makes its treatment challenging. Development of drug resistance has been reported in cases of M. fortuitum, further reducing treatment options available against the pathogen.
In order to identify the proteins involved in biofilm development, this work attempted to analyze the real-time proteome data of M. fortuitum using machine learning strategies. The aim of the study was to provide novel drug targets that may be used to treat patients more quickly and efficiently.
The proteomic data was analyzed using the Support Vector Machine (SVM), Artificial Neural Network (ANN), and k-nearest Neighbors (kNN) techniques. Proteins linked to biofilm formation, which were over-expressed and under-expressed, were used in the training set of the models. The trained models were then evaluated using abundant proteins found in M. fortuitum proteome analysis. The pre-processing and optimization were done to improve the performance of the models.
The kNN algorithm achieved the highest accuracy level of 82.98%, followed by SVM at 82.75% and ANN at 78%. Using other machine learning methods, including Random Forest, Naive Bayes, and Logistic Regression, the performance of these models was further verified. The outcomes demonstrated for the prediction of proteins, kNN consistently produced the best accuracy.
The study shows that machine learning techniques, in particular kNN, can be used for successful analyses of proteome data obtained from M. fortuitum in order to identify proteins associated with the formation of biofilms. This methodology may be used for the prediction of drug targets using a proteome database. Identification of drug targets can help in designing better treatment strategies against the pathogen.