
-
Diabetes, a chronic medical condition, affects millions of people worldwide and requires consistent monitoring of blood glucose levels (BGLs). Traditional invasive methods for BGL monitoring can be challenging and painful for patients. This study introduces a non-invasive, deep learning (DL)-based approach to estimate BGL using photoplethysmography (PPG) signals. Specifically, a Deep Sparse Capsule Network (DSCNet) model is proposed to provide accurate and robust BGL monitoring. The proposed model's workflow includes data collection, preprocessing, feature extraction, and predictions. A hardware module was designed using a PPG sensor and Raspberry Pi to collect patient data. In preprocessing, a Savitzky-Golay filter and moving average filter were applied to remove noise and preserve pulse form and high-frequency components. The DSCNet model was then applied to predict the sugar level. Two models were developed for prediction: a baseline model, DSCNet, and an enhanced model, DSCNet with self-attention. DSCNet's performance was evaluated using Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Relative Difference (MARD), and coefficient of determination (R<sup>2</sup>), yielding values of 3.022, 0.05, 0.058, 0.062, 10.81, and 0.98, respectively.
A comparison of four protein model-building pipelines (ARP/wARP, Buccaneer, PHENIX AutoBuild and SHELXE) was performed using data sets from 202 experimentally phased cases, both with the data as observed and truncated to simulate lower resolutions. All pipelines were run using default parameters. Additionally, an ARP/wARP run was completed using models from Buccaneer. All pipelines achieved nearly complete protein structures and low R<sub>work</sub>/R<sub>free</sub> at resolutions between 1.2 and 1.9 Å, with PHENIX AutoBuild and ARP/wARP producing slightly lower R factors. At lower resolutions, Buccaneer leads to significantly more complete models.
Proteins are macromolecules that perform essential biological functions which depend on their three-dimensional structure. Determining this structure involves complex laboratory and computational work. For the computational work, multiple software pipelines have been developed to build models of the protein structure from crystallographic data. Each of these pipelines performs differently depending on the characteristics of the electron-density map received as input. Identifying the best pipeline to use for a protein structure is difficult, as the pipeline performance differs significantly from one protein structure to another. As such, researchers often select pipelines that do not produce the best possible protein models from the available data. Here, a software tool is introduced which predicts key quality measures of the protein structures that a range of pipelines would generate if supplied with a given crystallographic data set. These measures are crystallographic quality-of-fit indicators based on included and withheld observations, and structure completeness. Extensive experiments carried out using over 2500 data sets show that the tool yields accurate predictions for both experimental phasing data sets (at resolutions between 1.2 and 4.0 Å) and molecular-replacement data sets (at resolutions between 1.0 and 3.5 Å). The tool can therefore provide a recommendation to the user concerning the pipelines that should be run in order to proceed most efficiently to a depositable model.
For the last two decades, researchers have worked independently to automate protein model building, and four widely used software pipelines have been developed for this purpose: ARP/wARP, Buccaneer, Phenix AutoBuild and SHELXE. Here, the usefulness of combining these pipelines to improve the built protein structures by running them in pairwise combinations is examined. The results show that integrating these pipelines can lead to significant improvements in structure completeness and R<sub>free</sub>. In particular, running Phenix AutoBuild after Buccaneer improved structure completeness for 29% and 75% of the data sets that were examined at the original resolution and at a simulated lower resolution, respectively, compared with running Phenix AutoBuild on its own. In contrast, Phenix AutoBuild alone produced better structure completeness than the two pipelines combined for only 7% and 3% of these data sets.
Tracing the backbone is a critical step in protein model building, as incorrect tracing leads to poor protein models. Here, a neural network trained to identify unfavourable fragments and remove them from the model-building process in order to improve backbone tracing is presented. Moreover, a decision tree was trained to select an optimal threshold to eliminate unfavourable fragments. The neural network was tested on experimental phasing data sets from the Joint Center for Structural Genomics (JCSG), recently deposited experimental phasing data sets (from 2015 to 2021) and molecular-replacement data sets. The experimental results show that using the neural network in the Buccaneer protein-model-building software can produce significantly more complete protein models than those built using Buccaneer alone. In particular, Buccaneer with the neural network built protein models with a completeness that was at least 5% higher for 25% and 50% of the original and truncated resolution JCSG experimental phasing data sets, respectively, for 28% of the recently collected experimental phasing data sets and for 43% of the molecular-replacement data sets.
Evaluating the quality of the publication can depend on several factors, such as the importance of the contribution of the publication, the analysis of the results, and the number of citations, which is not feasible before publishing the article. In this paper, we fine-tuned the BERT and Longformer models to classify scientific articles into citation groups based on their significance using the article metadata, author affiliations, Web of Science journal categories, and the article full text. We collected 4234 open-access article metadata that were published in 2021 in Web of Science journals. Moreover, the full text of the article was also recovered from the article publishers. We fine-tuned the models on these 4234 articles, and the number of citations was taken in 2023, which the articles complete two years of publishing. We evaluated the models using precision, recall, and the F score, and Longformer slightly outperformed Bert, which scored 0.63 for the three metrics.