Optimization of LC-MS/MS Methods for High-Throughput Protein and Glycopeptide Identification in Complex Biological Samples: the experience of the GlycoProteomics and Structural Mass Spectrometry Facility

Claudia Blanes Angeli1, Hellen Paula Valerio1, Giuseppe Palmisano1

1. ICB-USP, Instituto de Ciências Biomédicas - Universidade de São Paulo; Av. Prof. Lineu Prestes, 1374, ICB-II, São Paulo-SP, Brazil, CEP: 05508-000

Introduction:
High-throughput proteomics and glycoproteomics are essential for advancing our understanding of complex biological systems and disease mechanisms. Data-dependent acquisition (DDA) and data-independent acquisition (DIA) mass spectrometry enable robust protein identification and quantification. The choice of strategy depends on the biological sample and experimental design, requiring dedicated optimization of LC-MS/MS conditions. Additionally, the analysis of protein post-translational modifications, such as glycosylation, demands specialized sample preparation, mass spectrometry settings, and data analysis pipelines due to their structural diversity and low abundance in complex matrices. This study addresses both challenges by optimizing LC-MS/MS methods for broad proteome coverage and direct glycopeptide identification without enrichment.
Objectives:
1. Optimize peptide separation and DIA mass spectrometry settings to improve proteome coverage and quantification precision.
2. Develop LC-MS/MS methods for identifying glycopeptides in complex and moderately complex biological samples without enrichment.
Materials and Methods:
For DIA-based proteomics, neuroblastoma cell lysates (SH-SY5Y) were trypsin-digested and analyzed using a Vanquish-Neo UHPLC coupled to a Tribrid Orbitrap Ascend mass spectrometer. DIA parameters were optimized using in silico and gas-phase fractionation (GPF) libraries, with performance evaluated across four software tools (MSFragger-DIA, DIA-NN, MaxDIA, EncyclopeDIA).
For glycoproteomics, three sample types (neuroblastoma cell lysate, human serum, and IgG) were trypsin-digested and analyzed in the Orbitrap Ascend. Key parameters—including MS1/MS2 resolution, AGC targets, dynamic exclusion, and fragmentation methods (HCD, CID, EThcD)—were systematically varied to optimize LC-MS/MS performance. Gas-phase fractionation was also applied. No glycopeptide enrichment was performed, enabling direct applicability for biomarker discovery. All experiments were conducted at the GlycoProteomics and Structural Mass Spectrometry Facility (GPS-MS).
Results:
For DIA-based proteomics, the optimized method identified over 6,500 proteins in single-shot injections—a substantial improvement over DDA (4,000 proteins). Quantification precision was high (mean CV: 8%), with only 1.5% missing values. DIA-NN achieved the highest protein identifications, while EncyclopeDIA minimized missing values.
For glycoproteomics, neuroblastoma samples yielded 20 high-confidence glycopeptides, serum yielded 222, and IgG yielded 41. HCD fragmentation at 36% collision energy was most effective, with stepped CE further improving IgG analysis.
Conclusions:
The optimized DIA workflow significantly enhances proteome coverage and quantification accuracy, while the glycoproteomics method enables robust identification of intact glycopeptides without enrichment—particularly in serum and IgG samples. These protocols provide powerful tools for biomarker discovery and clinical research, leveraging the capabilities of the Orbitrap Ascend platform. Together, they offer a comprehensive framework for advancing proteomic and glycoproteomic studies at the GPS-MS facility, ICB-USP

Agradecimentos: FAPESP (Processos 2018/15549-1, 2022/11334-6, 2024/20365-8); CNPQ (Processo 317353/2021-7)