Protocol for Data Independent Acquisition - Mass spectrometry analysis – a DIA-based Organelle Proteomics

Dario R Alessi, Raja Sekhar Nirujogi, Rotimi Fasimoye, Toan K Phung

Published: 2022-08-27 DOI: 10.17504/protocols.io.kxygxzrokv8j/v1

Abstract

Purification of intact organelles by previously described methods (dx.doi.org/10.17504/protocols.io.bybjpskn; dx.doi.org/10.17504/protocols.io.6qpvrdjrogmk/v1) allows to profile the organelle proteome using quantitative mass spectrometry. Here we provide a detailed protocol for the Data Independent Acquisition (DIA)-based mass spectrometry (MS) data acquisition method for proteomic profiling of the Golgi. This includes a description of how to construct the nano Liquid chromatography and DIA MS methods as well as a Data Dependent Acquisition (DDA) strategy to generate deep spectral libraries to be able to use in searching the DIA data. In addition, we provide detailed search parameters for database search for both DDA and DIA and downstream MS data analysis.

Attachments

ivejbgrdf.docx

Steps

High-pH Reversed-phase Liquid Chromatography fractionation of pooled Golgi-tag IP peptides to generate Spectral library:

Take ~5µg of peptide digest from each of the Golgi-tag IP and Control-IP sample.

Vacuum dry the pooled samples.

Dissolve the peptide digest by adding 120µL of High-pH Solvent-A (10millimolar (mM) Ammonium formate 10.0). Place the sample on a Thermomixer with an agitation at 1800rpm,0h 0m 0s for 0h 30m 0s.

Centrifuge the sample at high speed (17000x g,0h 0m 0s) for 0h 5m 0s at Room temperature.

Take 0.5µL of the sample and verify the pH and transfer the sample into LC-vial.

Ensure the LC-solvent are as Solvent-A (10millimolar (mM) Ammonium formate 10.0); Solvent-B (90% ACN (v/v) in 10millimolar (mM) Ammonium formate 10.0).

Note

Note: Adjust the pH with 30% Ammonium Hydroxide.

Prepare the LC method by following the below gradient:

A	B	C
Time (minutes)	Nano pump Flow rate (µl/min)	% Of Solvent-B
0.0	0.100	3.0
5.0	0.100	7.0
5.5	0.100	7.0
10.0	0.100	10.0
50.0	0.100	40.0
55.0	0.100	90.0
62.0	0.100	90.0
62.5	0.100	3.0
70.0	0.100	3.0
70.1	0.0100	3.0

Set the fraction collection time as Start time 0h 5m 5s and End time 1h 2m 0s.

Collect a total of 45 fractions by keeping the fraction collection for 0h 1m 15s for each fraction.

10.

Transfer the fractions into a pre-labelled 1.5mL protein lo binding tubes.

11.

Vacuum dry the samples and freeze in -20 freezer until the LC-MS/MS analysis.

Single shot DIA acquisition on Orbitrap Exploris 480:

12.

Dissolve vacuum dried peptides in 60µL of LC buffer (3% ACN in 0.1% Formic acid) and place the samples on a Thermomixer and mix them at 1800rpm,0h 0m 0s at Room temperature for about 0h 30m 0s.

13.

Take 4µg equivalent of peptide digest and spike 1µL of iRT peptide mix. Adjust the total volume of the sample anywhere between 5µL to 15µL but don’t exceed 15µL. Transfer the sample into glass insert and place them on LC vial.

Note

Note: Strictly avoid using any autoclaved pipette tips. If possible, use dedicated Pipette set for MS analysis.

14.

Construct LC and vDIA MS method as described below using Xaclibur software integrated in Thermo Orbitrap Exploris 480 MS acquisition software suite.

15.

Ensure 2 cm trap column (C18, 5μm, 100A°, 100 µ, 2 cm Nano-viper column # 164564, Thermo Scientific) and 50 cm analytical column (C18, 5micromolar (µM), 50 cm , 100Aº Easy nano spray column # ES903, Thermo Scientific) are equilibrated and verify the column performance by injecting 50ng HeLa or another standard digest.

16.

Nano LC gradient for 2h 25m 0s DIA analysis:

A	B	C
Time (minutes)	Nano pump Flow rate (µl/min)	% Of Solvent-B
0.0	0.250	3.0
12.0	0.250	7.0
115.0	0.250	25.0
129.0	0.250	37.0
130.0	0.250	95.0
135.30	0.250	95.0
135.80	0.250	3.0
145.0	0.250	3.0
145.0	Stop Run

17.

Mass spectrometer parameters: Refer below settings to construct variable DIA method:

A	B	C	D
Method duration	145 min
MS Global settings:
	Infusion mode:	Liquid Chromatography
	Expected LC peak width (s):	20
	Advanced Peak determination:	TRUE
	Default charge state:	3
	Internal mass calibration:	off	Note: If needed enable user defined calibrant ion (Polysilaxolane 445.120025 or enable Easy-IC option

Full scan settings:
	Orbitrap resolution:	120000
	Scan range (m/z):	375-1500
	RF lens (%):	40
	AGC target:	Custom
	Normalized AGC target (%):	300
	Maximum injection Time mode:	Custom
	Maximum injection Time (ms):	30
	Micro scans:	1
	Data type:	Profile

tMS2 or DIA settings	Isolation offset:	Off
	Collision Energy Mode:	Stepped
	Collision Energy Type:	Normalized
	HCD Collision Energy (%):	25, 28, 32
	Orbitrap resolution:	30000
	Scan range mode:	Define m/z range
	Scan Range (m/z):	200 - 1200	Note: Maximum of the matched fragement ions (b series and y-series) fall within this range and if needed this can be modified.
	RF Lens (%):	50
	AGC target:	Custom
	Normalized AGC target (%):	3000	Note: It is recommended to fill the trap with a maximum accumulation of ions (3000% = 3E6 ions) for each of the DIA window to increase the sensitivity
	Maximum injection Time mode:	Custom
	Maximum injection Time (ms):	70
	Micro scans:	1
	Data type:	Profile
	Polarity:	Positive
	Loop control:	N
	N (Number of Spectra):	24	We include one full MS1 scan after every 24 DIA scans to accommodate maximum possible MS1 scans
	Dynamic RT:	Off
	Time Mode:	Unscheduled

A	B	C	D
Scheme of vDIA windows mass list table:
	m/z	z	Isolation Window (mz)
	383.375	3	66.8
	423	3	13.5
	435	3	11.5
	446.5	3	12.5
	458	3	11.5
	469	3	11.5
	480	3	11.5
	490.5	3	10.5
	501	3	11.5
	512	3	11.5
	523	3	11.5
	533.5	3	10.5
	544	3	11.5
	554.5	3	10.5
	565	3	11.5
	575.5	3	10.5
	586	3	11.5
	597.5	3	12.5
	609.5	3	12.5
	621.5	3	12.5
	633	3	11.5
	645	3	13.5
	657.5	3	12.5
	670.5	3	14.5
	684	3	13.5
	697	3	13.5
	710.5	3	14.5
	725.5	3	16.5
	741	3	15.5
	756.5	3	16.5
	773.5	3	18.5
	791	3	17.5
	808.5	3	18.5
	827	3	19.5
	846.5	3	20.5
	866.5	3	20.5
	887.5	3	22.5
	910.5	3	24.5
	935.5	3	26.5
	962.5	3	28.5
	992	3	31.5
	1025	3	35.5
	1063	3	41.5
	1108.5	3	50.5
	1391.625	3	516.8

18.

Export the MS raw data for database searches by library-free (direct DIA) or library-based as illustrated in the workflow with Biognosys Spectronaut software suite.

Note

Optional: As the Biognosys Spectronaut is a commercial software suite if you don’t have access to it then you could use an open-source software suite such as DIA-NN.

Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:

19.

Dissolve vacuum dried peptides of each fraction in 60µL of LC buffer (3% ACN in 0.1% Formic acid) and place the samples on a Thermomixer and mix them at 1800rpm,0h 0m 0s at Room temperature for about 0h 30m 0s.

20.

Take 1µg equivalent of peptide digest and Spike 1µL of iRT peptide mix. Adjust the total volume of the sample anywhere between 5µL to 15µL but don’t exceed 15µL. Transfer the sample into glass insert and place them on LC vial.

Note

Note: Strictly avoid using any autoclaved pipette tips. If possible, use dedicated Pipette set for MS analysis.

21.

Ensure 2 cm trap column (C18, 5μm, 100Ao, 100 µ, 2 cm Nano-viper column # 164564, Thermo Scientific) and 50 cm analytical column (C18, 5micromolar (µM), 50 cm , 100Ao Easy nano spray column # ES903, Thermo Scientific) are equilibrated and verify the column performance by injecting 50ng HeLa or another standard digest.

22.

Nano LC gradient for 1h 25m 0s DDA analysis:

A	B	C
Time (minutes)	Nano pump Flow rate (µl/min)	% Of Solvent-B
0.0	0.300	3.0
7.0	0.300	7.0
60.0	0.300	22.0
70.0	0.300	35.0
71.0	0.300	95.0
78.0	0.300	95.0
79.0	0.300	3.0
85.0	0.300	3.0
85.0	Stop Run

23.

Mass spectrometer parameters: Refer below settings to construct DDA method:

A	B	C
Method duration	85 min
MS Global settings:
	Infusion mode:	Liquid Chromatography
	Expected LC peak width (s):	15
	Advanced Peak determination:	TRUE
	Default charge state:	2
	Internal mass calibration:	off

Full scan settings:
	Orbitrap resolution:	60000
	Scan range (m/z):	350-1200
	RF lens (%):	40
	AGC target:	Custom
	Normalized AGC target (%):	300
	Maximum injection Time mode:	Custom
	Maximum injection Time (ms):	28
	Micorscans:	1
	Data type:	Profile
	Polarity:	Positive
Filters:
MIPS	Monoisotopic peak determination:	Peptide
	Relax restrictions when too few precursors are found:	FALSE
Intensity	Filter Type:	ntensity Threshold
	Intensity Threshold:	1.00E+04
Charge State	Include charge state(s):	2 to 6
	Include undetermined charge states:	False
Dynamic Exclusion	Dynamic Exclusion Mode:	Custom
	Exclude after n times:	1
	Exclusion duration (s):	45
	Mass Tolerance:	ppm
	Low:	10
	High	10
	Exclude isotopes:	TRUE
	Perform dependent scan on single charge state per precursor only:	FALSE
Data Dependent	Data Dependent Mode:	Cycle Time
	Time between Master Scans (sec):	3
ddMS2 settings	Isolation Window (m/z):	1.2
	Isolation Offset:	Off
	Collision Energy Mode:	Fixed
	Collision Energy Type:	Normalized
	HCD Collision Energy (%):	30
	Orbitrap resolution:	15000
	Scan range mode:	Auto
	Scan Range (m/z):	200 - 1200
	AGC target:	Custom
	Normalized AGC target (%):	100
	Maximum injection Time mode:	Custom
	Maximum injection Time (ms):	85
	Micorscans:	1
	Data type:	Centroid
	Polarity:	Positive

Database searches with MaxQuant for Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:

24.

Export Raw MS data to a Windows server to perform database searches using MaxQuant. Refer the below search parameters for the search.

Note

Note: It is recommended to have a good computational capability for a faster and successful MaxQuant search. We used the configuration: Intel® Xeon® Silver 421R CPU @ 2.40GHz and 2.39 GHz (2 processors), 384 GB RAM, 64-bit Windows OS with 1TB SSD drive.

A	B
	Value
Version	1.6.10.0
Include contaminants	TRUE
PSM FDR	0.01
PSM FDR Crosslink	0.01
Protein FDR	0.01
Site FDR	0.01
Use Normalized Ratios for Occupancy	TRUE
Min. peptide Length	7
Min. score for unmodified peptides	0
Min. score for modified peptides	40
Min. delta score for unmodified peptides	0
Min. delta score for modified peptides	6
Min. unique peptides	0
Min. razor peptides	1
Min. peptides	1
Use only unmodified peptides and	TRUE
Modifications included in protein quantification	Oxidation (M);Acetyl (Protein N-term)
Peptides used for protein quantification	Razor
Discard unmodified counterpart peptides	TRUE
Label min. ratio count	2
Use delta score	FALSE
iBAQ	TRUE
iBAQ log fit	TRUE
Match between runs	TRUE
Matching time window [min]	0.7
Match ion mobility window [indices]	0.05
Alignment time window [min]	20
Alignment ion mobility window [indices]	1
Find dependent peptides	FALSE
Fasta file	D:\Database\20200723-Human-Uniprot.fasta
Decoy mode	revert
Include contaminants	TRUE
Advanced ratios	TRUE
Fixed andromeda index folder
Temporary folder
Combined folder location
Second peptides	TRUE
Stabilize large LFQ ratios	FALSE
Separate LFQ in parameter group	FALSE
Require MS/MS for LFQ comparisons	FALSE
Calculate peak properties	FALSE
Main search max. combinations	200
Advanced site intensities	TRUE
Write msScans table	TRUE
Write msmsScans table	TRUE
Write ms3Scans table	FALSE
Write allPeptides table	TRUE
Write mzRange table	TRUE
Write pasefMsmsScans table	FALSE
Write accumulatedPasefMsmsScans table	FALSE
Max. peptide mass [Da]	4600
Min. peptide length for unspecific search	8
Max. peptide length for unspecific search	25
Razor protein FDR	TRUE
Disable MD5	FALSE
Max mods in site table	3
Match unidentified features	FALSE
Epsilon score for mutations
Evaluate variant peptides separately	TRUE
Variation mode	None
MS/MS tol. (FTMS)	20 ppm
Top MS/MS peaks per Da interval. (FTMS)	12
Da interval. (FTMS)	100
MS/MS deisotoping (FTMS)	TRUE
MS/MS deisotoping tolerance (FTMS)	7
MS/MS deisotoping tolerance unit (FTMS)	ppm
MS/MS higher charges (FTMS)	TRUE
MS/MS water loss (FTMS)	TRUE
MS/MS ammonia loss (FTMS)	TRUE
MS/MS dependent losses (FTMS)	TRUE
MS/MS recalibration (FTMS)	TRUE
MS/MS tol. (ITMS)	0.5 Da
Top MS/MS peaks per Da interval. (ITMS)	8
Da interval. (ITMS)	100
MS/MS deisotoping (ITMS)	FALSE
MS/MS deisotoping tolerance (ITMS)	0.15
MS/MS deisotoping tolerance unit (ITMS)	Da
MS/MS higher charges (ITMS)	TRUE
MS/MS water loss (ITMS)	TRUE
MS/MS ammonia loss (ITMS)	TRUE
MS/MS dependent losses (ITMS)	TRUE
MS/MS recalibration (ITMS)	FALSE
MS/MS tol. (TOF)	40 ppm
Top MS/MS peaks per Da interval. (TOF)	10
Da interval. (TOF)	100
MS/MS deisotoping (TOF)	TRUE
MS/MS deisotoping tolerance (TOF)	0.01
MS/MS deisotoping tolerance unit (TOF)	Da
MS/MS higher charges (TOF)	TRUE
MS/MS water loss (TOF)	TRUE
MS/MS ammonia loss (TOF)	TRUE
MS/MS dependent losses (TOF)	TRUE
MS/MS recalibration (TOF)	FALSE
MS/MS tol. (Unknown)	20 ppm
Top MS/MS peaks per Da interval. (Unknown)	12
Da interval. (Unknown)	100
MS/MS deisotoping (Unknown)	TRUE
MS/MS deisotoping tolerance (Unknown)	7
MS/MS deisotoping tolerance unit (Unknown)	ppm
MS/MS higher charges (Unknown)	TRUE
MS/MS water loss (Unknown)	TRUE
MS/MS ammonia loss (Unknown)	TRUE
MS/MS dependent losses (Unknown)	TRUE
MS/MS recalibration (Unknown)	FALSE
Site tables	Deamidation (NQ)Sites.txt;Oxidation (M)Sites.txt;Phospho (STY)Sites.txt

Database searches with Biognosys Spectronaut for Data Dependent Independent Acquisition (DIA) MS analysis (Library free and Library-based search):

25.

Import the msms.txt file form the MaxQuant search output files into Spectronaut to generate a Spectral library.

Note

Make sure to provide a correct path of the DDA raw data.

26.

Alternatively perform a Pulsar search of DDA data to generate a library.

27.

As illustrated in the workflow we recommend doing a direct-DIA or Library free search using Human Uniprot FAST file to construct a hybrid library. Enable search archive option during the direct-DIA search.

28.

Merge the direct-DIA search archive and DDA library to construct a hybrid library and use this library to perform library-based search of the DIA data.

29.

Use the below settings for the library-based DIA search within Spectronaut.

A	B
Spectronaut 15.7.220308.50606
Computer Name: MRC-DRI-2
User Domain Name: LIFESCI-AD
User Name: rnirujogi
Analysis Mode: UI
Analysis Type: Peptide-Centric
Settings Used: RN_DIA_Default
Data Extraction
MS1 Mass Tolerance Strategy:	Dynamic
Correction Factor:	1
MS2 Mass Tolerance Strategy:	Dynamic
Correction Factor:	1
Intensity Extraction MS1:	Maximum Intensity
Intensity Extraction MS2:	Maximum Intensity
XIC Extraction
XIC IM Extraction Window:	Dynamic
Correction Factor:	1
XIC RT Extraction Window:	Dynamic
Correction Factor:	1
Calibration
Calibration Mode:	Automatic
MS1 Mass Tolerance Strategy:	System Default
MS2 Mass Tolerance Strategy:	System Default
Precision iRT:	TRUE
iRT <-> RT Regression Type:	Local (Non-Linear) Regression
Exclude Deamidated Peptides:	TRUE
MZ Extraction Strategy:	Maximum Intensity
Allow source specific iRT Calibration:	TRUE
Used Biognosys' iRT Kit:	TRUE
Calibration Carry-Over:	FALSE
Identification
Generate Decoys:	TRUE
Decoy Limit Strategy:	Dynamic
Library Size Fraction:	0.1
Decoy Method:	Mutated
Preferred Fragment Source:	NN Predicted Fragments
Machine Learning:	Per Run
Exclude Duplicate Assays:	TRUE
Precursor PEP Cutoff:	0.2
Protein Qvalue Cutoff (Experiment):	0.01
Protein Qvalue Cutoff (Run):	0.05
Exclude Single Hit Proteins:	TRUE
Pvalue Estimator:	Kernel Density Estimator
Precursor Qvalue Cutoff:	0.01
Single Hit Definition:	By Stripped Sequence
Quantification
Interference Correction:	TRUE
MS1 Min:	2
MS2 Min:	3
Exclude All Multi-Channel Interferences:	TRUE
Only Identified Peptides:	TRUE
Protein LFQ Method:	Automatic
Major (Protein) Grouping:	by Protein Group Id
Minor (Peptide) Grouping:	by Stripped Sequence
Minor Group Top N:	TRUE
Min:	1
Max:	3
Minor Group Quantity:	Mean precursor quantity
Major Group Top N:	TRUE
Min:	1
Max:	3
Major Group Quantity:	Mean peptide quantity
Quantity MS-Level:	MS2
Quantity Type:	Area
Proteotypicity Filter:	None
Data Filtering:	Qvalue
Cross Run Normalization:	TRUE
Row Selection:	Automatic
Normalization Strategy:	None
Normalization Filter Type:	None
PTM Workflow
PTM Localization:	TRUE
Probability Cutoff:	0.75
PTM Analysis:	TRUE
Multiplicity:	TRUE
Run Clustering:	FALSE
PTM Consolidation:	Sum
Flanking Region:	7
Workflow
In-Silico Library Optimization:	FALSE
Profiling Strategy:	iRT Profiling
Profiling Row Selection:	Minimum Qvalue Row Selection
Qvalue Threshold:	0.01
Profiling Target Selection:	Automatic Selection
Carry-over exact Peak Boundaries:	FALSE
Unify Peptide Peaks Strategy:	None
Multi-Channel Workflow Definition:	From Library Annotation
Fallback Option:	Labeled
Protein Inference
Protein Inference Workflow:	Automatic
Inference Algorithm:	IDPicker
Post Analysis
Calculate Sample Correlation Matrix:	TRUE
Calculate Explained TIC:	None
Gene Ontology:	geneOntology/Ontologies\bgs_default_go basic.obo
Differential Abundance Grouping:	Major Group (Quantification Settings)
Smallest Quantitative Unit:	Major Group (Quantification Settings)
Use All MS-Level Quantities:	FALSE
Differential Abundance Testing:	Un-Paired t-test
Assume Equa Variance:	FALSE
Group-Wise Testing Correction:	FALSE
Run Clustering:	TRUE
Distance Metric:	Manhattan Distance
Linkage Strategy:	Ward's Method
Z-score transformation:	FALSE
Order Runs by Clustering:	TRUE
Pipeline Mode
Post Analysis Reports:
Scoring Histograms:	TRUE
Data Completeness Bar Chart:	TRUE
Run Identifications Bar Chart:	TRUE
CV Density Line Chart:	TRUE
CVs Below X Bar Chart:	TRUE
Generate SNE File:	TRUE
Store Iontraces in SNE:	FALSE
Report Schema:	PTMSiteReport (Pivot), RN_PG_Pivot (Pivot), MSStats Report (v 3.7.3)(Normal), Protein Quant (Normal), Protein Quant (Pivot), BGS Factory Report(Normal)
Reporting Unit:	Across Experiment

Data analysis of DIA data and data visualization:

30.

Export Protein group tables from Spectronaut in PG Pivot format.

31.

For the Golgi-tag IP data, annotate using a complied list of know Golgi proteins from a resource e.g. (https://compartments.jensenlab.org/Search)) and Uniprot-GO terms.

Note

Note: Annotate Golgi proteins by using the VLOOKUP function in Excel from the compiled known Golgi-tag proteins. Similarly in case of Mito IP use Mito carta resource.

32.

Prepare the data for differential analysis and this can be done using Perseus software suite (https://maxquant.net/perseus/)..) The basic functionalities of the software and various workflows can be adopted from the published literature (PMID: 27348712) and available tutorials (http://coxdocs.org/doku.php?id=perseus:start)) on Youtube (https://www.youtube.com/c/MaxQuantChannel/featured))

33.

Follow the Perseus workflow illustrated in Figure 2.

Figure: 2

34.

The T-test results can be exported and could be analysed using other software suites such as curtain tool to visualize the volcano plot and associated protein raw intensities for all the conditions, protein domain architecture, STRING interaction prediction and alpha fold prediction.

35.

Optional: In addition to using the Perseus other data quality can be done using custom R or Python Scripts (Provided below) and other relevant packages.

Figure: 1

Scripts - R correlation plot

36.

library(corrplot)

filename <- "//mrc-smb.lifesci.dundee.ac.uk/mrc-group-folder/ALESSI/Toan/For
Golgitag_Paper/For_Pearson_Corr_02.txt"

df <-  read.table(filename, header = TRUE, sep="\t")

df <-  df[colnames(df)[1:which(colnames(df) == "HA.WCL_06")]]

cor_mat <- cor(as.matrix(df), use="everything")

pdf("//mrc-smb.lifesci.dundee.ac.uk/mrc-group-folder/ALESSI/Toan/For
Golgitag_Paper/For_Pearson_Corr_02.txt.pdf")

corrplot(cor_mat, order="hclust", type="lower", method="ellipse")

dev.off()

Scripts - Python Network Interaction with Cytoscape and Plotly Dash

37.

import dash
import dash_cytoscape as cyto
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
cyto.load_extra_layouts()
app = dash.Dash(__name__)
server = app.server

def add_individual_protein(df, source, elements):
 highest = df["Difference"].max()
 n = 0
 for i, r in df.iterrows():
  if n < 15:
   opacity = r["Difference"]/highest
     elements.append({'data': {'id': r["Gene.names"], 'label': r["Gene.names"], 'color':        f"rgba(136, 86, 167,{opacity})", "opacity": opacity}, 'classes': 'protein'})
    elements.append(
        {'data': {'source': source, 'target': r["Gene.names"], 'color': f"rgba(136, 86, 167,{opacity})", "opacity": opacity}, 'classes': 'protein-edge'},)
  else:
    break
      n += 1

def add_groups_enriched(edf, elements):
edf = edf.sort_values(by="Difference", ascending=False)
golgi = edf[(edf["Golgi"] == "+")]
#golgi = edf[(edf["C: Golgi"] == "+")]
golgi_count = len(golgi.index)
print(golgi_count)
glyco = golgi[golgi["Glycosylation"] == "+"]
#glyco = golgi[golgi["Glycosylation genes"] == "+"]
glyco_count = len(glyco.index)
print(glyco_count)
phospha = golgi[golgi["Phosphatases"] == "+"]
phospha_count = len(phospha.index)
kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark.kinase"] == "+")]
#kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark Kinases"] == "+")]
kinases_count = len(kinases.index)
ubi = golgi[golgi["Ub.Pathway"] == "+"]
ubi_count = len(ubi.index)

l = [

{'data': {'id': 'enriched-golgi', 'label': f'Golgi: {golgi_count}', "size": golgi_count * block},
'classes': 'golgi enriched'},
#{'data': {'source': 'significant', 'target': 'enriched-golgi'}, 'classes': 'significant-edge'},
{'data': {'id': 'enriched-glyco', 'label': f'Glycosylation genes: {glyco_count}',
"size": glyco_count * block}, 'classes': 'golgi enriched'},
{'data': {'id': 'enriched-phospha', 'label': f'Phosphatases: {phospha_count}',
"size": phospha_count * block}, 'classes': 'golgi enriched'},
{'data': {'id': 'enriched-kinase', 'label': f'Kinases: {kinases_count}', "size": kinases_count * block},
'classes': 'golgi enriched'},
{'data': {'id': 'ubi', 'label': f'Ubiquitin components: {ubi_count}', "size": ubi_count * block},
'classes': 'golgi enriched'},
{'data': {'source': 'enriched-golgi', 'target': 'enriched-glyco'}, 'classes': 'golgi-edge enriched'},
{'data': {'source': 'enriched-golgi', 'target': 'enriched-phospha'},
'classes': 'golgi-edge enriched'},
{'data': {'source': 'enriched-golgi', 'target': 'enriched-kinase'},
'classes': 'golgi-edge enriched'},
{'data': {'source': 'enriched-golgi', 'target': 'ubi'},
'classes': 'golgi-edge enriched'},

]
for i in l:
elements.append(i)
add_individual_protein(glyco, "enriched-glyco", elements)
add_individual_protein(phospha, "enriched-phospha", elements)
add_individual_protein(kinases, "enriched-kinase", elements)
add_individual_protein(ubi, "ubi", elements)

def add_groups_not_enriched(edf, elements):
edf = edf.sort_values(by="Difference", ascending=False)
golgi = edf[(edf["Golgi"] != "+")]
#golgi = edf[(edf["C: Golgi"] != "+")]
golgi_count = len(golgi.index)
print(golgi_count)
glyco = golgi[golgi["Glycosylation"] == "+"]
#glyco = golgi[golgi["Glycosylation genes"] == "+"]
glyco_count = len(glyco.index)
print(glyco_count)
phospha = golgi[golgi["Phosphatases"] == "+"]
phospha_count = len(phospha.index)
kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark.kinase"] == "+")]
#kinases = golgi[(golgi["Kinases"] == "+") | (golgi["Dark Kinases"] == "+")]
kinases_count = len(kinases.index)
ubi = golgi[golgi["Ub.Pathway"] == "+"]
ubi_count = len(ubi.index)

l = [

{'data': {'id': 'non-enriched-golgi', 'label': f'Non-golgi: {golgi_count}', "size": golgi_count * block}, 'classes': 'not-golgi not-enriched'},
#{'data': {'source': 'significant', 'target': 'non-enriched-golgi'}, 'classes': 'not-golgi significant-edge'},
{'data': {'id': 'non-enriched-glyco', 'label': f'Glycosylation genes: {glyco_count}',
"size": glyco_count * block}, 'classes': 'not-golgi not-enriched'},
{'data': {'id': 'non-enriched-phospha', 'label': f'Phosphatases: {phospha_count}',
"size": phospha_count * block}, 'classes': 'not-golgi not-enriched'},
{'data': {'id': 'non-enriched-kinase', 'label': f'Kinases: {kinases_count}', "size": kinases_count * block},
'classes': 'not-golgi not-enriched'},
{'data': {'id': 'non-enriched-ubi', 'label': f'Ubiquitin components: {ubi_count}', "size": ubi_count * block},
'classes': 'not-golgi not-enriched'},
{'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-glyco'}, 'classes': 'not-golgi-edge not-enriched'},
{'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-phospha'}, 'classes': 'not-golgi-edge not-enriched'},
{'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-kinase'}, 'classes': 'not-golgi-edge not-enriched'},
{'data': {'source': 'non-enriched-golgi', 'target': 'non-enriched-ubi'},
'classes': 'not-golgi-edge not-enriched'},


]
for i in l:
elements.append(i)
add_individual_protein(glyco, "non-enriched-glyco", elements)
add_individual_protein(phospha, "non-enriched-phospha", elements)
add_individual_protein(kinases, "non-enriched-kinase", elements)
add_individual_protein(ubi, "non-enriched-ubi", elements)

block = 0.2

#df = pd.read_csv(r"C:\Users\toanp\Downloads\All enriched_For Network.txt", sep="\t")
#df = pd.read_csv(r"C:\Users\toanp\Downloads\GT-IP_Mock-IP_tTest.txt", sep="\t")
df = pd.read_csv(r"C:\Users\toanp\Downloads\GT-IP_WCL_tTest.txt", sep="\t")

df = df[(df["Significant"]=="+")&(df["Difference"] >= 1)]

elements = [
#{'data': {'id': 'significant', 'label': f'Significant: {len(df.index)}', "size": len(df.index) * block}, 'classes': 'significant'},
]

add_groups_enriched(df, elements)
add_groups_not_enriched(df, elements)

app.layout = html.Div([
cyto.Cytoscape(
id='cytoscape',
elements=elements,
layout={'name': 'cose', 'idealEdgeLength': 20},
style={'width': '2000px', 'height': '2000px'},
stylesheet=[
{
'selector': '.significant',
'style': {
'shape': 'ellipse',
'background-color': 'rgb(173, 218, 226)',
}
},
{
'selector': '.not-golgi',
'style': {
'shape': 'ellipse',
'background-color': 'rgb(255, 154, 162)',
}
},
{
'selector': '.not-golgi-edge',
'style': {
'curve-style': 'straight-triangle',
"width": 5,
'line-color': 'rgb(255, 154, 162)',
}
},
{
'selector': '.golgi',
'style': {
'shape': 'ellipse',
'background-color': 'rgb(255, 218, 193)',
}
},
{
'selector': '.golgi-edge',
'style': {
'curve-style': 'straight-triangle',
"width": 5,
'line-color': 'rgb(255, 218, 193)',
}
},
{
'selector': '.protein',
'style': {
'shape': 'ellipse',
'background-color': 'data(color)',
'background-opacity': 'data(opacity)',
'line-color': 'black'
}
},
{
'selector': '.protein-edge',
'style': {
'line-color': 'data(color)',
'opacity': 'data(opacity)',
}
},
{
'selector': 'node',
'style': {
"content": "data(label)",
"width": "data(size)",
"height": "data(size)",
}
},
{
'selector': '.enriched',
'style': {
'shape': 'ellipse',
'background-color': 'rgb(255, 154, 162)',
'line-color': 'rgb(255, 154, 162)',
}
},
{
'selector': '.not-enriched',
'style': {
'shape': 'ellipse',
'background-color': 'rgb(255, 218, 193)',
'line-color': 'rgb(255, 218, 193)',
}
}
]
),
html.Div([html.Button("as svg", id="btn-get-svg")])
])
print(elements)
@app.callback(
Output('image-text', 'children'),
Input('cytoscape', 'imageData'),
)
def put_image_string(data):
return data


@app.callback(
Output("cytoscape", "generateImage"),
[
Input("btn-get-svg", "n_clicks"),
])
def get_image(get_svg_clicks):

# File type to output of 'svg, 'png', 'jpg', or 'jpeg' (alias of 'jpg')


# 'store': Stores the image data in 'imageDataf' !only jpg/png are supported
# 'download'`: Downloads the image as a file with all data handling
# 'both'`: Stores image data and downloads image as file.


ctx = dash.callback_context
if ctx.triggered:
input_id = ctx.triggered[0]["prop_id"].split(".")[0]

if input_id != "tabs":
 action = "download"
 ftype = input_id.split("-")[-1]
 return {
 'type': 'svg',
 'action': 'download'
}

 return {
 'type': 'png',
 'action': 'store'
}

if __name__ == "__main__":
app.run_server(debug=True)

Protocol for Data Independent Acquisition - Mass spectrometry analysis – a DIA-based Organelle Proteomics

Abstract

Attachments

Steps

High-pH Reversed-phase Liquid Chromatography fractionation of pooled Golgi-tag IP peptides to generate Spectral library:

Single shot DIA acquisition on Orbitrap Exploris 480:

Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:

Database searches with MaxQuant for Data Dependent Acquisition (DDA) MS analysis to generate Spectral library:

Database searches with Biognosys Spectronaut for Data Dependent Independent Acquisition (DIA) MS analysis (Library free and Library-based search):

Data analysis of DIA data and data visualization:

Scripts - R correlation plot

Scripts - Python Network Interaction with Cytoscape and Plotly Dash

推荐阅读