Multiplexed CRISPR-based target-enriched next-generation sequencing for detecting antibiotic resistance genes in environmental samples
Yuqing Mao, Thanh H Nguyen
CRISPR
antibiotic resistance
next-generation sequencing
metagenomic
library
multiplex
ARG
antibiotic-resistance gene
Illumina
target enrichment
sequencing
environmental
wastewater
sewage
Cas9
NGS
Disclaimer
DISCLAIMER – FOR INFORMATIONAL PURPOSES ONLY; USE AT YOUR OWN RISK
The protocol content here is for informational purposes only and does not constitute legal, medical, clinical, or safety advice, or otherwise; content added to protocols.io is not peer reviewed and may not have undergone a formal approval of any kind. Information presented in this protocol should not substitute for independent professional judgment, advice, diagnosis, or treatment. Any action you take or refrain from taking using or relying upon the information presented here is strictly at your own risk. You agree that neither the Company nor any of the authors, contributors, administrators, or anyone else associated with protocols.io, can be held responsible for your use of the information contained in or linked to this protocol or any of our Sites/Apps and Services.
Abstract
High-throughput detection of antibiotic resistance genes (ARGs) in complex environmental samples is challenging for two reasons: 1) ARGs account for less than 0.1% of total DNA in an environmental sample, and 2) it is difficult to detect thousands of ARGs in one reaction. Conventional methods, including metagenomic sequencing and quantitative polymerase chain reaction (qPCR), have their limitations with sensitivity and target range, respectively. Here, we propose a multiplexed CRISPR-Cas9-based target-enriched next-generation sequencing (NGS) method to detect thousands of ARGs in complex environmental samples, using sewage as a testbed. This protocol includes guide RNA design, guide RNA synthesis, DNA sample preparation, CRISPR-NGS library preparation, and data processing steps. With this protocol, ARGs in low abundances can be detected with increased read depth and higher sensitivity than regular metagenomic NGS methods. This protocol is also applicable for detecting other low-abundance genetic markers, for example, bacterial virulence factors, in environmental samples.
Before start
It is highly recommended to use DNA Away and RNase Away to clean all surfaces and equipment before wet lab experiments.
Steps
Multiplex crRNA design (using FLASHit as an example)
Create a Linux environment. It can be set up in MobaXterm (https://mobaxterm.mobatek.net/) or other preferred terminal software. Using MobaXterm as an example, the Linux environment can be created by “Sessions” -> “New session” -> WSL. Then, select “Ubuntu” for “Distribution”. Click on “OK”, and the session will be created and saved.
Install FLASHit (https://github.com/czbiohub-sf/flash) in the Linux environment according to the instructions in “Prerequisite” on its GitHub webpage.
Collect all target genes from databases. Here, as an example, all available sequences for antibiotic resistance genes (ARGs) from The Comprehensive Antibiotic Resistance Database (CARD) (https://card.mcmaster.ca/) were downloaded using the link: https://card.mcmaster.ca/latest/data.
For an ARG detection project, among all downloaded “.fasta” files, “nucleotide_fasta_protein_homolog_model.fasta” was used as the input, because the ARGs that have raised high concerns such as the CTX-M gene families and the mcr gene families are included in the protein homolog model.
Trim the “nucleotide_fasta_protein_homolog_model.fasta” file using the Python code below to keep only the antibiotic resistance ontology (ARO) of the ARGs in the titles, because special characters cannot be processed by FLASHit.
import re
file=open(r"INPUT_FASTA_FILE_PATH_HERE")
raw_content=file.readlines()
file.close()
output_content=[]
for i in range(0,len(raw_content)):
if i%2==0:
aro_number=re.findall(r'ARO:[0-9]+',raw_content[i])[0]
output_content.append('>'+aro_number.split(':')[1])
if i%2!=0:
output_content.append(raw_content[i].strip('\n'))
output_file=open(r" OUTPUT_TRIMMED_FASTA_FILE_PATH_HERE ",'w')
for i in range(0,len(output_content)):
output_file.write(output_content[i]+'\n')
output_file.close()
In Linux terminal, activate the conda environment, then activate the environment for running FLASHit.
[optional] By default, FLASHit excludes the off-target sites from human genomes and the E. coli BL21 genome. If an environmental sample is expected to include undesired genomes other than these two, for example, swine genomes, users can modify the files in /flash/generated_files/ accordingly.
Search the reference genome of the undesired off-targets from NCBI Genome database (https://www.ncbi.nlm.nih.gov/datasets/genome/) by typing the species name in the search box. https://www.ncbi.nlm.nih.gov/datasets/genome/) by typing the species name in the search box.
Run the following python code to remove replicated off-targets and organize all off-targets to the same file.
import argparse
import os
parser = argparse.ArgumentParser()
parser.add_argument('-i',dest='input',type=str,required=True,help='Define input txt folder')
parser.add_argument('-o',dest='output',type=str,required=True,help="Define output path for a combined txt")
args=parser.parse_args()
full_gRNA_list=[]
dir_path=args.input.strip("'")
dir_list=os.listdir(dir_path)
for file_name in dir_list:
print('Processing '+file_name+' ......')
file=open(dir_path+'/'+file_name)
raw_content=file.readlines()
file.close()
for i in range(0,len(raw_content)):
raw_content[i]=raw_content[i].strip('\t\n\r')
for i in range(0,len(raw_content),3):
full_gRNA_list.append(raw_content[i])
full_gRNA_list=list(set(full_gRNA_list))
output_file=open(args.output.strip("'"),"w")
for i in range(0,len(full_gRNA_list)):
output_file.write(full_gRNA_list[i]+'\n')
output_file.close()
If there are multiple off-target genomes, place all output “.txt” files generated by the above code into the same directory. The list for human genome off-targets is already provided by FLASHit with the file path "/flash/generated_files/human_guides_38.txt". Combine all off-targets into the same “all_offtargets.txt” file using the following Python code.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('-i',dest='input',type=str,required=True,help='Define input txt folder')
parser.add_argument('-o',dest='output',type=str,required=True,help="Define output path for a combined txt")
args=parser.parse_args()
file=open(args.input.strip("'"))
gRNA_list=file.readlines()
file.close()
output_list=sorted(list(set(gRNA_list)))
output_file=open(args.output.strip("'"),'w')
for i in range(0,len(output_list)):
output_file.write(output_list[i])
output_file.close()
Replace the “all_offtargets.txt” file in “/flash/generated_files/” with the new “.txt” file generated by the code above. Make sure to rename the newly generated “.txt” file to “all_offtargets.txt”.
Rename “human_guides_38.txt” and “ecoli_bl21_de3_offtargets.txt” in “/flash/generated_files/” to “human_guides_38.txt1” and “ecoli_bl21_de3_offtargets.txt1” to avoid those two files to be identified by FLASHit by default.
From the search results, click on the genome with the NCBI RefSeq label.
Download the genome sequence by choosing “RefSeq only” and “Genome sequences (FASTA)”. .
Use each “.fasta” file as an input to FLASHit following the guidance in “Creating your own library” in th “Workflow” section on the GitHub page of FLASHit. .
After “Will discard xxx targets in amibiguous_targets.txt affecting xxx not necessarily unique genes.” is shown on the screen, break the current FLASHit run by hitting Ctrl+C.
Go to the directory /flash/generated_files/target_index/, copy “all_targets.txt” to a customized directory , and rename it by the input “.fasta” file name.
After collecting and renaming all “all_targets.txt” files to the customized directory, split files larger than 10 Mb to separated 10 Mb files into a new directory using the command below.
split -b 10m INPUT_FILE_NAME OUTPUT_FILE_PATH_AND_PREFIX
In the new directory containing all split files, add “.txt” suffix to all files using the command below.
ls | while read i; do mv ${i} ${i}.txt
Copy and paste the other files smaller than 10 Mb to the directory containing all split files.
Follow the guidance in “Creating your own library” in “Workflow” section on the GitHub page of FLASHit to generate a list for the multiplexed 20-nt target regions for the template of crRNA.
Follow the guidance in “Creating a bed file of the guides” on the GitHub page of FLASHit to generate a file showing the cleavage sites of the crRNA on the target genes.
Assemble the full crRNA templates by replacing the XXXXXXXXXXXXXXXXXXXX in 5’-TAATACGACTCACTATAGXXXXXXXXXXXXXXXXXXXXGTTTTAGAGCTATGCTGTTTTG-3’ by the 20-nt sequences generated by FLASHit.
The assembled nucleotide sequences can be used for purchasing DNA oligo pools.
Guide RNA preparation
Mix the DNA template for either crRNA or tracrRNA, forward and corresponding
reverse primers, Phusion High-Fidelity PCR Master Mix, and molecular biology
grade water in a nuclease-free PCR tube following the volumes listed in the
table below. Pipette up and down 10 times or until well mixed.
A | B |
---|---|
Reagent | Volume (μL) |
DNA template | 4 |
Forward primer (10 μM) | 2.5 |
Reverse primer (10 μM) | 2.5 |
Phusion High-Fidelity PCR Master Mix | 25 |
Molecular biology grade water | 16 |
Total | 50 |
Amplify the DNA templates for crRNA and tracrRNA in a thermal cycler for PCR. The steps in the thermal cycle are listed below.
A | B | C | D |
---|---|---|---|
Step | Temperature (℃) | Time (s) | Cycles |
Initial denaturation | 98 ℃ | 10 | 1x |
Denaturation | 98 ℃ | 5 | 12x |
Annealing | 55 ℃ | 15 | |
Final extension | 72 ℃ | 60 | 1x |
Hold | 4 ℃ | ∞ | 1x |
Mix ATP, UTP, GTP, and CTP provided in TranscriptAid T7 High Yield Transcription Kit in 1:1:1:1 ratio in a nuclease-free microcentrifuge tube. Pipette up and down for 10 times or until well mixed.
For the transcription reaction of crRNA and tracrRNA, mix the reagents
from TranscriptAid T7 High Yield Transcription Kit and the PCR-amplified DNA
templates in nuclease-free PCR tubes following the volumes provided in the
table below. Pipette up and down 10 times or until well mixed. A 50-μL
PCR-amplified DNA template can be divided into 4 transcription reactions in
this step.
A | B |
---|---|
Reagent | Volume (μL) |
Mixed NTP | 16 |
PCR-amplified DNA template | 12 |
5X TranscriptAid Reaction Buffer | 8 |
TranscriptAid Enzyme Mix | 4 |
Total | 40 |
Incubate the RNA transcription samples at 37 ℃ for 4 hours.
Add 5 μL of DNase I and 5 μL of DNA Digestion Buffer provided in RNA Clean & Concentrator-5 (DNase Included) to each 40-μL RNA transcription reaction. Pipette up and down 10 times or until well mixed.
Incubate at room temperature for 15 min.
Follow the “Total RNA Clean-up” instructions in the user manual of RNA Clean & Concentrator-5. Use 15 μL of DNase/RNase-Free Water to elute the purified crRNA or tracrRNA product. Incubate for 5 min before the final centrifugation to ensure maximum yield.
Pipette each purified crRNA or tracrRNA product up and down for 10 times or until well mixed. Take 1 μL from each purified RNA sample, dilute 100-fold in 99 μL of molecular biology grade water. Pipette up and down for 10 times or until well mixed. Quantify each 100-fold-diluted purified crRNA or tracrRNA using QubitTMRNA Broad Range (BR) Assay Kit by adding 10 μL diluted sample to 190 μL of master mix.
the
Aliquot the crRNA and tracrRNA samples and store at -80 ℃ before use.
Right before making CRISPR-NGS library, mix crRNA and tracrRNA in
anequi-molar ratio (see Equation 1), then add Nuclease-Free Duplex Buffer to
reach a final guide RNA concentration of 1500 ng/μL (see Equation 2). Pipette
up and down 10 times or until well mixed.


Incubate the mixture in a thermal cycler at 94 ℃ for 2 min, then slowly cool down to room temperature. The guide RNA is ready to use.
DNA sample preparation (Sewage sample as an example)
Shake the sewage sample until well mixed, and filter 50 mL of the sewage sample through
0.45 μm pore size membrane filter.
Store the membrane filter at -80 ℃ until DNA extraction.
Extract DNA from the membrane filter using FastDNATM SPIN Kit for Soil following the user’s manual. Elute the DNA samples using 100 μL of DES provided in the kit.
Purify the DNA samples using OneStep PCR Inhibitor Removal Kit following the user’s manual.
Determine the concentrations of the DNA samples using QubitTM 1X dsDNA High Sensitivity (HS) Assay Kit by adding 2 μL diluted sample to 198 μL of master mix.
Aliquot and store the DNA samples at -20 ℃ or -80 ℃ until library preparation.
CRISPR-NGS library preparation
Right before library preparation, dilute the “NH8B” external standard 100-fold using
molecular biology grade water.
Determine the concentration of the diluted “NH8B” external standard using QubitTM 1X dsDNA High Sensitivity (HS) Assay Kit by adding 2 μL diluted external standard to 198 μL of master mix.
Mix the diluted Cas9 protein, the duplexed guide RNA, and NEBufferTM r3.1
in nuclease-free PCR tubes with the volumes listed in the table below. Pipette
up and down 10 times or until well mixed.
A | B |
---|---|
Reagent | Volume (μL) |
Cas9 | 2 |
Guide RNA | 10 |
NEBuffer r3.1 | 3 |
Molecular biology grade water | 4 |
Total | 19 |
Incubate the above mixture at room temperature for at least 15 min to bind guide RNA to Cas9.
Block the DNA samples by removing 5’ phosphate group using rAPid Alkaline Phosphatase with the volumes listed in the table below.
A | B |
---|---|
Reagent | Amount |
rAPid Alkaline Phosphatase Buffer 10x concentrated | 2 μL |
rAPid Alkaline Phosphatase 1 U/μl | 1 μL |
DNA sample | ~200 ng |
Molecular biology grade water | Fill up the volume to 20 μL |
Incubate the DNA-blocking reaction mixture in a thermal cycler with the following thermal conditions.
A | B | C |
---|---|---|
Step | Temperature (℃) | Time (min) |
Incubation | 37 | 10 |
Phosphatase inactivation | 75 | 2 |
Hold | 4 | ∞ |
Mix the blocked DNA samples, the mixture of Cas9 and guide RNA, and the “NH8B” external standard with the volumes listed in the table below. Pipette up and down 10 times or until well mixed.
A | B |
---|---|
Reagent | Volume (μL) |
Mixture of Cas9 and guide RNA | 19 |
Blocked DNA | 10 |
100-fold diluted “NH8B” external standard | 1 |
Total | 30 |
Incubate the above mixture at 37 ℃ for 16 hours.
Add 5 μL of RNase T1 to the mixture, pipette up and down for 10 times, and incubate at 37 ℃ for 15 min to remove guide RNA.
Prepare the master mix for dA-tailing, using the reagents and volumes listed in the table below. Pipette up and down 10 times or until well mixed.
A | B |
---|---|
Reagent | Volume (μL) |
dATP (100 mM) | 2 |
Taq DNA Polymerase | 5 |
ThermoPol Reaction Buffer | 80 |
Molecular biology grade water | 13 |
Total | 100 |
Add 5 μL of the dA-tailing master mix to each 35-μL mixture after RNase T1 treatment to reach a 40 μL total volume. Pipette up and down 10 times or until well mixed.
Incubate the mixture at 72 ℃ for 20 min for dA-tailing and Cas9 inactivation.
Dilute adapters using molecular biology grade water or xGenTM Adapter Buffer based on the total DNA input, according to the table below.
A | B |
---|---|
Total DNA input (ng) | Adapter dilution ratio |
>100 | 100x |
<100 | 200x |
Ligate adapters to targeted DNA fragments using the reagents and volumes listed in the table below.
A | B |
---|---|
Reagent | Volume (μL) |
dA-tailed DNA sample | 35 |
Diluted adapter | 2.5 |
NEBNext Ligation Enhancer | 1 |
NEBNext Ultra II Ligation Master Mix | 30 |
Total | 68.5 |
Incubate the ligation mixture at room temperature for 15 min.
Dilute 1x TE buffer 10-fold to make 0.1x TE buffer.
Purify the adapter-ligated DNA samples using AMPure XP SPRI beads with beads:DNA ratio of 0.8:1 (57 μL of SPRI beads for 68.5 μL of adapter-ligated DNA sample). The detailed SPRI beads cleaning steps are listed in the table below.
A | B | C |
---|---|---|
Step | On/Off the magnetic rack | Time (min) |
Bind DNA sample to the beads | off | 5 |
Separate the beads from the liquid phase | on | 5 |
Discard the supernatant | on | / |
1st wash with 80% ethanol | on | 2 |
Discard the supernatant | on | / |
2nd wash with 80% ethanol | on | 2 |
Discard the supernatant | on | / |
Air dry the beads with the lid open | on | 3-5 |
Resuspend the beads with 17 μL of 0.1x TE buffer | off | / |
Release DNA from the beads to the liquid phase | off | 10 |
Separate the beads from the liquid phase | on | 5 |
Transfer 15 μL of the supernatant to a clean PCR tube | on | / |
Dilute the xGenTM Library Amplification Primer Mix 2-fold by adding an equal volume of molecular biology grade water.
Mix the beads-purified DNA sample, diluted primer mix, and NEBNext®UltraTM II Q5® Master Mix in a nuclease-free PCR tube according to the table below. Pipette up and down 10 times or until well mixed.
A | B |
---|---|
Reagent | Volume (μL) |
Beads-purified DNA sample | 15 |
Diluted primer mix | 10 |
NEBNext Ultra II Q5 Master Mix | 25 |
Total | 50 |
Incubate the mixture above in a thermal cycler using the thermal cycle listed in the table below.
A | B | C | D |
---|---|---|---|
Step | Temperature (℃) | Time | Cycles |
Initial denaturation | 98 ℃ | 30 s | 1x |
Denaturation | 98 ℃ | 10 s | 22x for >100 ng DNA input; 30 x for <100 ng DNA input |
Annealing | 65 ℃ | 75 s | |
Final extension | 65 ℃ | 5 min | 1x |
Hold | 4 ℃ | ∞ | 1x |
Purify the PCR product using AMPure XP SPRI beads with beads:DNA ratio
of 0.9:1 (45 μL of SPRI beads for 50 μL of PCR product). The detailed SPRI
beads cleaning steps are listed in the table below.
A | B | C |
---|---|---|
Step | On/Off the magnetic rack | Time (min) |
Bind the PCR product to the beads | off | 5 |
Separate the beads from the liquid phase | on | 5 |
Discard the supernatant | on | / |
1st wash with 80% ethanol | on | 2 |
Discard the supernatant | on | / |
2nd wash with 80% ethanol | on | 2 |
Discard the supernatant | on | / |
Air dry the beads with the lid open | on | 3-5 |
Resuspend the beads with 33 μL of 0.1x TE buffer | off | / |
Release DNA from the beads to the liquid phase | off | 10 |
Separate the beads from the liquid phase | on | 5 |
Transfer 30 μL of the supernatant to a clean tube | on | / |
Determine the DNA concentration of the library using QubitTM 1X dsDNA High Sensitivity (HS) Assay Kit by adding 2 μL diluted external standard to 198 μL of master mix.
According to the DNA concentration, take 1-2 μL of the library and dilute to ~1 ng/μL for the fragment analyzer.
Store the libraries at -20 ℃ or -80 ℃ until the sequencing run.
NGS read mapping
After sequencing, download all raw sequencing data files.
Unzip the files to get ".fastq" files for each library.
Download PRICE from https://derisilab.ucsf.edu/software/price/index.html to the local Linux environment. Install by navigating to the PRICE directory and typing "make" in the command line tool.
Clone the KMA repository from https://bitbucket.org/genomicepidemiology/kma/src/master/ using the command below to the local Linux environment. Install by navigating to the KMA directory and typing "make" in the command line tool.
git clone https://bitbucket.org/genomicepidemiology/kma.git
Screen the low-quality sequencing reads using PriceSeqFilter with 85% of nucleotides in a read must be in high quality, the minimum allowed probability of a nucleotide being correct is 98%, and 90% of nucleotides in a read that must be called. An example of the command for paired sequences is shown below.
PATH_TO_PRICE_FOLDER/Price/PriceSeqFilter -fp R1.fastq R2.fastq -rqf 85 0.98 -rnf 90 -op R1_filtered.fastq R2_filtered.fastq
Make a copy of the ".fasta" file used for generating guide RNA in Step 8. Add the sequence of the "NH8B" external standard to the end of the copied file. This ".fasta" file will be used as a reference gene list.
Index the ".fasta" reference gene list file with KMA using the command below.
kma index -i REFERENCE_GENE_LIST.fasta -o INDEX_FILE_PREFIX
Map the filtered ".fastq" files to the indexed reference gene list with KMA using the command below.
kma -ipe R1_filtered.fastq R2_filtered.fastq -a -t_db INDEX_FILE_PREFIX -o OUTPUT_FILE_PREFIX
The read mapping results for the target genes are available in the ".res" file generated by KMA.