LI Detector Analytical Pipeline

Saurin B Parikh

Published: 2023-02-24 DOI: 10.17504/protocols.io.3byl4kjd2vo5/v1

Abstract

The LI Detector framework consists of integrated experimental and analytical pipelines . A. The pin-copy-upscale experimental pipeline from frozen glycerol stocks (top) to imaging (bottom). Each box represents a pinning step, and the steps within the sky-blue highlighted portion can be repeated until the desired colony density is reached. Illustrations to the right of the flowchart is a simplified representation of four experimental plates. A reference population (grey) is introduced on every plate during the first upscale step. The analytical pipeline uses this population for spatial bias correction and relative fitness estimations for the mutant strains of interest (purple). B. Workflow of the analysis pipeline where columns from left to right represent user inputs, analytical steps, and outputs. User inputs consist of raw colony size estimates and the strain layout of the plates. The analytical pipeline performs: i) local artifact correction, ii) source normalization, iii) reference-based background colony size estimation using a 2-dimensional linear interpolation, iv) corrects for spatial bias by dividing the local artifact corrected colony sizes with the background colony sizes and provides a measure of relative fitness, and iv) assigns empirical p-values using the reference strain relative fitness distribution. The outputs include local artifact corrected colony sizes, background colony sizes, spatially corrected relative fitness, and mutant strains identified as having a mean colony size that is significantly larger or smaller than the reference strain.

Before start

LI Detector analytical pipeline can only be applied to experiments conducted in accordance to the LI Detector experimental pipeline. Please refer to the LI Detector manuscript for best practices on conducting the colony-based high-throughput experiment.

Steps

Files

Plate maps of the starting density plate

A .xlsx file with one plate per sheet
Cells contain strain-id
Example

Table specifying strain-id to orf-name relationship

A .xlsx file containing unique strain_id to each orf_name
First column is strain_id
Second column is orf_name
Each strain_id from Step 1 should have an associated orf_name
Example

*orf_name variable is used for names of the mutants in the experiment.

Download LID and dependencies

Dependencies:

Install Database Toolbox from the APPS > Get More Apps option within MATLAB
Download and unzip mysql connector JDBC driver from here.

Download LID and associated scripts from Github in your MATLAB folder.

~$ cd MATLAB

~/MATLAB$ git clone https://github.com/sauriiiin/Matlab-Colony-Analyzer-Toolkit.git
~/MATLAB$ git clone https://github.com/sauriiiin/bean-matlab-toolkit.git
~/MATLAB$ git clone https://github.com/sauriiiin/lidetector.git
~/MATLAB$ git clone https://github.com/sauriiiin/sau-matlab-toolkit.git

Make LID bash scripts executable.

~/MATLAB$ cd lidetector

~/MATLAB/lidetector$ chmod +x initialize.sh
~/MATLAB/lidetector$ chmod +x buildraw.sh
~/MATLAB/lidetector$ chmod +x lid.sh

Initialize

Information to keep in hand before proceeding:

MySQL credentials - username, password, database name
Name of experiment - this will be used as a prefix for all the tables that will be generated
Upscale patterns from the experiment - ie in what combinations were the lower density plates condensed to form the higher density plates
Name (orf_name) of reference strain used
File path to plate map .xlsx file from Step 1
File path to the strain_id to orf_name .xlsx file from Step 2

Execute the initialize bash script from within the lidetector folder.

~/MATLAB/lidetector$ ./initialize.sh

Successful run will create the following tables

_pos2coor = position ids and their corresponding plate coordinate (density, plate number, column number and row number).
_pos2orf_name = position ids and the corresponding orf-name
_pos2rep = position ids of lowest density plates to their replicates at higher density plates based on the upscale pattern
_pos2strain_id = position ids and their corresponding strain ids
_strainid2orf_name = same as table from Step 2

Example files can be found in Data.zip.

Colony Size Data

Organize colony size estimations from your favorite colony size estimator, like the MATLAB Colony Analyzer Toolkit (MCAT), in ascending order of hours, plate number, column number, row number.

Below is the structure of such a file. Here image1,2,3 are pixel counts from 3 different images of the same plate. Average column consists of the average pixel count of image1,2,3.

A	B	C	D	E
hours	image1	image2	image3	average

Example

Combine the above table with positions ids from _pos2coor table using the below command.

~/MATLAB/lidetector$ ./buildraw.sh

Successful completion of this command will generate:

_RAW = raw colony size estimations per hour per position id of all the images
_ smudgebox = position ids to be excluded from analysis that correspond to the user defined coordinates
_JPEG = clean version of the raw table with border colonies, colonies corresponding to the smudge box and those colonies with pixel count of less than 10 NULL'd

Example files can be found in Data.zip.

10.

Users can skip step 8 & 9 to use LI Detector's imageanalyzer function if they choose to utilize MCAT as their desired tool for colony size estimation.

LID: imageanalyzerSkip this step if you have successfully executed step 8 & 9.

Spatial Bias Correction

11.

Information to keep in handy before proceeding:

Path to MALTAB directory
Path to lidetector directory
Path to where the JDBC driver was unzipped from Step 3

12.

Execute the LI Detector

~/MATLAB/lidetector$ ./lid.sh

Successful run will create the following tables:

_NORM = position ids and their corresponding relative fitness measurements along with the background pixel count measurement based on references
_FITNESS = similar to _NORM but with strain ids and orf-names included
_FITNESS_STAT = strain-id-wise mean, median and standard deviation of relative fitness
_PVALUE = strain-id-wise empirical p-values where stat = (strain mean fitness - reference mean fitness)/reference fitness standard deviation
```
      es = (strain mean fitness - reference mean fitness)reference mean fitness
```

Example files can be found in Data.zip.

LI Detector Analytical Pipeline

Abstract

Before start

Steps

Files

Download LID and dependencies

Initialize

Colony Size Data

Spatial Bias Correction

推荐阅读