CODA (part 1): setting up environment and preparing sample dataset | HuBMAP | JHU-TMC
Kyu Sang Han, Pei-Hsun Wu, Sashank Reddy, Denis Wirtz, Joel Sunshine, Ashley Kiemen
Abstract
CODA workflow part 1. setting up environment and preparing dataset
Steps
Software requirements
MATLAB MATLAB (mathworks.com)
Image processing toolbox Image Processing Toolbox - MATLAB (mathworks.com)
Deep learning toolbox Deep Learning Toolbox - MATLAB (mathworks.com)
MATLAB - Resnet50 model MATLAB resnet50 (mathworks.com)
Aperio ImageScope Aperio ImageScope (leicabiosystems.com)
FIJI ImageJ Fiji Downloads (imagej.net)
Download Source code
Codes are available at the following GitHub: CODA Github Repository
Download Sample dataset
Here, we discuss application to a sample dataset “lungs”,
containing 150 serial histological images. Download the sample dataset (serial
images and sample annotations) hereLung Sample Dataset on Google Driveve
Images are .ndpi format and were scanned at 20x
magnification (approximately 0.5 micron / pixel resolution), spaced 10 micron
apart. Save the images in a local drive folder (e.g \Users\Ashley\Documents\lungs). \Users\Ashley\Documents\lungsgs).
Filenames for each image should be created such that tissue
sections are read consecutively by Matlab. Therefore, include zero-padding in
numerical indices.
CORRECT FILENAMES: lungs_001.ndpi, lungs_002.ndpi, ..., lungs_011.ndpi
INCORRECT FILENAMES (no zero padding): lungs_1.ndpi, lungs_2.ndpi, ..., lungs_11.ndpi
Create downsampled copies of high-resolution images
The function create_downsampled_tif_images will create downsampled copies of the .ndpi files by directly loading each high-resolution images in tiles and down sampling it to the desired pixel
resolutions.
First, decide the resolution of the images you want to create. Here, we create images of 1 micron / pixel, 2 microns / pixel, and 10 micron / pixel resolution:
ds=[1 2 10];
Next, decide on the name of the output folders for each of the downsampled images you create. Here, we will save the images downsampled to 1 micron / pixel in a folder named “10x”, the images downsampled to 2 micron / pixel in a folder named “5x,” and the images downsampled to 10 micron / pixel in a folder named “1x.”
subfolders=["10x" "5x" "1x"];
Finally, call the function: create_downsampled_tif_images(pth,ds,subfolders);
Using this function, you will make two subfolders within the original folder containing the .ndpi images. One subfolder named “10x” containing the 20x images downsampled by a factor of 2. The other subfolder named “1x” containing the 20x images downsampled by a factor of 20. Most calculations will be performed on these tif images. Note: here we use 10x and 1x for example, but other resolutions could be created as
desired.
pth10x=[pth,'10x'];
pth1x=[pth,'1x'];
**Note: If this code fails due to memory constraints on your computer, try python Openslide.