Default Config Settings
file_names
The file_names section specifies the files that will be used throughout the pipeline. Variables in this section can be changed at any point in the pipeline, and the notebook created using it can still be loaded in.
-
notebook_name: str.
Name of notebook file in output directory will be notebook_name.npz
Default:
notebook
-
input_dir: str.
Directory where the raw .nd2 files or .npy stacks are
Default:
MUST BE SPECIFIED
-
output_dir: str.
Directory where notebook is saved
Default:
MUST BE SPECIFIED
-
tile_dir: str.
Directory where tile .npy files saved
Default:
MUST BE SPECIFIED
-
round: maybe_list_str.
Names of .nd2 files for the imaging rounds. Leave empty if only using anchor.
Default:
None
-
anchor: maybe_str.
Name of the file for the anchor round. Leave empty if not using anchor.
Default:
None
-
raw_extension: str.
.nd2 or .npy indicating the data type of the raw data.
Default:
.nd2
-
raw_metadata: maybe_str.
If .npy raw_extension, this is the name of the .json file in input_dir which contains the metadata required extracted from the initial .nd2 files. I.e. it contains the output of coppafish/utils/nd2/save_metadata:
-
xy_pos
-List [n_tiles x 2]
. xy position of tiles in pixels. -
pixel_microns
-float
. xy pixel size in microns. -
pixel_microns_z
-float
. z pixel size in microns. -
sizes
- dict with fov (t
), channels (c
), y, x, z-planes (z
) dimensions.
Default:
None
-
-
dye_camera_laser: maybe_file.
csv file giving the approximate raw intensity for each dye with each camera/laser combination. If not set, the file coppafish/setup/dye_camera_laser_raw_intensity.csv file will be used.
Default:
None
-
code_book: str.
Text file which contains the codes indicating which dye to expect on each round for each gene.
Default:
MUST BE SPECIFIED
-
scale: str.
Text file saved in tile_dir containing
extract['scale']
andextract['scale_anchor']
values used to create the tile .npy files in the tile_dir. If the second value is 0, it meansextract['scale_anchor']
has not been calculated yet.If the extract step of the pipeline is re-run with
extract['scale']
orextract['scale_anchor']
different to values saved here, an error will be raised.Default:
scale
-
psf: str.
npy file in output directory indicating average spot shape. If deconvolution required and file does not exist, will be computed automatically in extract step. (this is psf before tapering and scaled to fill uint16 range).
Default:
psf
-
omp_spot_shape: str.
npy file in output_dir indicating average shape in omp coefficient image. It only indicates the sign of the coefficient i.e. only contains -1, 0, 1. If file does not exist, it is computed from the coefficient images of all genes of the central tile.
Default:
omp_spot_shape
-
omp_spot_info: str.
npy file in output_dir containing information about spots found in omp step. After each tile is completed, information will be saved to this file. If file does not exist, it will be saved after first tile of OMP step.
Default:
omp_spot_info
-
omp_spot_coef: str.
npz file in output_dir containing gene coefficients for all spots found in omp step. After each tile is completed, information will be saved to this file. If file does not exist, it will be saved after first tile of OMP step.
Default:
omp_spot_coef
-
big_dapi_image: maybe_str.
npz file in output_dir where stitched DAPI image is saved. If it does not exist, it will be saved if
basic_info['dapi_channel']
is notNone
. Leave blank to not save stitched anchorDefault:
dapi_image
-
big_anchor_image: maybe_str.
npz file in output_dir where stitched image of
ref_round
/ref_channel
is saved. If it does not exist, it will be saved. Leave blank to not save stitched anchorDefault:
anchor_image
-
pciseq: list_str.
csv files in output_dir where plotting information for pciSeq will be saved. First file is name where omp method output will be saved. Second file is name where ref_spots method output will be saved. If files don't exist, they will be created when the function coppafish/export_to_pciseq is run.
Default:
pciseq_omp, pciseq_anchor
basic_info
The basic_info section indicates information required throughout the pipeline.
-
is_3d: bool.
Whether to use the 3d pipeline.
Default:
MUST BE SPECIFIED
-
anchor_channel: maybe_int.
Channel in anchor round used as reference and to build coordinate system on. Usually channel with most spots. Leave blank if anchor not used.
Default:
None
-
dapi_channel: maybe_int.
Channel in anchor round that contains DAPI images. This does not have to be in
use_channels
as anchor round is dealt with separately. Leave blank if no DAPI.Default:
None
-
ref_round: maybe_int.
Round to align all imaging rounds to. Will be set to
anchor_round
ifanchor_channel
andfile_names['anchor']
specified.Default:
None
-
ref_channel: maybe_int.
Channel in
ref_round
used as reference and to build coordinate system on. Usually channel with most spots. Will be set toanchor_channel
ifanchor_channel
andfile_names['anchor']
specified.Default:
None
-
use_channels: maybe_list_int.
Channels in imaging rounds to use throughout pipeline. Leave blank to use all.
Default:
None
-
use_rounds: maybe_list_int.
Imaging rounds to use throughout pipeline. Leave blank to use all.
Default:
None
-
use_z: maybe_list_int.
z planes used to make tile .npy files. Leave blank to use all. If 2 values provided, all z-planes between and including the values given will be used.
Default:
None
-
use_tiles: maybe_list_int.
Tiles used throughout pipeline. Leave blank to use all. For an experiment where the tiles are arranged in a 4 x 3 (ny x nx) grid, tile indices are indicated as below:
| 2 | 1 | 0 |
| 5 | 4 | 3 |
| 8 | 7 | 6 |
| 11 | 10 | 9 |
Default:
None
-
ignore_tiles: maybe_list_int.
It is often easier to select tiles to remove than to use. All tiles listed here will be ignored. Leave blank to use all.
Default:
None
-
use_dyes: maybe_list_int.
Dyes to use when when assigning spots to genes. Leave blank to use all.
Default:
None
-
dye_names: maybe_list_str.
Name of dyes used in correct order. So for gene with code
360...
, gene appears withdye_names[3]
in round 0,dye_names[6]
in round 1,dye_names[0]
in round 2 etc. If left blank, then assumes each channel corresponds to a different dye i.e. code 0 in code_book = channel 0. For quad_cam data, this needs to be specified.Default:
None
-
channel_camera: maybe_list_int.
channel_camera[i]
is the wavelength in nm of the camera used for channeli
. Only need to be provided ifdye_names
provided to help estimate dye intensity in each channel.Default:
None
-
channel_laser: maybe_list_int.
channel_laser[i]
is the wavelengths in nm of the camera/laser used for channeli
. Only need to be provided ifdye_names
provided to help estimate dye intensity in each channel.Default:
None
-
tile_pixel_value_shift: int.
This is added onto every tile (except DAPI) when it is saved and removed from every tile when loaded. Required so we can have negative pixel values when save to .npy as uint16.
Default:
15000
-
ignore_first_z_plane: bool.
Previously had cases where first z plane in .nd2 file was in wrong place and caused focus stacking to be weird or identify lots of spots on first plane. Hence it is safest to not load first plane and this is done if
ignore_first_z_plane = True
.Default:
True
extract
The extract section contains parameters which specify how to filter the raw microscope images to produce the .npy files saved to file_names['tile_dir']
.
-
wait_time: int.
Time to wait in seconds for raw data to come in before crashing. Assumes first round is already in the
file_names['input_dir']
Want this to be large so can run pipeline while collecting data.Default:
21600
-
r1: maybe_int.
Filtering is done with a 2D difference of hanning filter with inner radius
r1
within which it is positive and outer radiusr2
so annulus betweenr1
andr2
is negative. Should be approx radius of spot. Typical = 3.For
r1 = 3
andr2 = 6
, a2048 x 2048 x 50
image took 4.1s. For2 <= r1 <= 5
andr2
double this, the time taken seemed to be constant.Leave blank to auto detect using
r1_auto_microns micron
.Default:
None
-
r2: maybe_int.
Filtering is done with a 2D difference of hanning filter with inner radius
r1
within which it is positive and outer radiusr2
so annulus betweenr1
andr2
is negative. Should be approx radius of spot. Typical = 6. Leave blank to set to twicer1
.Default:
None
-
r_dapi: maybe_int.
Filtering for DAPI images is a tophat with r_dapi radius. Should be approx radius of object of interest. Typical = 48. Leave blank to auto detect using
r_dapi_auto_microns
.Default:
None
-
r1_auto_microns: number.
If
r1
not specified, will convert to units of pixels from this micron value.Default:
0.5
-
r_dapi_auto_microns: maybe_number.
If
r_dapi
not specified. Will convert to units of pixels from this micron value. Typical = 8.0. If both this andr_dapi
left blank, DAPI image will not be filtered and no .npy file saved. Instead DAPI will be loaded directly from raw data and then stitched.Default:
None
-
scale: maybe_number.
Each filtered image is multiplied by scale. This is because the image is saved as uint16 so to gain information from the decimal points, should multiply image so max pixel number is in the 10,000s (less than 65,536). Leave empty to auto-detect using
scale_norm
.Default:
None
-
scale_norm: maybe_int.
If
scale
not given,scale = scale_norm/max(scale_image)
. Wherescale_image
is then_channels x n_y x n_x x n_z
image belonging to the central tile (saved asnb.extract_debug.scale_tile
) of round 0 after filtering and smoothing.Must be less than
np.iinfo(np.uint16).max - config['basic_info']['tile_pixel_value_shift']
which is typically \(65535 - 15000 = 50535\).Default:
35000
-
scale_anchor: maybe_number.
Analogous to
scale
but have different normalisation for anchor round/anchor channel as not used in final spot_colors. Leave empty to auto-detect usingscale_norm
.Default:
None
-
auto_thresh_multiplier: number.
nb.extract.auto_thresh[t,r,c]
is default threshold to find spots on tile t, round r, channel c. Value is set toauto_thresh_multiplier * median(abs(image))
whereimage
is the image produced for tile t, round r, channel c in the extract step of the pipeline and saved tofile_names['tile_dir']
.Default:
10
-
deconvolve: bool.
For 3D pipeline, whether to perform wiener deconvolution before hanning filtering.
Default:
False
-
psf_detect_radius_xy: int.
Need to detect spots to determine point spread function (psf) used in the wiener deconvolution. Only relevant if
deconvolve == True
. To detect spot, pixel needs to be above dilation with this radius in xy plane.Default:
2
-
psf_detect_radius_z: int.
Need to detect spots to determine point spread function (psf) used in the wiener deconvolution. Only relevant if
deconvolve == True
. To detect spot, pixel needs to be above dilation with this radius in z direction.Default:
2
-
psf_intensity_thresh: maybe_number.
Spots contribute to
psf
if they are above this intensity. If not given, will be computed the same asauto_thresh
i.e.median(image) + auto_thresh_multiplier*median(abs(image-median(image)))
. Note that for raw data,median(image)
is not zero hence the difference.Default:
None
-
psf_isolation_dist: number.
Spots contribute to
psf
if more thanpsf_isolation_dist
from nearest spot.Default:
20
-
psf_min_spots: int.
Need this many isolated spots to determine
psf
.Default:
300
-
psf_shape: list_int.
Diameter of psf in y, x, z direction (in units of [xy_pixels, xy_pixels, z_pixels]).
Default:
181, 181, 19
-
psf_annulus_width: number.
psf
is assumed to be radially symmetric within each z-plane so assume all values within annulus of this size (in xy_pixels) to be the same.Default:
1.4
-
wiener_constant: number.
Constant used to compute wiener filter from
psf
.Default:
50000
-
wiener_pad_shape: list_int.
When applying the wiener filter, we pad the raw image to median value linearly with this many pixels at end of each dimension.
Default:
20, 20, 3
-
r_smooth: maybe_list_int.
Radius of averaging filter to do smoothing of filtered image. Provide two numbers to do 2D smoothing and three numbers to do 3D smoothing. Typical 2D:
2, 2
. Typical 3D:1, 1, 2
. Recommended use is in 3D only as it incorporates information between z-planes which filtering with difference of hanning kernels does not.Size of
r_smooth
has big influence on time taken for smoothing. For a2048 x 2048 x 50
image:-
r_smooth = 1, 1, 2
: 2.8 seconds -
r_smooth = 2, 2, 2
: 8.5 seconds
Leave empty to do no smoothing.
Default:
None
-
-
n_clip_warn: int.
If the number of pixels that are clipped when saving as uint16 is more than
n_clip_warn
, a warning message will occur.Default:
1000
-
n_clip_error: maybe_int.
If the number of pixels that are clipped when saving as uint16 is more than
n_clip_error
forn_clip_error_images_thresh
images, the extract and filter step will be halted. If left blank, n_clip_error will be set to 1% of pixels of a single z-plane.Default:
None
-
n_clip_error_images_thresh: int.
If the number of pixels that are clipped when saving as uint16 is more than
n_clip_error
forn_clip_error_images_thresh
images, the extract and filter step will be halted.Default:
3
find_spots
The find_spots section contains parameters which specify how to convert the images produced in the extract section to point clouds.
-
radius_xy: int.
To be detected as a spot, a pixel needs to be above dilation with structuring element which is a square (
np.ones
) of width2*radius_xy-1
in the xy plane.Default:
2
-
radius_z: int.
To be detected as a spot, a pixel needs to be above dilation with structuring element which is cuboid (
np.ones
) with width2*radius_z-1
in z direction. Must be more than 1 to be 3D.Default:
2
-
max_spots_2d: int.
If number of spots detected on particular z-plane of an imaging round is greater than this, then will only select the
max_spots_2d
most intense spots on that z-plane. I.e. PCR works better if trying to fit fewer more intense spots. This only applies to imaging rounds and not ref_round/ref_channel as need lots of spots then. In 2D, allow more spots as only 1 z-planeDefault:
1500
-
max_spots_3d: int.
Same as
max_spots_2d
for the 3D pipeline. In 3D, need to allow less spots on a z-plane as have many z-planes.Default:
500
-
isolation_radius_inner: number.
To determine if spots are isolated, filter image with annulus between
isolation_radius_inner
andisolation_radius
.isolation_radius_inner
should be approx the radius where intensity of spot crosses from positive to negative. It is in units of xy-pixels. This filtering will only be applied to spots detected in the ref_round/ref_channel.Default:
4
-
isolation_radius_xy: number.
Outer radius of annulus filtering kernel in xy direction in units of xy-pixels.
Default:
14
-
isolation_radius_z: number.
Outer radius of annulus filtering kernel in z direction in units of z-pixels.
Default:
1
-
isolation_thresh: maybe_number.
Spot is isolated if value of annular filtered image at spot location is below the
isolation_thresh
value. Leave blank to automatically determine value usingauto_isolation_thresh_multiplier
. multiplied by the threshold used to detect the spots i.e. the extract_auto_thresh value.Default:
None
-
auto_isolation_thresh_multiplier: number.
If
isolation_thresh
left blank, it will be set toisolation_thresh = auto_isolation_thresh_multiplier * nb.extract.auto_thresh[:, r, c]
.Default:
-0.2
-
n_spots_warn_fraction: number.
Used in coppafish/find_spots/base/check_n_spots
A warning will be raised if for any tile, round, channel the number of spots detected is less than:
n_spots_warn = n_spots_warn_fraction * max_spots * nb.basic_info.nz
where
max_spots
ismax_spots_2d
if 2D andmax_spots_3d
if 3D.Default:
0.1
-
n_spots_error_fraction: number.
Used in coppafish/find_spots/base/check_n_spots. An error is raised if any of the following are satisfied:
-
For any given channel, the number of spots found was less than
n_spots_warn
for at least the fractionn_spots_error_fraction
of tiles/rounds. -
For any given tile, the number of spots found was less than
n_spots_warn
for at least the fractionn_spots_error_fraction
of rounds/channels. -
For any given round, the number of spots found was less than
n_spots_warn
for at least the fractionn_spots_error_fraction
of tiles/channels.
Default:
0.5
-
stitch
The stitch section contains parameters which specify how the overlaps between neighbouring tiles are found. Note that references to south in this section should really be north and west should be east.
-
expected_overlap: number.
Expected fractional overlap between tiles. Used to get initial shift search if not provided.
Default:
0.1
-
auto_n_shifts: list_int.
If
shift_south_min/max
and/orshift_west_min/max
not given, the initial shift search will haveauto_n_shifts
either side of the expected shift given theexpected_overlap
with step given byshift_step
. First value gives \(n_{shifts}\) in direction of overlap (y for south, x for west). Second value gives \(n_{shifts}\) in other direction (x for south, y for west). Third value gives \(n_{shifts}\) in z.Default:
20, 20, 1
-
shift_south_min: maybe_list_int.
Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical:-1900, -100, -2
Default:
None
-
shift_south_max: maybe_list_int.
Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical:-1700, 100, 2
Default:
None
-
shift_west_min: maybe_list_int.
Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical:-100, -1900, -2
Default:
None
-
shift_west_max: maybe_list_int.
Can manually specify initial shifts. Shift range will run between min to max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical:100, -1700, 2
Default:
None
-
shift_step: list_int.
Step size to use in y, x, z when finding shift between tiles.
Default:
5, 5, 3
-
shift_widen: list_int.
If shift in initial search range has score which does not exceed
shift_score_thresh
, then range will be extrapolated with same step byshift_widen
values in y, x, z direction.Default:
10, 10, 1
-
shift_max_range: list_int.
The range of shifts searched over will continue to be increased according to
shift_widen
until the shift range in the y, x, z direction reachesshift_max_range
. If a good shift is still not found, a warning will be printed.Default:
300, 300, 10
-
neighb_dist_thresh: number.
Basically the distance in yx pixels below which neighbours are a good match.
Default:
2
-
shift_score_thresh: maybe_number.
A shift between tiles must have a number of close neighbours exceeding this. If not given, it will be worked using the
shift_score_thresh
parameters below using the function coppafish/stitch/shift/get_score_thresh.Default:
None
-
shift_score_thresh_multiplier: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
2
-
shift_score_thresh_min_dist: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
11
-
shift_score_thresh_max_dist: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
20
-
nz_collapse: int.
3D data is converted into
np.ceil(nz / nz_collapse)
2D slices for exhaustive shift search to quicken it up. I.e. this is the maximum number of z-planes to be collapsed to a 2D slice when searching for the best shift.Default:
30
-
n_shifts_error_fraction: number.
Used in coppafish/stitch/check_shifts/check_shifts_stitch If more than this fraction of
shifts
found between neighbouring tiles havescore < score_thresh
, an error will be raised.Default:
0.5
-
save_image_zero_thresh: int.
When saving stitched images, all pixels with absolute value less than or equal to
save_image_zero_thresh
will be set to 0. This helps reduce size of the .npz files and does not lose any important information.Default:
20
register_initial
The register_initial section contains parameters which specify how the shifts from the ref_round/ref_channel to each imaging round/channel are found. These are then used as the starting point for determining the affine transforms in the register section.
-
shift_channel: maybe_int.
Channel to use to find shifts between rounds to use as starting point for PCR. Leave blank to set to
basic_info['ref_channel']
.Default:
None
-
shift_min: list_int.
Exhaustive search range will include all shifts between min and max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical: [-100, -100, -1]Default:
-100, -100, -3
-
shift_max: list_int.
Exhaustive search range will include all shifts between min and max with step given by
shift_step
. Each entry should be a list of 3 values: [y, x, z]. Typical: [100, 100, 1]Default:
100, 100, 3
-
shift_step: list_int.
Step size to use in y, x, z when performing the exhaustive search to find the shift between tiles.
Default:
5, 5, 3
-
shift_widen: list_int.
If shift in initial search range has score which does not exceed
shift_score_thresh
, then the range will be extrapolated with same step by shift_widen values in y, x, z direction.Default:
10, 10, 1
-
shift_max_range: list_int.
The range of shifts searched over will continue to be increased according to
shift_widen
until the shift range in the y, x, z direction reachesshift_max_range
. If a good shift is still not found, a warning will be printed.Default:
500, 500, 10
-
neighb_dist_thresh: number.
Basically the distance in yx pixels below which neighbours are a good match.
Default:
2
-
shift_score_thresh: maybe_number.
A shift between tiles must have a number of close neighbours exceeding this. If not given, it will be worked using the
shift_score_thresh
parameters below using the function coppafish/stitch/shift/get_score_thresh.Default:
None
-
shift_score_thresh_multiplier: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
1.5
-
shift_score_thresh_min_dist: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
11
-
shift_score_thresh_max_dist: number.
shift_score_thresh
is set toshift_score_thresh_multiplier
multiplied by the mean of scores of shifts a distance betweenshift_score_thresh_min_dist
andshift_score_thresh_max_dist
from the best shift.Default:
20
-
nz_collapse: int.
3D data is converted into
np.ceil(nz / nz_collapse)
2D slices for exhaustive shift search to quicken it up. I.e. this is the maximum number of z-planes to be collapsed to a 2D slice when searching for the best shift.Default:
30
-
n_shifts_error_fraction: number.
Used in coppafish/stitch/check_shifts/check_shifts_register If more than this fraction of
shifts
between theref_round
/ref_channel
and each imaging round for each tile havescore < score_thresh
, an error will be raised.Default:
0.5
register
The register section contains parameters which specify how the affine transforms from the ref_round/ref_channel to each imaging round/channel are found from the shifts found in the register_initial section.
-
n_iter: int.
Maximum number iterations to run point cloud registration, PCR
Default:
100
-
neighb_dist_thresh_2d: number.
Basically the distance in yx pixels below which neighbours are a good match. PCR updates transforms by minimising distances between neighbours which are closer than this.
Default:
3
-
neighb_dist_thresh_3d: number.
The same as
neighb_dist_thresh_2d
but in 3D, we use a larger distance because the size of a z-pixel is greater than a xy pixel.Default:
5
-
matches_thresh_fract: number.
If PCR produces transforms with fewer neighbours (pairs with distance between them less than
neighb_dist_thresh
) thanmatches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max)
, the transform will be re-evaluated with regularization so it is near the average transform.Default:
0.25
-
matches_thresh_min: int.
If PCR produces transforms with fewer neighbours (pairs with distance between them less than
neighb_dist_thresh
) thanmatches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max)
, the transform will be re-evaluated with regularization so it is near the average transform.Default:
25
-
matches_thresh_max: int.
If PCR produces transforms with fewer neighbours (pairs with distance between them less than
neighb_dist_thresh
) thanmatches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max)
, the transform will be re-evaluated with regularization so it is near the average transform.Default:
300
-
scale_dev_thresh: list_number.
If a transform has a chromatic aberration scaling that has an absolute deviation of more than
scale_dev_thresh[i]
from the median for that colour channel in dimensioni
, it will be re-evaluated with regularization. There is a threshold for the y, x, z scaling.Default:
0.01, 0.01, 0.1
-
shift_dev_thresh: list_number.
If a transform has a
shift[i]
that has an absolute deviation of more thanshift_dev_thresh[i]
from the median for that tile and round in any dimensioni
, it will be re-evaluated with regularization. There is a threshold for the y, x, z shift.shift_dev_thresh[2]
is in z pixels.Default:
15, 15, 5
-
regularize_constant: int.
Constant used when doing regularized least squares. If the number of neighbours are above this, regularization will have little effect. If the number of neighbours is less than this, regularization will have significant effect, and final transform will be similar to transform being regularized towards.
Default:
500
-
regularize_factor: number.
The loss function for finding the transform through regularized least squares is:
\(\sum_s^{n_{neighb}}D_s^2 + 0.5\lambda (\mu D_{scale}^2 + D_{shift}^2)\)
Where:
-
\(D_s^2\) is the squared distance between the pair of neighbours indicated by \(s\). Only neighbours with distance between them on previous iteration les than
neighb_dist
are considered. -
\(\lambda\) is
regularize_constant
. -
\(\mu\) is
regularize_factor
such that when \(n_{neighb} = \lambda\) and \(D_s^2 = D_{shift}^2\) for all \(s\), the two contributions to the loss function are approximately equal i.e. \(\mu = D_{shift}^2/D_{scale}^2\). -
\(D_{scale}^2\) is the squared distance between
transform[:3, :]
andtransform_regularize[:3, :]
. I.e. the squared difference of the scaling/rotation part of the transform from the target. -
\(D_{shift}^2\) is the squared distance between
transform[3]
andtransform_regularize[3]
. I.e. the squared difference of the shift part of the transform from the target.
So if a typical value of \(D_{shift}\) (or \(D_s\)) is 2 and a typical value of \(D_{scale}\) is 0.009, \(\mu = 5\times10^4\).
Default:
5e4
-
-
n_transforms_error_fraction: number.
Used in coppafish/register/check_transforms/check_transforms An error is raised if any of the following are satisfied where a failed transform is one with
nb.register_debug.n_matches < nb.register_debug.n_matches_thresh
.-
For any given channel, the fraction of failed transforms was greater than
n_transforms_error_fraction
of tiles/rounds. -
For any given tile, the fraction of failed transforms was greater than
n_transforms_error_fraction
of rounds/channels. -
For any given round, the fraction of failed transforms was greater than
n_transforms_error_fraction
of tiles/channels.
Default:
0.5
-
call_spots
The call_spots section contains parameters which determine how the bleed_matrix
and gene_efficiency
are computed, as well as how a gene is assigned to each spot found on the ref_round/ref_channel.
-
bleed_matrix_method: str.
bleed_matrix_method
can only besingle
orseparate
.single
: a single bleed matrix is produced for all rounds.separate
: a different bleed matrix is made for each round.Default:
single
-
color_norm_intensities: list_number.
Parameter used to get color normalisation factor.
color_norm_intensities
should be ascending andcolor_norm_probs
should be descending and they should be the same size. The probability of normalised spot color being greater thancolor_norm_intensities[i]
must be less thancolor_norm_probs[i]
for alli
.Default:
0.5, 1, 5
-
color_norm_probs: list_number.
Parameter used to get color normalisation factor.
color_norm_intensities
should be ascending andcolor_norm_probs
should be descending and they should be the same size. The probability of normalised spot color being greater thancolor_norm_intensities[i]
must be less thancolor_norm_probs[i]
for alli
.Default:
0.01, 5e-4, 1e-5
-
bleed_matrix_score_thresh: number.
In
scaled_k_means
part ofbleed_matrix
calculation, a mean vector for each dye is computed from all spots with a dot product to that mean greater than this.Default:
0
-
bleed_matrix_min_cluster_size: int.
If less than this many vectors are assigned to a dye cluster in the
scaled_k_means
part ofbleed_matrix
calculation, the expected code for that dye will be set to 0 for all color channels i.e. bleed matrix computation will have failed.Default:
10
-
bleed_matrix_n_iter: int.
Maximum number of iterations allowed in the
scaled_k_means
part ofbleed_matrix
calculation.Default:
100
-
bleed_matrix_anneal: bool.
If
True
, thescaled_k_means
calculation will be performed twice. The second time starting with the output of the first and withscore_thresh
for clusteri
set to the median of the scores assigned to clusteri
in the first run.This limits the influence of bad spots to the bleed matrix.
Default:
True
-
background_weight_shift: maybe_number.
Shift to apply to weighting of each background vector to limit boost of weak spots. The weighting of round r for the fitting of the background vector for channel c is
1 / (spot_color[r, c] + background_weight_shift)
sobackground_weight_shift
ensures this does not go to infinity for smallspot_color[r, c]
. Typicalspot_color[r, c]
is 1 for intense spot sobackground_weight_shift
is small fraction of this. Leave blank to set to median absolute intensity of all pixels on the mid z-plane of the central tile.Default:
None
-
dp_norm_shift: maybe_number.
When calculating the
dot_product_score
, this is the small shift to apply when normalisingspot_colors
to ensure don't divide by zero. Value is for a single round and is multiplied bysqrt(n_rounds_used)
when computingdot_product_score
. Expected norm of a spot_color for a single round is 1 sodp_norm_shift
is a small fraction of this. Leave blank to set to median L2 norm for a single round of all pixels on the mid z-plane of the central tile.Default:
None
-
norm_shift_min: number.
Minimum possible value of
dp_norm_shift
andbackground_weight_shift
.Default:
0.001
-
norm_shift_max: number.
Maximum possible value of
dp_norm_shift
andbackground_weight_shift
.Default:
0.5
-
norm_shift_precision: number.
dp_norm_shift
andbackground_weight_shift
will be rounded to nearestnorm_shift_precision
.Default:
0.01
-
gene_efficiency_min_spots: int.
If number of spots assigned to a gene less than or equal to this,
gene_efficiency[g]=1
for all rounds.Default:
25
-
gene_efficiency_max: number.
Maximum allowed value of
gene_efficiency
i.e. any one round can be at most this times more important than the median round for every gene.Default:
6
-
gene_efficiency_min: number.
At most
ceil(gene_efficiency_min_factor * n_rounds_use)
rounds can havegene_efficiency
belowgene_efficiency_min
for any given gene.Default:
0.05
-
gene_efficiency_min_factor: number.
At most
ceil(gene_efficiency_min_factor * n_rounds_use)
rounds can havegene_efficiency
belowgene_efficiency_min
for any given gene.Default:
0.2
-
gene_efficiency_n_iter: int.
gene_efficiency
is computed from spots which pass a quality thresholding based on the bled_codes computed with the gene_efficiency of the previous iteration. This process will continue until thegene_effiency
converges orgene_efficiency_n_iter
iterations are reached. 0 meansgene_efficiency
will not be used.Default:
10
-
gene_efficiency_score_thresh: number.
Spots used to compute
gene_efficiency
must havedot_product_score
greater thangene_efficiency_score_thresh
, difference to second best score greater thangene_efficiency_score_diff_thresh
and intensity greater thangene_efficiency_intensity_thresh
.Default:
0.6
-
gene_efficiency_score_diff_thresh: number.
Spots used to compute
gene_efficiency
must havedot_product_score
greater thangene_efficiency_score_thresh
, difference to second best score greater thangene_efficiency_score_diff_thresh
and intensity greater thangene_efficiency_intensity_thresh
.Default:
0.2
-
gene_efficiency_intensity_thresh: maybe_number.
Spots used to compute
gene_efficiency
must havedot_product_score
greater thangene_efficiency_score_thresh
, difference to second best score greater thangene_efficiency_score_diff_thresh
and intensity greater thangene_efficiency_intensity_thresh
. Leave blank to determine fromgene_efficiency_intensity_thresh_percentile
.Default:
None
-
gene_efficiency_intensity_thresh_percentile: int.
gene_efficiency_intensity_thresh
will be set to this percentile of the intensity computed for all pixels on the mid z-plane of the most central tile if not specified.Default:
37
-
gene_efficiency_intensity_thresh_precision: number.
gene_efficiency_intensity_thresh
will be rounded to nearestgene_efficiency_intensity_thresh_precision
if not given.Default:
0.001
-
gene_efficiency_intensity_thresh_min: number.
Min allowed value of
gene_efficiency_intensity_thresh
.Default:
0.001
-
gene_efficiency_intensity_thresh_max: number.
Max allowed value of
gene_efficiency_intensity_thresh
.Default:
0.2
-
alpha: number.
When computing the dot product score, \(\Delta_{s0g}\) between spot \(s\) and gene \(g\), rounds/channels with background already fit contribute less. The larger \(\alpha\), the lower the contribution.
Set \(\alpha = 0\) to use the normal dot-product with no weighting.
Default:
120
-
beta: number.
Constant used in weighting factor when computing dot product score, \(\Delta_{s0g}\) between spot \(s\) and gene \(g\).
Default:
1
omp
The omp section contains parameters which are use to carry out orthogonal matching pursuit (omp) on every pixel, as well as how to convert the results of this to spot locations.
-
use_z: maybe_list_int.
Can specify z-planes to find spots on If 2 values provided, all z-planes between and including the values given will be used.
Default:
None
-
weight_coef_fit: bool.
If
False
, gene coefficients are found through omp with normal least squares fitting. IfTrue
, gene coefficients are found through omp with weighted least squares fitting with rounds/channels which already containing genes contributing less.Default:
False
-
initial_intensity_thresh: maybe_number.
To save time in
call_spots_omp
, coefficients only found for pixels with intensity of absolutespot_colors
greater thaninitial_intensity_thresh
. Leave blank to set to determine usinginitial_intensity_thresh_auto_param
It is also clamped between theinitial_intensity_thresh_min
andinitial_intensity_thresh_max
.Default:
None
-
initial_intensity_thresh_percentile: int.
If
initial_intensity_thresh
not given, it will be set to theinitial_intensity_thresh_percentile
percentile of the absolute intensity of all pixels on the mid z-plane of the central tile. It usesnb.call_spots.abs_intensity_percentile
Default:
25
-
initial_intensity_thresh_min: number.
Min allowed value of
initial_intensity_thresh
.Default:
0.001
-
initial_intensity_thresh_max: number.
Max allowed value of
initial_intensity_thresh
.Default:
0.2
-
initial_intensity_precision: number.
initial_intensity_thresh
will be rounded to nearestinitial_intensity_precision
if not given.Default:
0.001
-
max_genes: int.
The maximum number of genes that can be assigned to each pixel i.e. number of iterations of omp.
Default:
30
-
dp_thresh: number.
Pixels only have coefficient found for a gene if that gene has absolute
dot_product_score
greater than this i.e. this is the stopping criterion for the OMP.Default:
0.225
-
alpha: number.
When computing the dot product score, \(\Delta_{sig}\) between spot \(s\) and gene \(g\) on iteration \(i\) of OMP, rounds/channels with genes already fit to them, contribute less. The larger \(\alpha\), the lower the contribution.
Set \(\alpha = 0\) to use the normal dot-product with no weighting.
Default:
120
-
beta: number.
Constant used in weighting factor when computing dot product score, \(\Delta_{sig}\) between spot \(s\) and gene \(g\) on iteration \(i\) of OMP.
Default:
1
-
initial_pos_neighbour_thresh: maybe_int.
Only save spots with number of positive coefficient neighbours greater than
initial_pos_neighbour_thresh
. Leave blank to determine usinginitial_pos_neighbour_thresh_param
. It is also clipped betweeninitial_pos_neighbour_thresh_min
andinitial_pos_neighbour_thresh_max
.Default:
None
-
initial_pos_neighbour_thresh_param: number.
If
initial_pos_neighbour_thresh
not given, it is set toinitial_pos_neighbour_thresh_param
multiplied by number of positive values in nb.omp.spot_shape i.e. withinitial_pos_neighbour_thresh_param = 0.1
, it is set to 10% of the max value.Default:
0.1
-
initial_pos_neighbour_thresh_min: int.
Min allowed value of
initial_pos_neighbour_thresh
.Default:
4
-
initial_pos_neighbour_thresh_max: int.
Max allowed value of
initial_pos_neighbour_thresh
.Default:
40
-
radius_xy: int.
To detect spot in coefficient image of each gene, pixel needs to be above dilation with structuring element which is a square (
np.ones
) of width2*radius_xy-1
in the xy plane.Default:
3
-
radius_z: int.
To detect spot in coefficient image of each gene, pixel needs to be above dilation with structuring element which is cuboid (
np.ones
) with width2*radius_z-1
in z direction. Must be more than 1 to be 3D.Default:
2
-
shape_max_size: list_int.
spot_shape specifies the neighbourhood about each spot in which we count coefficients which contribute to score. It is either given through
file_names['omp_spot_shape']
or computed using the below parameters with shape prefix. Maximum Y, X, Z size of spot_shape. Will be cropped if there are zeros at the extremities.Default:
27, 27, 9
-
shape_pos_neighbour_thresh: int.
For spot to be used to find
spot_shape
, it must have this many pixels around it on the same z-plane that have a positive coefficient. If 3D, also, require 1 positive pixel on each neighbouring plane (i.e. 2 is added to this value).Default:
9
-
shape_isolation_dist: number.
Spots are isolated if nearest neighbour (across all genes) is further away than this. Only isolated spots are used to find
spot_shape
.Default:
10
-
shape_sign_thresh: number.
If the mean absolute coefficient sign is less than this in a region near a spot, we set the expected coefficient in
spot_shape
to be 0. Max mean absolute coefficient sign is 1 so must be less than this.Default:
0.15
thresholds
The thresholds section contains the thresholds used to determine which spots pass a quality thresholding process such that we consider their gene assignments legitimate.
-
intensity: maybe_number.
Final accepted reference and OMP spots both require
intensity > thresholds[intensity]
. If not given, will be set to same value asnb.call_spots.gene_efficiency_intensity_thresh
. intensity for a really intense spot is about 1 sointensity_thresh
should be less than this.Default:
None
-
score_ref: number.
Final accepted spots are those which pass quality_threshold which is
nb.ref_spots.score > thresholds[score_ref]
andnb.ref_spots.intensity > intensity_thresh
. quality_threshold requires score computed with coppafish/call_spots/dot_prodduct/dot_product_score to exceed this. Max score is 1 so must be below this.Default:
0.25
-
score_omp: number.
Final accepted OMP spots are those which pass quality_threshold which is:
score > thresholds[score_omp]
andintensity > thresholds[intensity]
.score
is given by:score = (score_omp_multiplier * n_neighbours_pos + n_neighbours_neg) / (score_omp_multiplier * n_neighbours_pos_max + n_neighbours_neg_max)
Max score is 1 soscore_thresh
should be less than this.0.15 if more concerned for missed spots than false positives.
Default:
0.263
-
score_omp_multiplier: number.
Final accepted OMP spots are those which pass quality_threshold which is:
score > thresholds[score_omp]
andintensity > thresholds[intensity]
.score
is given by:score = (score_omp_multiplier * n_neighbours_pos + n_neighbours_neg) / (score_omp_multiplier * n_neighbours_pos_max + n_neighbours_neg_max)
0.45 if more concerned for missed spots than false positives.
Default:
0.95