Skip to content

Default Config Settings

file_names

The file_names section specifies the files that will be used throughout the pipeline. Variables in this section can be changed at any point in the pipeline, and the notebook created using it can still be loaded in.

  • notebook_name: str.

    Name of notebook file in output directory will be notebook_name.npz

    Default: notebook

  • input_dir: str.

    Directory where the raw .nd2 files or .npy stacks are

    Default: MUST BE SPECIFIED

  • output_dir: str.

    Directory where notebook is saved

    Default: MUST BE SPECIFIED

  • tile_dir: str.

    Directory where tile .npy files saved

    Default: MUST BE SPECIFIED

  • round: maybe_list_str.

    Names of .nd2 files for the imaging rounds. Leave empty if only using anchor.

    Default: None

  • anchor: maybe_str.

    Name of the file for the anchor round. Leave empty if not using anchor.

    Default: None

  • raw_extension: str.

    .nd2 or .npy indicating the data type of the raw data.

    Default: .nd2

  • raw_metadata: maybe_str.

    If .npy raw_extension, this is the name of the .json file in input_dir which contains the metadata required extracted from the initial .nd2 files. I.e. it contains the output of coppafish/utils/nd2/save_metadata:

    • xy_pos - List [n_tiles x 2]. xy position of tiles in pixels.

    • pixel_microns - float. xy pixel size in microns.

    • pixel_microns_z - float. z pixel size in microns.

    • sizes - dict with fov (t), channels (c), y, x, z-planes (z) dimensions.

    Default: None

  • dye_camera_laser: maybe_file.

    csv file giving the approximate raw intensity for each dye with each camera/laser combination. If not set, the file coppafish/setup/dye_camera_laser_raw_intensity.csv file will be used.

    Default: None

  • code_book: str.

    Text file which contains the codes indicating which dye to expect on each round for each gene.

    Default: MUST BE SPECIFIED

  • scale: str.

    Text file saved in tile_dir containing extract['scale'] and extract['scale_anchor'] values used to create the tile .npy files in the tile_dir. If the second value is 0, it means extract['scale_anchor'] has not been calculated yet.

    If the extract step of the pipeline is re-run with extract['scale'] or extract['scale_anchor'] different to values saved here, an error will be raised.

    Default: scale

  • psf: str.

    npy file in output directory indicating average spot shape. If deconvolution required and file does not exist, will be computed automatically in extract step. (this is psf before tapering and scaled to fill uint16 range).

    Default: psf

  • omp_spot_shape: str.

    npy file in output_dir indicating average shape in omp coefficient image. It only indicates the sign of the coefficient i.e. only contains -1, 0, 1. If file does not exist, it is computed from the coefficient images of all genes of the central tile.

    Default: omp_spot_shape

  • omp_spot_info: str.

    npy file in output_dir containing information about spots found in omp step. After each tile is completed, information will be saved to this file. If file does not exist, it will be saved after first tile of OMP step.

    Default: omp_spot_info

  • omp_spot_coef: str.

    npz file in output_dir containing gene coefficients for all spots found in omp step. After each tile is completed, information will be saved to this file. If file does not exist, it will be saved after first tile of OMP step.

    Default: omp_spot_coef

  • big_dapi_image: maybe_str.

    npz file in output_dir where stitched DAPI image is saved. If it does not exist, it will be saved if basic_info['dapi_channel'] is not None. Leave blank to not save stitched anchor

    Default: dapi_image

  • big_anchor_image: maybe_str.

    npz file in output_dir where stitched image of ref_round/ref_channel is saved. If it does not exist, it will be saved. Leave blank to not save stitched anchor

    Default: anchor_image

  • pciseq: list_str.

    csv files in output_dir where plotting information for pciSeq will be saved. First file is name where omp method output will be saved. Second file is name where ref_spots method output will be saved. If files don't exist, they will be created when the function coppafish/export_to_pciseq is run.

    Default: pciseq_omp, pciseq_anchor

basic_info

The basic_info section indicates information required throughout the pipeline.

  • is_3d: bool.

    Whether to use the 3d pipeline.

    Default: MUST BE SPECIFIED

  • anchor_channel: maybe_int.

    Channel in anchor round used as reference and to build coordinate system on. Usually channel with most spots. Leave blank if anchor not used.

    Default: None

  • dapi_channel: maybe_int.

    Channel in anchor round that contains DAPI images. This does not have to be in use_channels as anchor round is dealt with separately. Leave blank if no DAPI.

    Default: None

  • ref_round: maybe_int.

    Round to align all imaging rounds to. Will be set to anchor_round if anchor_channel and file_names['anchor'] specified.

    Default: None

  • ref_channel: maybe_int.

    Channel in ref_round used as reference and to build coordinate system on. Usually channel with most spots. Will be set to anchor_channel if anchor_channel and file_names['anchor'] specified.

    Default: None

  • use_channels: maybe_list_int.

    Channels in imaging rounds to use throughout pipeline. Leave blank to use all.

    Default: None

  • use_rounds: maybe_list_int.

    Imaging rounds to use throughout pipeline. Leave blank to use all.

    Default: None

  • use_z: maybe_list_int.

    z planes used to make tile .npy files. Leave blank to use all. If 2 values provided, all z-planes between and including the values given will be used.

    Default: None

  • use_tiles: maybe_list_int.

    Tiles used throughout pipeline. Leave blank to use all. For an experiment where the tiles are arranged in a 4 x 3 (ny x nx) grid, tile indices are indicated as below:

    | 2 | 1 | 0 |

    | 5 | 4 | 3 |

    | 8 | 7 | 6 |

    | 11 | 10 | 9 |

    Default: None

  • ignore_tiles: maybe_list_int.

    It is often easier to select tiles to remove than to use. All tiles listed here will be ignored. Leave blank to use all.

    Default: None

  • use_dyes: maybe_list_int.

    Dyes to use when when assigning spots to genes. Leave blank to use all.

    Default: None

  • dye_names: maybe_list_str.

    Name of dyes used in correct order. So for gene with code 360..., gene appears with dye_names[3] in round 0, dye_names[6] in round 1, dye_names[0] in round 2 etc. If left blank, then assumes each channel corresponds to a different dye i.e. code 0 in code_book = channel 0. For quad_cam data, this needs to be specified.

    Default: None

  • channel_camera: maybe_list_int.

    channel_camera[i] is the wavelength in nm of the camera used for channel i. Only need to be provided if dye_names provided to help estimate dye intensity in each channel.

    Default: None

  • channel_laser: maybe_list_int.

    channel_laser[i] is the wavelengths in nm of the camera/laser used for channel i. Only need to be provided if dye_names provided to help estimate dye intensity in each channel.

    Default: None

  • tile_pixel_value_shift: int.

    This is added onto every tile (except DAPI) when it is saved and removed from every tile when loaded. Required so we can have negative pixel values when save to .npy as uint16.

    Default: 15000

  • ignore_first_z_plane: bool.

    Previously had cases where first z plane in .nd2 file was in wrong place and caused focus stacking to be weird or identify lots of spots on first plane. Hence it is safest to not load first plane and this is done if ignore_first_z_plane = True.

    Default: True

extract

The extract section contains parameters which specify how to filter the raw microscope images to produce the .npy files saved to file_names['tile_dir'].

  • wait_time: int.

    Time to wait in seconds for raw data to come in before crashing. Assumes first round is already in the file_names['input_dir'] Want this to be large so can run pipeline while collecting data.

    Default: 21600

  • r1: maybe_int.

    Filtering is done with a 2D difference of hanning filter with inner radius r1 within which it is positive and outer radius r2 so annulus between r1 and r2 is negative. Should be approx radius of spot. Typical = 3.

    For r1 = 3 and r2 = 6, a 2048 x 2048 x 50 image took 4.1s. For 2 <= r1 <= 5 and r2 double this, the time taken seemed to be constant.

    Leave blank to auto detect using r1_auto_microns micron.

    Default: None

  • r2: maybe_int.

    Filtering is done with a 2D difference of hanning filter with inner radius r1 within which it is positive and outer radius r2 so annulus between r1 and r2 is negative. Should be approx radius of spot. Typical = 6. Leave blank to set to twice r1.

    Default: None

  • r_dapi: maybe_int.

    Filtering for DAPI images is a tophat with r_dapi radius. Should be approx radius of object of interest. Typical = 48. Leave blank to auto detect using r_dapi_auto_microns.

    Default: None

  • r1_auto_microns: number.

    If r1 not specified, will convert to units of pixels from this micron value.

    Default: 0.5

  • r_dapi_auto_microns: maybe_number.

    If r_dapi not specified. Will convert to units of pixels from this micron value. Typical = 8.0. If both this and r_dapi left blank, DAPI image will not be filtered and no .npy file saved. Instead DAPI will be loaded directly from raw data and then stitched.

    Default: None

  • scale: maybe_number.

    Each filtered image is multiplied by scale. This is because the image is saved as uint16 so to gain information from the decimal points, should multiply image so max pixel number is in the 10,000s (less than 65,536). Leave empty to auto-detect using scale_norm.

    Default: None

  • scale_norm: maybe_int.

    If scale not given, scale = scale_norm/max(scale_image). Where scale_image is the n_channels x n_y x n_x x n_z image belonging to the central tile (saved as nb.extract_debug.scale_tile) of round 0 after filtering and smoothing.

    Must be less than np.iinfo(np.uint16).max - config['basic_info']['tile_pixel_value_shift'] which is typically \(65535 - 15000 = 50535\).

    Default: 35000

  • scale_anchor: maybe_number.

    Analogous to scale but have different normalisation for anchor round/anchor channel as not used in final spot_colors. Leave empty to auto-detect using scale_norm.

    Default: None

  • auto_thresh_multiplier: number.

    nb.extract.auto_thresh[t,r,c] is default threshold to find spots on tile t, round r, channel c. Value is set to auto_thresh_multiplier * median(abs(image)) where image is the image produced for tile t, round r, channel c in the extract step of the pipeline and saved to file_names['tile_dir'].

    Default: 10

  • deconvolve: bool.

    For 3D pipeline, whether to perform wiener deconvolution before hanning filtering.

    Default: False

  • psf_detect_radius_xy: int.

    Need to detect spots to determine point spread function (psf) used in the wiener deconvolution. Only relevant if deconvolve == True. To detect spot, pixel needs to be above dilation with this radius in xy plane.

    Default: 2

  • psf_detect_radius_z: int.

    Need to detect spots to determine point spread function (psf) used in the wiener deconvolution. Only relevant if deconvolve == True. To detect spot, pixel needs to be above dilation with this radius in z direction.

    Default: 2

  • psf_intensity_thresh: maybe_number.

    Spots contribute to psf if they are above this intensity. If not given, will be computed the same as auto_thresh i.e. median(image) + auto_thresh_multiplier*median(abs(image-median(image))). Note that for raw data, median(image) is not zero hence the difference.

    Default: None

  • psf_isolation_dist: number.

    Spots contribute to psf if more than psf_isolation_dist from nearest spot.

    Default: 20

  • psf_min_spots: int.

    Need this many isolated spots to determine psf.

    Default: 300

  • psf_shape: list_int.

    Diameter of psf in y, x, z direction (in units of [xy_pixels, xy_pixels, z_pixels]).

    Default: 181, 181, 19

  • psf_annulus_width: number.

    psf is assumed to be radially symmetric within each z-plane so assume all values within annulus of this size (in xy_pixels) to be the same.

    Default: 1.4

  • wiener_constant: number.

    Constant used to compute wiener filter from psf.

    Default: 50000

  • wiener_pad_shape: list_int.

    When applying the wiener filter, we pad the raw image to median value linearly with this many pixels at end of each dimension.

    Default: 20, 20, 3

  • r_smooth: maybe_list_int.

    Radius of averaging filter to do smoothing of filtered image. Provide two numbers to do 2D smoothing and three numbers to do 3D smoothing. Typical 2D: 2, 2. Typical 3D: 1, 1, 2. Recommended use is in 3D only as it incorporates information between z-planes which filtering with difference of hanning kernels does not.

    Size of r_smooth has big influence on time taken for smoothing. For a 2048 x 2048 x 50 image:

    • r_smooth = 1, 1, 2: 2.8 seconds

    • r_smooth = 2, 2, 2: 8.5 seconds

    Leave empty to do no smoothing.

    Default: None

  • n_clip_warn: int.

    If the number of pixels that are clipped when saving as uint16 is more than n_clip_warn, a warning message will occur.

    Default: 1000

  • n_clip_error: maybe_int.

    If the number of pixels that are clipped when saving as uint16 is more than n_clip_error for n_clip_error_images_thresh images, the extract and filter step will be halted. If left blank, n_clip_error will be set to 1% of pixels of a single z-plane.

    Default: None

  • n_clip_error_images_thresh: int.

    If the number of pixels that are clipped when saving as uint16 is more than n_clip_error for n_clip_error_images_thresh images, the extract and filter step will be halted.

    Default: 3

find_spots

The find_spots section contains parameters which specify how to convert the images produced in the extract section to point clouds.

  • radius_xy: int.

    To be detected as a spot, a pixel needs to be above dilation with structuring element which is a square (np.ones) of width 2*radius_xy-1 in the xy plane.

    Default: 2

  • radius_z: int.

    To be detected as a spot, a pixel needs to be above dilation with structuring element which is cuboid (np.ones) with width 2*radius_z-1 in z direction. Must be more than 1 to be 3D.

    Default: 2

  • max_spots_2d: int.

    If number of spots detected on particular z-plane of an imaging round is greater than this, then will only select the max_spots_2d most intense spots on that z-plane. I.e. PCR works better if trying to fit fewer more intense spots. This only applies to imaging rounds and not ref_round/ref_channel as need lots of spots then. In 2D, allow more spots as only 1 z-plane

    Default: 1500

  • max_spots_3d: int.

    Same as max_spots_2d for the 3D pipeline. In 3D, need to allow less spots on a z-plane as have many z-planes.

    Default: 500

  • isolation_radius_inner: number.

    To determine if spots are isolated, filter image with annulus between isolation_radius_inner and isolation_radius. isolation_radius_inner should be approx the radius where intensity of spot crosses from positive to negative. It is in units of xy-pixels. This filtering will only be applied to spots detected in the ref_round/ref_channel.

    Default: 4

  • isolation_radius_xy: number.

    Outer radius of annulus filtering kernel in xy direction in units of xy-pixels.

    Default: 14

  • isolation_radius_z: number.

    Outer radius of annulus filtering kernel in z direction in units of z-pixels.

    Default: 1

  • isolation_thresh: maybe_number.

    Spot is isolated if value of annular filtered image at spot location is below the isolation_thresh value. Leave blank to automatically determine value using auto_isolation_thresh_multiplier. multiplied by the threshold used to detect the spots i.e. the extract_auto_thresh value.

    Default: None

  • auto_isolation_thresh_multiplier: number.

    If isolation_thresh left blank, it will be set to isolation_thresh = auto_isolation_thresh_multiplier * nb.extract.auto_thresh[:, r, c].

    Default: -0.2

  • n_spots_warn_fraction: number.

    Used in coppafish/find_spots/base/check_n_spots

    A warning will be raised if for any tile, round, channel the number of spots detected is less than:

    n_spots_warn = n_spots_warn_fraction * max_spots * nb.basic_info.nz

    where max_spots is max_spots_2d if 2D and max_spots_3d if 3D.

    Default: 0.1

  • n_spots_error_fraction: number.

    Used in coppafish/find_spots/base/check_n_spots. An error is raised if any of the following are satisfied:

    • For any given channel, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of tiles/rounds.

    • For any given tile, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of rounds/channels.

    • For any given round, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of tiles/channels.

    Default: 0.5

stitch

The stitch section contains parameters which specify how the overlaps between neighbouring tiles are found. Note that references to south in this section should really be north and west should be east.

  • expected_overlap: number.

    Expected fractional overlap between tiles. Used to get initial shift search if not provided.

    Default: 0.1

  • auto_n_shifts: list_int.

    If shift_south_min/max and/or shift_west_min/max not given, the initial shift search will have auto_n_shifts either side of the expected shift given the expected_overlap with step given by shift_step. First value gives \(n_{shifts}\) in direction of overlap (y for south, x for west). Second value gives \(n_{shifts}\) in other direction (x for south, y for west). Third value gives \(n_{shifts}\) in z.

    Default: 20, 20, 1

  • shift_south_min: maybe_list_int.

    Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: -1900, -100, -2

    Default: None

  • shift_south_max: maybe_list_int.

    Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: -1700, 100, 2

    Default: None

  • shift_west_min: maybe_list_int.

    Can manually specify initial shifts. Exhaustive search will include all shifts between min and max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: -100, -1900, -2

    Default: None

  • shift_west_max: maybe_list_int.

    Can manually specify initial shifts. Shift range will run between min to max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: 100, -1700, 2

    Default: None

  • shift_step: list_int.

    Step size to use in y, x, z when finding shift between tiles.

    Default: 5, 5, 3

  • shift_widen: list_int.

    If shift in initial search range has score which does not exceed shift_score_thresh, then range will be extrapolated with same step by shift_widen values in y, x, z direction.

    Default: 10, 10, 1

  • shift_max_range: list_int.

    The range of shifts searched over will continue to be increased according to shift_widen until the shift range in the y, x, z direction reaches shift_max_range. If a good shift is still not found, a warning will be printed.

    Default: 300, 300, 10

  • neighb_dist_thresh: number.

    Basically the distance in yx pixels below which neighbours are a good match.

    Default: 2

  • shift_score_thresh: maybe_number.

    A shift between tiles must have a number of close neighbours exceeding this. If not given, it will be worked using the shift_score_thresh parameters below using the function coppafish/stitch/shift/get_score_thresh.

    Default: None

  • shift_score_thresh_multiplier: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 2

  • shift_score_thresh_min_dist: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 11

  • shift_score_thresh_max_dist: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 20

  • nz_collapse: int.

    3D data is converted into np.ceil(nz / nz_collapse) 2D slices for exhaustive shift search to quicken it up. I.e. this is the maximum number of z-planes to be collapsed to a 2D slice when searching for the best shift.

    Default: 30

  • n_shifts_error_fraction: number.

    Used in coppafish/stitch/check_shifts/check_shifts_stitch If more than this fraction of shifts found between neighbouring tiles have score < score_thresh, an error will be raised.

    Default: 0.5

  • save_image_zero_thresh: int.

    When saving stitched images, all pixels with absolute value less than or equal to save_image_zero_thresh will be set to 0. This helps reduce size of the .npz files and does not lose any important information.

    Default: 20

register_initial

The register_initial section contains parameters which specify how the shifts from the ref_round/ref_channel to each imaging round/channel are found. These are then used as the starting point for determining the affine transforms in the register section.

  • shift_channel: maybe_int.

    Channel to use to find shifts between rounds to use as starting point for PCR. Leave blank to set to basic_info['ref_channel'].

    Default: None

  • shift_min: list_int.

    Exhaustive search range will include all shifts between min and max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: [-100, -100, -1]

    Default: -100, -100, -3

  • shift_max: list_int.

    Exhaustive search range will include all shifts between min and max with step given by shift_step. Each entry should be a list of 3 values: [y, x, z]. Typical: [100, 100, 1]

    Default: 100, 100, 3

  • shift_step: list_int.

    Step size to use in y, x, z when performing the exhaustive search to find the shift between tiles.

    Default: 5, 5, 3

  • shift_widen: list_int.

    If shift in initial search range has score which does not exceed shift_score_thresh, then the range will be extrapolated with same step by shift_widen values in y, x, z direction.

    Default: 10, 10, 1

  • shift_max_range: list_int.

    The range of shifts searched over will continue to be increased according to shift_widen until the shift range in the y, x, z direction reaches shift_max_range. If a good shift is still not found, a warning will be printed.

    Default: 500, 500, 10

  • neighb_dist_thresh: number.

    Basically the distance in yx pixels below which neighbours are a good match.

    Default: 2

  • shift_score_thresh: maybe_number.

    A shift between tiles must have a number of close neighbours exceeding this. If not given, it will be worked using the shift_score_thresh parameters below using the function coppafish/stitch/shift/get_score_thresh.

    Default: None

  • shift_score_thresh_multiplier: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 1.5

  • shift_score_thresh_min_dist: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 11

  • shift_score_thresh_max_dist: number.

    shift_score_thresh is set to shift_score_thresh_multiplier multiplied by the mean of scores of shifts a distance between shift_score_thresh_min_dist and shift_score_thresh_max_dist from the best shift.

    Default: 20

  • nz_collapse: int.

    3D data is converted into np.ceil(nz / nz_collapse) 2D slices for exhaustive shift search to quicken it up. I.e. this is the maximum number of z-planes to be collapsed to a 2D slice when searching for the best shift.

    Default: 30

  • n_shifts_error_fraction: number.

    Used in coppafish/stitch/check_shifts/check_shifts_register If more than this fraction of shifts between the ref_round/ref_channel and each imaging round for each tile have score < score_thresh, an error will be raised.

    Default: 0.5

register

The register section contains parameters which specify how the affine transforms from the ref_round/ref_channel to each imaging round/channel are found from the shifts found in the register_initial section.

  • n_iter: int.

    Maximum number iterations to run point cloud registration, PCR

    Default: 100

  • neighb_dist_thresh_2d: number.

    Basically the distance in yx pixels below which neighbours are a good match. PCR updates transforms by minimising distances between neighbours which are closer than this.

    Default: 3

  • neighb_dist_thresh_3d: number.

    The same as neighb_dist_thresh_2d but in 3D, we use a larger distance because the size of a z-pixel is greater than a xy pixel.

    Default: 5

  • matches_thresh_fract: number.

    If PCR produces transforms with fewer neighbours (pairs with distance between them less than neighb_dist_thresh) than matches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max), the transform will be re-evaluated with regularization so it is near the average transform.

    Default: 0.25

  • matches_thresh_min: int.

    If PCR produces transforms with fewer neighbours (pairs with distance between them less than neighb_dist_thresh) than matches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max), the transform will be re-evaluated with regularization so it is near the average transform.

    Default: 25

  • matches_thresh_max: int.

    If PCR produces transforms with fewer neighbours (pairs with distance between them less than neighb_dist_thresh) than matches_thresh = np.clip(matches_thresh_fract * n_spots, matches_thresh_min, matches_thresh_max), the transform will be re-evaluated with regularization so it is near the average transform.

    Default: 300

  • scale_dev_thresh: list_number.

    If a transform has a chromatic aberration scaling that has an absolute deviation of more than scale_dev_thresh[i] from the median for that colour channel in dimension i, it will be re-evaluated with regularization. There is a threshold for the y, x, z scaling.

    Default: 0.01, 0.01, 0.1

  • shift_dev_thresh: list_number.

    If a transform has a shift[i] that has an absolute deviation of more than shift_dev_thresh[i] from the median for that tile and round in any dimension i, it will be re-evaluated with regularization. There is a threshold for the y, x, z shift. shift_dev_thresh[2] is in z pixels.

    Default: 15, 15, 5

  • regularize_constant: int.

    Constant used when doing regularized least squares. If the number of neighbours are above this, regularization will have little effect. If the number of neighbours is less than this, regularization will have significant effect, and final transform will be similar to transform being regularized towards.

    Default: 500

  • regularize_factor: number.

    The loss function for finding the transform through regularized least squares is:

    \(\sum_s^{n_{neighb}}D_s^2 + 0.5\lambda (\mu D_{scale}^2 + D_{shift}^2)\)

    Where:

    • \(D_s^2\) is the squared distance between the pair of neighbours indicated by \(s\). Only neighbours with distance between them on previous iteration les than neighb_dist are considered.

    • \(\lambda\) is regularize_constant.

    • \(\mu\) is regularize_factor such that when \(n_{neighb} = \lambda\) and \(D_s^2 = D_{shift}^2\) for all \(s\), the two contributions to the loss function are approximately equal i.e. \(\mu = D_{shift}^2/D_{scale}^2\).

    • \(D_{scale}^2\) is the squared distance between transform[:3, :] and transform_regularize[:3, :]. I.e. the squared difference of the scaling/rotation part of the transform from the target.

    • \(D_{shift}^2\) is the squared distance between transform[3] and transform_regularize[3]. I.e. the squared difference of the shift part of the transform from the target.

    So if a typical value of \(D_{shift}\) (or \(D_s\)) is 2 and a typical value of \(D_{scale}\) is 0.009, \(\mu = 5\times10^4\).

    Default: 5e4

  • n_transforms_error_fraction: number.

    Used in coppafish/register/check_transforms/check_transforms An error is raised if any of the following are satisfied where a failed transform is one with nb.register_debug.n_matches < nb.register_debug.n_matches_thresh.

    • For any given channel, the fraction of failed transforms was greater than n_transforms_error_fraction of tiles/rounds.

    • For any given tile, the fraction of failed transforms was greater than n_transforms_error_fraction of rounds/channels.

    • For any given round, the fraction of failed transforms was greater than n_transforms_error_fraction of tiles/channels.

    Default: 0.5

call_spots

The call_spots section contains parameters which determine how the bleed_matrix and gene_efficiency are computed, as well as how a gene is assigned to each spot found on the ref_round/ref_channel.

  • bleed_matrix_method: str.

    bleed_matrix_method can only be single or separate. single: a single bleed matrix is produced for all rounds. separate: a different bleed matrix is made for each round.

    Default: single

  • color_norm_intensities: list_number.

    Parameter used to get color normalisation factor. color_norm_intensities should be ascending and color_norm_probs should be descending and they should be the same size. The probability of normalised spot color being greater than color_norm_intensities[i] must be less than color_norm_probs[i] for all i.

    Default: 0.5, 1, 5

  • color_norm_probs: list_number.

    Parameter used to get color normalisation factor. color_norm_intensities should be ascending and color_norm_probs should be descending and they should be the same size. The probability of normalised spot color being greater than color_norm_intensities[i] must be less than color_norm_probs[i] for all i.

    Default: 0.01, 5e-4, 1e-5

  • bleed_matrix_score_thresh: number.

    In scaled_k_means part of bleed_matrix calculation, a mean vector for each dye is computed from all spots with a dot product to that mean greater than this.

    Default: 0

  • bleed_matrix_min_cluster_size: int.

    If less than this many vectors are assigned to a dye cluster in the scaled_k_means part of bleed_matrix calculation, the expected code for that dye will be set to 0 for all color channels i.e. bleed matrix computation will have failed.

    Default: 10

  • bleed_matrix_n_iter: int.

    Maximum number of iterations allowed in the scaled_k_means part of bleed_matrix calculation.

    Default: 100

  • bleed_matrix_anneal: bool.

    If True, the scaled_k_means calculation will be performed twice. The second time starting with the output of the first and with score_thresh for cluster i set to the median of the scores assigned to cluster i in the first run.

    This limits the influence of bad spots to the bleed matrix.

    Default: True

  • background_weight_shift: maybe_number.

    Shift to apply to weighting of each background vector to limit boost of weak spots. The weighting of round r for the fitting of the background vector for channel c is 1 / (spot_color[r, c] + background_weight_shift) so background_weight_shift ensures this does not go to infinity for small spot_color[r, c]. Typical spot_color[r, c] is 1 for intense spot so background_weight_shift is small fraction of this. Leave blank to set to median absolute intensity of all pixels on the mid z-plane of the central tile.

    Default: None

  • dp_norm_shift: maybe_number.

    When calculating the dot_product_score, this is the small shift to apply when normalising spot_colors to ensure don't divide by zero. Value is for a single round and is multiplied by sqrt(n_rounds_used) when computing dot_product_score. Expected norm of a spot_color for a single round is 1 so dp_norm_shift is a small fraction of this. Leave blank to set to median L2 norm for a single round of all pixels on the mid z-plane of the central tile.

    Default: None

  • norm_shift_min: number.

    Minimum possible value of dp_norm_shift and background_weight_shift.

    Default: 0.001

  • norm_shift_max: number.

    Maximum possible value of dp_norm_shift and background_weight_shift.

    Default: 0.5

  • norm_shift_precision: number.

    dp_norm_shift and background_weight_shift will be rounded to nearest norm_shift_precision.

    Default: 0.01

  • gene_efficiency_min_spots: int.

    If number of spots assigned to a gene less than or equal to this, gene_efficiency[g]=1 for all rounds.

    Default: 25

  • gene_efficiency_max: number.

    Maximum allowed value of gene_efficiency i.e. any one round can be at most this times more important than the median round for every gene.

    Default: 6

  • gene_efficiency_min: number.

    At most ceil(gene_efficiency_min_factor * n_rounds_use) rounds can have gene_efficiency below gene_efficiency_min for any given gene.

    Default: 0.05

  • gene_efficiency_min_factor: number.

    At most ceil(gene_efficiency_min_factor * n_rounds_use) rounds can have gene_efficiency below gene_efficiency_min for any given gene.

    Default: 0.2

  • gene_efficiency_n_iter: int.

    gene_efficiency is computed from spots which pass a quality thresholding based on the bled_codes computed with the gene_efficiency of the previous iteration. This process will continue until the gene_effiency converges or gene_efficiency_n_iter iterations are reached. 0 means gene_efficiency will not be used.

    Default: 10

  • gene_efficiency_score_thresh: number.

    Spots used to compute gene_efficiency must have dot_product_score greater than gene_efficiency_score_thresh, difference to second best score greater than gene_efficiency_score_diff_thresh and intensity greater than gene_efficiency_intensity_thresh.

    Default: 0.6

  • gene_efficiency_score_diff_thresh: number.

    Spots used to compute gene_efficiency must have dot_product_score greater than gene_efficiency_score_thresh, difference to second best score greater than gene_efficiency_score_diff_thresh and intensity greater than gene_efficiency_intensity_thresh.

    Default: 0.2

  • gene_efficiency_intensity_thresh: maybe_number.

    Spots used to compute gene_efficiency must have dot_product_score greater than gene_efficiency_score_thresh, difference to second best score greater than gene_efficiency_score_diff_thresh and intensity greater than gene_efficiency_intensity_thresh. Leave blank to determine from gene_efficiency_intensity_thresh_percentile.

    Default: None

  • gene_efficiency_intensity_thresh_percentile: int.

    gene_efficiency_intensity_thresh will be set to this percentile of the intensity computed for all pixels on the mid z-plane of the most central tile if not specified.

    Default: 37

  • gene_efficiency_intensity_thresh_precision: number.

    gene_efficiency_intensity_thresh will be rounded to nearest gene_efficiency_intensity_thresh_precision if not given.

    Default: 0.001

  • gene_efficiency_intensity_thresh_min: number.

    Min allowed value of gene_efficiency_intensity_thresh.

    Default: 0.001

  • gene_efficiency_intensity_thresh_max: number.

    Max allowed value of gene_efficiency_intensity_thresh.

    Default: 0.2

  • alpha: number.

    When computing the dot product score, \(\Delta_{s0g}\) between spot \(s\) and gene \(g\), rounds/channels with background already fit contribute less. The larger \(\alpha\), the lower the contribution.

    Set \(\alpha = 0\) to use the normal dot-product with no weighting.

    Default: 120

  • beta: number.

    Constant used in weighting factor when computing dot product score, \(\Delta_{s0g}\) between spot \(s\) and gene \(g\).

    Default: 1

omp

The omp section contains parameters which are use to carry out orthogonal matching pursuit (omp) on every pixel, as well as how to convert the results of this to spot locations.

  • use_z: maybe_list_int.

    Can specify z-planes to find spots on If 2 values provided, all z-planes between and including the values given will be used.

    Default: None

  • weight_coef_fit: bool.

    If False, gene coefficients are found through omp with normal least squares fitting. If True, gene coefficients are found through omp with weighted least squares fitting with rounds/channels which already containing genes contributing less.

    Default: False

  • initial_intensity_thresh: maybe_number.

    To save time in call_spots_omp, coefficients only found for pixels with intensity of absolute spot_colors greater than initial_intensity_thresh. Leave blank to set to determine using initial_intensity_thresh_auto_param It is also clamped between the initial_intensity_thresh_min and initial_intensity_thresh_max.

    Default: None

  • initial_intensity_thresh_percentile: int.

    If initial_intensity_thresh not given, it will be set to the initial_intensity_thresh_percentile percentile of the absolute intensity of all pixels on the mid z-plane of the central tile. It uses nb.call_spots.abs_intensity_percentile

    Default: 25

  • initial_intensity_thresh_min: number.

    Min allowed value of initial_intensity_thresh.

    Default: 0.001

  • initial_intensity_thresh_max: number.

    Max allowed value of initial_intensity_thresh.

    Default: 0.2

  • initial_intensity_precision: number.

    initial_intensity_thresh will be rounded to nearest initial_intensity_precision if not given.

    Default: 0.001

  • max_genes: int.

    The maximum number of genes that can be assigned to each pixel i.e. number of iterations of omp.

    Default: 30

  • dp_thresh: number.

    Pixels only have coefficient found for a gene if that gene has absolute dot_product_score greater than this i.e. this is the stopping criterion for the OMP.

    Default: 0.225

  • alpha: number.

    When computing the dot product score, \(\Delta_{sig}\) between spot \(s\) and gene \(g\) on iteration \(i\) of OMP, rounds/channels with genes already fit to them, contribute less. The larger \(\alpha\), the lower the contribution.

    Set \(\alpha = 0\) to use the normal dot-product with no weighting.

    Default: 120

  • beta: number.

    Constant used in weighting factor when computing dot product score, \(\Delta_{sig}\) between spot \(s\) and gene \(g\) on iteration \(i\) of OMP.

    Default: 1

  • initial_pos_neighbour_thresh: maybe_int.

    Only save spots with number of positive coefficient neighbours greater than initial_pos_neighbour_thresh. Leave blank to determine using initial_pos_neighbour_thresh_param. It is also clipped between initial_pos_neighbour_thresh_min and initial_pos_neighbour_thresh_max.

    Default: None

  • initial_pos_neighbour_thresh_param: number.

    If initial_pos_neighbour_thresh not given, it is set to initial_pos_neighbour_thresh_param multiplied by number of positive values in nb.omp.spot_shape i.e. with initial_pos_neighbour_thresh_param = 0.1, it is set to 10% of the max value.

    Default: 0.1

  • initial_pos_neighbour_thresh_min: int.

    Min allowed value of initial_pos_neighbour_thresh.

    Default: 4

  • initial_pos_neighbour_thresh_max: int.

    Max allowed value of initial_pos_neighbour_thresh.

    Default: 40

  • radius_xy: int.

    To detect spot in coefficient image of each gene, pixel needs to be above dilation with structuring element which is a square (np.ones) of width 2*radius_xy-1 in the xy plane.

    Default: 3

  • radius_z: int.

    To detect spot in coefficient image of each gene, pixel needs to be above dilation with structuring element which is cuboid (np.ones) with width 2*radius_z-1 in z direction. Must be more than 1 to be 3D.

    Default: 2

  • shape_max_size: list_int.

    spot_shape specifies the neighbourhood about each spot in which we count coefficients which contribute to score. It is either given through file_names['omp_spot_shape'] or computed using the below parameters with shape prefix. Maximum Y, X, Z size of spot_shape. Will be cropped if there are zeros at the extremities.

    Default: 27, 27, 9

  • shape_pos_neighbour_thresh: int.

    For spot to be used to find spot_shape, it must have this many pixels around it on the same z-plane that have a positive coefficient. If 3D, also, require 1 positive pixel on each neighbouring plane (i.e. 2 is added to this value).

    Default: 9

  • shape_isolation_dist: number.

    Spots are isolated if nearest neighbour (across all genes) is further away than this. Only isolated spots are used to find spot_shape.

    Default: 10

  • shape_sign_thresh: number.

    If the mean absolute coefficient sign is less than this in a region near a spot, we set the expected coefficient in spot_shape to be 0. Max mean absolute coefficient sign is 1 so must be less than this.

    Default: 0.15

thresholds

The thresholds section contains the thresholds used to determine which spots pass a quality thresholding process such that we consider their gene assignments legitimate.

  • intensity: maybe_number.

    Final accepted reference and OMP spots both require intensity > thresholds[intensity]. If not given, will be set to same value as nb.call_spots.gene_efficiency_intensity_thresh. intensity for a really intense spot is about 1 so intensity_thresh should be less than this.

    Default: None

  • score_ref: number.

    Final accepted spots are those which pass quality_threshold which is nb.ref_spots.score > thresholds[score_ref] and nb.ref_spots.intensity > intensity_thresh. quality_threshold requires score computed with coppafish/call_spots/dot_prodduct/dot_product_score to exceed this. Max score is 1 so must be below this.

    Default: 0.25

  • score_omp: number.

    Final accepted OMP spots are those which pass quality_threshold which is: score > thresholds[score_omp] and intensity > thresholds[intensity]. score is given by: score = (score_omp_multiplier * n_neighbours_pos + n_neighbours_neg) / (score_omp_multiplier * n_neighbours_pos_max + n_neighbours_neg_max) Max score is 1 so score_thresh should be less than this.

    0.15 if more concerned for missed spots than false positives.

    Default: 0.263

  • score_omp_multiplier: number.

    Final accepted OMP spots are those which pass quality_threshold which is: score > thresholds[score_omp] and intensity > thresholds[intensity]. score is given by: score = (score_omp_multiplier * n_neighbours_pos + n_neighbours_neg) / (score_omp_multiplier * n_neighbours_pos_max + n_neighbours_neg_max)

    0.45 if more concerned for missed spots than false positives.

    Default: 0.95