Find Spots

The find spots step of the pipeline loads in the filtered images for each tile, round, channel saved during the extract step and detects spots on them. We obtain a point cloud from the images because in the stitch and register sections of the pipeline, it is quicker to use point clouds than the full images.

The find_spots NotebookPage is added to the Notebook after this stage is finished.

Spot detection

The spots on tile \(t\), round \(r\), channel \(c\) are the local maxima in the filtered image (loaded in through load_tile(nb.file_names, nb.basic_info, t, r, c)) with an intensity greater than auto_thresh[t, r, c].

Local maxima means pixel with the largest intensity in a neighbourhood defined by config['find_spots']['radius_xy'] and config['find_spots']['radius_z']:

kernel = np.ones((2*radius_xy-1, 2*radius_xy-1, 2*radius_z-1))

The position of the local maxima is found to be where the dilation of the image with the kernel is equal to the image.

The spot detection process can be visualised with view_find_spots.

Optimised spot detection

The dilation method is quite slow, so if jax is installed, a different spot detection method is used.

In this method, we look at all pixels with intensity greater than auto_thresh[t, r, c]. For each of these, we say that the pixel is a spot if it has a greater intensity than all of its neighbouring pixels, where the neighbourhood is determined by the kernel.

The larger auto_thresh[t, r, c] and the smaller the kernel, the faster this method is, whereas the value of auto_thresh[t, r, c] makes no difference to the speed of the dilation method. In our case, auto_thresh[t, r, c] is pretty large as the whole point is that all background pixels (the vast majority) have intensity less than it.

For images of size \(2048 \times 2048 \times 50\), the optimised method is around 9 times faster.

Dealing with duplicates

If there are two neighbouring pixels which have the same intensity, which is the local maxima intensity, by default both pixels will be declared to be local maxima. However, if remove_duplicates == True in detect_spots, only one will be deemed a local maxima.

This is achieved by adding a random shift to the intensity of each pixel. The max possible shift is 0.2 so it will not change the integer version of the image but it will ensure each pixel has a different intensity to its neighbour.

Imaging spots

For non reference spots (all round/channel combinations apart from ref_round/ref_channel), we only use the spots for registration to the reference spots, so the quantity of spots is not important. In fact, registration tends to work better if there are fewer but more reliable spots, as this means there is a lesser chance of matching up spots by chance.

To exploit this, for each imaging tile, round, channel, the point cloud is made up of the max_spots most intense spots on each z-plane. In 2D, max_spots is config['find_spots']['max_spots_2d'] and in 3D, it is config['find_spots']['max_spots_3d']. If there are fewer than max_spots spots detected on a particular z-plane, all the spots will be kept.

Reference spots

We want to assign a gene to each reference spot (ref_round/ref_channel), as well as use it for registration, so it is beneficial to maximise the number of reference spots. As such, we do not do the max_spots thresholding for reference spots.

However, we want to know which reference spots are isolated because when it comes to the bleed_matrix calculation, we do not want to use overlapping spots.

Isolated spots

We deem a spot to be isolated if it has a prominent negative annulus, because if there was an overlapping spot, you would expect positive intensity in the annulus around the spot. We find the intensity of the annulus by computing the correlation of the image with an annulus kernel obtained from annulus(r0, r_xy, r_z) where:

r0 = config['find_spots']['isolation_radius_inner']
r_xy = config['find_spots']['isolation_radius_xy']
r_z = config['find_spots']['isolation_radius_z'].

If the value of this correlation at the location of a spot is less than config['find_spots']['isolation_thresh'], then we deem the spot to be isolated. If config['find_spots']['isolation_thresh'] is not given, it is set to:

config['find_spots']['auto_isolation_thresh_multiplier'] * auto_thresh[t, r, c]

The final isolation thresholds used for each tile are saved as nb.find_spots.isolation_thresh. The process of obtaining isolated spots can be visualised with view_find_spots.

Annulus kernel

The annulus kernel should be equal to 1 over the pixels in the neighbourhood of an isolated spot which are usually negative. The example images below show a typical spot (left) and the annulus kernel used for this data (right) with r0 = 4, r_xy = 14 and r_z = 1. The dimensions of each image is 29 x 29 pixels, red is positive and blue is negative.

z = -1z = 0z = 1

Error - too few spots

After the find_spots NotebookPage has been added to the Notebook, check_n_spots will be run.

This will produce a warning for any tile, round, channel for which fewer than

n_spots_warn = config['find_spots']['n_spots_warn_fraction'] * max_spots * nb.basic_info.nz

spots were detected, where max_spots is config['find_spots']['max_spots_2d'] if 2D and config['find_spots']['max_spots_3d'] if 3D.

An error will be raised if any of the following is satisfied:

For any given channel, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of tiles/rounds.

The faulty channels should then be removed from use_channels.
For any given tile, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of rounds/channels.

The faulty tiles should then be removed from use_tiles.
For any given round, the number of spots found was less than n_spots_warn for at least the fraction n_spots_error_fraction of tiles/channels.

The faulty rounds should then be removed from use_rounds.

Example

The following is the \(n_{tiles}\) (3) x \(n_{rounds}\) (5) array of number of spots found for a given channel:

spot_no = array([[1295, 1016,  869,  719,  829],
                 [1055,  888,  687,  556,  824],
                 [5901, 4208, 5160, 4069, 4006]])

The value of n_spots_warn for this experiment is 3500 so a warning will be raised for the 10 tiles/rounds for which spot_no[t, r] < n_spots_warn:

array([[0, 0],
       [0, 1],
       [0, 2],
       [0, 3],
       [0, 4],
       [1, 0],
       [1, 1],
       [1, 2],
       [1, 3],
       [1, 4]])

The value of n_spots_error_fraction for this experiment is 0.5 so the threshold number of failed tiles/rounds to give an error is \(0.5 \times n_{tiles} \times n_{rounds} = 7.5\). We have 10 failed tiles/rounds so an error would be raised in this case.

The use_tiles, use_rounds and use_channels parameters can be changed without having to re-run the find_spots section of the pipeline as explained here. If tiles/rounds/channels are added instead of removed though, it will need re-running, as will the extract step.

`n_spots_grid`

The n_spots_grid function is useful to visualise the number of spots detected on each tile, round and channel:

GoodBad

In the good example, you can see from the minimum on the colorbar that lots of spots have been detected on every image. You can also see that tile 13 seems to have significantly fewer spots than the other tiles and that round 0 often has fewer spots than the other rounds.

In the bad example, all tiles, rounds and channels where n_spots < n_spots_warn are highlighted by a red border. Clearly, channels 0 and 3 did not work for this experiment and channel 1 probably didn't. Also, tile 1 appears to have fewer spots than the other tiles.

Viewer

We can see how the various parameters affect which spots are detected using view_find_spots. This can be called as follows (in the Without Notebook case, the raw images will be loaded and then filtered according to parameters in config['extract']).

With NotebookWithout NotebookReference round/channel showing isolated spots

from coppafish import Notebook
from coppafish.plot import view_find_spots
nb_file = '/Users/user/coppafish/experiment/notebook.npz'
nb = Notebook(nb_file)
t = 1       # tile to view
r = 3       # round to view
c = 6       # channels to view
view_find_spots(nb, t, r, c)

from coppafish.plot import view_filter
ini_file = '/Users/user/coppafish/experiment/settings.ini'
t = 1       # tile to view
r = 3       # round to view
c = 6       # channel to view
view_find_spots(None, t, r, c, config_file=ini_file)

from coppafish import Notebook
from coppafish.plot import view_find_spots
nb_file = '/Users/user/coppafish/experiment/notebook.npz'
nb = Notebook(nb_file)
t = 1                               # tile to view
r = nb.basic_info.ref_round         # round to view
c = nb.basic_info.ref_channel       # channel to view
view_find_spots(nb, t, r, c, show_isolated=True)

This will open a napari viewer with up to 5 sliders in the bottom left:

Detection Radius YX: This is the value of config['find_spots']['r_xy'].
Detection Radius Z: This is the value of config['find_spots']['r_z'].
Intensity Threshold: This is the value of nb.extract.auto_thresh[t, r, c].
Isolation Threshold: This is the value of nb.find_spots.isolation_thresh[t]. It will only appear if show_isolated == True, r = nb.basic_info.ref_round and c = nb.basic_info.ref_channel.
Z Thickness: Spots detected on the current z-plane and this many z-planes either side of it will be shown. Initially, this will be set to 1 so spots from the current z-plane and 1 either side of it will be shown.

Whenever the first two are changed, the dilation will be redone using the new values of the radii and the time taken will be printed to the console. The correlation calculation required to determine which spots are isolated is slow hence, by default show_isolated == False.

Z thickness

The images below show the effect of changing the z-thickness. The size of the spots is related to the z-plane they were detected on. The closer to the current z-plane, the larger they appear:

012

The blue spot has a neighbouring pixel with negative intensity and is not kept in the final point cloud.

Detection radius

YX

The images below show the effect of using the slider to change config['find_spots']['r_xy'] with config['find_spots']['r_z'] fixed at 2 and z thickness = 2:

2310

Clearly, as the radius increases the spots have to be separated by a larger distance, resulting in less spots found.

Z

The images below show the effect of using the slider to change config['find_spots']['r_z'] with config['find_spots']['r_yx'] fixed at 2 and z thickness = 2:

2612

Again we see the number of spots reducing as this increases. However, it is less clear as to why, because the minimum separation in z is what is changing but, we are only seeing spots from 5 z-planes imposed on a single z-plane.

Intensity threshold

The images below show the effect of using the slider to change nb.extract.auto_thresh[t, r, c]. This is for a 2D experiment with config['find_spots']['r_yx'] = 2.

44722344757

This is useful to see what a suitable intensity threshold should be. For example, the 447 plot identifies spots which do not look real while the 4757 plot misses some obvious spots.

The value of nb.extract.auto_thresh[t, r, c] obtained for this data is approximately 2234.

The green spots are those which are identified as isolated.

Changing `auto_thresh`

If after playing with this slider, it is decided that the find_spots part of the pipeline should be run (or re-run) with the updated intensity threshold, new_thresh, for tile \(t\), round \(r\), channel \(c\), this can be achieved by setting nb.extract.auto_thresh[t, r, c] = new_thresh.

Note, this is an abuse of the rules of the Notebook as it is changing a variable after it has been added and thus should not be allowed, but it does work. To keep the new_thresh value, the Notebook will need saving after nb.extract.auto_thresh has been updated. It should be saved to a different location, so it does not overwrite the initial Notebook i.e. nb.save('/Users/user/experiment1/output/notebook_new_auto_thresh.npz').

Isolation threshold

The images below show the effect of using the slider to change nb.find_spots.isolation_thresh[t]. This is for a 2D experiment with:

config['find_spots']['r_yx'] = 2,
config['find_spots']['isolation_radius_inner'] = 2
config['find_spots']['isolation_radius_xy'] = 2.

There is no slider to change the isolation radii because the dilation calculation is quite slow and has to be re-done everytime the radii change.

-200-438-494

Here, we see that as the absolute threshold increases, the spots need a darker (more negative) annulus to be considered isolated (green).

The value of -438 is approximately the value of nb.find_spots.isolation_thresh[t] used for this data (i.e. config['find_spots']['auto_isolation_thresh_multiplier'] = -0.2 and auto_thresh[t, r, c] = 2234 gives -447 which is almost the same).

Pseudocode

This is the pseudocode outlining the basics of this step of the pipeline.

for r in use_rounds:
    for t in use_tiles:
        for c in use_channels:
            if r is anchor_round and c is not anchor_channel:
                Skip to next channel as no spots need detecting on this channel.
            im = load image from npy file in tile directory
            spots_trc = detect spots(im)
            Remove spots with negative neighbouring pixel from spots_trc
            if r is ref_round and c is ref_channel:
                Determine which spots are isolated.
            else:
                Keep only most intense spots on each z-plane.
            Set spot_no[t, r, c] to be the number of spots found.
            For spot s, record in spot_details[s]:
              - tile: tile found on
              - round: round found on
              - channel: channel found on
              - isolated: whether spot isolated (if not ref_round/ref_channel, 
                this will be False)
              - y: y coordinate of spot in tile
              - x: x coordinate of spot in tile
              - z: z coordinate of spot in tile
Add isolation_threshold, spot_no and spot_details to find_spots NotebookPage
Return find_spots NotebookPage