swnn.builder module

class swnn.builder.Builder(stage_order)

Bases: object

The wrapper object for conveniently building both the single-cell graph and the coarse-grained tree.

Parameters

stage_order (Sequence) – the order of stages

See also

stagewise_knn, adaptive_tree

property stage_lbs

The original stage labels

property group_lbs

The original group labels

property distmat: scipy.sparse.base.spmatrix

The single-cell distance graph

Return type

spmatrix

property connect: scipy.sparse.base.spmatrix

The single-cell graph (connectivities)

Return type

spmatrix

property connect_bin: scipy.sparse.base.spmatrix

The single-cell graph (connectivities) with binary edges

Return type

spmatrix

property edgedf: Union[None, pandas.core.frame.DataFrame]

the voting tree in a node-parent-proportion format

Return type

Optional[DataFrame]

property refined_group_lbs

The refined group labels

build_graph(X, stage_lbs, binary_edge=True, ks=10, n_pcs=50, pca_base_on='stacked', leaf_size=5, **kwargs)

Build multipartite KNN-graph stage-by-stage.

Parameters
  • X (np.ndarray or sparse matrix) – data matrix, of shape (n_samples, n_features)

  • stage_lbs (Sequence) – stage labels for each sample (nodes in build_graph)

  • binary_edge (bool (default=True)) – whether to use the binarized edges. Set as True may cause some information loss but a more robust result.

  • ks (Union[Sequence[int], int]) – the number of nearest neighbors to be calculated.

  • n_pcs (Union[Sequence[int], int]) – The number of principal components after PCA reduction. If pca_base_on is None, this will be ignored.

  • pca_base_on (str {'x1', 'x2', 'stacked', None} (default='stacked')) – if None, perform KNN on the original data space.

  • leaf_size (int (default=5)) – Leaf size passed to BallTree or KDTree, for adjusting the approximation level. The higher the faster, while of less promises to find the exact nearest neighbors. Setting as 1 for brute-force (exact) KNN.

  • kwargs – other parameters for stagewise_knn

Returns

  • distmat (sparse.csr_matrix) – the distance matrix, of shape (n_samples, n_samples)

  • connect (sparse.csr_matrix) – the connectivities matrix, of shape (n_samples, n_samples)

See also

stagewise_knn

build_tree(group_lbs, stage_lbs=None, ignore_pa=[], ext_sep='_')

Adaptatively build the developmental tree from the stagewise-KNN graph.

Parameters
  • group_lbs (Sequence) – group labels for each sample (nodes in build_graph)

  • stage_lbs (Sequence) – stage labels for each sample (nodes in build_graph)

  • ignore_pa (list or set) – parent nodes to be ignored; empty tuple by default.

  • ext_sep (str) – parse string for automatically extract the stage-labels from group_lbs

Returns

  • edgedf (pd.DataFrame) – pd.DataFrame of columns {‘node’, ‘parent’, ‘prop’}, and of the same number of rows as number of total stage-clusters. the column ‘prop’ is the proportion of nodes that have votes for the current parent.

  • refined_group_lbs – refined group labels for each sample (e.g. single-cell)

See also

adaptive_tree