came.pipeline.gather_came_results

came.pipeline.gather_came_results(dpair: DataPair | AlignedDataPair, trainer: Trainer, classes: Sequence, keys: Sequence[str], keys_compare: Sequence[str] | None = None, resdir: str | Path = '.', checkpoint: int | str = 'best', batch_size: int | None = None, save_hidden_list: bool = True, save_dpair: bool = True)

Packed function for pipeline as follows:

  1. load the ‘best’ or the given checkpoint (model)

  2. get the predictions for cells, including probabilities (from logits)

  3. get and the hidden states for both cells and genes

  4. make a predictor

Parameters:
  • dpair – the DataPair or AlignedDataPair object. Note that it may be changed after pass through this function.

  • trainer – the model trainer

  • classes – the class (or cell-type) space

  • keys – a pair of names like [key_class1, key_class2], where key_class1 is the column name of the reference cell-type labels, and key_class2 for the query, which can be set as None if there are no labels for the query data. These labels will be extracted and stored in the column ‘REF’ of dpair.obs.

  • keys_compare – a pair of names like [key_class1, key_class2], just for comparison. These labels will be extracted and stored in the column ‘celltype’ of dpair.obs.

  • resdir – the result directory

  • checkpoint – specify which checkpoint to adopt. candidates are ‘best’, ‘last’, or an integer.

  • batch_size – specify it when your GPU memory is limited

  • save_hidden_list – whether to save the hidden states into {resdir}/hidden_list.h5

  • save_dpair – whether to save the dpair elements into {resdir}/datapair_init.pickle