came.utils.preprocess.subset_matches

came.utils.preprocess.subset_matches(df_match: DataFrame, left: Sequence, right: Sequence, union: bool = False, cols: Sequence[str] | None = None, indicators=False)

Take a subset of token matches (e.g., gene homologies)

Parameters:
  • df_match (pd.DataFrame) – a dataframe with at least 2 columns

  • left – list-like, for extracting elements in the first column.

  • right – list-like, for extracting elements in the second column.

  • union – whether to take union of both sides of genes

  • cols – if specified, should be 2 column names for left and right genes, respectively.

  • indicators – if True, only return the indicators.