came.AlignedDataPair¶

class came.AlignedDataPair(features: Sequence[spmatrix | ndarray], ov_adjs: Sequence[spmatrix | ndarray], oo_adjs: Sequence[spmatrix] | None = None, varnames_feat: Sequence[str] | None = None, varnames_node: Sequence[str] | None = None, obs_dfs: Sequence[DataFrame] | None = None, var_dfs: Sequence[DataFrame] | None = None, dataset_names: Sequence[str] = ('reference', 'query'), ntypes: Dict[str, str] | None = None, etypes: Dict[str, str] | None = None, make_graph: bool = True, **kwds)¶

Paired datasets with the aligned features (e.g. cross-datasets or cross-omics)

Parameters:

features (list or tuple) – a list or tuple of 2 feature matrices. common / aligned feratures, as node-features (for observations). of shape (n_obs1, n_features) and (n_obs2, n_features)
ov_adjs (list or tuple) – a list or tuple of 2 (sparse) feature matrices. unaligned features, for making ov_adj. of shape (n_obs1, n_vnodes1) and (n_obs2, n_vnodes2)
varnames_feat (list or tuple) – names of variables that will be treated as node-features for observations
varnames_node (list or tuple) – names of variables that will be treated as nodes.
obs_dfs (list or tuple) – a list or tuple of 2 DataFrame s
ntypes (dict) – A dict for specifying names of the node types
etypes (dict) – A dict for specifying names of the edge types
**kwds – other key words for the HeteroGraph construction

Examples

>>> dpair = AlignedDataPair(
...     [features1, features2],
...     [ov_adj1, ov_adj2],
...     varnames_feat = vars_feat,
...     varnames_node = vars_node,
...     obs_dfs = [obs1, obs2],
...     dataset_names=dataset_names,
...     )

__init__(features: Sequence[spmatrix | ndarray], ov_adjs: Sequence[spmatrix | ndarray], oo_adjs: Sequence[spmatrix] | None = None, varnames_feat: Sequence[str] | None = None, varnames_node: Sequence[str] | None = None, obs_dfs: Sequence[DataFrame] | None = None, var_dfs: Sequence[DataFrame] | None = None, dataset_names: Sequence[str] = ('reference', 'query'), ntypes: Dict[str, str] | None = None, etypes: Dict[str, str] | None = None, make_graph: bool = True, **kwds)¶

Methods

`__init__`(features, ov_adjs[, oo_adjs, ...])
`get_feature_dict`([astensor, scale, unit_var])
`get_obs_anno`(keys[, which, concat])	get the annotations of samples (observations)
`get_obs_dataset`()
`get_obs_features`([astensor, scale, ...])
`get_obs_ids`([which, astensor])	get node indices for obs-nodes (samples), choices are:
`get_obs_labels`(keys[, astensor, train_use, ...])	make labels for model training
`get_whole_net`([rebuild])
`load`(fp)	load object fp: file path to `AlignedDataPair` object, e.g., 'datapair_init.pickle'
`make_ov_adj`()
`make_whole_net`([selfloop_o, selfloop_v])	make the whole hetero-graph (e.g.
`save_init`([path])	save object for reloading
`set_common_obs_annos`([df, ignore_index])	Shared and merged annotation labels for ALL of the observations in both datasets.
`set_dataset_names`(dataset_names)
`set_etypes`(etypes)
`set_features`(features[, varnames_feat])	setting feature matrices, where features are aligned across datasets.
`set_ntypes`(ntypes)
`set_obs_dfs`([obs1, obs2])
`set_oo_adj`([oo_adjs])
`set_ov_adj`(ov_adjs)	set observation-by-variable adjacent matrices
`set_varnames_node`([varnames_node, index])
`summary_graph`()

Attributes

`G`	The graph structure, of type `dgl.Heterograph`
`classes`	Unique classes (types) in the reference data, may contain "unknown" if there are any types in the query data but not in the reference, or if the query data is un-labeled.
`etypes`
`labels`	Labels for each observations that would be taken as the supervised information for model-training.
`n_feats`	Number of dimensions of the observation-node features
`n_obs`	Total number of the observations (e.g., cells)
`n_obs1`	Number of observations (e.g., cells) in the reference data
`n_obs2`	Number of observations (e.g., cells) in the query data
`n_vnodes`	Total number of the variables (e.g., genes)
`ntypes`
`obs_ids`	All of the observation (e.g., cell) indices
`obs_ids1`	Indices of the observation (e.g., cell) nodes in the reference data
`obs_ids2`	Indices of the observation (e.g., cell) nodes in the query data
`oo_adj`	observation-by-variable adjacent matrix (e.g.
`ov_adj`	merged adjacent matrix between observation and variable nodes (e.g.
`varnames_feat`	The observation feature names
`varnames_node`	Names of variable nodes
`vnode_ids`	All of the variable (e.g., cell) indices