FAQs ==== About the input format ---------------------- **Q**: I processed my data using Seurat, and transformed them into .h5ad files. But an error occurred when I passed them into CAME's default pipeline. **A**: The problem is caused by "the h5ad files converted from seurat-object by ``SeuratDisk``". CAME process the data from the raw-count matrices. So please use scanpy to construct the AnnData object from the raw-count matrices (e.g., read from the ``*.mtx`` and ``*.txt`` files by ``scanpy.read()``) You can also use the following R code to export the filtered scRNA-seq data into an ``.h5`` file, which takes less time and space. .. code-block:: R # R code library(rhdf5) library(Matrix) save_h5mat = function(mat, fp_h5, feature_type, genome=""){ # save sparse.mat ('dgCMatrix' format) into a h5 file # ======= Test code ====== # tmp = Seurat::Read10X_h5(fp_h5) # all(tmp@x == mat@x) # all(tmp@i == mat@i) # all(tmp@p == mat@p) message(fp_h5) h5createFile(fp_h5) root = "matrix" h5createGroup(fp_h5, root) h5write(dim(mat), fp_h5, paste(root, "shape", sep='/')) h5write(mat@x, fp_h5, paste(root, "data", sep='/')) h5write(mat@i, fp_h5, paste(root, "indices", sep='/')) # mat@i - 1 ? h5write(mat@p, fp_h5, paste(root, "indptr", sep='/')) h5write(colnames(mat), fp_h5, paste(root, "barcodes", sep='/')) feat_root = paste(root, "features", sep='/') h5createGroup(fp_h5, feat_root) h5write(rownames(mat), fp_h5, paste(feat_root, "id", sep='/')) h5write(rownames(mat), fp_h5, paste(feat_root, "name", sep='/')) h5write(rep(feature_type, dim(mat)[1]), fp_h5, paste(feat_root, "feature_type", sep='/')) h5write(rep("", dim(mat)[1]), fp_h5, paste(feat_root, "derivation", sep='/')) h5write(rep(genome, dim(mat)[1]), # "mm10" fp_h5, paste(feat_root, "genome", sep='/')) h5write(c("genome", "derivation"), fp_h5, paste(feat_root, "_all_tag_keys", sep='/')) h5closeAll() message("Done!") } # save_h5mat_peak = function(mat, fp_h5, genome=""){ # save_h5mat(mat, fp_h5, feature_type = "Peaks", genome = genome) # } save_h5mat_gex = function(mat, fp_h5, genome=""){ save_h5mat(mat, fp_h5, feature_type = "Gene Expression", genome = genome) } # save the raw-counts in a Seurat-object "seurat_obj" mat = seurat_obj[["RNA"]]@counts save_h5mat_gex(mat, "matrix.raw.h5", genome="") # save the meta-data into a csv file: meta_data = seurat_obj@meta.data write.csv(meta_data, "metadata.csv") And read the h5 file using Scanpy's build-in function: .. code-block:: Python # python-code import pandas as pd import scanpy as sc fp_mat = 'matrix.raw.h5' fp_meta = 'metadata.csv' adata_raw = sc.read_10x_h5(fp_mat) metadata = pd.read_csv(fp_meta, index_col=0) # add meta-data for c in metadata.columns: adata_raw.obs[c] = metadata[c]