stream plots

[ ]:

[1]:
import scanpy as sc
adata = sc.datasets.paul15()
WARNING: In Scanpy 0.*, this returned logarithmized data. Now it returns non-logarithmized data.
/Users/huidong/anaconda3/envs/env_stream2_dev/lib/python3.7/site-packages/anndata/compat/_overloaded_dict.py:106: ImplicitModificationWarning: Trying to modify attribute `._uns` of view, initializing view as actual.
  self.data[key] = value
/Users/huidong/anaconda3/envs/env_stream2_dev/lib/python3.7/contextlib.py:119: FutureWarning: X.dtype being converted to np.float32 from float64. In the next version of anndata (0.9) conversion will not be automatic. Pass dtype explicitly to avoid this warning. Pass `AnnData(X, dtype=X.dtype, ...)` to get the future behavour.
  next(self.gen)
[2]:
adata
[2]:
AnnData object with n_obs × n_vars = 2730 × 3451
    obs: 'paul15_clusters'
    uns: 'iroot'
[ ]:

[3]:
import stream2 as st2
/Users/huidong/anaconda3/envs/env_stream2_dev/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
[4]:
workdir = 'result_stream_plots'
st2.settings.set_workdir(workdir)
Saving results in: result_stream_plots
[16]:
st2.settings.set_figure_params(dpi=80,
                              style='white',
                              fig_size=[5,5],
                              rc={'image.cmap': 'viridis'})
[6]:
st2.pp.filter_genes(adata,min_n_cells=3)
Before filtering:
2730 cells, 3451 genes
Filter genes based on min_n_cells
After filtering out low-expressed genes:
2730 cells, 3451 genes
[7]:
st2.pp.cal_qc_rna(adata)
[8]:
st2.pl.violin(adata,list_obs=['n_counts','n_genes','pct_mt'], fig_size=(3,3))
_images/stream_plots_10_0.png
[9]:
st2.pp.normalize(adata,method='lib_size')
st2.pp.log_transform(adata)
[10]:
st2.pp.select_variable_genes(adata)
st2.pl.variable_genes(adata,show_texts=True)
2000 variable genes are selected.
_images/stream_plots_12_1.png
[ ]:

[11]:
st2.pp.pca(adata,feature='highly_variable', n_components=50)
st2.pl.pca_variance_ratio(adata, log=True)
the number of selected PC is: 50
_images/stream_plots_14_1.png
[12]:
st2.tl.dimension_reduction(adata,obsm='X_pca',n_dim=40, n_jobs=4)
OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
[13]:
st2.pl.dimension_reduction(adata,color=['paul15_clusters','n_genes'],
                           dict_drawing_order={'paul15_clusters':'random', 'n_genes':'sorted'},
                           fig_legend_ncol=2, fig_size=(5.5,5))
_images/stream_plots_16_0.png
[14]:
st2.tl.seed_graph(adata, obsm='X_dr', n_clusters=15)
Seeding initial graph...
Clustering...
K-Means clustering ...
Calculating minimum spanning tree...
[15]:
st2.tl.learn_graph(adata, obsm='X_dr', n_nodes=20, method='principal_tree',)
[17]:
st2.pl.graph(adata, color=['paul15_clusters', 'n_genes'],
             show_text=True, show_node=True,
             fig_legend_ncol=2,)
_images/stream_plots_19_0.png
[ ]:

In the new version of stream plots, the same set of node IDs as in the graph will be used (and therefore no need to keep “flat tree” any more). The cell state ids in STREAM1, e.g., ‘S1’, ‘S2’, etc., are removed.

[ ]:

start from a leaf node

for non-numerical variables, the same color palette will be automatically used if detected in adata

[18]:
st2.pl.stream_sc(adata, source=6,
                 dist_scale=0.2,
                 color=['paul15_clusters', 'n_genes'],
                 fig_ncol=2,
                 fig_legend_ncol=2,
                 save_fig=False)
_images/stream_plots_24_0.png
[19]:
st2.pl.stream(adata, source=6,
              color=['paul15_clusters', 'n_genes'],
              fig_legend_ncol=1,
              save_fig=False)
_images/stream_plots_25_0.png
_images/stream_plots_25_1.png
[ ]:

if color is not specificed, it will use the default color without showing the title and legend

[20]:
st2.pl.stream_sc(adata, source=6,
                 dist_scale=0.2,
                 save_fig=False)
_images/stream_plots_28_0.png
[21]:
st2.pl.stream(adata, source=6,
              save_fig=False)
_images/stream_plots_29_0.png
[ ]:

start from a branching node

[22]:
st2.pl.stream_sc(adata, source=7,
                 dist_scale=0.2,
                 color=['paul15_clusters', 'n_genes'],
                 fig_ncol=2,
                 fig_legend_ncol=2,
                 save_fig=False)
_images/stream_plots_32_0.png
[23]:
st2.pl.stream(adata, source=7,
              color=['paul15_clusters', 'n_genes'],
              fig_legend_ncol=1,
              save_fig=False)
_images/stream_plots_33_0.png
_images/stream_plots_33_1.png
[ ]:

start from any node (such as a connecting node)

[24]:
st2.pl.stream_sc(adata, source=9,
                 dist_scale=0.2,
                 color=['paul15_clusters', 'n_genes'],
                 fig_ncol=2,
                 fig_legend_ncol=2,
                 save_fig=False)
_images/stream_plots_36_0.png
[25]:
st2.pl.stream(adata, source=9,
              color=['paul15_clusters', 'n_genes'],
              fig_legend_ncol=1,
              save_fig=False)
_images/stream_plots_37_0.png
_images/stream_plots_37_1.png
[ ]:

the branches can still be re-arranged by specifiying the parameter ‘preference’

[26]:
st2.pl.stream_sc(adata, source=9,
                 dist_scale=0.2,
                 color=['paul15_clusters', 'n_genes'],
                 preference=[4, 5],
                 fig_ncol=2,
                 fig_legend_ncol=2,
                 save_fig=False)
_images/stream_plots_40_0.png
[27]:
st2.pl.stream(adata, source=9,
              color=['paul15_clusters', 'n_genes'],
              preference=[4, 5],
              fig_legend_ncol=1,
              save_fig=False)
_images/stream_plots_41_0.png
_images/stream_plots_41_1.png
[28]:
adata
[28]:
AnnData object with n_obs × n_vars = 2730 × 3451
    obs: 'paul15_clusters', 'n_counts', 'n_genes', 'pct_genes', 'pct_mt', 'kmeans', 'seed_epg_node_id', 'seed_epg_node_dist', 'seed_epg_edge_id', 'seed_epg_edge_loc', 'seed_epg_edge_dist', 'epg_node_id', 'epg_node_dist', 'epg_edge_id', 'epg_edge_loc', 'epg_edge_dist'
    var: 'n_counts', 'n_cells', 'pct_cells', 'variances_norm', 'variances', 'means', 'highly_variable'
    uns: 'iroot', 'pca', 'color', 'seed', 'seed_epg', 'epg', 'stream_tree'
    obsm: 'X_pca', 'X_dr'
    layers: 'raw'
[ ]:

errors will be reported if it’s a non-tree structure

[29]:
st2.tl.learn_graph(adata, obsm='X_dr', n_nodes=20, method='principal_circle',)
WARNING: seed graph is ignored when using method principal_circle
[30]:
st2.pl.stream_sc(adata, source=9,
                 dist_scale=0.2,
                 color=['paul15_clusters', 'n_genes'],
                 preference=[4, 5],
                 fig_ncol=2,
                 fig_legend_ncol=2,
                 save_fig=False)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/var/folders/0f/sj3b57rj3nvfztwg4njsv6nc0000gp/T/ipykernel_67992/936429855.py in <module>
      5                  fig_ncol=2,
      6                  fig_legend_ncol=2,
----> 7                  save_fig=False)

~/anaconda3/envs/env_stream2_dev/lib/python3.7/site-packages/stream2/plotting/_plot.py in stream_sc(adata, source, key, color, dict_palette, dist_scale, dist_pctl, size, drawing_order, dict_drawing_order, preference, fig_size, fig_ncol, fig_legend_ncol, fig_legend_order, vmin, vmax, alpha, pad, w_pad, h_pad, cbar_pad, cbar_fraction, cbar_aspect, show_text, show_graph, save_fig, fig_path, fig_name, **kwargs)
   1385
   1386     assert _check_is_tree(adata, key=key), \
-> 1387         "`.pl.stream_sc()` only works for a tree structure"
   1388     if fig_size is None:
   1389         fig_size = mpl.rcParams["figure.figsize"]

AssertionError: `.pl.stream_sc()` only works for a tree structure
[31]:
st2.pl.stream(adata, source=9,
              color=['paul15_clusters', 'n_genes'],
              preference=[4, 5],
              fig_legend_ncol=1,
              save_fig=False)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/var/folders/0f/sj3b57rj3nvfztwg4njsv6nc0000gp/T/ipykernel_67992/3514217850.py in <module>
      3               preference=[4, 5],
      4               fig_legend_ncol=1,
----> 5               save_fig=False)

~/anaconda3/envs/env_stream2_dev/lib/python3.7/site-packages/stream2/plotting/_plot.py in stream(adata, source, key, color, dict_palette, preference, dist_scale, factor_num_win, factor_min_win, factor_width, factor_nrow, factor_ncol, log_scale, factor_zoomin, fig_size, fig_legend_order, fig_legend_ncol, fig_colorbar_aspect, vmin, vmax, pad, w_pad, h_pad, save_fig, fig_path, fig_format)
   1652
   1653     assert _check_is_tree(adata, key=key), \
-> 1654         "`.pl.stream()` only works for a tree structure"
   1655     if fig_size is None:
   1656         fig_size = mpl.rcParams["figure.figsize"]

AssertionError: `.pl.stream()` only works for a tree structure
[ ]: