Title: | PHATE - Potential of Heat-Diffusion for Affinity-Based Transition Embedding |
---|---|
Description: | PHATE is a tool for visualizing high dimensional single-cell data with natural progressions or trajectories. PHATE uses a novel conceptual framework for learning and visualizing the manifold inherent to biological systems in which smooth transitions mark the progressions of cells from one state to another. To see how PHATE can be applied to single-cell RNA-seq datasets from hematopoietic stem cells, human embryonic stem cells, and bone marrow samples, check out our publication in Nature Biotechnology at <doi:10.1038/s41587-019-0336-3>. |
Authors: | Krishnan Srinivasan [aut], Scott Gigante [cre]
|
Maintainer: | Scott Gigante <[email protected]> |
License: | GPL-2 | file LICENSE |
Version: | 1.0.7 |
Built: | 2025-03-09 05:11:39 UTC |
Source: | https://github.com/cran/phateR |
Returns the embedding matrix with column names PHATE1 and PHATE2
## S3 method for class 'phate' as.data.frame(x, ...)
## S3 method for class 'phate' as.data.frame(x, ...)
x |
A fitted PHATE object |
... |
Arguments for as.data.frame() |
Returns the embedding matrix. All components can be accessed using phate$embedding, phate$diff.op, etc
## S3 method for class 'phate' as.matrix(x, ...)
## S3 method for class 'phate' as.matrix(x, ...)
x |
A fitted PHATE object |
... |
Arguments for as.matrix() |
Check that the current PHATE version in Python is up to date.
check_pyphate_version()
check_pyphate_version()
KMeans on the PHATE potential Clustering on the PHATE operator as introduced in Moon et al. This is similar to spectral clustering.
cluster_phate(phate, k = 8, seed = NULL)
cluster_phate(phate, k = 8, seed = NULL)
phate |
|
k |
Number of clusters (default: 8) |
seed |
Random seed for kmeans (default: NULL) |
clusters Integer vector of cluster assignments
if (reticulate::py_module_available("phate")) { # Load data # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) # Run PHATE phate.tree <- phate(tree.data.small$data) # Clustering cluster_phate(phate.tree) }
if (reticulate::py_module_available("phate")) { # Load data # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) # Run PHATE phate.tree <- phate(tree.data.small$data) # Clustering cluster_phate(phate.tree) }
Passes the embedding matrix to ggplot with column names PHATE1 and PHATE2
## S3 method for class 'phate' ggplot(data, ...)
## S3 method for class 'phate' ggplot(data, ...)
data |
A fitted PHATE object |
... |
Arguments for ggplot() |
if (reticulate::py_module_available("phate") && require(ggplot2)) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) ggplot(phate.tree, aes(x=PHATE1, y=PHATE2, color=tree.data.small$branches)) + geom_point() }
if (reticulate::py_module_available("phate") && require(ggplot2)) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) ggplot(phate.tree, aes(x=PHATE1, y=PHATE2, color=tree.data.small$branches)) + geom_point() }
Install PHATE Python package into a virtualenv or conda env.
install.phate( envname = "r-reticulate", method = "auto", conda = "auto", pip = TRUE, ... )
install.phate( envname = "r-reticulate", method = "auto", conda = "auto", pip = TRUE, ... )
envname |
Name of environment to install packages into |
method |
Installation method. By default, "auto" automatically finds a method that will work in the local environment. Change the default to force a specific installation method. Note that the "virtualenv" method is not available on Windows. |
conda |
Path to conda executable (or "auto" to find conda using the PATH and other conventional install locations). |
pip |
Install from pip, if possible. |
... |
Additional arguments passed to conda_install() or virtualenv_install(). |
On Linux and OS X the "virtualenv" method will be used by default ("conda" will be used if virtualenv isn't available). On Windows, the "conda" method is always used.
Performs L1 normalization on input data such that the sum of expression values for each cell sums to 1, then returns normalized matrix to the metric space using median UMI count per cell effectively scaling all cells as if they were sampled evenly.
library.size.normalize(data, verbose = FALSE)
library.size.normalize(data, verbose = FALSE)
data |
matrix (n_samples, n_dimensions) 2 dimensional input data array with n cells and p dimensions |
verbose |
boolean, default=FALSE. If true, print verbose output |
data_norm matrix (n_samples, n_dimensions) 2 dimensional array with normalized gene expression values
PHATE is a data reduction method specifically designed for visualizing high dimensional data in low dimensional spaces.
phate( data, ndim = 2, knn = 5, decay = 40, n.landmark = 2000, gamma = 1, t = "auto", mds.solver = "sgd", knn.dist.method = "euclidean", knn.max = NULL, init = NULL, mds.method = "metric", mds.dist.method = "euclidean", t.max = 100, npca = 100, plot.optimal.t = FALSE, verbose = 1, n.jobs = 1, seed = NULL, potential.method = NULL, k = NULL, alpha = NULL, use.alpha = NULL, ... )
phate( data, ndim = 2, knn = 5, decay = 40, n.landmark = 2000, gamma = 1, t = "auto", mds.solver = "sgd", knn.dist.method = "euclidean", knn.max = NULL, init = NULL, mds.method = "metric", mds.dist.method = "euclidean", t.max = 100, npca = 100, plot.optimal.t = FALSE, verbose = 1, n.jobs = 1, seed = NULL, potential.method = NULL, k = NULL, alpha = NULL, use.alpha = NULL, ... )
data |
matrix (n_samples, n_dimensions)
2 dimensional input data array with
n_samples samples and n_dimensions dimensions.
If |
ndim |
int, optional, default: 2 number of dimensions in which the data will be embedded |
knn |
int, optional, default: 5 number of nearest neighbors on which to build kernel |
decay |
int, optional, default: 40 sets decay rate of kernel tails. If NULL, alpha decaying kernel is not used |
n.landmark |
int, optional, default: 2000 number of landmarks to use in fast PHATE |
gamma |
float, optional, default: 1
Informational distance constant between -1 and 1.
|
t |
int, optional, default: 'auto' power to which the diffusion operator is powered sets the level of diffusion |
mds.solver |
'sgd', 'smacof', optional, default: 'sgd' which solver to use for metric MDS. SGD is substantially faster, but produces slightly less optimal results. Note that SMACOF was used for all figures in the PHATE paper. |
knn.dist.method |
string, optional, default: 'euclidean'.
recommended values: 'euclidean', 'cosine', 'precomputed'
Any metric from |
knn.max |
int, optional, default: NULL
Maximum number of neighbors for which alpha decaying kernel
is computed for each point. For very large datasets, setting |
init |
phate object, optional object to use for initialization. Avoids recomputing intermediate steps if parameters are the same. |
mds.method |
string, optional, default: 'metric' choose from 'classic', 'metric', and 'nonmetric' which MDS algorithm is used for dimensionality reduction |
mds.dist.method |
string, optional, default: 'euclidean' recommended values: 'euclidean' and 'cosine' |
t.max |
int, optional, default: 100. Maximum value of t to test for automatic t selection. |
npca |
int, optional, default: 100 Number of principal components to use for calculating neighborhoods. For extremely large datasets, using n_pca < 20 allows neighborhoods to be calculated in log(n_samples) time. |
plot.optimal.t |
boolean, optional, default: FALSE If TRUE, produce a plot showing the Von Neumann Entropy curve for automatic t selection. |
verbose |
|
n.jobs |
|
seed |
int or |
potential.method |
Deprecated.
For log potential, use |
k |
Deprecated. Use |
alpha |
Deprecated. Use |
use.alpha |
Deprecated
To disable alpha decay, use |
... |
Additional arguments for |
"phate" object containing:
embedding: the PHATE embedding
operator: The PHATE operator (python phate.PHATE object)
params: Parameters passed to phate
if (reticulate::py_module_available("phate")) { # Load data # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) # Run PHATE phate.tree <- phate(tree.data.small$data) summary(phate.tree) ## PHATE embedding ## knn = 5, decay = 40, t = 58 ## Data: (3000, 100) ## Embedding: (3000, 2) library(graphics) # Plot the result with base graphics plot(phate.tree, col=tree.data.small$branches) # Plot the result with ggplot2 if (require(ggplot2)) { ggplot(phate.tree) + geom_point(aes(x=PHATE1, y=PHATE2, color=tree.data.small$branches)) } # Run PHATE again with different parameters # We use the last run as initialization phate.tree2 <- phate(tree.data.small$data, t=150, init=phate.tree) # Extract the embedding matrix to use in downstream analysis embedding <- as.matrix(phate.tree2) }
if (reticulate::py_module_available("phate")) { # Load data # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) # Run PHATE phate.tree <- phate(tree.data.small$data) summary(phate.tree) ## PHATE embedding ## knn = 5, decay = 40, t = 58 ## Data: (3000, 100) ## Embedding: (3000, 2) library(graphics) # Plot the result with base graphics plot(phate.tree, col=tree.data.small$branches) # Plot the result with ggplot2 if (require(ggplot2)) { ggplot(phate.tree) + geom_point(aes(x=PHATE1, y=PHATE2, color=tree.data.small$branches)) } # Run PHATE again with different parameters # We use the last run as initialization phate.tree2 <- phate(tree.data.small$data, t=150, init=phate.tree) # Extract the embedding matrix to use in downstream analysis embedding <- as.matrix(phate.tree2) }
Plot a PHATE object in base R
## S3 method for class 'phate' plot(x, ...)
## S3 method for class 'phate' plot(x, ...)
x |
A fitted PHATE object |
... |
Arguments for plot() |
if (reticulate::py_module_available("phate")) { library(graphics) # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) plot(phate.tree, col=tree.data.small$branches) }
if (reticulate::py_module_available("phate")) { library(graphics) # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) plot(phate.tree, col=tree.data.small$branches) }
This avoids spamming the user's console with a list of many large matrices
## S3 method for class 'phate' print(x, ...)
## S3 method for class 'phate' print(x, ...)
x |
A fitted PHATE object |
... |
Arguments for print() |
if (reticulate::py_module_available("phate")) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) print(phate.tree) ## PHATE embedding with elements ## $embedding : (3000, 2) ## $operator : Python PHATE operator ## $params : list with elements (data, knn, decay, t, n.landmark, ndim, ## gamma, npca, mds.method, ## knn.dist.method, mds.dist.method) }
if (reticulate::py_module_available("phate")) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) print(phate.tree) ## PHATE embedding with elements ## $embedding : (3000, 2) ## $operator : Python PHATE operator ## $params : list with elements (data, knn, decay, t, n.landmark, ndim, ## gamma, npca, mds.method, ## knn.dist.method, mds.dist.method) }
Summarize a PHATE object
## S3 method for class 'phate' summary(object, ...)
## S3 method for class 'phate' summary(object, ...)
object |
A fitted PHATE object |
... |
Arguments for summary() |
if (reticulate::py_module_available("phate")) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) summary(phate.tree) ## PHATE embedding ## knn = 5, decay = 40, t = 58 ## Data: (3000, 100) ## Embedding: (3000, 2) }
if (reticulate::py_module_available("phate")) { # data(tree.data) # We use a smaller tree to make examples run faster data(tree.data.small) phate.tree <- phate(tree.data.small$data) summary(phate.tree) ## PHATE embedding ## knn = 5, decay = 40, t = 58 ## Data: (3000, 100) ## Embedding: (3000, 2) }
A dataset containing high dimensional data that has 10 unique branches
tree.data
tree.data
A list containing data
, a matrix with 3000 rows and 100 variables
and branches
, a factor containing 3000 elements.
The authors
A dataset containing high dimensional data that has 10 unique branches
tree.data.small
tree.data.small
A list containing data
, a matrix with 250 rows and 50 variables
and branches
, a factor containing 250 elements.
The authors