Fast nonlinear dimensionality reduction using a quadratic-convergence t-SNE algorithm.
Version | 1.0 |
---|---|
Bundle | tools |
Categories | Multivariate Statistics |
Authors | Antti Hakkinen (antti.e.hakkinen@helsinki.fi) |
Issue tracker | View/Report issues |
Requires | liblapack3 (DEB) ; installer (bash) |
Source files | component.xml qsne.bash |
Usage | Example with default values |
Name | Type | Mandatory | Description |
---|---|---|---|
in | CSV | Mandatory | Input matrix. Rows represent samples and columns the input variables. All variables are used in the mapping. |
init | CSV | Optional | (Optional) initial guess. Rows represent samples and columns the projected variables. If a file is not provided, an initial guess is derived using a truncated SVD of the input matrix (PCA). |
Name | Type | Description |
---|---|---|
out | CSV | Output matrix. Rows represent samples and columns the projected variables. |
Name | Type | Default | Description |
---|---|---|---|
compat | boolean | false | t-SNE compatibility mode. Causes q-SNE to switch to gradient ascent and scales variables to those used in van der Maaten's t-SNE implementation. Note that in this mode, the convergence is linear. |
cost_tol | float | 1.48e-9 | Objective tolerance for detecting a stall in optimization. This allows q-SNE to stop early, when the objective no longer decreases. |
dims | int | 2 | Output dimension. Values of 2 and 3 are useful for visualization, but any number can be used. |
max_iter | int | 100 | Maximum number of iterations to perform. q-SNE requires roughly sqrt(n) iterations compared to a regular t-SNE implementation. |
num_threads | int | -1 | Number of threads executing in parallel. By default, a single thread for each core is used. |
perplexity | float | 30 | Perplexity. Controls roughly the number of neighbor samples affecting each sample. |
perplexity_range | float | 0 | Perplexity range. An optimal perplexity is automatically sought in the range [p-r/2,p+r/2] where p is the specified perplexity and r is the range. |
rank | int | 10 | Rank of the approximate Hessian. A value of 0 implies plain gradient ascent (the original t-SNE algorithm) and larger values are required for quadratic convergence. Should be roughly the local inherent dimension. |
Test case | Parameters▼ | IN in |
IN init |
OUT out |
||
---|---|---|---|---|---|---|
iris | properties | in | (missing) | out | ||
dims=2, |