API Reference¶

This is the class and function reference of Mars learn.

Datasets¶

Samples generator¶

 datasets.make_blobs([n_samples, n_features, …]) Generate isotropic Gaussian blobs for clustering. datasets.make_classification([n_samples, …]) Generate a random n-class classification problem. datasets.make_low_rank_matrix([n_samples, …]) Generate a mostly low rank matrix with bell-shaped singular values

Matrix Decomposition¶

 decomposition.PCA decomposition.TruncatedSVD

Metrics¶

Classification metrics¶

 metrics.accuracy_score(y_true, y_pred[, …]) Accuracy classification score. metrics.auc(x, y[, session, run_kwargs]) Compute Area Under the Curve (AUC) using the trapezoidal rule metrics.roc_curve(y_true, y_score[, …]) Compute Receiver operating characteristic (ROC)

Pairwise metrics¶

 metrics.pairwise.cosine_similarity(X[, Y, …]) Compute cosine similarity between samples in X and Y. metrics.pairwise.cosine_distances(X[, Y]) Compute cosine distance between samples in X and Y. metrics.pairwise.euclidean_distances(X[, Y, …]) Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors. metrics.pairwise.haversine_distances(X[, Y]) Compute the Haversine distance between samples in X and Y metrics.pairwise.manhattan_distances(X[, Y, …]) Compute the L1 distances between the vectors in X and Y. metrics.pairwise.rbf_kernel(X[, Y, gamma]) Compute the rbf (gaussian) kernel between X and Y. metrics.pairwise_distances(X[, Y, metric])

Splitter Functions¶

 model_selection.train_test_split(*arrays, …) Split arrays or matrices into random train and test subsets

Nearest Neighbors¶

 neighbors.NearestNeighbors

Preprocessing and Normalization¶

 preprocessing.normalize(X[, norm, axis, …]) Scale input vectors individually to unit norm (vector length).

Semi-Supervised Learning¶

 semi_supervised.LabelPropagation

Utilities¶

 utils.assert_all_finite(X[, allow_nan, …]) utils.check_X_y utils.check_array(array[, accept_sparse, …]) Input validation on a tensor, list, sparse matrix or similar. utils.check_consistent_length(*arrays[, …]) Check that all arrays have consistent first dimensions. utils.multiclass.type_of_target(y) Determine the type of data indicated by the target. utils.multiclass.is_multilabel(y) Check if y is in a multilabel format. utils.shuffle(*arrays, **options) utils.validation.check_is_fitted utils.validation.column_or_1d(y[, warn]) Ravel column or 1d numpy array, else raises an error

TensorFlow Integration¶

 contrib.tensorflow.run_tensorflow_script(…) Run TensorFlow script in Mars cluster.

PyTorch Integration¶

 contrib.pytorch.run_pytorch_script(script, …) Run PyTorch script in Mars cluster. contrib.pytorch.MarsDataset contrib.pytorch.MarsDistributedSampler contrib.pytorch.MarsRandomSampler(data_source)

XGBoost Integration¶

 contrib.xgboost.MarsDMatrix(data[, label, …]) contrib.xgboost.train(params, dtrain[, evals]) Train XGBoost model in Mars manner. contrib.xgboost.predict(model, data[, …]) contrib.xgboost.XGBClassifier contrib.xgboost.XGBRegressor