Mars Learn

This is the class and function reference of Mars learn.



cluster.KMeans([n_clusters, init, n_init, …])

K-Means clustering.


cluster.k_means(X, n_clusters[, …])

K-means clustering algorithm.


Samples generator

datasets.make_blobs([n_samples, n_features, …])

Generate isotropic Gaussian blobs for clustering.

datasets.make_classification([n_samples, …])

Generate a random n-class classification problem.

datasets.make_low_rank_matrix([n_samples, …])

Generate a mostly low rank matrix with bell-shaped singular values

Matrix Decomposition

decomposition.PCA([n_components, copy, …])

Principal component analysis (PCA)

decomposition.TruncatedSVD([n_components, …])

Dimensionality reduction using truncated SVD (aka LSA).

Ensemble Methods


Blockwise training and ensemble voting classifier.


Blockwise training and ensemble voting regressor.

Linear Models

Classical linear regressors

linear_model.LinearRegression(*[, …])

Ordinary least squares Linear Regression.


Classification metrics

metrics.accuracy_score(y_true, y_pred[, …])

Accuracy classification score.

metrics.auc(x, y[, session, run_kwargs])

Compute Area Under the Curve (AUC) using the trapezoidal rule

metrics.log_loss(y_true, y_pred, *[, eps, …])

Log loss, aka logistic loss or cross-entropy loss.

metrics.roc_curve(y_true, y_score[, …])

Compute Receiver operating characteristic (ROC)

Regression metrics

metrics.r2_score(y_true, y_pred, *[, …])

\(R^2\) (coefficient of determination) regression score function.

Pairwise metrics

metrics.pairwise.cosine_similarity(X[, Y, …])

Compute cosine similarity between samples in X and Y.

metrics.pairwise.cosine_distances(X[, Y])

Compute cosine distance between samples in X and Y.

metrics.pairwise.euclidean_distances(X[, Y, …])

Considering the rows of X (and Y=X) as vectors, compute the distance matrix between each pair of vectors.

metrics.pairwise.haversine_distances(X[, Y])

Compute the Haversine distance between samples in X and Y

metrics.pairwise.manhattan_distances(X[, Y, …])

Compute the L1 distances between the vectors in X and Y.

metrics.pairwise.rbf_kernel(X[, Y, gamma])

Compute the rbf (gaussian) kernel between X and Y.

metrics.pairwise_distances(X[, Y, metric])

Model Selection

Splitter Classes

model_selection.KFold([n_splits, shuffle, …])

K-Folds cross-validator

Splitter Functions

model_selection.train_test_split(*arrays, …)

Split arrays or matrices into random train and test subsets

Nearest Neighbors

neighbors.NearestNeighbors([n_neighbors, …])

Preprocessing and Normalization

preprocessing.LabelBinarizer(*[, neg_label, …])

Binarize labels in a one-vs-all fashion.

preprocessing.MinMaxScaler([feature_range, …])

Transform features by scaling each feature to a given range.

preprocessing.minmax_scale(X[, …])

Transform features by scaling each feature to a given range.

preprocessing.label_binarize(y, *, classes)

Binarize labels in a one-vs-all fashion.

preprocessing.normalize(X[, norm, axis, …])

Scale input vectors individually to unit norm (vector length).

Semi-Supervised Learning

semi_supervised.LabelPropagation([kernel, …])

Label Propagation classifier


utils.assert_all_finite(X[, allow_nan, …])

utils.check_X_y(X, y[, accept_sparse, …])

Input validation for standard estimators.

utils.check_array(array[, accept_sparse, …])

Input validation on a tensor, list, sparse matrix or similar.

utils.check_consistent_length(*arrays[, …])

Check that all arrays have consistent first dimensions.


Determine the type of data indicated by the target.


Check if y is in a multilabel format.

utils.shuffle(*arrays, **options)


Perform is_fitted validation for estimator.

utils.validation.column_or_1d(y[, warn])

Ravel column or 1d numpy array, else raises an error


wrappers.ParallelPostFit(estimator, scoring, …)

Meta-estimator for parallel predict and transform.

LightGBM Integration

contrib.lightgbm.LGBMClassifier(*args, **kwargs)

contrib.lightgbm.LGBMRegressor(*args, **kwargs)

contrib.lightgbm.LGBMRanker(*args, **kwargs)

TensorFlow Integration


Run TensorFlow script in Mars cluster.


convert mars data type to

XGBoost Integration

contrib.xgboost.MarsDMatrix(data[, label, …])

contrib.xgboost.train(params, dtrain[, evals])

Train XGBoost model in Mars manner.

contrib.xgboost.predict(model, data[, …])

contrib.xgboost.XGBClassifier([max_depth, …])

Implementation of the scikit-learn API for XGBoost classification.

contrib.xgboost.XGBRegressor([max_depth, …])

Implementation of the scikit-learn API for XGBoost regressor.