A tutorial on statistical-learning for scientific data processing# Statistical learning: the setting and the estimator object in scikit-learn Datasets Estimators objects Supervised learning: predicting an output variable from high-dimensional observations Nearest neighbor and the curse of dimensionality Linear model: from regression to sparsity Support vector machines (SVMs) Model selection: choosing estimators and their parameters Score, and cross-validated scores Cross-validation generators Grid-search and cross-validated estimators Unsupervised learning: seeking representations of the data Clustering: grouping observations together Decompositions: from a signal to components and loadings Putting it all together Pipelining Face recognition with eigenfaces Open problem: Stock Market Structure