sklearn.datasets.load_digits#

sklearn.datasets.load_digits(*, n_class=10, return_X_y=False, as_frame=False)[source]#

Load and return the digits dataset (classification).

Each datapoint is a 8x8 image of a digit.

Classes

10

Samples per class

~180

Samples total

1797

Dimensionality

64

Features

integers 0-16

This is a copy of the test set of the UCI ML hand-written digits datasets https://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits

Read more in the User Guide.

Parameters:
n_classint, default=10

The number of classes to return. Between 0 and 10.

return_X_ybool, default=False

If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object.

New in version 0.18.

as_framebool, default=False

If True, the data is a pandas DataFrame including columns with appropriate dtypes (numeric). The target is a pandas DataFrame or Series depending on the number of target columns. If return_X_y is True, then (data, target) will be pandas DataFrames or Series as described below.

New in version 0.23.

Returns:
dataBunch

Dictionary-like object, with the following attributes.

data{ndarray, dataframe} of shape (1797, 64)

The flattened data matrix. If as_frame=True, data will be a pandas DataFrame.

target: {ndarray, Series} of shape (1797,)

The classification target. If as_frame=True, target will be a pandas Series.

feature_names: list

The names of the dataset columns.

target_names: list

The names of target classes.

New in version 0.20.

frame: DataFrame of shape (1797, 65)

Only present when as_frame=True. DataFrame with data and target.

New in version 0.23.

images: {ndarray} of shape (1797, 8, 8)

The raw image data.

DESCR: str

The full description of the dataset.

(data, target)tuple if return_X_y is True

A tuple of two ndarrays by default. The first contains a 2D ndarray of shape (1797, 64) with each row representing one sample and each column representing the features. The second ndarray of shape (1797) contains the target samples. If as_frame=True, both arrays are pandas objects, i.e. X a dataframe and y a series.

New in version 0.18.

Examples

To load the data and visualize the images:

>>> from sklearn.datasets import load_digits
>>> digits = load_digits()
>>> print(digits.data.shape)
(1797, 64)
>>> import matplotlib.pyplot as plt
>>> plt.gray()
>>> plt.matshow(digits.images[0])
<...>
>>> plt.show()
../../_images/sklearn-datasets-load_digits-1_00.png
../../_images/sklearn-datasets-load_digits-1_01.png

Examples using sklearn.datasets.load_digits#

Release Highlights for scikit-learn 1.3

Release Highlights for scikit-learn 1.3

Recognizing hand-written digits

Recognizing hand-written digits

A demo of K-Means clustering on the handwritten digits data

A demo of K-Means clustering on the handwritten digits data

Feature agglomeration

Feature agglomeration

Various Agglomerative Clustering on a 2D embedding of digits

Various Agglomerative Clustering on a 2D embedding of digits

The Digit Dataset

The Digit Dataset

Recursive feature elimination

Recursive feature elimination

Comparing various online solvers

Comparing various online solvers

L1 Penalty and Sparsity in Logistic Regression

L1 Penalty and Sparsity in Logistic Regression

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap…

Manifold learning on handwritten digits: Locally Linear Embedding, Isomap...

Explicit feature map approximation for RBF kernels

Explicit feature map approximation for RBF kernels

The Johnson-Lindenstrauss bound for embedding with random projections

The Johnson-Lindenstrauss bound for embedding with random projections

Balance model complexity and cross-validated score

Balance model complexity and cross-validated score

Comparing randomized search and grid search for hyperparameter estimation

Comparing randomized search and grid search for hyperparameter estimation

Custom refit strategy of a grid search with cross-validation

Custom refit strategy of a grid search with cross-validation

Plotting Learning Curves and Checking Models’ Scalability

Plotting Learning Curves and Checking Models' Scalability

Plotting Validation Curves

Plotting Validation Curves

Caching nearest neighbors

Caching nearest neighbors

Dimensionality Reduction with Neighborhood Components Analysis

Dimensionality Reduction with Neighborhood Components Analysis

Kernel Density Estimation

Kernel Density Estimation

Compare Stochastic learning strategies for MLPClassifier

Compare Stochastic learning strategies for MLPClassifier

Restricted Boltzmann Machine features for digit classification

Restricted Boltzmann Machine features for digit classification

Pipelining: chaining a PCA and a logistic regression

Pipelining: chaining a PCA and a logistic regression

Selecting dimensionality reduction with Pipeline and GridSearchCV

Selecting dimensionality reduction with Pipeline and GridSearchCV

Label Propagation digits active learning

Label Propagation digits active learning

Label Propagation digits: Demonstrating performance

Label Propagation digits: Demonstrating performance

Digits Classification Exercise

Digits Classification Exercise