文件名称:
scikit-learn user guide Release 0.20.3 API
开发工具:
文件大小: 46mb
下载次数: 0
上传时间: 2019-03-24
详细说明:scikit-learn user guide Release 0.20.3 官方文档APICONTENTS
1 Welcome to scikit-learn
1.1 Installing scikit-learn
1.2 Frequently Asked Questions
2
1.3 Suppo
8
1. 4 Related Projects
“·
1. 5 About us
12
1.6 Who is using scikit-learn?
16
1. Release histor
25
1. 8 Version o20.3
5
1. 9 Version 0.20.2
7
L10 Version 0.20.1
28
1.11 Version o200
1.12 Version 0.19.2
47
1.13 ersion0.19.1
47
1. 14 Version 0.19
49
1. 15 Previous releases
60
2 scikit-learn Tutorials
129
2.1 An introduction to machine learning with scikit-learn
l29
2.2 A tutorial on statistical-learning for scientific data processing
135
2.3 Working With Text Data
163
2.4 Choosing the right estimator
171
2.5 External Resources. Videos and Talks
.171
3 User guide
173
3. 1 Supervised learnin
3.2 Unsupervised learning
307
3.3 Model selection and evaluation
402
3.4 Dataset transformations
.535
3.5 Dataset loading utilities
3.6 Computing with scikit-learn
608
4 Glossary of Common Terms and API Elements
623
4.1 General Concepts
623
4.2 Class APIs and Estimator Ty
4.3 Target Types
634
4.4 Methe
636
4.5 Parameters
638
4.6 Attributes
641
4.7 Data and sample properties
642
5 Examples
5.1 Miscellaneous examples
.643
5.2 Examples based on real world datasets
678
5.3 Biclustering
739
5.4 Calibration
751
5.5 Classification
769
5.6
Clustering
5.7 Pipelines and composite estimators
·
..866
5.8 Covariance estimation
...901
5.9 Cross decomposition
916
5.10 Dataset examples
920
5.11 Decomposition
92
5.12 Ensemble methods
975
5.13 Tutorial exercises
.....1030
5. 14 Feature Selection
1038
5.15 Gaussian Process for Machine learning
1048
5.16 Generalized Linear Models
..1077
5.17 Manifold learning
..1164
6.18 Gaussian mixture models
l193
5.19 Model selection
1210
5.20 Multioutput methods
,1258
5.21 Nearest Neighbors
1261
5.22 Neural Networks
1283
5.23 Preprocessing
5.24 Semi Supervised Classification
..1322
5.25 Support Vector Machines
5.26 Working with text documents
5. 27 Decision Trees
1382
6 API Reference
1393
6.1 sklearn base: Base classes and utility functions
.1393
6.2 sklearn calibration: Probability Calibration
1400
6.3 sklearn. cluster: Clustering
..1404
6.4 sklearn cluster bicluster: Biclustering
...1443
6.5 sklearn compose: Composite Estimators
1450
6.6 sklearn covariance: Covariance Estimators
.....1457
6.7 sklearn cross decomposition: Cross decomposition
“
1488
6.8 sklearn. datasets: Datasets
150l
6.9 sklearn decomposition: Matrix Decomposition
1546
6.10 sklearn. discriminant analysis: Discriminant Analysis
.1598
6.11 sklearn dummy: Dummy estimators
1606
6.12 sklearn. ensemble: Ensemble methods
1611
6.13 sklearn. exceptions: Exceptions and warnings
.1641
6.14 sklearn. feature extraction: Feature Extraction
1646
6.15 sklearn. feature selection: Feature Selection
1673
6.16 sklearn gaussian_process: Gaussian Processes
......1708
6. 17 sklearn. isotonic: Isotonic regression
1747
6. 18 sklearn imput e: Impute
1752
6.19 sklearnkernel_approximation Kernel Approximation
1756
6.20 sklearn. kernel ridge Kernel Ridge Regression
1766
6.21 sklearn, linear model, Generalized Linear models
1769
6.22 sklearn manifold: Manifold Learning
1865
6.23 sklearn. metrics: Metrics
1884
6.24 sklearn, mixture: Gaussian mixture models
1951
6.25 sklearn. model selection: Model Selection
,,.,,1962
6.26 sklearn, multiclass: Multiclass and multilabel classification
.2016
6.27 sklearn. multioutput: Multioutput regression and classification
2024
6.28 sklearn. naive bayes: Naive Bayes
6.29 sklearn. neighbors: Nearest Neighbors
2048
6.30 sklearn. neural network: Neural network models
2095
6.31 sklearn pipeline: Pipeline
2107
6.32 sklearn. preprocessing: Preprocessing and Normalization
..2116
6.33 sklearn random_projection: Random projection
2171
6.34 sklearn. semi_ supervised Semi-Supervised Learning
2177
6.35 sklearn. svm: Support Vector Machines
2183
6.36 sklearn. tree: Decision Trees
2214
6.37 sklearn utils: Utilities
2238
6.38 Recently deprecated
2265
7 Developer’ s Guide
2293
7.1 Contributing
.2293
7.2D
Tips and Tricks
2314
7.3 Utilities for Developers
2318
7.4 How to optimize for speed
2321
7. 5 Advanced installation instructions
2327
7.6 Maintainer core-developer information
.2332
Bibliography
2335
Index
2343
CHAPTER
ONE
WELCOME TO SCIKIT-LEARN
1.1 Installing scikit-learn
Note: If you wish to contribute to the project, it's recommended you install the latest development version
1.1.1 Installing the latest release
Scikit-learn requires
Python(>=2.7or>=3.4),
NumPy(>=1.8.2)
SciPy(>=0.13.3)
Warning: Scikit-learn 0.20 is the last version to support Python 2.7 and Python 3. 4. Scikit-learn 0.21 will require
Python 3.5 or newer
If you already have a working installation of numpy and scipy, the easiest way to install scikit-learn is using pip
pip install -U scikit-learn
or conda
conda install scikit-learn
If you have not installed NumPy or SciPy yet, you can also install these using conda or pip. When using pip, please
ensure that binary wheels are used. and NunPy and SciPy are not recompiled from source, which can happen when
using particular configurations of operating system and hardware(such as Linux on a Raspberry P1). Building numpy
and scipy from source can be complex(especially on Windows) and requires careful configuration to ensure that they
link against an optimized implementation of linear algebra routines. Instead, use a third-party distribution as described
If you must install scikit-learn and its dependencies with pip, you can install it as scikit-learn[alldeps]. The
most common use case for this is in a requirements. txt file used as part of an automated build process for a
Paas application or a docker image. This option is not intended for manual installation from the command line
scikit-learn user guide, release 0.20.3
Note: For installing on PyPy, PyPy3-v5.10, Numpy 1. 14.0+, and scipy 1.1.0+ are required
For installation instructions for more distributions see other distributions. For compiling the development version
from source, or building the package if no distribution is available for your architecture, see the Advanced installation
instructions
1.1.2 Third-party Distributions
If you dont already have a python installation with numpy and scipy, we recommend to install either via your package
manager or via a python bundle. These come with numpy, scipy, scikit-learnl, matplotlib and many other helpful
scientific and data processing libraries
Available options are
Canopy and Anaconda for all supported platforms
Canopy and Anaconda both ship a recent version of scikit-learn, in addition to a large set of scientific python library
for Windows Mac osX and Linux
Anaconda offers scikit-learn as part of its free distribution
Warning: To upgrade or uninstall scikit-learn installed with Anaconda or conda you should not use the pip
command. Instead
To upgrade scikit-learn
conda update scikit-learn
To uninstall scikit-learn:
conda remove scikit-learn
likely fail to properly remove files installed by the conda command ia
Upgrading with pip install -U scikit-learn or uninstalling pip uninstall scikit-learn is
pip upgrade and uninstall operations only work on packages installed via pip install
WinPython for Windows
The Win Python project distributes scikit-learn as an additional plugin
1.2 Frequently Asked Questions
Here we try to give some answers to questions that regularly pop up on the mailing list
1.2.1 What is the project name(a lot of people get it wrong)?
scikit-learn, but not scikit or SciKit nor sci-kit learn. Also not scikits learn or scikits-learn, which were previously
Chapter 1. Welcome to scikit-learn
scikit-learn user guide, release 0.20.3
1.2.2 How do you pronounce the project name?
Sy-kit learn. sci stands for science!
1.2.3 Why scikit?
Therearemultiplescikits,whicharescientifictoolboxesbuiltaroundScipy.Youcanfindalistathttps://scikits
appspot. com/scikits. Apart from scikit-learn, another popular one is scikit-image
1. 2. 4 How can I contribute to scikit-learn?
See Contributing. Before wanting to add a new algorithm, which is usually a major and lengthy undertaking, it is
recommended to start with known issues. Please do not contact the contributors of scikit-learn directly regarding
contributing to scikit-le
1.2.5 What's the best way to get help on scikit-learn usage?
For general machine learning questions, please use Cross validated with the [machine-learning] tag
For scikit-learn usage questions, please use Stack Overflow with the [scikit-learn] and [python] tags. You
can alternatively use the mailing list
Please make sure to include a minimal reproduction code snippet ( ideally shorter than 10 lines that highlights your
problem on a toy dataset (for instance from sklearn dataset s or randomly generated with functions of numpy
random with a fixed random seed). Please remove any line of code that is not necessary to reproduce your problem
The problem should be reproducible by simply copy-pasting your code snippet in a Python shell with scikit-learn
installed. Do not forget to include the import statements
More guidance to write good reproduction code snippets can be found at
https://stackoverflow.com/help/mcve
If your problem raises an exception that you do not understand(even after googling it), please make sure to include
the full traceback that you obtain when running the reproduction script
For bug reports or feature requests, please make use of the issue tracker on GitHub.
There is also a scikil-learn Gitter channel where some users and developers night be found
Please do not email any authors directly to ask for assistance report bugs, or for any other issue related to
scikit-learn
1.2.6 How should I save, export or deploy estimators for production?
See Model persistence
1.2.7 How can I create a bunch object?
Dont make a bunch object! They are not part of the scikil-learn API. Bunch objects are just a way to package some
numpy arrays. As a scikit-learn user you only ever need numpy arrays to feed your model with data
For instance to train a classifier, all you need is a 2D array X for the input variables and a ID array y for the target
variables. The array X holds the features as columns and samples as rows. The array y contains integer values to
encode the class membership of each sample in X
1.2. Frequently Asked Questions
3
scikit-learn user quide, Release 0.20.3
1.2.8 How can I load my own datasets into a format usable by scikit-learn?
Generally, scikit-learn works on any numeric data stored as numpy arrays or scipy sparse matrices. Other types that
are convertible to numeric arrays such as pandas dataframe are also acceptable
For more information on loading your data files into these usable data structures, please refer to loading external
datasets
1.2.9 What are the inclusion criteria for new algorithms
We only consider well-established algorithms for inclusion. a rule of thumb is at least 3 years since publication, 200+
citations and wide use and usefulness. a technique that provides a clear-cut improvement(e. g. an enhanced data
structure or a more efficient approximation technique )on a widely-used method will also be considered for inclusion
From the algorithms or techniques that meet the above criteria, only those which fit well within the current APi of
scikit-learn, that is a fit, predict/transform interface and ordinarily having input/output that is a numpy array
or sparse matrix, are accepted
The contributor should support the importance of the proposed addition with research papers and/or implementations
in other similar packages, demonstrate its usefulness via common use-cases/applications and corroborate performance
improvements, if any, with benchmarks and/or plots. It is expected that the proposed algorithm should outperform the
methods that are already implemented in scikit-learn at least in some areas
Also note that your implementation need not be in scikit-learn to be used together with scikit-learn tools. You can
implement your favorite algorithm in a scikit-learn compatible way, upload it to github and let us know We will be
happy to list it under Related Projects. If you already have a package on GitHub following the scikit-learn API,you
may also be interested to look at scikit-learnl-contrib
1.2.10 Why are you so selective on what algorithms you include in scikit-learn?
Code is maintenance cost, and we need to balance the amount of code we have with the size of the team (and add to
this the fact that complexity scales non linearly with the number of features ). The package relies on core developers
using their free time to fix bugs, maintain code and review contributions. Any algorithm that is added needs future
attention by the developers, at which point the original author might long have lost interest. See also What are the
inclusion criteria for new algorithms ? For a great read about long-term maintenance issues in open-source software
look at the Executive Summary of roads and bridges
1.2. 11 Why did you remove HMMs from scikit-learn?
See Will you add graphical models or sequence prediction to scikit-learn
1.2.12 Will you add graphical models or sequence prediction to scikit-learn?
Not in the foreseeable future. scikit-learn tries to provide a unified API for the basic tasks in machine learning, with
pipelines and meta-algorithms like grid search to tie everything together. The required concepts, APIs, algorithms
and expertise required for structured learning are different from what scikit-learn has to offer. If we started doing
rbitrary structured learning, we'd need to redesign the whole package and the project would likely collapse under its
own weight
There are two project with API similar to scikit-learn that do structured prediction
pystruct handles general structured learning (focuses on SS VMs on arbitrary graph structures with approximate
inference, defines the notion of sample as an instance of the graph structure
Chapter 1. Welcome to scikit-learn
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.