开发工具:
文件大小: 756kb
下载次数: 0
上传时间: 2019-06-28
详细说明:PyTorch官方文档,详细介绍了pytorch的使用。Notes
1 Autograd mechanics
2 CUDA semantics
7
3 Extending Py tor
4 Multiprocessing best practices
5 Serialization semantics
17
6 torch
19
7 torch Tensor
8 torch. Storage
orch.nn
113
10 torch.nn. functional
167
torch.nn. init
179
12 torch optim
13 Automatic differentiation package -torch. autograd
14 Multiprocessing package- torch multiprocessing
197
15 Legacy package -torch. legacy
199
16 torch cuda
17 torch.utils.fifi
205
18 torch.utils.data
207
19 torch utils. model zoo
209
20 torchvisit
211
21 torchvision, datasets
213
22 torchvision models
217
23 torchvision, transforms
221
24 torchvision utils
25 Indices and tables
225
Python
227
yTorch Documentation, 0.1.11_5
Py Torch is an optimized tensor library for deep learning using GPUs and CPUs
Notes
yTorch Documentation, 0.1.115
Notes
CHAPTER 1
Autograd mechanics
This note will present an overview of how autograd works and records the operations. It's not strictly necessary
to understand all this, but we recommend getting familiar with it, as it will help you write more efficient, cleaner
programs, and can aid you in debugging
Excluding subgraphs from backward
Every Variable has two Hags: requires_ grad and volatile. They both allow for fine grained exclusion of
subgraphs from gradient computation and can increase efficiency
requires grad
If there's a single input to an operation that requires gradient, its output will also require gradient. Conversely, only
if all inputs dont require gradient, the output also wont require it. Backward computation is never performed in the
subgraphs, where all Variables didnt require gradients
>>>x= Variable(torch randn(5, 5))
>> y= Variable(torch randn(5, 5))
>>> z- Variable(torch randn(5, 5), requires grad=True
>>>a=x+Y
>> arequires grad
False
>>>b=a+z
>>>b requires grad
士ue
This is especially useful when you want to freeze part of your model, or you know in advance that you're not going
to use gradients w r t. some parameters. For example if you want to finetune a pretrained CNn, it's enough to switch
the requires_ grad fags in the frozen base, and no intermediate buffers will be saved, until the computation gets
to the last layer, where the affine transform will use weights that require gradient, and the output of the network will
also require them
3
yTorch Documentation, 0.1.115
model- torchvis-on. models. resnetl8(pretrained=True)
for param in mode. parameters(
param requires grad = False
Replace the ! ast fully-connected layer
Parameters of newly constructed moduies have requires grad=True by default
del.fc= nn. Linear(512, 100)
f Cptimizc only the classifier
optimizer =optin SGD(model fc parameters(), lr=le-2, momentum=C 9)
volatile
Volatile is recommended for purely inference mode, when you're sure you won't be even calling backward). It's
more efficient than any other autograd setting- it will use the absolute minimal amount of memory to evaluate the
model. volatile also determines that requires_ grad is False
Volatile differs from requires_ grad in how the flag propagates. If there's even a single volatile input to an operation
its output is also going to be volatile. Volatility spreads accross the graph much easier than non-requiring gradient
you only need a single volatile leaf to have a volatile output, while you need all leaves to not require gradient to
have an output the doesn t require gradient. Using volatile flag you don' t need to change any settings of your model
parameters to use it for inference. It's enough to create a volatile input, and this will ensure that no intermediate states
are saved
>> regular_input Variable(torch. randn (5, 5)
>> volatile_input Variable(torch randn(5, 5), volatilc-True)
>> moc
torchvision. models. resnetl8(pretrained=True)
>> mo
(regular input).requires grad
T
>> model(volati-e input) requires crad
上aise
>> mode I(vol ati input).volatile
True
>> model(volati-einput) creator is None
T
How autograd encodes the history
Each Variable has a creator attribute, that points to the function, of which it is an output. This is an entry point
to a directed acyclic graph (dag consisting of Function objects as nodes, and references between them being
the edges. Every time an operation is performed, a new Function representing it is instantiated, its forward()
method is called, and its output Variable s creators are set to it. Then, by following the path from any Variable to
the leaves, it is possible to reconstruct the sequence of operations that has created the data, and automatically compute
the gradients
An important thing to note is that the graph is recreated from scratch at every iteration, and this is exactly what allows
for using arbitrary Python control flow statements, that can change the overall shape and size of the graph at every
iteration. You dont have to encode all possible paths before you launch the training -what you run is what you
differentiate
Chapter 1. Autograd mechanics
yTorch Documentation, 0.1.11_5
In-place operations on Variables
Supporting in-place operations in autograd is a hard matter, and we discourage their use in most cases. Autograd's
aggressive buffer freeing and reuse makes it very efficient and there are very few occasions when in-place operations
actually lower memory usage by any significant amount. Unless you re operating under heavy memory pressure, you
might never need to use them
There are two main reasons that limit the applicability of in-place operations
1. Overwriting values required to compute gradients. This is why variables dont support log_. Its gradient
formula requires the original input, and while it is possible to recreate it by computing the inverse operation, it
is numerically unstable, and requires additional work that often defeats the purpose of using these functions
2. Every in-place operation actually requires the implementation to rewrite the computational graph. Out-of-place
versions simply allocate new objects and keep references to the old graph, while in-place operations, require
changing the creator of all inputs to the Function representing this operation. This can be tricky, especially if
there are many Variables that reference the same storage (e. g. created by indexing or transposing), and in-place
unctions will actually raise an error if the storage of modified inputs is referenced by any other variable
n-place correctness checks
Every variable keeps a version counter, that is incremented every time it's marked dirty in any operation. When a
Function saves any tensors for backward, a version counter of their containing Variable is saved as well. Once you
access self. saved tensors it is checked, and if it's greater than the saved value an error is raised
1.3. In-place operations on variables
5
yTorch Documentation, 0.1.115
Chapter 1. Autograd mechanics
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.