PyTorch基础 创建tensor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 import torchimport numpy as nptorch.manual_seed(1234 ) def describe (x) : print("Type: {}" .format(x.type())) print("Shape/size: {}" .format(x.shape)) print("Values: \n{}" .format(x)) describe(torch.rand(2 ,3 )) describe(torch.randn(2 ,3 )) describe(torch.zeros(2 , 3 )) x = torch.ones(2 , 3 ) describe(x) x.fill_(5 ) describe(x) x=torch.Tensor([[1 ,2 ,3 ], [4 ,5 ,6 ]])
与numpy转换 1 2 3 4 5 6 7 8 9 10 11 12 13 npy = np.random.rand(2 ,3 ) describe(torch.from_numpy(npy)) ----- Type: torch.DoubleTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0.5332 , 0.1183 , 0.3183 ], [0.5398 , 0.3398 , 0.1708 ]], dtype=torch.float64) x = torch.tensor([[1 ,2 ,3 ],[4 ,5 ,6 ]]) y= x.numpy()
Type & Size 默认的torch.Tensor是torch.FloatTensor。
有两种方式初始化其类型:
①使用torch.FloatTensor或LongTensor创建;
②使用torch.tensor(),并提过dtype创建。
Size有两种方式 x.shape和x.size(),这两种方式几乎是等价的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 x=torch.FloatTensor([[1 ,2 ,3 ], [4 ,5 ,6 ]]) describe(x) ---------- Type: torch.FloatTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[1. , 2. , 3. ], [4. , 5. , 6. ]]) =================================== x=x.long() describe(x) ---------- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[1 , 2 , 3 ], [4 , 5 , 6 ]]) ==================================== x=x.float() ==================================== x=torch.tensor([[1 ,2 ,3 ], [4 ,5 ,6 ]],dtype=torch.int64) describe(x) print(x.size()) ---------- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[1 , 2 , 3 ], [4 , 5 , 6 ]]) torch.Size([2 , 3 ])
view 改变tensor的size
-1表示根据其他维度推测得到的。
1 2 3 4 5 6 7 x = torch.randn(4, 4) print(x.shape) print(x.view(-1,2).shape) --- torch.Size([4, 4]) torch.Size([8, 2])
Tensor运算 对于2维tensor,dimension 0表示对列操作,dimension 1 表示行。
1 2 3 4 5 6 7 8 9 10 11 x=torch.Tensor([[1 ,2 ,3 ], [4 ,5 ,6 ]]) describe(torch.add(x,x)) ---------- Type: torch.FloatTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[ 2. , 4. , 6. ], [ 8. , 10. , 12. ]])
dimension-based view ,sum
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 x=torch.arange(6 ) describe(x) ------ Type: torch.LongTensor Shape/size: torch.Size([6 ]) Values: tensor([0 , 1 , 2 , 3 , 4 , 5 ]) x=x.view(2 ,3 ) describe(x) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0 , 1 , 2 ], [3 , 4 , 5 ]]) ================================= describe(torch.sum(x,dim=0 )) ----- Type: torch.LongTensor Shape/size: torch.Size([3 ]) Values: tensor([3 , 5 , 7 ]) describe(torch.sum(x,dim=1 )) ----- Type: torch.LongTensor Shape/size: torch.Size([2 ]) Values: tensor([ 3 , 12 ])
切片,索引 concat expand
concat是拼接,expand 是维度扩展,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 x = torch.arange(6 ).view(2 ,3 ) describe(x) ------ Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0 , 1 , 2 ], [3 , 4 , 5 ]]) describe(x[:1 , :2 ]) ------ Type: torch.LongTensor Shape/size: torch.Size([1 , 2 ]) Values: tensor([[0 , 1 ]]) ======================= describe(x[0 , 1 ]) ------ Type: torch.LongTensor Shape/size: torch.Size([]) Values: 1 ============================== x = torch.arange(6 ).view(2 ,3 ) indices = torch.LongTensor([0 , 2 ]) describe(torch.index_select(x, dim=1 , index=indices)) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 2 ]) Values: tensor([[0 , 2 ], [3 , 5 ]]) ================================ indices = torch.LongTensor([0 , 0 ]) describe(torch.index_select(x, dim=0 , index=indices)) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0 , 1 , 2 ], [0 , 1 , 2 ]]) ================================== x = torch.arange(6 ).view(2 ,3 ) row_indices = torch.arange(2 ).long() col_indices = torch.LongTensor([0 , 1 ]) describe(x[row_indices, col_indices]) 索引第0 行第0 列,第1 行第1 列元素 ------ Type: torch.LongTensor Shape/size: torch.Size([2 ]) Values: tensor([0 , 4 ]) ===================================== a = np.array([[1 ,2 ,3 ], [4 ,5 ,6 ],[7 ,8 ,9 ]]) rows = np.array([0 ,0 ,2 ,2 ]) cols = np.array([0 ,2 ,0 ,2 ]) b = a[rows,cols] print(b) ---- [1 3 7 9 ] rows = np.array([[0 ,0 ],[2 ,2 ]]) cols = np.array([[0 ,2 ],[0 ,2 ]]) b = a[rows,cols] print(b) ------ [[1 3 ] [7 9 ]] =====连接===== x = torch.arange(6 ).view(2 ,3 ) describe(torch.cat([x,x], dim=0 )) ----- Type: torch.LongTensor Shape/size: torch.Size([4 , 3 ]) Values: tensor([[0 , 1 , 2 ], [3 , 4 , 5 ], [0 , 1 , 2 ], [3 , 4 , 5 ]]) describe(torch.cat([x,x], dim=1 )) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 6 ]) Values: tensor([[0 , 1 , 2 , 0 , 1 , 2 ], [3 , 4 , 5 , 3 , 4 , 5 ]]) ==================================== a=torch.rand(3 ,1 ) describe(a) describe(a.expand(3 ,4 )) =========================== describe(torch.stack([x,x])) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 2 , 3 ]) Values: tensor([[[0 , 1 , 2 ], [3 , 4 , 5 ]], [[0 , 1 , 2 ], [3 , 4 , 5 ]]])
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 x1 = torch.arange(6 ).view(2 ,3 ).float() describe(x) x2 = torch.ones(3 ,2 ) x2[:,1 ] +=1 describe(x2) ----- Type: torch.FloatTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0. , 1. , 2. ], [3. , 4. , 5. ]]) Type: torch.FloatTensor Shape/size: torch.Size([3 , 2 ]) Values: tensor([[1. , 2. ], [1. , 2. ], [1. , 2. ]]) ================= describe(torch.mm(x1,x2)) ----- Type: torch.FloatTensor Shape/size: torch.Size([2 , 2 ]) Values: tensor([[ 3. , 6. ], [12. , 24. ]]) ======= 注意矩阵相乘不能用*号 *表示哈达玛积。 当一个矩阵*向量时,会将向量复制再扩展成与矩阵相同大小。 比如 tensor([[1 , 2 , 3 ], [2 , 3 , 4 ]]) tensor([[10 , 20 ,5 ]])或tensor([10 , 20 , 5 ]) ==== out:tensor([[10 , 40 , 15 ], [20 , 60 , 20 ]]) x1 = torch.arange(6 ).view(2 ,3 ) describe(torch.transpose(x1,0 ,1 )) ----- Type: torch.LongTensor Shape/size: torch.Size([3 , 2 ]) Values: tensor([[0 , 3 ], [1 , 4 ], [2 , 5 ]]) ========================== z.permute(1 , 0 , 2 )
unsqueeze 加入1维,squeeze去掉1维
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 x1 = torch.arange(6 ).view(2 ,3 ) describe(x1) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0 , 1 , 2 ], [3 , 4 , 5 ]]) ============================== x1=x1.unsqueeze(dim=1 ) describe(x1) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 1 , 3 ]) Values: tensor([[[0 , 1 , 2 ]], [[3 , 4 , 5 ]]]) ================================= x1=x1.squeeze() describe(x1) ----- Type: torch.LongTensor Shape/size: torch.Size([2 , 3 ]) Values: tensor([[0 , 1 , 2 ], [3 , 4 , 5 ]])
1 2 3 4 5 6 7 8 9 10 11 12 13 a = torch.rand(3 , 4 , 5 ) b = torch.rand(5 , 4 ) torch.bmm(a, b.unsqueeze(0 ).expand(a.size(0 ), *b.size()))
requires_grad = True 启用梯度操作
x = torch.ones(2, 2, requires_grad=True)
CUDA Tensor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 print (torch.cuda.is_available())device = torch.device("cuda" if torch.cuda.is_available() else "cpu" ) print (device)----- True cuda ============================= x = torch.rand(3 , 3 ).to(device) describe(x) ----- Type: torch.cuda.FloatTensor Shape/size: torch.Size([3 , 3 ]) Values: tensor([[0.0290 , 0.4019 , 0.2598 ], [0.3666 , 0.0583 , 0.7006 ], [0.0518 , 0.4681 , 0.6738 ]], device='cuda:0' ) =============================== x = torch.rand(3 , 3 ).to(device) y = torch.rand(3 , 3 ) x + y ----- x + y RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument ========================= device = torch.device("cuda" if torch.cuda.is_available() else "cpu" ) x = torch.rand(3 , 3 ).to(device) y = torch.rand(3 , 3 ) cpu_device = torch.device("cpu" ) y = y.to(cpu_device) x = x.to(cpu_device) print( x + y) ----- tensor([[0.3604 , 1.1856 , 0.8229 ], [1.1415 , 0.8791 , 0.9799 ], [0.7335 , 0.7518 , 1.3305 ]])
如果您有多个CUDA可见设备(即多个GPU),最佳做法是CUDA_VISIBLE_DEVICES在执行程序时使用环境变量,如下所示:
1 CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py
返回两个矩阵批处理乘积 (a=torch.rand(3,4,5), b=torch.rand(5,4)).
1 2 3 4 5 6 7 a = torch.rand(3 , 4 , 5 ) b = torch.rand(5 , 4 ) torch.bmm(a, b.unsqueeze(0 ).expand(a.size(0 ), *b.size()))
Other torch.sort() 返回排序后元素和其初始index
torch.eq() 1 2 3 4 5 6 7 8 9 10 11 12 13 y_pred=torch.Tensor([[1 ,2 ,3 ,2 ], [3 ,4 ,5 ,6 ]]) print(y_pred) _, y_pred_indices = y_pred.max(dim=1 ) print(y_pred_indices) y_target = torch.Tensor([0 , 3 ]) n_correct = torch.eq(y_pred_indices.long(), y_target.long()) print(n_correct) ----- tensor([[1. , 2. , 3. , 2. ], [3. , 4. , 5. , 6. ]]) tensor([2 , 3 ]) tensor([0 , 1 ], dtype=torch.uint8)
tensor转换 1 tensorA.numpy().tolist()
神经网络 numpy手写神经网络
pytorch的自动求导
pytorch的sequential
pytorch的nn.Module
尽在 https://github.com/sbaban/Machine-Learning/blob/master/pytorch/pytorch%E6%A1%86%E6%9E%B6.ipynb
框架一般包括五步
1 2 3 4 5 6 7 8 9 10 11 import torch.optim as optimoptimizer = optim.Adam(net.parameters(), lr=0.01 ) optimizer.zero_grad() output = net(input) loss = criterion(output, target) loss.backward() optimizer.step()
nn.Linear() 线性变换
1 2 3 nn.Linear(in_features, out_features,bias=True ) in_features: size of each input sample out_features: size of each output sample
$$ y = xw^T + b $$
1 2 3 4 5 6 7 8 m = torch.nn.Linear(20 , 30 ) input = torch.randn(128 , 20 ) output = m(input) print(output.size()) print(m.weight.shape) ----- torch.Size([128 , 30 ]) torch.Size([30 , 20 ])
为什么输入可以是【batch_size, x】?
nn.Linear()线性变化只针对最后一维,只要保证最后一维是输入就行
http://digtime.cn/articles/159/pytorch-zhong-nn-linear-han-shu-jie-du
nn.CrossEntropyLoss crossEntropy自动做了logsoftmax,所以输入数据不必做logsoftmax。
若是batch数据,默认返回平均值,reduce参数决定
nn.CrossEntropyLoss()这个损失函数用于多分类问题虽然说的是交叉熵。nn.CrossEntropyLoss()是nn.logSoftmax()和nn.NLLLoss()的整合,可以直接使用它来替换网络中的这两个操作.
假设输入的size是(mini batch,C)。其中,C是类别数,损失函数计算如下
$$loss(x,class) = -log(\frac{exp(x_{class})}{\sum_jexp(x_j)}) =-x_{class} + log(\sum_jexp(x_j))$$
pytorch中log是以e为底 。(干,我说怎么和我手算的对不上)
损失函数中也有权重weight参数设置,若设置权重,则公式为: $$loss(x,class) = weight·(-log(\frac{exp(x_{class})}{\sum_jexp(x_j)})) =weight·(-x_{class} + log(\sum_jexp(x_j))$$
举个栗子,我们一共有三种类别,批量大小为1(为了好计算),那么输入size为(1,3),具体值为torch.Tensor([[-0.7715, -0.6205,-0.2562]])。标签值为target = torch.tensor([0]),这里标签值为0,表示属于第0类。loss计算如下:
1 2 3 4 5 6 7 8 9 10 11 import torchimport torch.nn as nnimport mathentroy=nn.CrossEntropyLoss() input=torch.Tensor([[-0.7715 , -0.6205 ,-0.2562 ]]) target = torch.tensor([0 ]) output = entroy(input, target) print(output) ==== tensor(1.3447 )
−x[0]+log(exp(x[0])+exp(x[1])+exp(x[2]))
=0.7715+log(exp(−0.7715)+exp(−0.6205)+exp(−0.2562)=1.3447
我们在看看是否等价nn.logSoftmax() 和nn.NLLLoss() 的整合:
1 2 3 4 m = nn.LogSoftmax() loss = nn.NLLLoss() mysoftmax=m(input) output = loss(mysoftmax, target)
nn.logSoftmax() $$log(\frac{exp(x)}{\sum_iexp(x[i])})$$
x是输入向量,得到的也是向量
nn.NLLLoss() 输入【【样本数,类别数】,target的真正label索引】
,label索引取值范围[0,类别数减一]tensor类型
负对数似然损失函数(Negative Log Likelihood) 在前面接上一个 LogSoftmax 层就等价于交叉熵损失了
$$loss_n= -w_nx_{n,y_n}$$
这里没有设置权重w,默认为1,$x_{n,y_n}$表示输入x所对应的真正label索引,是tensor类型的,loss=-1*x[0]=1.3447
另一个例子
1 2 3 4 5 6 7 8 >>> m = nn.LogSoftmax(dim=1 )>>> loss = nn.NLLLoss()>>> >>> input = torch.randn(3 , 5 , requires_grad=True )>>> >>> target = torch.tensor([1 , 0 , 4 ])>>> output = loss(m(input), target)>>> output.backward()
nn.Embedding() 1 2 3 4 5 torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False) num_embeddings是整个词汇表的大小 embedding_dim是embedding_dim padding_idx(int):default 0; Embedding will not use this index 表示要填充0的行的索引,所谓padding,就是在将不等长的句子组成一个batch时,对那些空缺的位置补0,以形成一个统一的矩阵。
输入到nn.Embedding里的是整数索引
若输入维度是【batchsize】,则输出【batchsize,embeddingdim】
索引这张表时,你必须使用torch.LongTensor
(因为索引是整数,不是浮点数)。
1 2 3 4 5 word_to_ix = {"hello" : 0 , "world" : 1 } embeds = nn.Embedding(2 , 5 ) lookup_tensor = torch.tensor([word_to_ix["hello" ]], dtype=torch.long) hello_embed = embeds(lookup_tensor) print(hello_embed)
1 2 3 4 5 6 7 8 myinput = torch.randn(2 ,3 ).long() print(myinput.shape) embeds = nn.Embedding(10 , 5 ) embeded = embeds(myinput) print(embeded.shape) --- torch.Size([2 , 3 ]) torch.Size([2 , 3 , 5 ])
nn.LSTM() 具体见 https://sbaban.com/pytorch2019.html
1 2 3 nn.LSTM()参数: input_size, hidden_size,num_layers=1, bias=True, batch_first=False, dropout=0., bidirectional=False
关于几个参数 seqlength,timestep,hiddensize,batchsize
timestep/seqlength 一个东西:
时间步长,也可以理解为展开的rnn或者lstm的block的个数;或者说
每个句子的长度(句子中单词的个数
输入 – input (seq_len, batch, input_size)
seq_len:每个句子的长度(句子中单词的个数),batch:每次喂多少个句子,input_size:每个单词用多少维的向量表示。
– h_0 (num_layers * num_directions, batch, hidden_size) – c_0 (num_layers * num_directions, batch, hidden_size)
$h_0$和$c_0$是隐藏层的初始状态。num_directions取值为1或2,表示是否为双向LSTM
输出 – output (seq_len, batch, num_directions * hidden_size) – h_n (num_layers * num_directions, batch, hidden_size) – c_n (num_layers * num_directions, batch, hidden_size)
1 2 3 4 lstm = nn.LSTM() out, (hn,cn)=lstm(输入样本,(h0,c0))
nn.Conv1d() 这个模块常用来保存词嵌入和用下标检索它们。模块的输入是一个下标的列表,输出是对应的词嵌入。
输入下标0,输出就是embeds矩阵中第0行。
根据数据性质,卷积可以是一维、二维或三维的,分别由Conv1d、Conv2d和Conv3d模块实现 。
换句话说,输入是(600,300),Conv1D(kernel_size=3)实际就是Conv2D(kernel_size=(3,300)),当然必须把输入也reshape成(600,300,1),即可在多行上进行Conv2D卷积。600为seqlength
NLP中的卷积运算大多是一维的卷积 .
1 2 torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1 , padding=0 , dilation=1 , groups=1 , bias=True )
in_channels(int) – 输入信号的通道。在文本分类中,即为词向量的维度embedding dim ,若是one-hot则是Vocabulary的大小。 out_channels(int) – 卷积产生的通道。有多少个out_channels,就需要多少个1维卷积 kerner_size(int or tuple) - 卷积核的尺寸,卷积核的大小为(k,),第二个维度是由in_channels来决定的,所以实际上卷积大小为kerner_size*in_channels stride(int or tuple, optional) - 卷积步长,默认为1 padding (int or tuple, optional)- 输入的每一条边补充0的层数 dilation(int or tuple, `optional``) – 卷积核元素之间的间距 groups(int, optional) – 从输入通道到输出通道的阻塞连接数 bias(bool, optional) - 如果bias=True,添加偏置
注意:
1 2 3 4 5 6 7 8 9 10 11 12 13 batch_size = 2 one_hot_size = 8 sequence_length = 5 data = torch.randn(batch_size, one_hot_size, sequence_length) conv1 = nn.Conv1d(in_channels=one_hot_size, out_channels=6 ,kernel_size=3 ) intermediate1 = conv1(data) print(data.size()) print(intermediate1.size()) ----- torch.Size([2 , 8 , 5 ]) torch.Size([2 , 6 , 3 ])
Note: RNNs want the input with the batch dimension second, whereas CNNs want the batch dimension first.
if your sentence(s) are shorter than the largest filter used. You will then have to pad your sentences to the length of the largest filter
F.avg_pool1d() F.avg_pool1d() 平均池化操作作用于一维,input的维度是三维比如[2,2,7]。F.avg_pool1d()中kernel_size是3,步长是2表示每三个数取平均,每隔两个数取一次.比如[1,3,3,4,5,6,7]安照3个数取均值,两步取一次,那么结果就是[ 2.3333 ,4 ,6 ],也就是核是一维的,也只作用于一个维度。按照池化操作计算公式input size为[2,2,7],kernel size为3,步长为2,则输出维度计算(7-3)/2+1=3所以输出维度是[2,2,3],这与输出结果是一致的。
nn.Sequential 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 class Net (torch.nn.Module) : def __init__ (self, n_feature, n_hidden, n_output) : super(Net, self).__init__() self.hidden = nn.Linear(n_feature, n_hidden) self.predict = nn.Linear(n_hidden, n_output) def forward (self, x) : x = torch.relu(self.hidden(x)) x = self.predict(x) return x net1 = Net(1 , 10 , 1 ) print(net1) ----- Net( (hidden): Linear(in_features=1 , out_features=10 , bias=True ) (predict): Linear(in_features=10 , out_features=1 , bias=True ) )
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 class Net (torch.nn.Module) : def __init__ (self, n_feature, n_hidden, n_output) : super(Net, self).__init__() self.net = nn.Sequential( nn.Linear(n_feature, n_hidden), nn.ReLU(), nn.Linear(n_hidden, n_output) ) def forward (self, x) : x = self.net(x) return x net2 = Net(1 ,10 ,1 ) print(net2) ----- Net( (net): Sequential( (0 ): Linear(in_features=1 , out_features=10 , bias=True ) (1 ): ReLU() (2 ): Linear(in_features=10 , out_features=1 , bias=True ) ) )
反正我是没觉得变简单了。。。
PackedSequences 用RNN(包括LSTM\GRU等)做NLP任务时,对于同一个batch内的短句子一般需要padding补齐,这些padding的字符一般不应该拿去算output、hidden state、loss…的。
Re:https://wemedia.ifeng.com/52703165/wemedia.shtml
clip_grad_norm() 梯度裁剪,解决梯度爆炸。只在训练时使用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 # define your sequence model model = .. # define loss function loss_function = .. # training loop for _ in ...: ... model.zero_grad() output, hidden = model(data, hidden) loss = loss_function(output, targets) loss.backward() torch.nn.utils.clip_grad_norm(model.parameters(), 0.25)#0.25应该是阈值, ...
RE:
https://blog.csdn.net/faner1994/article/details/78293650
函数 torch.multinomial multinomial 多项式分布概率采样(input, num_samples, replacement=False)
input张量可以看成一个权重 张量(不是输入样本),每一个元素代表其在该行中的权重,权重大的容易被采样。
如果有元素为0,那么在其他不为0的元素,被取干净之前,这个元素是不会被取到的。
抽取numsamples个,返回其元素对应的索引 ,若input有m行,则抽取m×numsamples个,True表示有放回的采样
https://blog.csdn.net/monchin/article/details/79787621
或我的github
https://github.com/sbaban/Machine-Learning/blob/master/pytorch/bilibli/3-skip-gram-pytorch.ipynb
库 torch.text torchtext包含以下组件:
Field :主要包含以下数据预处理的配置信息,比如指定分词方法,是否转成小写,起始字符,结束字符,补全字符以及词典等等
Dataset :继承自pytorch的Dataset,用于加载数据,提供了TabularDataset可以指点路径,格式,Field信息就可以方便的完成数据加载。同时torchtext还提供预先构建的常用数据集的Dataset对象,可以直接加载使用,splits方法可以同时加载训练集,验证集和测试集。
Iterator : 主要是数据输出的模型的迭代器,可以支持batch定制
torchtext预处理流程:
定义Field:声明如何处理数据
定义Dataset:得到数据集,此时数据集里每一个样本是一个 经过 Field声明的预处理 后的wordlist
建立vocab:在这一步建立词汇表,词向量(word embeddings)
构造迭代器:构造迭代器,用来分批次训练模型