找人做公司網(wǎng)站百度商城
【LLM學(xué)習(xí)之路】9月16日 第六天
損失函數(shù)
L1Loss
可以取平均也可以求和
參數(shù)解析
input (N,*) N是batchsize,星號(hào)代表可以是任意維度 不是輸入的參數(shù),只是描述數(shù)據(jù)
target 形狀要同上
MSELoss平方差
CrossEntropyLoss交叉熵
inputs的形狀要是
(N, C) N是批次大小
x = torch.tensor([0.1,0.2,0.3]) #形狀為 (3,) 的 1D 張量
y = torch.tensor([1])
x = torch.reshape(x,(1,3)) #inputs 的形狀要是 (N, C)
loss_cross = nn.CrossEntropyLoss()
result_cross = loss_cross(x,y)
反向傳播
result_loss.backward()
優(yōu)化器
套路是這樣的
optim = torch.optim.SGD(tudui.parameters(),loss=0.01)
optim.zero_grad() 進(jìn)行梯度清零
result_loss.backward() 反向傳播計(jì)算梯度
optim.step() 對(duì)模型參數(shù)進(jìn)行調(diào)優(yōu)
后面自己添加了如何使用顯卡
import torch
import torchvision.datasets
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader
# 檢查是否有 GPU 可用
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")dataset = torchvision.datasets.CIFAR10("./data",train = False,download=True,transform=torchvision.transforms.ToTensor())
dataloader = DataLoader(dataset,batch_size=1)
class Tudui(nn.Module):def __init__(self):super(Tudui,self).__init__()self.conv1 = Conv2d(3,32,5,padding=2)self.maxpool1 = MaxPool2d(2)self.conv2 = Conv2d(32,32,5,padding=2)self.maxpool2 = MaxPool2d(2)self.conv3 = Conv2d(32,64,5,padding=2)self.maxpool3 = MaxPool2d(2)self.flatten = Flatten()self.linear1 = Linear(1024,64)self.linear2 = Linear(64,10)self.model1 = Sequential(Conv2d(3, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 32, 5, padding=2),MaxPool2d(2),Conv2d(32, 64, 5, padding=2),MaxPool2d(2),Flatten(),Linear(1024, 64),Linear(64, 10))def forward(self,x):x = self.model1(x)return x
loss = nn.CrossEntropyLoss()
tudui = Tudui().to(device)
optim = torch.optim.SGD(tudui.parameters(),lr=0.01)for epoch in range(20):running_loss = 0.0for data in dataloader:imgs,targets = dataimgs,targets = imgs.to(device), targets.to(device)outputs = tudui(imgs)# print(outputs)# print(targets)result_loss = loss(outputs,targets)optim.zero_grad()result_loss.backward()optim.step()# print("ok")running_loss = result_loss + running_lossprint(running_loss)
完整的模型驗(yàn)證套路
利用已經(jīng)訓(xùn)練好的模型,然后給它提供輸入