坑
指定GPU
先用nvidia-smi
查看gpu负载状态,然后跑torch的时候,指定可见的gpu ID就行
在运行python blablabla
前面加一句变成CUDA_VISIBLE_DEVICES=0,1,2 python blablabla
,意义是可见的gpu有0,1,2号,而且torch会按照0,1,2这个顺序来使用gpu
链接
optimizer 和 cuda 的 bug
大概出现这种错时
File "xxx", line 128, in update
self.optimizer.step()
File "/home/xalanq/python/lib/python3.6/site-packages/torch/optim/adamax.py", line 75, in step
exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'other'
加上类似这份代码即可
self.optimizer.load_state_dict(state_dict['optimizer'])
if self.opt['cuda']:
for state in self.optimizer.state.values():
for k, v in state.items():
if torch.is_tensor(v):
state[k] = v.cuda()