一、问题描述
两个问题一并解决:
<1>
Traceback (most recent call last):
File "run_trainer_ernie_gen.py", line 120, in <module>
paddle.set_device(trainer_params.get("PADDLE_PLACE_TYPE", "cpu"))
File "/opt/conda/envs/ERNIE-GEN/lib/python3.7/site-packages/paddle/device/__init__.py", line 204, in set_device
place = _convert_to_place(device)
File "/opt/conda/envs/ERNIE-GEN/lib/python3.7/site-packages/paddle/device/__init__.py", line 127, in _convert_to_place
raise ValueError("The device should not be 'gpu', "
ValueError: The device should not be 'gpu', since PaddlePaddle is not compiled with CUDA
<2>
RuntimeError: (PreconditionNotMet) Cannot load cudnn shared library. Cannot invoke method cudnnGetVersion.
[Hint: cudnn_dso_handle should not be null.] (at /paddle/paddle/phi/backends/dynload/cudnn.cc:59)
问题总结:paddlepaddle调用不了GPU
二、问题解决
很多博客说安装cudatoolkit=10.2就可以解决,但是我不行(可能因为我是用的docker 容器)
如果在本机上安装,可以直接conda安装:
conda install cudatoolkit=10.2
而docker中我选择直接去官网下载:
https://developer.nvidia.com/rdp/cudnn-archive
下载内容为:
切记!一定要选用这个!
下载完毕,移动到/usr/local/目录下:
tar -xvf cudnn-10.2-linux-x64-v7.6.5.32.tgz
sudo cp cuda/lib64/* /usr/local/cuda-10.2/lib
sudo cp cuda/lib64/* /usr/local/cuda-10.2/lib64/
解压好CUDA后,下面配置path:
export PATH=/usr/local/cuda-10.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64$LD_LIBRARY_PATH
这时候已经大功告成了!
但是我这时候报这个错误:
检查一圈发现需要安装 paddlepaddle-gpu:
pip install paddlepaddle-gpu
大功告成!
成功的样子: