PyTorch-YOLOv3がエラーで動かない。そんな時に確認してほしいアレの話
はじめに
前回でOSのディスク容量を増やして、いざPyTorch-YOLOv3を動かそうとしたらエラーで詰まったので、その解決方法を紹介します。
やったこと
このReadmeに従い、以下のコマンドでtest.pyを実行しました。
python3 test.py --weights_path weights/yolov3.weights
すると、以下のようなエラー。
$ python3 test.py --weights_path weights/yolov3.weights
Namespace(batch_size=8, class_path='data/coco.names', conf_thres=0.001, data_config='config/coco.data', img_size=416, iou_thres=0.5, model_def='config/yolov3.cfg', n_cpu=8, nms_thres=0.5, weights_path='weights/yolov3.weights')
Compute mAP...
Detecting objects: 0%| | 0/625 [00:00<?, ?it/s]Traceback (most recent call last):
File "test.py", line 98, in <module>
batch_size=8,
File "test.py", line 36, in evaluate
for batch_i, (_, imgs, targets) in enumerate(tqdm.tqdm(dataloader, desc="Detecting objects")):
File "/home/dluser/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm.py", line 1022, in __iter__
for obj in iterable:
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
TypeError: Traceback (most recent call last):
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dluser/PyTorch-YOLOv3/utils/datasets.py", line 96, in __getitem__
img, pad = pad_to_square(img, 0)
File "/home/dluser/PyTorch-YOLOv3/utils/datasets.py", line 23, in pad_to_square
img = F.pad(img, pad, "constant", value=pad_value)
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2159, in pad
return ConstantPadNd.apply(input, pad, value)
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/nn/_functions/padding.py", line 40, in forward
c_output = c_output.narrow(i, p[0], c_output.size(i) - p[0])
TypeError: narrow(): argument 'start' (position 2) must be int, not numpy.int64
このissueで紹介されている解決方法を試すと以下。
$ python3 test.py --weights_path weights/yolov3.weights
Namespace(batch_size=8, class_path='data/coco.names', conf_thres=0.001, data_config='config/coco.data', img_size=416, iou_thres=0.5, model_def='config/yolov3.cfg', n_cpu=8, nms_thres=0.5, weights_path='weights/yolov3.weights')
Compute mAP...
Detecting objects: 0%| | 0/625 [00:00<?, ?it/s]Traceback (most recent call last):
File "test.py", line 98, in <module>
batch_size=8,
File "test.py", line 36, in evaluate
for batch_i, (_, imgs, targets) in enumerate(tqdm.tqdm(dataloader, desc="Detecting objects")):
File "/home/dluser/anaconda3/lib/python3.7/site-packages/tqdm/_tqdm.py", line 1022, in __iter__
for obj in iterable:
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 336, in __next__
return self._process_next_batch(batch)
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dluser/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/dluser/PyTorch-YOLOv3/utils/datasets.py", line 115, in __getitem__
x1 += pad[0]
RuntimeError: Expected object of type torch.DoubleTensor but found type torch.LongTensor for argument #4 'other'
他に色々調べたり、試したりしたのですが、うまくいかずたどり着いたのが、「Pytorchのバージョン問題かも」疑惑。
そこで、Pytorchのバージョンを以下のコマンドで最新にしました(※実際に自分の環境で行う際にはPythonやCUDAのバージョンなどを確認してその環境にあったものをインストールしてください)。
$conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
issueで修正した部分を元に戻すと、以下のように無事testすることができました!
$ python3 test.py --weights_path weights/yolov3.weights --n_cpu 6
Namespace(batch_size=8, class_path='data/coco.names', conf_thres=0.001, data_config='config/coco.data', img_size=416, iou_thres=0.5, model_def='config/yolov3.cfg', n_cpu=6, nms_thres=0.5, weights_path='weights/yolov3.weights')
Compute mAP...
Detecting objects: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 625/625 [06:55<00:00, 1.51it/s]
Computing AP: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 80/80 [00:01<00:00, 55.33it/s]
Average Precisions:
+ Class '0' (person) - AP: 0.6907157868140823
+ Class '1' (bicycle) - AP: 0.46869626369935824
+ Class '2' (car) - AP: 0.5847854090492266
+ Class '3' (motorbike) - AP: 0.6173425471546101
+ Class '4' (aeroplane) - AP: 0.7368216071089109
+ Class '5' (bus) - AP: 0.7521598809811552
+ Class '6' (train) - AP: 0.754366135549987
+ Class '7' (truck) - AP: 0.4188454158138422
+ Class '8' (boat) - AP: 0.4055367418456285
+ Class '9' (traffic light) - AP: 0.4443524890561703
+ Class '10' (fire hydrant) - AP: 0.7803236133317674
+ Class '11' (stop sign) - AP: 0.7203250980406222
+ Class '12' (parking meter) - AP: 0.5318708513711929
+ Class '13' (bench) - AP: 0.3334771229094851
・・・
ちなみに私の環境に入ったPyTorchは1.1.0でした。
おわりに
今回はRequirementsにバージョンが書いてないからいけるだろと適当にやった結果痛い目をみましたが、無事PyTorch-YOLOv3を動かすことができました。
これからは、なんか動かないなと思ったらまずパッケージのバージョンを確認するところから始めたいですね。