Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with pointnet2_ops #19

Open
matthiasjaeger95 opened this issue Apr 4, 2023 · 8 comments
Open

Problem with pointnet2_ops #19

matthiasjaeger95 opened this issue Apr 4, 2023 · 8 comments

Comments

@matthiasjaeger95
Copy link

Hello,

i wanna train your network with my own dataset and ran into that error:

Loaded compiled 3D CUDA chamfer distance
  0%|                                                                                                                                                                                                   | 0/87 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 165, in <module>
    train(config)
  File "train.py", line 104, in train
    pcds_pred = model(partial)
  File "/home/matthias/anaconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "../models/model_completion.py", line 137, in forward
    feat = self.feat_extractor(point_cloud)
  File "/home/matthias/anaconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "../models/model_completion.py", line 32, in forward
    l1_xyz, l1_points, idx1 = self.sa_module_1(l0_xyz, l0_points)  # (B, 3, 512), (B, 128, 512)
  File "/home/matthias/anaconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "../models/utils.py", line 375, in forward
    new_xyz, new_points, idx, grouped_xyz = sample_and_group_knn(xyz, points, self.npoint, self.nsample, self.use_xyz, idx=idx)
  File "../models/utils.py", line 316, in sample_and_group_knn
    new_xyz = gather_operation(xyz, furthest_point_sample(xyz_flipped, npoint)) # (B, 3, npoint)
  File "/home/matthias/anaconda3/envs/spd/lib/python3.7/site-packages/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg/pointnet2_ops/pointnet2_utils.py", line 54, in forward
    out = _ext.furthest_point_sampling(xyz, npoint)
RuntimeError: false INTERNAL ASSERT FAILED at "pointnet2_ops/_ext-src/src/sampling.cpp":83, please report a bug to PyTorch. CPU not supported

It seems like there is a problem with pointnet2_ops extension. I've created the enviroment following your repo. My Cuda Version is 11.4. Could you help me with that?

Thank you in advance.

@AllenXiangX
Copy link
Owner

Hi,
it seems that you have compiled the pointnet2_ops extension with cpu version PyTorch.
RuntimeError: false INTERNAL ASSERT FAILED at "pointnet2_ops/_ext-src/src/sampling.cpp":83, please report a bug to PyTorch. CPU not supported
Please update your PyTorch into a cuda version and try compiling again.

@huyanbi
Copy link

huyanbi commented Apr 24, 2023

@AllenXiangX ,Hello, I have also encountered this issue. I can confirm that I am using the cuda version of the torch. I encountered it while training on point cloud completion.(spd) root@I1230eab0b100201c25:/hy-tmp/SnowflakeNet/completion# python train.py --config ./configs/pcn_cd1.yaml Loaded compiled 3D CUDA chamfer distance 0%| | 0/906 [00:03<?, ?it/s] Traceback (most recent call last): File "train.py", line 166, in <module> train(config) File "train.py", line 105, in train pcds_pred = model(partial) File "/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "../models/model_completion.py", line 137, in forward feat = self.feat_extractor(point_cloud) File "/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "../models/model_completion.py", line 32, in forward l1_xyz, l1_points, idx1 = self.sa_module_1(l0_xyz, l0_points) # (B, 3, 512), (B, 128, 512) File "/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl result = self.forward(*input, **kwargs) File "../models/utils.py", line 375, in forward new_xyz, new_points, idx, grouped_xyz = sample_and_group_knn(xyz, points, self.npoint, self.nsample, self.use_xyz, idx=idx) File "../models/utils.py", line 316, in sample_and_group_knn new_xyz = gather_operation(xyz, furthest_point_sample(xyz_flipped, npoint)) # (B, 3, npoint) File "/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg/pointnet2_ops/pointnet2_utils.py", line 54, in forward out = _ext.furthest_point_sampling(xyz, npoint) RuntimeError: false INTERNAL ASSERT FAILED at "pointnet2_ops/_ext-src/src/sampling.cpp":83, please report a bug to PyTorch. CPU not supported

@huyanbi
Copy link

huyanbi commented Apr 24, 2023

@AllenXiangX ,I compiled again and the situation is as follows`(spd) root@I1230eab0b100201c25:/hy-tmp/SnowflakeNet/models/pointnet2_ops_lib# python setup.py install
running install
/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
setuptools.SetuptoolsDeprecationWarning,
/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/setuptools/command/easy_install.py:147: EasyInstallDeprecationWarning: easy_install command is deprecated. Use build and pip and other standards-based tools.
EasyInstallDeprecationWarning,
running bdist_egg
running egg_info
writing pointnet2_ops.egg-info/PKG-INFO
writing dependency_links to pointnet2_ops.egg-info/dependency_links.txt
writing requirements to pointnet2_ops.egg-info/requires.txt
writing top-level names to pointnet2_ops.egg-info/top_level.txt
/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/torch/utils/cpp_extension.py:352: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
reading manifest file 'pointnet2_ops.egg-info/SOURCES.txt'
writing manifest file 'pointnet2_ops.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/pointnet2_ops
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/init.py -> build/bdist.linux-x86_64/egg/pointnet2_ops
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_version.py -> build/bdist.linux-x86_64/egg/pointnet2_ops
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/pointnet2_modules.py -> build/bdist.linux-x86_64/egg/pointnet2_ops
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/pointnet2_utils.py -> build/bdist.linux-x86_64/egg/pointnet2_ops
creating build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src
creating build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/ball_query.cpp -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/ball_query_gpu.cu -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/bindings.cpp -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/group_points.cpp -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/group_points_gpu.cu -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/interpolate.cpp -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/interpolate_gpu.cu -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/sampling.cpp -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext-src/src/sampling_gpu.cu -> build/bdist.linux-x86_64/egg/pointnet2_ops/_ext-src/src
copying build/lib.linux-x86_64-cpython-37/pointnet2_ops/_ext.cpython-37m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/pointnet2_ops
byte-compiling build/bdist.linux-x86_64/egg/pointnet2_ops/init.py to init.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/pointnet2_ops/_version.py to _version.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/pointnet2_ops/pointnet2_modules.py to pointnet2_modules.cpython-37.pyc
byte-compiling build/bdist.linux-x86_64/egg/pointnet2_ops/pointnet2_utils.py to pointnet2_utils.cpython-37.pyc
creating stub loader for pointnet2_ops/_ext.cpython-37m-x86_64-linux-gnu.so
byte-compiling build/bdist.linux-x86_64/egg/pointnet2_ops/_ext.py to _ext.cpython-37.pyc
creating build/bdist.linux-x86_64/egg/EGG-INFO
copying pointnet2_ops.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pointnet2_ops.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pointnet2_ops.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pointnet2_ops.egg-info/requires.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
copying pointnet2_ops.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO
writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt
zip_safe flag not set; analyzing archive contents...
pointnet2_ops.pycache._ext.cpython-37: module references file
pointnet2_ops.pycache.pointnet2_utils.cpython-37: module references file
creating 'dist/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it
removing 'build/bdist.linux-x86_64/egg' (and everything under it)
Processing pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg
removing '/usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg' (and everything under it)
creating /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg
Extracting pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg to /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages
pointnet2-ops 3.0.0 is already the active version in easy-install.pth

Installed /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages/pointnet2_ops-3.0.0-py3.7-linux-x86_64.egg
Processing dependencies for pointnet2-ops==3.0.0
Searching for torch==1.7.1+cu110
Best match: torch 1.7.1+cu110
Adding torch 1.7.1+cu110 to easy-install.pth file
Installing convert-caffe2-to-onnx script to /usr/local/miniconda3/envs/spd/bin
Installing convert-onnx-to-caffe2 script to /usr/local/miniconda3/envs/spd/bin

Using /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages
Searching for numpy==1.21.6
Best match: numpy 1.21.6
Adding numpy 1.21.6 to easy-install.pth file
Installing f2py script to /usr/local/miniconda3/envs/spd/bin
Installing f2py3 script to /usr/local/miniconda3/envs/spd/bin
Installing f2py3.7 script to /usr/local/miniconda3/envs/spd/bin

Using /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages
Searching for typing-extensions==4.5.0
Best match: typing-extensions 4.5.0
Adding typing-extensions 4.5.0 to easy-install.pth file

Using /usr/local/miniconda3/envs/spd/lib/python3.7/site-packages
Finished processing dependencies for pointnet2-ops==3.0.0`You can see from above that the CUDA version was successfully compiled

@AllenXiangX
Copy link
Owner

Perhaps some of the tensors or operations are on the cpu. Please try manually specify the gpus before runing the training script as follows:
export CUDA_VISIBLE_DEVICES='2'
python train.py --config ./configs/pcn_cd1.yaml

@huyanbi
Copy link

huyanbi commented May 2, 2023

@AllenXiangX Okay, I'll try

@huyanbi
Copy link

huyanbi commented May 3, 2023

@AllenXiangX .hellow,After trying, I found that it didn't solve the problem.

@sjYoondeltar
Copy link

I saw the same error message. In my case, I modify the config yaml file. In the config yaml, I can modify train:gpus:[2] to train:gpus:[0] and the error fixed

@w1hao
Copy link

w1hao commented Oct 24, 2024

I had this problem as well and it has been resolved.
You can change the ‘gpu: [2]’ in the yaml file to the graphics card you want to use, my computer only has one GPU, so I changed it to ‘ gpu: [0]’ and the problem was solved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants