Skip to content
This repository has been archived by the owner on Jul 26, 2022. It is now read-only.

build cherrypi on windows : cmake command fail #2

Open
koalarun opened this issue Nov 27, 2018 · 16 comments
Open

build cherrypi on windows : cmake command fail #2

koalarun opened this issue Nov 27, 2018 · 16 comments

Comments

@koalarun
Copy link

koalarun commented Nov 27, 2018

I run the full command in the installation guid

cmake .. -DMSVC=true -DZMQ_LIBRARY="../3rdparty/zmq.lib" -DZMQ_INCLUDE_DIR="../3rdparty/libzmq/include" -DGFLAGS_LIBRARY="../3rdparty/gflags_static.lib" -DGFLAGS_INCLUDE_DIR="../3rdparty/gflags/build/include" -DGLOG_ROOT_DIR="../3rdparty/glog" -DCMAKE_CXX_FLAGS_RELEASE="/MP /EHsc" -G "Visual Studio 15 2017 Win64"

and the error happened:

CMake Error in common/CMakeLists.txt:
  Imported target "Torch" includes non-existent path
    "D:/StarCraftAI/TorchCraft/TorchCraftAI/3rdparty/pytorch/torch/lib/tmp_install/include/THC"
  in its INTERFACE_INCLUDE_DIRECTORIES.

image

So the follow command "msbuild CherryPi.sln /property:Configuration=Release /m” is also failled.

@ebetica
Copy link
Contributor

ebetica commented Nov 27, 2018

This suggest to me that you somehow built pytorch without CUDA support, while TorchCraftAI thinks you do have CUDA. I wonder if there's something strange with your CUDA installation? Alternatively, try to make sure PyTorch builds with CUDA support.

@koalarun
Copy link
Author

koalarun commented Nov 28, 2018

This suggest to me that you somehow built pytorch without CUDA support, while TorchCraftAI thinks you do have CUDA. I wonder if there's something strange with your CUDA installation? Alternatively, try to make sure PyTorch builds with CUDA support.

@ebetica
I have downloaded and installed CUDA9.2. And I used the command in the installation guid:

conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing

and cd dir pytorch, run

python setup.py build

and after a long time, it was successfull without error.(I installed CUDA10.0 first time, but failled to build the pytorch.)

Did I make some mistake?

@koalarun
Copy link
Author

koalarun commented Nov 28, 2018

I installed the Patch 1 of cuda9.2, and rebuilded pytorch, It failed...
I collected some errors in message:

Library mkl_intel_lp64: not found

Library mkl_intel: not found

CMake Error at cmake/public/cuda.cmake:123 (file): file failed to open for reading (No such file or directory):  \=//cudnn.h

huu~~ it's a hard work. I hope there is a release package easy to install.

@ebetica
Copy link
Contributor

ebetica commented Nov 28, 2018

If you can give me your full pytorch and TorchcraftAI build log, I might be able to help in more depth.

The second error you pasted suggests to me you don't have anaconda in your environment, since mkl was installed above in conda install ... mkl ....

@koalarun
Copy link
Author

koalarun commented Dec 2, 2018

@ebetica
I reinstalled the win10 and try again with cuda9.2 , and also installed the pathc1 of cuda9.2.
It failed again because of the same error above: mkl not found.
I think the problem is not all CUDA version is fit to build the pytorch in the TorchCraftAI.
Could you tell me what CUDA version do you installed to build pytorch?

@bwangll
Copy link

bwangll commented Dec 6, 2018

I want to ask if you have built this environment, I have encountered a lot of problems that cannot be solved.

@dexterju27
Copy link

Hey, For the windows build, we use the Windows 10 / CUDA 9.2. With / without patches shouldn't matter.
Do you also install the anaconda environment? This is where you get conda install and python support.

@bwangll
Copy link

bwangll commented Dec 6, 2018

@dexterju I did not install CUDA. Is this the main reason for my pytorch error? The reason for its error is 'tool sbui1d pytorch_ 1ibs. Bat --use-fbgemn --use-mnpack. Use-mkldnn -use-qnpack caffe2

@dexterju27
Copy link

Do you have anaconda installed ? and do the conda install command as described in the tutorial?

Btw what you pasted here is not an error, it just tells you this bat file failed to run, it should give more information on why it fails, do you have it?
iIf not, You can go into the script and try to run this line with all environment variables set, and tun this line manually to see what is trigging it.

@ebetica
Copy link
Contributor

ebetica commented Dec 6, 2018

@koalarun MKL is the Intel library for fast CPU numerical evaluations. It should not be affected by which version of CUDA you run.

Actually I think the easiest way to set things up is to compile TorchCraftAI on a Linux machine. You can either run OpenBW with a 4.20 bot, or use a windows VM to run StarCraft. We do not really use Windows to develop, so we don't know the kinks of the build process as well as we know the Linux setup process.

@koalarun
Copy link
Author

koalarun commented Dec 8, 2018

@dexterju @ebetica thank you for your replies. Today I look at the build pytorch error carefully, and I find the cmake error is when build the caffe2:

 "CMake Error at cmake/public/cuda.cmake:123 (file):
  file failed to open for reading (No such file or directory):"

and I open the cuda.cmake file, the code from 120-123 is

if(CAFFE2_USE_CUDNN)
  # Get cuDNN version
  file(READ ${CUDNN_INCLUDE_DIR}/cudnn.h CUDNN_HEADER_CONTENTS)

I doubt the logic in cuda.cmake for check whether use cudnn may be not right?

if(NOT CUDNN_FOUND)
  message(WARNING
    "Caffe2: Cannot find cuDNN library. Turning the option off")
  set(CAFFE2_USE_CUDNN OFF)
else()
  set(CAFFE2_USE_CUDNN ON)
endif()

The parameter CUDNN_FOUND is not defined, and it only appear here. So the CAFFE2_USE_CUDNN is always ON.
I have installed CUDA, but I could not find the cudnn.h file in my computer. May be I need to install CUDNN manually.
So I use the anaconda to install CUDNN
conda install -c anaconda cudnn
And copy the bin,include,lib files to CUDA install dir.

Now I run the
python setup.py build
It's running without error now... I hope it will successful.


huuu~~The newest version of VS2017 is not compatible with CUDA9.2,so I uninstall CUDA9.2, install CUDA10.0 and rebuild again.


Oh, another new error....

  Error : Internal Compiler error (codegen): "there was an error in verifying the lgenfe output!" [D:\StarCraftAI\Torch
Craft\TorchCraftAI\3rdparty\pytorch\build\caffe2\caffe2_gpu.vcxproj]

and there is no more information...

@bwangll
Copy link

bwangll commented Dec 8, 2018

@koalarun I got the same mistake with you.

@ebetica
Copy link
Contributor

ebetica commented Dec 10, 2018

@koalarun I feel that you have some left over build files from the last run. Try python setup.py clean before you rebuild.

@koalarun
Copy link
Author

@koalarun I feel that you have some left over build files from the last run. Try python setup.py clean before you rebuild.

I delete the build directory and run the command 'python setup.py clean', but it's useless, the error is also "there was an error in verifying the lgenfe output!".

@ebetica
Copy link
Contributor

ebetica commented Dec 12, 2018

Could this explain the issue? pytorch/pytorch#12117

I'm not exactly sure what the fix is, since we don't observe it on our machines... It sounds like 9.2 + older VS2017 is the way to go?

@koalarun
Copy link
Author

Could this explain the issue? pytorch/pytorch#12117

I'm not exactly sure what the fix is, since we don't observe it on our machines... It sounds like 9.2 + older VS2017 is the way to go?

May be I'll wait the CUDA10.1 release...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants