Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy to DAS-5 #24

Open
KiaraGrouwstra opened this issue Feb 3, 2020 · 7 comments
Open

deploy to DAS-5 #24

KiaraGrouwstra opened this issue Feb 3, 2020 · 7 comments

Comments

@KiaraGrouwstra
Copy link
Owner

KiaraGrouwstra commented Feb 3, 2020

DAS-5 is the machine made available to me for running research by VU.
the box seems to offer slurm modules (not ones I want, see module avail) + yum.
using the latter I managed to get Stack running there:

# install stack
mkdir -p ~/.local/bin
wget https://get.haskellstack.org/stable/linux-x86_64-static.tar.gz
tar xvf linux-x86_64-static.tar.gz
mv stack-2.1.3-linux-x86_64-static/stack ~/.local/bin/stack
rm -rf stack-2.1.3-linux-x86_64-static/
rm linux-x86_64-static.tar.gz

# run job
touch job.sh
vim job.sh
#######
#!/bin/bash
#SBATCH -t 01:00:00
#SBATCH -n 1
#SBATCH --job-name kiara
#SBATCH -o /home/tga340/kiara.log
cd /home/tga340/synthesis
stack exec -- synthesis
########

sbatch job.sh
squeue -u tga340
vim ~/kiara.log
@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Feb 3, 2020

I'm not getting result / log files from my Slurm job now. investigating.

edit: ran outta my 4 gb user space while compiling -- DAS-5 support now bumped me to 10 gb. :)

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Feb 13, 2020

tensorflow/haskell can be installed by either Docker, Nix, or locally, tho I'm feeling hard-pressed to succeed in any of these without sudo rights on DAS...

the local install seems to have failed due to disk quota again tho!

HaskTorch install instead seems to fail cuz gcc (needs: 8) there is old (default 6, 7 available) -- unlike my local one which is too new (9)!
I could maybe try gcc binaries.

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Feb 18, 2020

failing a proper compiler stack, I could move over:

  • compiler binaries
  • a compiled hasktorch/TF
  • a binary for my own application - just tried this for an executable as generated using my current Nix+Cabal setup:
scp ./dist-newstyle/build/x86_64-linux/ghc-8.6.5/synthesis-0.0.0/x/generator/build/generator/generator vu:/home/tga340/generator
ssh vu
scp ./generator das:/home/tga340/generator
ssh das
./generator
# -bash: ./generator: /nix/store/6yaj6n8l925xxfbcd65gzqx3dz7idrnn-glibc-2.27/lib/ld-linux-x86-64.so.2: bad ELF interpreter: No such file or directory

so... I guess by default the binaries use dynamic linking, which obviously is not portable.
honestly I think this pretty much rules out university-provided compute resources for my software stack; I could raise a support request for adding e.g. Nix, but... considering user-level installs were already tricky there, their module-based ecosystem seems super conservative, while I'm sort of on the edge... that doesn't seem super likely somehow.

I'll probably end up at Google Cloud somehow.
In which case, I might as well close this.

@KiaraGrouwstra
Copy link
Owner Author

apparently cabal file option ld-options does static linking (source) so there may be a way around this...

@KiaraGrouwstra
Copy link
Owner Author

see NixOS/nixpkgs#43795 for options...

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Apr 25, 2020

Google Cloud attempts:

  • colab using nix: setup is taking a bit, and I fear if it disconnects you're kinda stuck waiting for an hour again. this can be sorta alleviated while the browser is active fwiw:
function ClickConnect(){
console.log("Working");
document.querySelector("colab-toolbar-button#connect").click()
}
setInterval(ClickConnect,60000)
  • nixops to deploy to GCE: just import the default.nix into your instance configuration and deploy it. simple deployment; for services that don’t need lots of sharing of the nixops statefile, it should be fairly straight-forward to deploy a local project and have it just work.
  • GKE by Dockerfile: I don't currently have a Docker for my own repo, see reproducibility #16.
  • GCE from Docker Hub hub.docker.com/r/nixorg/nix: it didn't like this as an image address (html)
  • GCE from Nixery (image nixery.dev/shell/git, command bash): while tutorial command docker run -ti nixery.dev/shell/git bash worked, as a GCE image this just got stuck in restarting state.
  • nixos-generators: couldn't even log in to this VM
# build a NixOS image for Google Compute Engine
nix-env -f https://github.com/nix-community/nixos-generators/archive/master.tar.gz -i
nixos-generate -f gce
# upload the .tar.gz to a GCP Cloud Storage bucket
# create a Compute Engine image from the .tar.gz in your bucket
# create a Compute Engine VM instance from your image
  • build nix stuff on GCE from an Ubuntu VM
    • in my local region
    • with a GPU (the cheaper T4, available in some sub-regions tho not others)
    • with Ubuntu (preferably 18.04)
    • on a persistent disk big enough not to fail on build (10 -> 50 GB)
# start your VM and SSH into it using the `gcloud` command-line tool
git clone https://github.com/tycho01/synthesis.git
cd synthesis
# then follow readme instructions

given enough resources (50 GB disk, 8 GB RAM), that seemed to work.

@KiaraGrouwstra
Copy link
Owner Author

KiaraGrouwstra commented Apr 27, 2020

Junji:

I've realized we should use AWS for nixos.
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/ec2-amis.nix
nixos-20.3 is supported on AWS.
But GCE does not support the latest nixos.
Still the latest version of GCE is 18.09.
https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/virtualisation/gce-images.nix
He uses terraform to create image on GCE.
The base image is 18.09
https://kylesferrazza.com/posts/nixops-gce/

I've now done AWS signup.
Apparently that may take a day to activate.

It looks like from my place of Amsterdam, Frankfurt (eu-central-1) is closest (+ not suffering Brexit).
So for NixOS 20.03 I'll need AMI ami-0a1a94722dcbff94c.

instance types:

  • compute optimized: C. I don't need extra letters (C6g/C5a/C5n) so C5 seems viable there, smallest being c5.large, priced at $0.097/h.
  • general purpose has A (arm), T (burstable), and M (actual general purpose?). I don't need extra letters (M6g/M5a/M5n) so M5 seems viable there, smallest being m5.large.
ssh -i ~/Downloads/kiara.pem root@ec2-35-158-37-55.eu-central-1.compute.amazonaws.com
nix-channel --update
nix-env -iA nixos.git nixos.vim nixos.cabal-install
cabal update
vim ~/.cabal/config
#   ghc-options: -fconstraint-solver-iterations=8
nix-env -iA cachix -f https://cachix.org/api/v1/install
cachix use tycho01
vim /etc/nixos/configuration.nix
# ./cachix.nix
sudo nixos-rebuild switch
git clone https://github.com/tycho01/synthesis.git
cd synthesis/
nix-shell
hpack --force

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant