This is a simple watcher that will collect resource usage during a Singularity pull of several containers, ubuntu, busybox, centos, alpine, and nginx. I chose these fairly randomly. The goal will be to create plots, where we take a measurement each second, and ask the following questions:
- Is running on a head node as bad an idea as we expect it to be.
- Is there varying performance based on the amount of memory available?
For the first point, it would be more accurate to look at a collection of head nodes at different times of day (for example, it's a Saturday morning, and unlikely to be busy now). For the second point, the extent to which the software takes advantage of available memory depends on the software itself. Singularity (I think) should do a fairly good job at this.
This is a fairly simple analysis in that I could install watchme and then write a few quick scripts, run, and be done!
- pull.sh Is the script I ran on the head node to pull 5 containers (5 times each)
- pull-job.sh I submit to different nodes with varying memory, also each 5 times)
- export.sh is a smalll script to export the data from the .git repository.
Note that since the cluster runs were done in parallel, watchme saved files directly to data. We can't use git as a temporal database here because it's likely to have issues with multiple jobs trying to write and commit at the same time. I realize this is a drawback of the temporal database approach, but this is also why it's reasonable to run watchme on its own and just save each result to a file.
Specifically, to install watchme:
$ pip install watchme[all]
You can also clone and install from the master branch directly:
$ git clone https://www.github.com/vsoch/watchme
cd watchme
pip install .[all] --user
And then I created a watcher folder (this repo).
$ watchme create singularity-pull
This was the script pull.sh and it looked like this:
for iter in 1 2 3 4 5; do
for name in ubuntu busybox centos alpine nginx; do
echo "Running $name iteration $iter..."
output=${outdir}/$name-iter-${iter}.json
watchme monitor singularity pull --force docker://$name --seconds 1 > ${output}
done
done
If I wanted to use watchme as a temporal database, I could have done this:
for iter in 1 2 3 4 5; do
for name in ubuntu busybox centos alpine nginx; do
echo "Running $name iteration $iter..."
watchme monitor singularity-pull singularity pull --force docker://$name --name $name-$iter --seconds 1
done
done
Notice how we are saving directly to the watcher. Then we can easily export using export.sh. The data for this export is in export.
We used the last section of the pull.sh script to launch a number of jobs on the Sherlock cluster, each performing a pull:
# Next, we can run this on nodes with different memory. Since git doesn't
# do well with running in parallel, we will just save these files to the host,
# named based on the run.
for iter in 1 2 3 4 5; do
for name in ubuntu busybox centos alpine nginx; do
for mem in 4 6 8 12 16 18 24 32 64 128; do
output="${outdir}/${name}-iter${iter}-${mem}gb.json"
echo "sbatch --mem=${mem}GB pull-job.sh ${mem} ${iter} ${name} ${output}"
sbatch --mem=${mem}GB pull-job.sh "${mem}" "${iter}" "${name}" ${output}
done
done
done
The results were each written directly to files in data (not using git as a temporal database).