diff --git a/docs/recipes/install/common/import_ww4_files_ib_centos.tex b/docs/recipes/install/common/import_ww4_files_ib_centos.tex new file mode 100644 index 0000000000..7fe8f005a3 --- /dev/null +++ b/docs/recipes/install/common/import_ww4_files_ib_centos.tex @@ -0,0 +1,18 @@ +%\iftoggle{isCentOS}{\clearpage} + +\noindent Finally, to add {\em optional} support for controlling IPoIB +interfaces (see \S\ref{sec:add_ofed}), \OHPC{} includes a +template file for \Warewulf{} that can optionally be imported and used later to provision +\texttt{ib0} network settings. + +% begin_ohpc_run +% ohpc_validation_newline +% ohpc_command if [[ ${enable_ipoib} -eq 1 ]];then +% ohpc_indent 5 +\begin{lstlisting}[language=bash,literate={-}{-}1,keywords={},upquote=true] +[sms](*\#*) wwsh file import /opt/ohpc/pub/examples/network/centos/ifcfg-ib0.ww +[sms](*\#*) wwsh -y file set ifcfg-ib0.ww --path=/etc/sysconfig/network-scripts/ifcfg-ib0 +\end{lstlisting} +% ohpc_indent 0 +% ohpc_command fi +% \end_ohpc_run diff --git a/docs/recipes/install/common/warewulf4_slurm_test_job.tex b/docs/recipes/install/common/warewulf4_slurm_test_job.tex new file mode 100644 index 0000000000..d2c16035d3 --- /dev/null +++ b/docs/recipes/install/common/warewulf4_slurm_test_job.tex @@ -0,0 +1,137 @@ +With the resource manager enabled for production usage, users should now be +able to run jobs. To demonstrate this, we will add a ``test'' user on the {\em master} +host that can be used to run an example job. + +% begin_ohpc_run +\begin{lstlisting}[language=bash,keywords={}] +[sms](*\#*) useradd -m test +\end{lstlisting} +% end_ohpc_run + +\Warewulf{} installs a utility on the compute nodes to automatically +synchronize known files from the provisioning server at five minute intervals. In this +recipe, recall that we previously registered credential files with Warewulf (e.g. passwd, +group, and shadow) so that these files would be propagated during compute node +imaging. However, with the addition of a new ``test'' user above, the files +have been outdated and we need to update the Warewulf database to incorporate +the additions. This re-sync process can be accomplished as follows: + +% begin_ohpc_run +\begin{lstlisting}[language=bash,keywords={}] +[sms](*\#*) wwsh file resync passwd shadow group +\end{lstlisting} +% end_ohpc_run + +% begin_ohpc_run +% ohpc_command sleep 2 +% end_ohpc_run + +\begin{center} +\begin{tcolorbox}[] +\small +After re-syncing to notify Warewulf of file modifications made on the {\em +master} host, it should take approximately 5 minutes for the changes to +propagate. However, you can also manually pull the changes from compute nodes +via the following: +% begin_ohpc_run +\begin{lstlisting}[language=bash,keywords={}] +[sms](*\#*) pdsh -w ${compute_prefix}[1-${num_computes}] /warewulf/bin/wwgetfiles +\end{lstlisting} +% end_ohpc_run +\end{tcolorbox} +\end{center} + +\input{common/prun} + +%\iftoggle{isSLES_ww_slurm_x86}{\clearpage} +%\iftoggle{isCentOS_ww_slurm_x86}{\clearpage} + + +\subsection{Interactive execution} +To use the newly created ``test'' account to compile and execute the +application {\em interactively} through the resource manager, execute the +following (note the use of \texttt{prun} for parallel job launch which summarizes +the underlying native job launch mechanism being used): + +\begin{lstlisting}[language=bash,keywords={}] +# Switch to "test" user +[sms](*\#*) su - test + +# Compile MPI "hello world" example +[test@sms ~]$ mpicc -O3 /opt/ohpc/pub/examples/mpi/hello.c + +# Submit interactive job request and use prun to launch executable +[test@sms ~]$ salloc -n 8 -N 2 + +[test@c1 ~]$ prun ./a.out + +[prun] Master compute host = c1 +[prun] Resource manager = slurm +[prun] Launch cmd = mpiexec.hydra -bootstrap slurm ./a.out + + Hello, world (8 procs total) + --> Process # 0 of 8 is alive. -> c1 + --> Process # 4 of 8 is alive. -> c2 + --> Process # 1 of 8 is alive. -> c1 + --> Process # 5 of 8 is alive. -> c2 + --> Process # 2 of 8 is alive. -> c1 + --> Process # 6 of 8 is alive. -> c2 + --> Process # 3 of 8 is alive. -> c1 + --> Process # 7 of 8 is alive. -> c2 +\end{lstlisting} + +\begin{center} +\begin{tcolorbox}[] +The following table provides approximate command equivalences between SLURM and +OpenPBS: + +\vspace*{0.15cm} +\input common/rms_equivalence_table +\end{tcolorbox} +\end{center} +\nottoggle{isCentOS}{\clearpage} + +\iftoggle{isCentOS}{\clearpage} + +\subsection{Batch execution} + +For batch execution, \OHPC{} provides a simple job script for reference (also +housed in the \path{/opt/ohpc/pub/examples} directory. This example script can +be used as a starting point for submitting batch jobs to the resource manager +and the example below illustrates use of the script to submit a batch job for +execution using the same executable referenced in the previous interactive example. + +\begin{lstlisting}[language=bash,keywords={}] +# Copy example job script +[test@sms ~]$ cp /opt/ohpc/pub/examples/slurm/job.mpi . + +# Examine contents (and edit to set desired job sizing characteristics) +[test@sms ~]$ cat job.mpi +#!/bin/bash + +#SBATCH -J test # Job name +#SBATCH -o job.%j.out # Name of stdout output file (%j expands to %jobId) +#SBATCH -N 2 # Total number of nodes requested +#SBATCH -n 16 # Total number of mpi tasks #requested +#SBATCH -t 01:30:00 # Run time (hh:mm:ss) - 1.5 hours + +# Launch MPI-based executable + +prun ./a.out + +# Submit job for batch execution +[test@sms ~]$ sbatch job.mpi +Submitted batch job 339 +\end{lstlisting} + +\begin{center} +\begin{tcolorbox}[] +\small +The use of the \texttt{\%j} option in the example batch job script shown is a convenient +way to track application output on an individual job basis. The \texttt{\%j} token +is replaced with the \SLURM{} job allocation number once assigned (job~\#339 in +this example). +\end{tcolorbox} +\end{center} + +