-
Notifications
You must be signed in to change notification settings - Fork 39
Quickstart
This page covers simple examples for getting started with Doodad. For more detailed documentation on the API, see Detailed Documentation.
We can very quickly run shell commands using the launch_api
:
import doodad
doodad.run_command(
command='echo helloworld',
)
This command will launch a docker container and execute the command echo helloworld
inside.
Next, we can try running a simple python program which replicates the above behavior. First, we can write the following hello world script:
print('helloworld')
Save this script and remember the filename. We can run this script with the run_python
command:
import doodad
doodad.run_python(
target='path/to/hello_world.py',
)
Likewise, this will start a docker container and run the python script inside.
The hello world programs do not specify any data dependencies, so there is no mechanism for sending or retrieving data from the running script. We can specify these dependencies using "mount" objects. In the next example, let's include a text file and read it.
Save the following data as 'foo/secret.txt'
apple
Now, we need tell doodad to specify the 'foo' folder as a dependency. We can do this by creating a MountLocal object:
mnt = doodad.MountLocal(
local_dir='foo',
mount_point='./mymount',
output=False
)
The mount_point argument specified where this folder will be available to the running script.
Now, let us finish writing the script.
import doodad
doodad.run_command(
command='cat ./mymount/secret.txt',
mounts=[mnt]
)
When run, this should now print out the contents of secret.txt
!
Mount objects are used for both specifying outputs and code/data dependencies. See the documentation for available options.
Sometimes you will need to collect data or log files generated by your program. If you need to retrieve outputs from the log container, use the output=True
flag for MountLocal as follows:
import os
import doodad
os.makedirs('testing_dir')
mnt = doodad.MountLocal(
local_dir='testing_dir',
mount_point='/mymount',
output=True
)
doodad.run_command(
command='echo hello123 > /mymount/secret.txt',
mounts=[mnt]
)
This script will write 'hello123' into the text file testing_dir/secret.txt.
When running remotely, you will need to use either MountGCP or MountS3 instead of MountLocal in order to sync to cloud storage services. Again, see the documentation for more details.
Command-line arguments can be passed into the script, by passing in cli_args
to the launch_api.run_command
or launch_api.run_python
functions. cli_args
should be formatted as a string, for example: '--arg1 10 --arg2'
. Some basic hyperparameter sweeping functionality can be performed on top of command-line arguments.
The argparse
module in python is recommended for retrieving arguments inside the script.
In order to launch jobs remotely, we need to specify a launch mode. Our previous examples have been using the local run mode, mode.LocalMode()
by default.
Let's take our example from the previous section. To run a job remotely, we simply need to pass in the appropriate launch mode.
import doodad
local = doodad.LocalMode()
gcp_mode = doodad.GCPMode(<fill in arguments>)
mnt = doodad.MountLocal(
local_dir='foo',
mount_point='./mymount',
output=False
)
# This will run locally
doodad.run_command(
command='cat ./mymount/secret.txt',
mounts=[mnt],
mode=local,
)
# This will run remotely
doodad.run_command(
command='cat ./mymount/secret.txt',
mounts=[mnt],
mode=gcp_mode
)
A good workflow is to first test your code locally, before launching jobs on remote services (and potentially incurring charges):
- Launch your code with mode.LocalMode()
- Launch your code with the appropriate remote service.