Skip to content

Commit

Permalink
Merge pull request #11 from grycap/showoutput
Browse files Browse the repository at this point in the history
add showoutput to strace and includefolder to scripts
  • Loading branch information
dealfonso authored Mar 2, 2018
2 parents c5dd428 + 00938f8 commit 0722883
Show file tree
Hide file tree
Showing 17 changed files with 199 additions and 59 deletions.
8 changes: 3 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,10 @@

When you run containers (e.g. in Docker), you usually run a system that has a whole Operating System, documentation, extra packages, etc. and your specific application. The result is that the footprint of the container is bigger than needed.

**minicon** aims at reducing the footprint of the Docker containers, by just including in the container those files that are needed. That means that the other files in the original container are removed.

**minicon** is a general tool to analyze applications and executions of these applications to obtain a filesystem that contains all the dependencies that have been detected. In particular, it can be used to reduce Docker containers. The **minicon** package includes **minidock**
which will help to reduce Docker containers by hiding the underlying complexity of running **minicon** inside a Docker container.

The purpose of **minicon** and **minidock** is better understood with the use cases explained in depth in the section "[Examples](#4-examples)": the size of a basic UI that contains bash, ip, wget, ssh, etc. commands is _reduced from 211MB to 10.9MB_; the size of a NodeJS application along with the server is _reduced from 686 MB (using the official node image) to 45.6MB_; the size of an Apache server is _reduced from 216MB to 50.4MB_, and the size of a Perl application in a Docker container is _reduced from 206MB to 43.3MB_.
The purpose of **minicon** and **minidock** is better understood with the use cases explained in depth in the section "[Examples](#4-examples)": the size of a basic UI that contains bash, ip, wget, ssh, etc. commands is _reduced from 211MB to 10.9MB_; the size of a NodeJS application along with the server is _reduced from 686 MB (using the official node image) to 45.6MB_; the size of an Apache server is _reduced from 216MB to 50.4MB_, and the size of a Perl application in a Docker container is _reduced from 206MB to 5.81MB_.

> [**minidock**](doc/minidock.md) is based on [**minicon**](doc/minicon.md), [**importcon**](doc/importcon.md) and [**mergecon**](doc/mergecon.md), and hides the complexity of creating a container, mapping minicon, guessing parameters such as the entrypoint or the default command, creating the proper commandline, etc.
Expand Down Expand Up @@ -399,8 +397,8 @@ We can check the differences in the sizes:
```bash
$ docker images minicon
REPOSITORY TAG IMAGE ID CREATED SIZE
minicon uc6 7c85b5a104f5 5 seconds ago 43.3MB
minicon uc6 7c85b5a104f5 5 seconds ago 5.81MB
minicon uc6fat 1c8179d3ba94 4 hours ago 206MB
```

In this case, the size has been reduced from 206MB to about 43.3MB.
In this case, the size has been reduced from 206MB to about 5.81MB.
File renamed without changes.
File renamed without changes.
26 changes: 21 additions & 5 deletions doc/man/minicon → doc/man/minicon.1
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,12 @@ to check the dependencies of an application or library. The resulting dependenci
.SS scripts
This plugins tries to guess if a command is an interpreted script. If it is guessed to be, the interpreter will be also analyzed. It makes use of the command
.I file
and the analysis of the shebang line of text files.
and the analysis of the shebang line of text files. It accepts the next optional parameter:

.B includefolders
If it is set to true, the scripts plugin will include in the final filesystem the whole folders in which the interpreter will search for packages (i.e. using @inc or include). The default value is
.B false.
.B 3.

.SS strace
This plugin analyzes the execution of an application and detects which files have been used. It is tightened to the
Expand All @@ -110,17 +115,28 @@ parameter from minicon. It accepts several parameters and the syntax is:
--plugin=strace:param=value:param=value...

.B seconds
the number of seconds that strace will be analyzing the execution
the number of seconds that strace will be analyzing the execution. The default value is
.B 3.

.B mode
decides which files will be included in the filesystem. The possible values are: skinny (includes only the opened, checked, etc. files and creates the opened, checked, etc. folders), slim (also includes the whole opened or created folders), regular (also includes the whole folder in which the opened files are stored; useful for included libraries) and loose (also includes the whole opened, checked, etc. folder).
decides which files will be included in the filesystem. The possible values are: skinny (includes only the opened, checked, etc. files and creates the opened, checked, etc. folders), slim (also includes the whole opened or created folders), regular (also includes the whole folder in which the opened files are stored; useful for included libraries) and loose (also includes the whole opened, checked, etc. folder). The default value is
.B skinny.

.B execfile
points to a file that includes commandline examples of different applications. These commandlines will be used for analyzing the executables. E.g. analyzing a plain
.I ping
command has no sense, because it does nothing. But analyzing
.I ping www.google.es
makes use of libraries, name resolution, etc.
makes use of libraries, name resolution, etc. The default value is
.B none.

.B showoutput
If set to
.I true
, strace will output the output of the simulations to stdout and stderr. Otherwise, the simulation is hidden. If it the parameter appears without value, it will be interpreted to be
.I true
(i.e. `--plugin=strace:showoutput` is the same than `--plugin=strace:showoutput=true`). The default value is
.B false.

.SH EXAMPLES

Expand All @@ -139,7 +155,7 @@ Then it is possible to import such filesystem into Docker with a command like
The same run of minicon, but running it inside a Docker ubuntu-based container:

.RS 3
.B docker run --privileged --rm -it -v /bin/minicon:/tmp/minicon -v $PWD:/tmp/work ubuntu:latest bash -c 'apt-get install -y strace && /tmp/minicon/minicon -t /tmp/work/minimal.tar --plugin=strace -E bash -E "ssh localhost" -E "ip addr" -E id -E cat -E ls -E mkdir -E "ping -c 1 www.google.es" -- wget -q -O- www.google.es'
.B docker run --cap-add SYS_PTRACE --rm -it -v /bin/minicon:/tmp/minicon -v $PWD:/tmp/work ubuntu:latest bash -c 'apt-get install -y strace && /tmp/minicon/minicon -t /tmp/work/minimal.tar --plugin=strace -E bash -E "ssh localhost" -E "ip addr" -E id -E cat -E ls -E mkdir -E "ping -c 1 www.google.es" -- wget -q -O- www.google.es'
.RE


Expand Down
File renamed without changes.
15 changes: 14 additions & 1 deletion doc/minicon.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,15 @@ $ ./minicon -t tarfile --plugin=strace:seconds=10 -E '/usr/games/cowsay hello'
# The next execution will try to execute the application bash for 3 seconds (the default value), but will exclude any file used by the application that is found either in /dev or /proc
$ ./minicon -t tarfile --plugin=strace:exclude=/dev:exclude=/proc bash
```
**Parameters**

* _seconds_: The number of seconds while the simulation of the application will be ran. The execution may end earlier, but if not, it will be killed (with -9 signal). The default value is **3**.

* _execfile_: A file that contains examples of command invocation for applications. If an application is tried to be simulated without arguments, the strace plugin will search in this file for a better example. The default value is **None**.

* _mode_: decides which files will be included in the filesystem. The possible values are: _skinny_ (includes only the opened, checked, etc. files and creates the opened, checked, etc. folders), _slim_ (also includes the whole opened or created folders), _regular_ (also includes the whole folder in which the opened files are stored; useful for included libraries) and _loose_ (also includes the whole opened, checked, etc. folder). The default value is **skinny**.

* _showoutput_: If set to _true_, strace will output the output of the simulations to stdout and stderr. Otherwise, the simulation is hidden. If it the parameter appears without value, it will be interpreted to be _true_ (i.e. `--plugin=strace:showoutput` is the same than `--plugin=strace:showoutput=true`). The default value is **false**.

#### scripts plug-in
Some of the executables that you want to include in the resulting filesystem can be scripts (e.g. bash, perl, python, etc.). As an example, **minicon** is a bash script. The problem is that these scripts need an interpreter (i.e. bash is needed for **minicon**), but inspecting the executable using _ldd_ will not find any dependency.
Expand All @@ -168,6 +177,10 @@ To activate the strace plugin you can use the option ```--plugin```. Some exampl
$ ./minicon -t tarfile --plugin=scripts ./minicon
```

**Parameters**

* _includefolders_: If it is set to true, the scripts plugin will include in the final filesystem the whole folders in which the interpreter will search for packages (i.e. using _@inc_ or _include_). The default value is **false**.

> **DISCLAIMER**: take into account that the _scripts_ plugin is an automated tool and tries to make its best. If a interpreter is detected, all the default include folder for that interpreter will be added to the final filesystem. If you know your app, you can reduce the number of folders to include.
## 4. Use Cases
Expand Down Expand Up @@ -334,7 +347,7 @@ minicon uc2fat 2a95d52068fd 2 minutes ago
But we can reduce the size, if we know which tools we want to provide to our users. From the folder in which it is installed **minicon**, we can execute the following commands to minimize the container and to import it into docker:

```
$ docker run --privileged --rm -it -v /home/calfonso/Programacion/git/minicon:/tmp/minicon minicon:uc2fat bash -c 'apt-get install -y strace && /tmp/minicon/minicon -t /tmp/minicon/usecases/uc2/uc2.tar --plugin=strace:execfile=/tmp/minicon/usecases/uc2/execfile-cmd -E bash -E ssh -E ip -E id -E cat -E ls -E mkdir -E ping -E wget'
$ docker run --cap-add SYS_PTRACE --rm -it -v /home/calfonso/Programacion/git/minicon:/tmp/minicon minicon:uc2fat bash -c 'apt-get install -y strace && /tmp/minicon/minicon -t /tmp/minicon/usecases/uc2/uc2.tar --plugin=strace:execfile=/tmp/minicon/usecases/uc2/execfile-cmd -E bash -E ssh -E ip -E id -E cat -E ls -E mkdir -E ping -E wget'
$ docker import usecases/uc2/uc2.tar minicon:uc2
```

Expand Down
12 changes: 6 additions & 6 deletions doc/minidock.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ When you run Docker containers, you usually run a system that has a whole Operat

**minidock** aims at reducing the footprint of the Docker containers, by just including in the container those files that are needed. That means that the other files in the original container are removed.

The purpose of **minidock** is better understood with the use cases explained in depth in the section "[Examples](#4-examples)": the size of an Apache server is reduced from 216MB. to 50.4MB., and the size of a Perl application in a Docker container is reduced from 206MB to 50.4MB.
The purpose of **minidock** is better understood with the use cases explained in depth in the section "[Examples](#4-examples)": the size of an Apache server is reduced from 216MB. to 50.4MB., and the size of a Perl application in a Docker container is reduced from 206MB to 5.81MB.


> **minidock** is based on [**minicon**](minicon.md), [**importcon**](importcon.md) and [**mergecon**](mergecon.md), and hides the complexity of creating a container, mapping minicon, guessing parameters such as the entrypoint or the default command, creating the proper commandline, etc.
Expand Down Expand Up @@ -323,7 +323,7 @@ minicon uc5fat ff6f2573d73b 9 days ago

In order to reduce it, you just need to issue the next command:
```bash
$ ./minidock -i minicon:uc5fat -t minicon:uc5 --apt
$ minidock -i minicon:uc5fat -t minicon:uc5 --apt
...
```

Expand Down Expand Up @@ -378,7 +378,7 @@ $ docker run --rm -it minicon:uc6fat i am a cow in a fat container
In this case, the entrypoint needs some parameters to be run. If you try to analyze the container simply issuing a command like the next one:

```bash
$ ./minidock -i minicon:uc6fat -t minicon:uc6 --apt
$ minidock -i minicon:uc6fat -t minicon:uc6 --apt
...
$ docker run --rm -it minicon:uc6 i am a cow in a not properly minimized container
cowsay: Could not find default.cow cowfile!
Expand All @@ -389,7 +389,7 @@ It does not work properly, because the execution of the entrypoint has not been
In this case, you should run a **minidock** commandline that include the command that we used to test it, and we will be able to run it:

```bash
$ ./minidock -i minicon:uc6fat -t minicon:uc6 --apt -- i am a cow in a fat container
$ minidock -i minicon:uc6fat -t minicon:uc6 --apt -- i am a cow in a fat container
...
$ docker run --rm -it minicon:uc6 i am a cow in a minimized container
_____________________________________
Expand All @@ -409,11 +409,11 @@ We can check the differences in the sizes:
```bash
$ docker images minicon
REPOSITORY TAG IMAGE ID CREATED SIZE
minicon uc6 7c85b5a104f5 5 seconds ago 43.3MB
minicon uc6 7c85b5a104f5 5 seconds ago 5.81MB
minicon uc6fat 1c8179d3ba94 4 hours ago 206MB
```

In this case, the size has been reduced from 206MB to about 43.3MB.
In this case, the size has been reduced from 206MB to about 5.81MB.

# 5. Flexible Manipulation of Container Filesystems

Expand Down
17 changes: 14 additions & 3 deletions importcon
Original file line number Diff line number Diff line change
Expand Up @@ -200,7 +200,7 @@ function trim() {
function build_cmdline() {
local SHCMDLINE=""
while [ $# -gt 0 ]; do
if [ "$1" == "&&" -o "$1" == ">" -o "$1" == ">>" -o "$1" == "2>" -o "$1" == "2>>" -o "$1" == "<" -o "$1" == "<<" ]; then
if [ "$1" == "|" -o "$1" == "&&" -o "$1" == ">" -o "$1" == ">>" -o "$1" == "2>" -o "$1" == "2>>" -o "$1" == "<" -o "$1" == "<<" ]; then
SHCMDLINE="${SHCMDLINE} $1"
else
SHCMDLINE="${SHCMDLINE} \"$1\""
Expand All @@ -219,10 +219,21 @@ function arrayze_cmd() {
local _CMD="$2"
local R n=0
while read R; do
read ${AN}[n] <<< "$R"
read ${AN}[$n] <<< "$R"
n=$((n+1))
done < <(printf "%s\n" "$_CMD" | xargs -n 1 printf "%s\n")
}

function lines_to_array() {
local AN="$1"
local LINES="$2"
local L
local n=0
while read L; do
read ${AN}[$n] <<< "$L"
n=$((n+1))
done <<< "$LINES"
}
function generate_dockerimagename() {
local NEWNAME
NEWNAME="$(cat /proc/sys/kernel/random/uuid)"
Expand Down Expand Up @@ -274,7 +285,7 @@ function get_config_field_raw() {
echo "$(trim "$C_RESULT")"
}

VERSION=1.2-1
VERSION=1.2-2

n=0
while [ $# -gt 0 ]; do
Expand Down
2 changes: 1 addition & 1 deletion mergecon
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ function get_config_field_raw() {
echo "$(trim "$C_RESULT")"
}

VERSION=1.2-1
VERSION=1.2-2

function verify_dependencies() {
if ! docker --version > /dev/null 2> /dev/null; then
Expand Down
64 changes: 53 additions & 11 deletions minicon
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ function trim() {
function build_cmdline() {
local SHCMDLINE=""
while [ $# -gt 0 ]; do
if [ "$1" == "&&" -o "$1" == ">" -o "$1" == ">>" -o "$1" == "2>" -o "$1" == "2>>" -o "$1" == "<" -o "$1" == "<<" ]; then
if [ "$1" == "|" -o "$1" == "&&" -o "$1" == ">" -o "$1" == ">>" -o "$1" == "2>" -o "$1" == "2>>" -o "$1" == "<" -o "$1" == "<<" ]; then
SHCMDLINE="${SHCMDLINE} $1"
else
SHCMDLINE="${SHCMDLINE} \"$1\""
Expand All @@ -287,10 +287,21 @@ function arrayze_cmd() {
local _CMD="$2"
local R n=0
while read R; do
read ${AN}[n] <<< "$R"
read ${AN}[$n] <<< "$R"
n=$((n+1))
done < <(printf "%s\n" "$_CMD" | xargs -n 1 printf "%s\n")
}

function lines_to_array() {
local AN="$1"
local LINES="$2"
local L
local n=0
while read L; do
read ${AN}[$n] <<< "$L"
n=$((n+1))
done <<< "$LINES"
}
function plugin_parameter() {
# Gets the value of a parameter passed to a plugin
# the format is: <plugin>:<param1>=<value1>:<param2>=<value2>...
Expand Down Expand Up @@ -578,6 +589,14 @@ function _strace_exec() {
SECONDSSIM=3
fi

local SHOWSTRACE
SHOWSTRACE=$(plugin_parameter "strace" "showoutput")
if [ $? -eq 0 ]; then
if [ "$SHOWSTRACE" == "" ]; then
SHOWSTRACE=true
fi
fi

local MODE
MODE="$(_strace_mode)"

Expand All @@ -588,9 +607,13 @@ function _strace_exec() {
p_info "analysing ${COMMAND[@]} using strace and $SECONDSSIM seconds ($MODE)"

local TMPFILE=$(tempfile)
{
timeout -s 9 $SECONDSSIM strace -qq -e file -fF -o "$TMPFILE" "${COMMAND[@]}" > /dev/null 2> /dev/null
} > /dev/null 2> /dev/null
if [ "$SHOWSTRACE" == "true" ]; then
timeout -s 9 $SECONDSSIM strace -qq -e file -fF -o "$TMPFILE" "${COMMAND[@]}"
else
{
timeout -s 9 $SECONDSSIM strace -qq -e file -fF -o "$TMPFILE" "${COMMAND[@]}" > /dev/null 2> /dev/null
} > /dev/null 2> /dev/null
fi

# Now we'll inspect the files that the execution has used
local EXEC_FUNCTIONS="exec.*"
Expand Down Expand Up @@ -767,7 +790,7 @@ function STRACE_command() {
fi
CMDLINE[0]="$COMMAND"
fi
COMMAND=( ${CMDLINE[@]} )
COMMAND=( "${CMDLINE[@]}" )
_strace_exec

PLUGINS_ACTIVATED="${_PLUGINS_ACTIVATED}"
Expand All @@ -780,6 +803,17 @@ function PLUGIN_11_scripts() {
# If it is, adds the interpreter to the list of commands to add to the container
p_debug "trying to guess if $1 is a interpreted script"

local INCLUDEFOLDERS
INCLUDEFOLDERS=$(plugin_parameter "scripts" "includefolders")
if [ $? -eq 0 ]; then
if [ "$INCLUDEFOLDERS" == "" ]; then
INCLUDEFOLDERS=true
fi
else
# The default value is to include the folders that the interpreter may use
INCLUDEFOLDERS=false
fi

local S_PATH="$(which $1)"
local ADD_PATHS=

Expand Down Expand Up @@ -812,10 +846,8 @@ function PLUGIN_11_scripts() {
fi

case "$(basename "$INTERPRETER")" in
perl) ADD_PATHS="${ADD_PATHS}
$(perl -e "print qq(@INC)" | tr ' ' '\n' | grep -v -e '^/home' -e '^\.')";;
python) ADD_PATHS="${ADD_PATHS}
$(python -c 'import sys;print "\n".join(sys.path)' | grep -v -e '^/home' -e '^\.')";;
perl) ;;
python) ;;
bash) ;;
sh) ;;
env) ADD_PATHS="${ADD_PATHS}
Expand All @@ -824,6 +856,16 @@ ${ENV_APP}";;
return 0;;
esac

# If we want to include the 'include' folders of the scripts (to also include libraries), let's get them
if [ "$INCLUDEFOLDERS" == "true" ]; then
case "$(basename "$INTERPRETER")" in
perl) ADD_PATHS="${ADD_PATHS}
$(perl -e "print qq(@INC)" | tr ' ' '\n' | grep -v -e '^/home' -e '^\.')";;
python) ADD_PATHS="${ADD_PATHS}
$(python -c 'import sys;print "\n".join(sys.path)' | grep -v -e '^/home' -e '^\.')";;
esac
fi

if [ "$ADD_PATHS" != "" ]; then
p_debug "found that $S_PATH needs $ADD_PATHS"
local P
Expand All @@ -846,7 +888,7 @@ function plugin_list() {
done <<< "$(PLUGIN_funcs)"
echo
}
VERSION=1.2-1
VERSION=1.2-2

function is_protected() {
local SRC="$1"
Expand Down
Loading

0 comments on commit 0722883

Please sign in to comment.