Appmon Daemon

1. Goals and context

Appmon Daemon is a standalone executable aimed to manage applications life circles on simple Linux systems.
Appmon Daemon can start, stop, restart or send status of monitored applications.
When an application is monitored by the Appmon Daemon, the Appmon Daemon will run the command (in a new spawned process) and checks if the application stops with an error and then restart it.

INFO

As Appmon Daemon was designed to use simple mechanisms, it cannot give hard constraints on application reboot time accuracy or application stopping time accuracy.

2. External Interface and usage

2.1. Start the daemon

It is up to user/solution provider to find a way to start Appmon Daemon at the device boot.
Some basic ideas: init/rc scripts, cron ...

INFO

The daemon has to be started with correct permissions including: opening a socket on specified port, changing uid/gid of processes, changing process priority, etc. As a result, this daemon is very likely to be run using root-like permissions.

usage:

$ ./appmon_daemon -p [tcp_port_number] -a [privileged_app] -w [privileged app working directory] -u [user id to use to start apps] -g [group id to use to start apps] \
  -n [process priority to run apps]

INFO

When an option is given, its value must be correctly set, otherwise the Appmon Daemon is very likely to exit synchronously with an error.

example:

$ ./appmon_daemon -p 45874 -a /tmp/myapp -w /tmp -u myuser -g 32500

INFO

As the use of the term daemon suggests it, Appmon Daemon immediately exits, even when init was successful, then continuing to run in background being accessible using the socket API.

2.2. Socket Interface

TCP port will default to port 4242 if not specified in command used to start the daemon.

When the client connects to the daemon, it can send command and get results several times.

WARNING

The client must pay attention to close the socket connected to the daemon if other clients have to send requests: only one client connection is possible at the same time.

2.3. Commands

2.3.1. Generic command format
command_name arg1 arg2 ... argn\n

The format of command is the command name followed by command arguments.
The command line element separator is space and the command line ending character is \n.

2.3.2. Generic command response format
result1\t result2\t ...resultn\n

The command produce several results, those are separated by character \t (see List command for example)
The command results are ended by character \n.

2.3.3. Setup command
setup working_directory application

working_directory is the absolute path used as working directory to execute the application.
application is the absolute path to the application executable. Can be any type of Linux executable file, read/write/exec access rights must be correctly set (prior to calling setup command and during application life circle).

2.3.3.1. actions
2.3.3.2. setup response format
Application_id

Application_id is a string:

ex:

setup/tmp /tmp/app
1
setup /tmp /path/that/doesnt/exist
Cannot install app

2.3.4. Start command

start application_id

application_id is the id as returned by the setup command to identify the application.

2.3.4.1. actions
2.3.4.2. start response format
result

result is a string:

ex:

start /tmp /tmp/app
1
start /tmp /path/that/doesnt/exist
Cannot start app

2.3.5. Stop command

stop application_id

application_id is the id as returned by the setup command to identify the application.

2.3.5.1. actions

Note: All app childs, if any, will be killed with the monitored application, unless the childs detach from it.

2.3.5.2. stop response format
result

result is a string:

ex:

stop 3
ok
stop 4344444
Unknown app
stop 1
Privileged App, cannot act on it through socket.

2.3.6. Remove command

remove application_id

application_id is the id as returned by the setup command to identify the application.

2.3.6.1. actions
2.3.6.2. Remove response format
result

result is a string:

ex:

remove 3
ok
remove 4344444
Unknown app

2.3.7. Status command

status application_id

application_id is returned by the start command to identify the application.

2.3.7.1. Status response format
result

result is a string:

ex:

status 2
AppID=[2] Privileged=[0] Prog=[/tmp/prog] Wd=[/tmp] Status=[STARTED] Pid=[16855] StartCount[2] LastExitType=[STOP_REGULAR] LastExitCode[140]
status 4344444
Unknown app

2.3.8. List command

List

No argument.

2.3.8.1. List response format
result

result is a string:

ex:

list
AppID=[1] Privileged=[1] Prog=[/tmp/prog] Wd=[/tmp] Status=[STARTED] Pid=[16855] StartCount[2] LastExitType=[STOP_REGULAR] LastExitCode[140]
AppID=[2] Privileged=[0] Prog=[/tmp/prog2] Wd=[/tmp] Status=[STARTED] Pid=[16856] StartCount[1] LastExitType=[STOP_REGULAR] LastExitCode[140]
AppID=[3] Privileged=[0] Prog=[/tmp/prog3] Wd=[/tmp] Status=[STARTED] Pid=[16858] StartCount[6] LastExitType=[STOP_REGULAR] LastExitCode[140]
AppID=[4] Privileged=[0] Prog=[/tmp/prog4] Wd=[/tmp] Status=[STARTED] Pid=[16859] StartCount[1] LastExitType=[STOP_REGULAR] LastExitCode[140]
AppID=[5] Privileged=[0] Prog=[/tmp/prog] Wd=[/tmp] Status=[STOPPED] Pid=[16862] StartCount[0] LastExitType=[App haven't died yet] LastExitCode[-1]

2.3.9 setenv command

setenv variable_name=value

Where variable_name is the name of the variable to set in Appmon Daemon's address space, and value its value.

ex:

setenv LD_LIBRARY_PATH="/path/to/lib"
setenv LUA_CPATH="/path/to/lua/native/modules/?.so"
setenv LUA_PATH="/path/to/lua/modules/?.lua;/path/to/lua/modules/?/init.lua"

3. Usage examples

Given the socket interface, one of the simplest way to interact with the Appmon Daemon can be to use the nc aka netcat tool usually provided in most of Linux distributions.

Let's start the Appmon Daemon

$ ./appmon_daemon -p 7865

To get some kind of interactive mode do:

$ nc localhost 7865
setup /tmp /tmp/some_absent_file                           ## send this command by hitting enter
prog (/tmp/sdflklsdjkflsdkjflkdsj) cannot be stat!         ## result sent by Appmon Daemon
...                                                        ## you can type other commands

To get some kind of client to send command, for exemple to be used in a shell script, you can do something like:

$ echo "setup /tmp /tmp/some_absent_file" | nc localhost 7865

4. Application life cicle

Once monitored by the Appmon Daemon, an application can have several state:

Here is a schema about application states with some details on which events triggers state transitions.

Caption:
Black arrows: commands coming from client API.
Red arrows: internal events
Others remarks:

Stopped status can occur because of several causes, next sections explains them more precisely.

INFO

Child process suspend/resume actions (using SIGSTOP/SIGCONT/... signals) that can cause the child to change its state, are neither detected, nor managed, i.e. suspended application/child will still be reported as "started"

4.1. Termination status

The termination status is shown in status/list command result in "LastExitType" field.
It is the combination of the cause of the app exit and the state of the app at this precise moment.

When a SIGCHLD meaning child termination is received, termination status is computed like that:

4.2. Application automatic restart

The Appmon Daemon will restart a monitored application when it dies if:

When restarted, the application will keep its id, but will be started in a new process with a new pid.
All actions/configurations on the application (working directory, process group creation, ...) done on regular start also apply on automatic restart.

5. Known limitations