Differences between revisions 2 and 3
Revision 2 as of 2006-11-28 22:29:33
Size: 3943
Editor: scott
Comment: change state names
Revision 3 as of 2006-11-30 13:51:16
Size: 6377
Editor: scott
Comment: improve spec
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
There are currently two scripts that may be run by a job, one before it is started to prepare and the other after is is stopped to cleanup. This specification proposes two new scripts, one after it has started, and another before it is stopped, and adjusts the names of them all to make sense. It also proposes allowing binaries to be run instead of scripts at these points. Upstart currently provides the option for two shell script tasks that may be run as part of the job's life cycle; one before it is started to prepare, and the other after is is stopped to cleanup. This specification proposes new names for these, allows them to be specified as paths to executables instead of shell scripts and describes an additional two tasks; one after it has started, and another before it is stopped.
Line 13: Line 13:
The current script names are confusing for those coming from init scripts, they expect to need to start or kill the daemon themselves. The change of names would be welcome to reduce confusion. The current names of the existing tasks, "start" and "stop", are confusing to those coming to upstart from the existing init script system. They expect that these tasks need to actually start or kill the daemon themselves, whereas in upstart, that is handled by the init daemon itself. New names should reduce this confusion and make it obvious what they are for.
Line 15: Line 15:
The additional scripts are useful to perform post-start checks, or check whether it is safe to stop a process. The primary process of the job has always been able to be specified as either a shell script or executable, however the "start" and "stop" tasks have only been able to be shell scripts. This is an artificial limitation with no real reason for existing, so will be removed to improve consistency.

While the two current tasks allow for preparing and cleaning up, they're not sufficient for performing such things as post-start checks, waiting for a daemon to be listening, instructing a daemon to save its state, wait for it to be safe to stop or even cancel the stop entirely. In order to that, seperate jobs needed to be written, which makes maintenance more difficult and is generally pointless; the proposed new tasks allow these things to be done inside the same job definition.
Line 21: Line 23:
 * A daemon may not be able to be stopped while there are dependencies on it, or it may wish to emit an event to inform others that it is going away.  * A daemon may not be able to be stopped while there are connections to it, or it may wish to emit an event to inform others that it is going away.
Line 25: Line 27:
This specification is limited to the names of the scripts, and the introduction of the new ones. The necessary underlying state changes are specified in JobStates. This scope of this specification is limited to specifying the names and descriptions of the tasks, and how they are configured. The necessary underlying state changes are specified by JobStates.
Line 29: Line 31:
 * The `start` script will be renamed to `pre-start`.  * The `start` task will be renamed to `pre-start`, this better reflects that it happens before the job is actually started and is used for preparation.
 {{{
pre-start script
    mkdir /var/run/daemon
end script
}}}
Line 31: Line 38:
 * The `stop` script will be renamed to `post-stop`.  * The `stop` task will be renamed to `post-stop`, this better reflects that it happens after the job has stopped, and is used for cleanup.
 {{{
post-stop script
    rm -rf /var/run/daemon
end script
}}}
Line 33: Line 45:
 * A new `post-start` script will be introduced.  * In the configuration file, either `script` or `exec` may follow the name of the task. It is an error to specify both.
 {{{
pre-start exec /usr/lib/daemon/prepare.pl
}}}
Line 35: Line 50:
 * This will be run once the job is in the '''post-start''' state.  * A new `post-start` task will be introduced.
   * This may be specified as either a `script` or `exec`.
   * It is run once the primary process has been started, and the pid located; alongside that process.
   * The job is not considered fully started until this script finishes, so it may be used to hold up other jobs.
   * The exit status of this task does not affect the goal of the job, if it terminates with an error, the job will still be running.
   * If the goal is changed to '''stop''' while this task is running (which may be by the task itself), the task will be terminated and the goal changed.
 {{{
post-start script
    # wait for listen on port 80
    while ! nc -q0 localhost 80 </dev/null >/dev/null 2>&1; do
        sleep 1;
    done
end script
}}}
Line 37: Line 65:
 * Should the `post-start` script terminate with an error, the goal will be changed to stop, and the process will fail.

 * When the `post-start` script completes (or if it doesn't exist), the state will be changed to '''running'''. This allows this script to be used to detect readiness.

 * Changing the goal to stop will terminate the `post-start` script if it is running.

 * A new `pre-stop` script will be introduced.

 * This will be run once the job is in the '''pre-stop''' state.

 * Should the `pre-stop` script terminate with an error, the goal will be changed ''back to start''. The event that caused it will still fail.

 * When the `pre-stop` script completes (or if it doesn't exist), the state will be changed to '''stopping'''. This allows this script to prevent or delay the stopping of the job.

 * Changing the goal to start will terminate the `pre-stop` script if it is running.

 * In the configuration file, either `script` or `exec` may follow the name of the action, defining either the command or script to run.
 * A new `pre-stop` task will be introduced.
   * This may be specified as either a `script` or `exec`.
   * It is run before the primary process is terminated, alongside that process.
   * The job is not considered to be stopping until this script finishes, so it may be used to delay the termination and hold up other jobs.
   * The exit status of this task does not affect the goal of the job, if it terminates with an error, the job will still stop.
   * If the goal is changed to '''start''' while this task is running (which may be by the task itself), the task will be terminated and the goal changed. The task should be careful to trap this and undo anything it does that disables the job.
 {{{
pre-stop script
    # disable the queue, wait for it to become empty
    trap "fooctl enable" TERM
    fooctl disable
    while fooq >/dev/null; do
        sleep 1
    done
end script
}}}
Line 59: Line 86:
New members of the `Job` structure will need to be introduced, replacing the existing ones. Changes largely occur within `init/job.h` and `init/job.c`. A new structure will be introduced that defines either a command to execute or a script, this will be used for tasks and the primary process itself.
Line 61: Line 88:
char *pre_start_command;
char *pre_start_script;
char *post_start_command;
char *post_start_script;
char *pre_stop_command;
char *pre_stop_script;
char *post_stop_command;
char *post_stop_script;
char *respawn_command;
char *respawn_script;
typedef struct task {
  int script;
  char *command;
} Task;
Line 73: Line 94:
It may be worth having some kind of union to track these, as they're now quite common. When `script` is `FALSE`, the `command` member is executed directly; otherwise it is executed using a shell.
Line 75: Line 96:
These will need to be run, as appropriate, by `job_change_state` in `init/job.c`. The existing members of the `Job` structure will be replaced with new members.
 {{{
Task *process;
Task *pre_start;
Task *post_start;
Task *pre_stop;
Task *post_stop;
}}}
Line 77: Line 105:
Because the `post-start` and `pre-stop` scripts can be run alongside the process, an additional pid will need to be tracked: A new function, `job_run_task` will be introduced that is passed a task and calls either `job_run_command` or `job_run_script` with the details and stores the pid in the appropriate place. This function will be called by `job_change_state` where appropriate.

Because the `post-start` and `pre-stop` tasks can be run alongside the process, an additional pid will need to be tracked for them.
Line 82: Line 112:
This will need to be checked by `job_find_by_pid`, `job_child_reaper` will also need to be modified to check which process died and adjust its behaviour appropriately. This will need to be checked by `job_find_by_pid`, `job_child_reaper` will also need to be modified to check which process died and adjust its behaviour appropriately.  `job_kill_process` will need to ensure that this process is also killed.
Line 86: Line 116:
This is backwards compatible with the previous behaviour. These changes are not backwards compatible, many existing jobs will need changing. No migration plan is anticipated at this point in the development cycle.

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

Upstart currently provides the option for two shell script tasks that may be run as part of the job's life cycle; one before it is started to prepare, and the other after is is stopped to cleanup. This specification proposes new names for these, allows them to be specified as paths to executables instead of shell scripts and describes an additional two tasks; one after it has started, and another before it is stopped.

Rationale

The current names of the existing tasks, "start" and "stop", are confusing to those coming to upstart from the existing init script system. They expect that these tasks need to actually start or kill the daemon themselves, whereas in upstart, that is handled by the init daemon itself. New names should reduce this confusion and make it obvious what they are for.

The primary process of the job has always been able to be specified as either a shell script or executable, however the "start" and "stop" tasks have only been able to be shell scripts. This is an artificial limitation with no real reason for existing, so will be removed to improve consistency.

While the two current tasks allow for preparing and cleaning up, they're not sufficient for performing such things as post-start checks, waiting for a daemon to be listening, instructing a daemon to save its state, wait for it to be safe to stop or even cancel the stop entirely. In order to that, seperate jobs needed to be written, which makes maintenance more difficult and is generally pointless; the proposed new tasks allow these things to be done inside the same job definition.

Use cases

  • Once a daemon is started, it would be useful to have a script that connected to it to verify whether it is, in fact, running or not.
  • A daemon may not be able to be stopped while there are connections to it, or it may wish to emit an event to inform others that it is going away.

Scope

This scope of this specification is limited to specifying the names and descriptions of the tasks, and how they are configured. The necessary underlying state changes are specified by JobStates.

Design

  • The start task will be renamed to pre-start, this better reflects that it happens before the job is actually started and is used for preparation.

    pre-start script
        mkdir /var/run/daemon
    end script
  • The stop task will be renamed to post-stop, this better reflects that it happens after the job has stopped, and is used for cleanup.

    post-stop script
        rm -rf /var/run/daemon
    end script
  • In the configuration file, either script or exec may follow the name of the task. It is an error to specify both.

    pre-start exec /usr/lib/daemon/prepare.pl
  • A new post-start task will be introduced.

    • This may be specified as either a script or exec.

    • It is run once the primary process has been started, and the pid located; alongside that process.
    • The job is not considered fully started until this script finishes, so it may be used to hold up other jobs.
    • The exit status of this task does not affect the goal of the job, if it terminates with an error, the job will still be running.
    • If the goal is changed to stop while this task is running (which may be by the task itself), the task will be terminated and the goal changed.

    post-start script
        # wait for listen on port 80
        while ! nc -q0 localhost 80 </dev/null >/dev/null 2>&1; do
            sleep 1;
        done
    end script
  • A new pre-stop task will be introduced.

    • This may be specified as either a script or exec.

    • It is run before the primary process is terminated, alongside that process.
    • The job is not considered to be stopping until this script finishes, so it may be used to delay the termination and hold up other jobs.
    • The exit status of this task does not affect the goal of the job, if it terminates with an error, the job will still stop.
    • If the goal is changed to start while this task is running (which may be by the task itself), the task will be terminated and the goal changed. The task should be careful to trap this and undo anything it does that disables the job.

    pre-stop script
        # disable the queue, wait for it to become empty
        trap "fooctl enable" TERM
        fooctl disable
        while fooq >/dev/null; do
            sleep 1
        done
    end script

Implementation

Code

Changes largely occur within init/job.h and init/job.c. A new structure will be introduced that defines either a command to execute or a script, this will be used for tasks and the primary process itself.

  • typedef struct task {
      int   script;
      char *command;
    } Task;

When script is FALSE, the command member is executed directly; otherwise it is executed using a shell.

The existing members of the Job structure will be replaced with new members.

  • Task *process;
    Task *pre_start;
    Task *post_start;
    Task *pre_stop;
    Task *post_stop;

A new function, job_run_task will be introduced that is passed a task and calls either job_run_command or job_run_script with the details and stores the pid in the appropriate place. This function will be called by job_change_state where appropriate.

Because the post-start and pre-stop tasks can be run alongside the process, an additional pid will need to be tracked for them.

  • pid_t aux_pid;

This will need to be checked by job_find_by_pid, job_child_reaper will also need to be modified to check which process died and adjust its behaviour appropriately. job_kill_process will need to ensure that this process is also killed.

Data preservation and migration

These changes are not backwards compatible, many existing jobs will need changing. No migration plan is anticipated at this point in the development cycle.


CategorySpec

JobScripts (last edited 2006-11-30 13:51:16 by scott)