Differences between revisions 7 and 8
Revision 7 as of 2006-11-29 20:50:37
Size: 3762
Editor: scott
Comment: drafting
Revision 8 as of 2006-12-13 20:09:59
Size: 5015
Editor: scott
Comment: try and draft this damned spec
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
 * We'd like to introduce additional scripts, as defined in JobScripts, these need a state to be run in.
Line 27: Line 29:
The new states are as follows: === State Diagram ===
Line 29: Line 31:
|| '''State Name''' || '''Process Running''' || '''Aux Process Running''' || '''Next State if Start''' || '''Next State if Stop''' || '''Notes''' ||
|| '''waiting''' || none || none || '''pre-start''' || n/a || rest state ||
|| '''pre-start''' || `pre-start` script || none || '''starting''' || '''post-stop''' || ||
|| '''starting''' || binary or parent || none || '''post-start''' || '''stopping''' || spawn process, wait for pid ||
|| '''post-start''' || binary || `post-start` script || '''running''' || '''stopping''' || ||
|| '''running''' || binary || none || n/a || '''pre-stop''' || rest state ||
|| '''pre-stop''' || binary || `pre-stop` script || '''stopping''' || '''stopping''' || ||
|| '''stopping''' || binary terminating || none || '''post-stop''' || '''post-stop''' || terminate process ||
|| '''post-stop''' || `post-stop` script || none || '''pre-start''' || '''waiting''' || ||
The easiest way to describe the proposed state machine is with a diagram ([attachment:states.dot dot source]):
Line 39: Line 33:
Problems: attachment:states.png
Line 41: Line 35:
 * In running, fails and needs to be respawned -- need to go to post-stop but leave the goal as start Within this diagram, the following applies:
Line 43: Line 37:
 * When are events emitted?
   * starting when goal set to start
   * stopping when goal set to stop
 * Diamond nodes represent rest states.
Line 47: Line 39:
   * started when at running
   * stopped when at waiting
 * Oval nodes represent states during which tasks are to be performed.
Line 50: Line 41:
 * respawn would need to behave as:
   * emit failed because it died
   * change goal to stop, emit stopping
   * notice respawn, change goal back to start, emit starting
 * Rectangular nodes represent events to be emitted when entering the next state, they are not discreet states themselves.
Line 55: Line 43:
 * all works fine: starting, started, stopping, stopped
 * pre-start fails: starting, failed, stopping, stopped
 * post-start fails: starting, failed, stopping, stopped
 * binary fails: starting, failed, stopping, stopped
 * binary respawns: starting, started, failed, stopping, starting, started
 * Nodes are grouped into four "virtual states" that correspond to the time between events.
Line 61: Line 45:
 * pre-stop should be able to prevent stopping, but it happens after the event?  * Green arrows leading from a node are followed while the goal is `start`.

 * Red arrows leading from a node are followed while the goal is `stop`.

=== Notes ===

 * State names are expected to roughly match the task being performed during that state:
   * waiting
   * pre-start
   * start ("exec process")
   * post-start
   * running
   * pre-stop
   * stop ("kill process")
   * post-stop

 * The following state transitions are observed:
  * The `starting` event is emitted when entering the `pre-start` state.
  * The `pre-start` task is not run until the `starting` event has completed.
  * The primary process is run when entering the `start` state.
  * The `post-start` task is run when entering the `post-start` state.
  * The `started` event is emitted when entering the `running` state.
  * The `pre-stop` task is run when entering the `pre-stop` state.
  * The `stopping` event is emitted when entering the `stop` state.
  * The primary process is not killed until the `stopping` event has completed.
  * The `post-stop` task is run when entering the `post-stop` state.
  * The `stopped` event is emitted when entering the `waiting` state.

 * Multiple state transitions in a single call will still occur; though it's no longer possible to skip past the emission of an event that can block.

 * A job remains in the `waiting` state if the goal is `stop`.

 * When a start request or event occurs, the goal is set to `start`:
  * If the job is in the `waiting` state, it can be moved to the `starting` state.
  * As noted by JobScripts, if the job is in the `pre-stop` state, it causes the `pre-stop` task to be terminated and the state will return to `running` when that task is reaped. The `stopping` event is never emitted.

 * When a stop request or event occurs, the goal is set to `stop`:
  * If the job is in the `running` state, it is moved to the `pre-stop` state.
  * As noted by JobScripts, if the job is in the `post-start` state, it causes the `post-start` task to be terminated and the state will change to `stop` when that task is reaped. The `started` event is never emitted.
 
 * When the primary process exits, the goal is changed to `stop` unless the job is marked `respawn` and the termination cause is listed in `normalexit`.
  * The next state, regardless of the goal, is always `stop`.
  * The `pre-stop` state is therefore skipped.
  * A job being respawned will iterate through the usual stopping states followed by the starting states.
Line 75: Line 102:
The events are backwards compatible, with just the addition of new intermediate ones. These changes are backwards compatible with the previous behaviour.

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

Summary

Upstart currently has a three-level state machine; goal, job state and process state. This specification proposes reducing that to a two-level state machine, combining the two lower levels into one.

Rationale

The current three-level state machine is confusing, and doesn't allow as much flexibility as we would like. In particular, it's not ideal for extending the states to allow for additional scripts and actions.

Use cases

  • Some daemons fork off into the background, we'd like an interim state that indicates this has happened.
  • Some processes don't terminate immediately, we'd like an interim state that indicates we sent the signal but haven't yet reaped it.
  • We'd like to introduce additional scripts, as defined in JobScripts, these need a state to be run in.

Scope

The scope of this specification is limited to modifying the current state machine, specifying where the events defined in JobEvents are emitted and when the scripts specified in JobScripts are to be run.

Design

State Diagram

The easiest way to describe the proposed state machine is with a diagram ([attachment:states.dot dot source]):

attachment:states.png

Within this diagram, the following applies:

  • Diamond nodes represent rest states.
  • Oval nodes represent states during which tasks are to be performed.
  • Rectangular nodes represent events to be emitted when entering the next state, they are not discreet states themselves.
  • Nodes are grouped into four "virtual states" that correspond to the time between events.
  • Green arrows leading from a node are followed while the goal is start.

  • Red arrows leading from a node are followed while the goal is stop.

Notes

  • State names are expected to roughly match the task being performed during that state:
    • waiting
    • pre-start
    • start ("exec process")
    • post-start
    • running
    • pre-stop
    • stop ("kill process")
    • post-stop
  • The following state transitions are observed:
    • The starting event is emitted when entering the pre-start state.

    • The pre-start task is not run until the starting event has completed.

    • The primary process is run when entering the start state.

    • The post-start task is run when entering the post-start state.

    • The started event is emitted when entering the running state.

    • The pre-stop task is run when entering the pre-stop state.

    • The stopping event is emitted when entering the stop state.

    • The primary process is not killed until the stopping event has completed.

    • The post-stop task is run when entering the post-stop state.

    • The stopped event is emitted when entering the waiting state.

  • Multiple state transitions in a single call will still occur; though it's no longer possible to skip past the emission of an event that can block.
  • A job remains in the waiting state if the goal is stop.

  • When a start request or event occurs, the goal is set to start:

    • If the job is in the waiting state, it can be moved to the starting state.

    • As noted by JobScripts, if the job is in the pre-stop state, it causes the pre-stop task to be terminated and the state will return to running when that task is reaped. The stopping event is never emitted.

  • When a stop request or event occurs, the goal is set to stop:

    • If the job is in the running state, it is moved to the pre-stop state.

    • As noted by JobScripts, if the job is in the post-start state, it causes the post-start task to be terminated and the state will change to stop when that task is reaped. The started event is never emitted.

  • When the primary process exits, the goal is changed to stop unless the job is marked respawn and the termination cause is listed in normalexit.

    • The next state, regardless of the goal, is always stop.

    • The pre-stop state is therefore skipped.

    • A job being respawned will iterate through the usual stopping states followed by the starting states.

Implementation

Code

The implementation is largely confined to upstart/job.h, init/job.h and init/job.c.

The Job structure will lose its process_state member, and the ProcessState enum will be removed.

job_next_state and job_change_state will be amended to use the new states, other references will be adjusted.

Data preservation and migration

These changes are backwards compatible with the previous behaviour.


CategorySpec

JobStates (last edited 2006-12-13 20:09:59 by scott)