Motr  M0
confc Internals

State Specification

A state machine is embedded into m0_confc_ctx structure as its fc_mach member. m0_confc_ctx_init() initialises the state machine and sets its state to S_INITIAL.

dot_inline_dotgraph_7.png

S_INITIAL

Summary: m0_confc_ctx has just been initialised.

m0_confc_open() populates m0_confc_ctx::fc_path array (path_copy()) and posts an AST to m0_confc::cc_group.

Note
m0_sm_ast_post() signals group's clink. Current design of confc assumes that some thread will respond to this event by calling m0_sm_asts_run().

When the AST, posted by m0_confc_open(), is run, it moves a state machine (m0_confc_ctx::fc_mach) to S_CHECK state.

S_CHECK

Summary: Traversing the path, checking whether the requested configuration object is accessible.

When S_CHECK state is entered, check_st_in() callback is invoked. It calls path_walk() and, depending on the value returned by this call, moves the state machine to another state:

+--------------------+-----------------+
| path_walk() result |   next state    |
+--------------------+-----------------+
|    M0_CS_READY     |  S_TERMINAL     |
|    M0_CS_MISSING   |  S_WAIT_REPLY   |
|    M0_CS_LOADING   |  S_WAIT_STATUS  |
|         < 0        |  S_FAILURE      |
+--------------------+-----------------+

The algorithm of path_walk() is described below (see Walking the DAG).

S_WAIT_REPLY

Summary: Waiting for confd's reply to arrive.

When a state machine is about to enter S_WAIT_REPLY state, wait_reply_st_in() callback is executed. This callback sends configuration request (m0_confc_ctx::fc_rpc_item) to the confd, using m0_rpc_post().

The state machine remains in S_WAIT_REPLY state until a reply from confd arrives. This event triggers on_replied() callback. If m0_rpc_item::ri_error is non-zero, on_replied() posts an AST that will eventually move the state machine to S_FAILURE state, in case the confc is not coupled with rconfc. If ->ri_error is zero, on_replied() increments rpc item's reference counter (m0_rpc_item_get()) and posts an AST, scheduling transition to S_GROW_CACHE state.

In case confc is coupled with rconfc, and ->ri_error is found non-zero in on_replied(), the state machine is transitioned from S_WAIT_REPLY to S_SKIP_CONFD by posting AST to let rconfc reconnect the confc it is in control of to some other confd server from rconfc's active list.

S_WAIT_STATUS

Summary: Waiting for an object to be filled by another configuration request.

A state machine in S_WAIT_STATUS state remains idle until the channel (m0_conf_obj::co_chan) that m0_confc_ctx::fc_clink is registered with is signaled. Such an event triggers on_object_updated() callback, which de-registers the clink and posts an AST that will eventually move the state machine to S_CHECK state.

Note
Object's channel (m0_conf_obj::co_chan) is signaled (m0_chan_broadcast()) when
  1. object_enrich() completes loading of configuration data into this object and changes its status to M0_CS_READY (loading succeeded) or M0_CS_MISSING (loading failed);
  2. the object is closed and its number of references becomes zero. (This case is not applicable to S_WAIT_STATUS state.)

S_SKIP_CONFD

Summary: Skipping current confd the confc is connected to and reconnecting to some other confd server running the same configuration version.

m0_confc::cc_gops::go_skip() is called to let the confc be reconnected. When reconnection succeeded, the state machine is transitioned to S_CHECK state to repeat the last missing object reading. In case of reconnection failure due to having no more responsive confd servers in the active list, the state machine is transitioned to S_FAILURE state.

S_SKIP_CONFD

Summary: Applying configuration data contained in confd's reply.

When a state machine is entering S_GROW_CACHE state, grow_cache_st_in() callback is invoked. If the error code contained in confd's response (m0_conf_fetch_resp::fr_rc) is zero, the callback calls cache_grow() function (see Growing the cache below). The callback "releases" rpc item by calling m0_rpc_item_put(). If ->fr_rc == 0 and cache_grow() succeeds, grow_cache_st_in() moves the state machine to S_CHECK state, otherwise — to S_FAILURE state.

S_TERMINAL

Summary: Configuration retrieval succeeded.

S_FAILURE

Summary: Configuration retrieval failed.


Walking the DAG

path_walk() begins with locking the confc cache (m0_confc::cc_lock); it unlocks the cache before returning.

The function "moves" along the DAG of cached configuration objects, starting at m0_confc_ctx::fc_origin object and following m0_confc_ctx::fc_path. Next object is found by calling m0_conf_obj_ops::coo_lookup() with current object and path component as parameters. The iteration continues until ->coo_lookup() fails, or a stub is met, or the end of path is reached.

path_walk_complete() applies the results of path walking: increments reference counter of M0_CS_READY object, allocates and fills m0_conf_fetch request for M0_CS_MISSING object, or registers clink with the channel of M0_CS_LOADING object.


Growing the cache

cache_grow() locks the cache (m0_confc::cc_cache) and unlocks before returning. The function performs the following operations for every object descriptor (m0_confx_obj, defined in conf/onwire.ff):

  1. m0_conf_obj_find(): tries to find an object with the same identity (type and id) in the registry of cached objects. If the object is not found, a stub is created and added to the cache.
  2. object_enrich(): compares cached object with the descriptor received from the confd. If a discrepancy is found (!m0_conf_obj_match()), the function returns an error code.

    If there is no discrepancy, and the cached object is a stub, object_enrich() fills the cached object with configuration data (m0_conf_obj_fill()) and signals object's channel.


Threading and Concurrency Model

There are as many state machines in operation as there are unfinished m0_confc_open*() requests.

At most one state transition (m0_sm_state_descr::sd_in()) can be running at any given time. Synchronization of state transitions is achieved by using m0_sm_group (m0_confc::cc_group).

m0_confc instance and confc cache are protected from concurrent modifications by m0_confc::cc_lock mutex, aka cache lock. Group lock (m0_confc::cc_group::s_lock) cannot be used for this purpose, because it does not prevent the application from modifying the cache with m0_confc_close().

If a function needs both locks – group lock and cache lock – for its operation, group lock must be acquired first. Note, that the "function" here cannot be something invoked from an AST callback, because otherwise it would deadlock on the group mutex.

A user managing the state machine group (m0_confc::cc_group) is responsible for making sure m0_sm_asts_run() is called when m0_sm_group::s_clink is signaled. See State machine (search for ‘"ast" thread’.)