Motr  M0
Spiel API DLD

Definition of Seagate Software Platform Library for Motr (SPIEL, SSPL).


Overview

Spiel library is used by a "management application" to control Motr:

  • to inform Motr about hardware resources it should use, their roles and arrangement;
  • to specify operational characteristics of Motr, such as fault-tolerance parameters;
  • to issue commands to modify the cluster state: start and stop operation, format storage, etc.

Motr stores information about cluster elements (hardware and software), their arrangement, functions and operational characteristics in an internal (replicated) data-base, called configuration data-base.

Cluster state is changed by sending operation requests (fops) to Motr services running on cluster nodes. This assumes that every node already runs a mininal Motr process, which is started on node bootup and can be then remotely commanded to start more services, as necessary.


Definitions

  • Configuration: Data-base describing Motr cluster in details required and sufficient for cluster components operation. See Configuration data-base.
  • Version: A Configuration data-base snapshot reflecting the changes introduced by "managements application". A Version is intended for being uploaded to confd servers.
  • Transaction: A standard mechanism of spreading Version among confd servers and putting it into effect. Transaction guarantees Version being consistently distributed among confd servers and reached a quorum enough for non-conflicting configuration reading. A Transaction needs to be open explicitly. Later it may be either closed or committed. A Version appliance occurs in case of successful committing only.

Configuration data-base

Motr configuration data-base contains all the meta-data that is manipulated by a system administrator, as opposed to meta-data manipulated in the course of executing application requests.

Data-base is a graph. Graph nodes represent cluster elements and arcs—relations between elements. Graph matches the "schema", which defines types of elements and possible relations between them.

The following configuration elements are currently supported (see conf/obj.h for details):

  • a profile is the list of pools which a client can use. It is not used by servers.
  • a pool is a collection of hardware resources. Cluster hardware is divided into pools for administrative purposes (for example, for security reasons) and to encode fault-tolerance properties;
  • a pool version is a list of elements that belonged to a pool at a certain moment in time. As system evolves, new hardware is added and old hardware retired, contents of a pool might change (this change is reflected by creation of a new pool version), but pool identity remains unchanged;
  • a rack of enclosures (cf. "a knot of toads", http://www.oxforddictionaries.com/words/what-do-you-call-a-group-of);
  • an enclosure;
  • a controller;
  • a storage device: a rotational or solid-state drive;
  • a node is something capable or running processes. Controllers are one type of node, but a cluster can contain other nodes;
  • a process is a user-space process or kernel executing services;
  • a service is an executable entity that can accept and execute requests;
  • in addition, off each pool version hangs off a tree of "v-objects" (rack-v, enclosure-v and controller-v) that specify which hardware elements belong to the pool version. A v-object contains a pointer to the "real object" (rack, enclosure or controller) and the list of children. Such indirect arrangement makes it possible to have pool versions sharing hardware.

Interface

Each configuration element has a unique identifier, which is assigned by the management application. An identifier is 128 bits (m0_fid), with 8 most significant bits representing object type.

Spiel interface is divided into two parts: configuration management and command interface.

Configuration management interface is designed in transactional manner. Command interface defines individual, separate calls.

Invocation

Spiel interface is exported from the standard Motr library, which uses Motr networking for communication. As a result, spiel entry points can be invoked on any node in the cluster.