Motr  M0
idx_cass_instance Struct Reference
Collaboration diagram for idx_cass_instance:
Collaboration graph

Data Fields

char * ci_keyspace
 
CassCluster * ci_cluster
 
CassSession * ci_session
 

Detailed Description

A few notes on data model for Cassandra index service.

Motr index API defines a Key-Value interface supporting only a few simple operations (GET/PUT/DEL/NEXT). When implementing the driver for Cassandra index service, Cassandra driver should be able to make good use of its advance features to achive good performance, load balance, scalability and HA, we have the following requirements in mind: (1) Motr index interface requires K-V pairs are sorted in order by keys, particularly for NEXT operation.

(2) A Motr application may use the index API in different ways, taking S3 server, an application which uses Motr index API and Cassandra as a backend index service, as example:

(a) There may be a large number of indices (for example, one index for each S3 bucket), and the number of K-V pairs in an index may vary from 100s pairs to hundres of thousands. If an index is mapped to a column family directly, this large number of CF can cause heavy memory usage. 10s or 100s CF is sugussted.

(b) Cassandra is naturally optimised for write heavy workload, while S3 server exihbits a read dominant workload. This requires the Cassandra data model has to be optimized for read.

(3) Atomic modifications on index.

(4) Use BATCH and PREPARED queries whenever it is possible.

Motr defines the following schema for Cassandra index service:

CREATE TABLE cass_index_store { index_fid TEXT, key BLOB, vale BLOB, PRIMARY KEY(index_fid, key) };

A few things for this schema: (1) As shown in the schema definition, the row key is index fid which spreads rows into nodes(machines) by indices. (2) 'Key' is clustering key so keys are physically sorted in disks (to efficiently support NEXT query). 'Key' is of BLOB type to allow applications customise their keys and values. (3) The schema implies wide row is used, a few limitations on wide rows: max key(name) size: 64KB, max size for a column value: 2GB, max number of cells in a row: 2 billions (4) A mapping layer is introduced to translate Motr index to a physical Cassandra keyspace/column families. By wisely using the index FID, it is also possible to group indices into different keyspace/CF's. For example, Motr can map S3 bucket indices(object list) into one or more CF's, while mapping indices of object id to object metadata into another keyspace/CF.

Definition at line 93 of file idx_cass.c.

Field Documentation

◆ ci_cluster

CassCluster* ci_cluster

Definition at line 95 of file idx_cass.c.

◆ ci_keyspace

char* ci_keyspace

Definition at line 94 of file idx_cass.c.

◆ ci_session

CassSession* ci_session

Definition at line 96 of file idx_cass.c.


The documentation for this struct was generated from the following file: