reindexer

Replication & cluster mode

Reindexer supports async logical master-slave replication. Master must be standalone server or golang application with builtinserver reindexer binding. Slave can be standalone server, or golang application with builtin or builtinserver bindings. Number of active slaves is not limited. Slave can get data from master or other slave node. All nodes must have unique server Id for correct work. Server Id = 0 used for compatibility with old (<2.10) version.

Replication is using 3 different mechanics:

Write ahead log (WAL)

Write ahead log is combination of rows updates and statements execution. WAL is stored as part of namespace storage. Each WAL record has unique 64-bit log sequence number (LSN). WAL is ring structure, therefore after N updates (1M by default) are recorded, oldest WAL records will be automatically removed. WAL in storage contains only records with statements (bulk updates, deletes and index structure changes). Rows updates are not stored as dedicated WAL records, but each document contains it’s own LSN - this is enough to restore complete WAL in RAM on namespace loading from disk.

Side effect of this mechanic is lack of exact sequence of indexes updates/data updates, and therefore in case of incompatible data migration (e.g. indexed field type changed) slave will fail to apply offline WAL, and will fallback to forced sync

WAL overhead is 16 byte of RAM per each ROW update record.

Data integrity check

Replication is complex mechanism and there are present potential possibilities to broke data consistence between master and slave. Reindexer is calculates lightweight incremental hash of all namespace data (DataHash). DataHash is used to quick check, that data of slave is really up to date with master.

Usage

Configuring replication

To configure replication special document of namespace #config is used:

{
	"type":"replication",
	"replication":{
		"role":"none",
		"master_dsn":"cproto://127.0.0.1:6534/db",
		"cluster_id":2,
		"force_sync_on_logic_error": false,
		"force_sync_on_wrong_data_hash": false,
		"namespaces":[]
	}
}

As second option replication can be configured by config file, which will be placed to database folder. Sample of replication config file is here

If config file is present, then it’s overrides settings from #config namespace on reindexer startup

Check replication status

Replication status is available in system namespace #memstats. e.g, execution of statament:

Reindexer> SELECT name,replication FROM #memstats WHERE name='media_items'

will return JSON object with status of namespace media_items replication

{
	"last_lsn":20000000000000004,
	"last_lsn_v2":{
		"server_id":20,
		"counter":5
	},
	"slave_mode":false,
	"replicator_enabled":false,
	"temporary":false,
	"incarnation_counter":0,
	"data_hash":6,
	"data_count":4,
	"updated_unix_nano":1594041127153561000,
	"status":"none",
	"origin_lsn":{
		"server_id":20,
		"counter":5
	},
	"last_self_lsn":{
		"server_id":20,
		"counter":5
	},
	"last_upstream_lsn":{
		"server_id":0,
		"counter":999999999999999
	},
	"wal_count":6,
	"wal_size":311
}

Maximum WAL size configuration

WAL size (maximum number of WAL records) may be configured via #config namespace. For example to set first_namespace’s WAL size to 4000000 and second_namespace’s to 100000 this command may be used:

Reindexer> \upsert #config { "type": "namespaces", "namespaces": [ { "namespace":"first_namespace", "wal_size": 4000000 }, { "namespace":"second_namespace", "wal_size": 100000 } ] }

Default WAL size is 4000000

Access namespace’s WAL with reindexer_tool

To view offline WAL contents from reindexer_tool SELECT statement with special condition to #lsn index is used:

Reindexer> SELECT * FROM epg LIMIT 2 where #lsn > 1000000

To view online WAL updates stream command \SUBSCRIBE is used:

Reindexer> \SUBSCRIBE ON

To test subscription for example execute some write statement:

Reindexer> DELETE FROM media_items WHERE id>100

If subscription is working, then something like this should be displayed

#31864 media_items WalUpdateQuery DELETE FROM media_items WHERE id > 100

Limitations and know issues

Replication is in beta stage, so there are some issues and limitations:

Work around for this issues:
Before enabling replication master make database backup with reindexer_tool, remove database, enable master mode, and then restore database backup. After this WAL will be initialized and master should work properly