Fenix @develop
 
Loading...
Searching...
No Matches
Data Recovery

Functions for storing and restoring data in Fenix. More...

Macros

#define FENIX_DATA_GROUP_WORLD_ID   10
 
#define FENIX_GROUP_ID_MAX   11
 
#define FENIX_TIME_STAMP_MAX   12
 
#define FENIX_DATA_MEMBER_ALL   15
 
#define FENIX_DATA_MEMBER_ATTRIBUTE_BUFFER   11
 
#define FENIX_DATA_MEMBER_ATTRIBUTE_COUNT   12
 
#define FENIX_DATA_MEMBER_ATTRIBUTE_DATATYPE   13
 
#define FENIX_DATA_MEMBER_ATTRIBUTE_SIZE   14
 
#define FENIX_DATA_SNAPSHOT_LATEST   -1
 
#define FENIX_DATA_SNAPSHOT_ALL   16
 
#define FENIX_DATA_SUBSET_CREATED   2
 
#define FENIX_DATA_POLICY_IN_MEMORY_RAID   13
 

Functions

int Fenix_Data_group_create (int group_id, MPI_Comm comm, int start_time_stamp, int depth, int policy_name, void *policy_value, int *flag)
 Create a Data Group.
 
int Fenix_Data_member_create (int group_id, int member_id, void *buffer, int count, MPI_Datatype datatype)
 Create a data member for store/restore operations.
 
int Fenix_Data_group_get_redundancy_policy (int group_id, int *policy_name, void *policy_value, int *flag)
 Get the storage policy of a data group.
 
int Fenix_Data_wait (Fenix_Request request)
  UNIMPLEMENTED Block on completion of the store operation specified by the request.
 
int Fenix_Data_test (Fenix_Request request, int *flag)
  UNIMPLEMENTED Query completion of the store operation specified by the request.
 
int Fenix_Data_member_store (int group_id, int member_id, Fenix_Data_subset subset_specifier)
 Store a particular group member into the group's resilient storage space, in uncommitted storage.
 
int Fenix_Data_member_storev (int group_id, int member_id, Fenix_Data_subset subset_specifier)
  UNIMPLEMENTED As [store](Fenix_Data_member_store), but subsets may vary rank-to-rank.
 
int Fenix_Data_member_istore (int group_id, int member_id, Fenix_Data_subset subset_specifier, Fenix_Request *request)
  UNIMPLEMENTED As [store](Fenix_Data_member_store), but asynchronous.
 
int Fenix_Data_member_istorev (int group_id, int member_id, Fenix_Data_subset subset_specifier, Fenix_Request *request)
  UNIMPLEMENTED As [istore](Fenix_Data_member_istore), but asynchronous.
 
int Fenix_Data_commit (int group_id, int *time_stamp)
 Commit stored data members to the group's next snapshot.
 
int Fenix_Data_commit_barrier (int group_id, int *time_stamp)
 As commit, but ensures a globally consistent commit.
 
int Fenix_Data_barrier (int group_id)
  UNIMPLEMENTED Block until all ranks in the group have reached this point.
 
int Fenix_Data_member_restore (int group_id, int member_id, void *target_buffer, int max_count, int time_stamp, Fenix_Data_subset *found_data)
 Restore the data of a group member from a snapshot.
 
int Fenix_Data_member_lrestore (int group_id, int member_id, void *target_buffer, int max_count, int time_stamp, Fenix_Data_subset *found_data)
 Local-only version of Fenix_Data_member_restore.
 
int Fenix_Data_member_restore_from_rank (int group_id, int member_id, void *data, int max_count, int time_stamp, Fenix_Data_subset *found_data, int source_rank)
  UNIMPLEMENTED As Fenix_Data_member_restore, but restores from a specific rank's data.
 
int Fenix_Data_subset_create (int num_blocks, int start_offset, int end_offset, int stride, Fenix_Data_subset *subset_specifier)
 Create a data subset for use in store operations.
 
int Fenix_Data_subset_createv (int num_blocks, int *array_start_offsets, int *array_end_offsets, Fenix_Data_subset *subset_specifier)
 As Fenix_Data_subset_create, but with varying start and end offsets.
 
int Fenix_Data_subset_delete (Fenix_Data_subset *subset_specifier)
 Delete a data subset.
 
int Fenix_Data_group_get_number_of_members (int group_id, int *number_of_members)
  UNIMPLEMENTED Get the number of members in a data group.
 
int Fenix_Data_group_get_member_at_position (int group_id, int *member_id, int position)
  UNIMPLEMENTED Get member ID based on member index
 
int Fenix_Data_group_get_number_of_snapshots (int group_id, int *number_of_snapshots)
 Get the number of locally-available snapshots in a data group.
 
int Fenix_Data_group_get_snapshot_at_position (int group_id, int position, int *time_stamp)
 Get the time stamp of a snapshot at a given index.
 
int Fenix_Data_member_attr_get (int group_id, int member_id, int attributename, void *attributevalue, int *flag, int source_rank)
  UNIMPLEMENTED Get the value of a member's attribute.
 
int Fenix_Data_member_attr_set (int group_id, int member_id, int attribute_name, void *attribute_value, int *flag)
 Set the value of a member's attribute.
 
int Fenix_Data_snapshot_delete (int group_id, int time_stamp)
 Delete a snapshot from a data group.
 
int Fenix_Data_group_delete (int group_id)
 Delete a data group.
 
int Fenix_Data_member_delete (int group_id, int member_id)
 Delete a data member.
 

Variables

const Fenix_Data_subset FENIX_DATA_SUBSET_FULL
 A standin for checkpointing/recovering all available data in a member.
 
const Fenix_Data_subset FENIX_DATA_SUBSET_EMPTY
 A standin for checkpointing/recovering none of the available data in a member.
 

Overview

Functions for storing and restoring data in Fenix.

Fenix provides options for redundant storage of application data to facilitate application data recovery in a transparent manner. Fenix contains functions to control consistency of collections of such data, as well as their level of persistence. Functions with the prefix Fenix_Data_ perform store, versioning, restore, and other relevant operations and form the Fenix data recovery API. The user can select a specific set of application data, identified by its location in memory, label it using Fenix_Data_member_create, and copy it into Fenix's redundant storage space through Fenix_Data_member(i)store(v) at a point in time. Subsequently, Fenix_Data_commit finalizes all preceding Fenix store operations involving this data group and assigns a unique time stamp to the resulting data snapshot, marking the data as potentially recoverable after a loss of ranks. Individual pieces of data can then be restored whenever they are needed with Fenix_Data_member_restore, for example after a failure occurs. We note that Fenix's data storage and recovery facility aims primarily to support in-memory recovery.

Populating redundant data storage using Fenix may involve the dispersion of data created by one rank to other ranks within the system, making the store operation semantically a collective operation. However, Fenix does not require store operations to be globally synchronizing. For example, execution of Fenix_Data_member_store for a particular collection of data could potentially be finished in some ranks, but not yet in others. And if certain ranks nominally participating in the storage operations have no actual data movement responsibility, Fenix is allowerd to let them exit the operation immediately. Consequently, Fenix data storage functions should not be used for synchronization purposes.

Multiple distinct pieces (members) of data assigned to Fenix-managed redundant storage, can be associated with a specific instance of a Fenix data group to form a semantic unit. Committing such a group ensures that the data involved is available for recovery.


Data Groups

A Fenix data group provides dual functionality. First, it serves as a container for a set of data objects (members) that are committed together, and hence provides transaction semantics. Second, it recognizes that Fenix_Data_member_store is an operation carried out collectively by a group of ranks, but not necessarily by all active ranks in the MPI environment. Hence, it adopts the convenient MPI vehicle of communicators to indicate the subset of ranks involved. Data groups are composed of members that describe the actual application data and the redundancy policy to be used for securely storing the members.

Data groups can and should be recreated after each failure (i.e. do not conditionally skip the creation after initialization).

See Fenix_Data_group_create for creating a data group.


Data Redundancy Policies

Fenix internally uses an extensible system for defining data policies to keep the door open to easily adding new data policies and configuring them on a per-data-group basis. We currently support a single, configurable, memory-based policy.

In Memory Redundancy Policy (IMR)

IMR is referenced with the FENIX_DATA_POLICY_IN_MEMORY_RAID definition, and takes as input an array of integers with the following usage:

The policy is designed to localize recovery as much as possible. Communication amongst group members is required (as failure during recovery operations can lead to inconsistent beliefs about which ranks have recovered data), but groups without recovering ranks may then all recover locally rather than communicating further. Groups need not wait for ranks outside of their group to enter or exit recovery.

These options enable users to trade reliability and computation for memory space, which may be necessary for applications with large memory usage.

Function Documentation

◆ Fenix_Data_commit()

int Fenix_Data_commit ( int group_id,
int * time_stamp )
collectivelocal

Commit stored data members to the group's next snapshot.

This function is used to freeze the current state of a data group, together with all its application data that has been stored in Fenix’ redundant storage, and label it with a time stamp, thus creating a snapshot of the stored application data. Only data that has been committed is eligible for recovery through Fenix_Data_member_restore. An application needs to call Fenix_Data_wait for all pending asynchronous Fenix_Data_member_istore(v) operations in the group before committing.

Parameters
[in]group_idThe group to commit
[out]time_stampThe time stamp of the new snapshot
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_commit_barrier()

int Fenix_Data_commit_barrier ( int group_id,
int * time_stamp )
collective

As commit, but ensures a globally consistent commit.

This function does not function as a traditional barrier. The commit will proceed if all non-failed ranks reach the barrier. This allows for commits to be made when a rank fails after storing all of its data into resilient storage.

Parameters
[in]group_idThe group to commit
[out]time_stampThe time stamp of the new snapshot
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_group_create()

int Fenix_Data_group_create ( int group_id,
MPI_Comm comm,
int start_time_stamp,
int depth,
int policy_name,
void * policy_value,
int * flag )
collective

Create a Data Group.

If a group with this group_id was already created in the past and has not been deleted, the parameters of this call are ignored and this function simply serves to coordinate with any ranks that have not yet created this group (e.g. due to a failure).

All calling ranks must pass the same values for the parameters group_id, comm, start_time_stamp, policy_name, and policy_value.

Parameters
group_idA unique identifier to this group.
commA resilient communicator on which the group is formed.
start_time_stampThe time_stamp to be used for the first commit in this group.
depth

The number of successive snapshots of this group that are retained by Fenix, in addition to the most recent one, and that can be recovered by calling Fenix data member restore functions.

For example, a depth of 0 means Fenix will keep only the necessary data to restore the most recent snapshot, freeing or overwriting older snapshots automatically. A depth of -1 is currently not supported, but would ordinarily indicate that no snapshots should be removed automatically.

policy_nameCurrently, may only be FENIX_DATA_POLICY_IN_MEMORY_RAID
policy_valuePointer to data passed along to the policy. See the specific policy for more information.
flagpointer to store policy-specific status or errors
Returns
FENIX_SUCCESS, or an error value.

◆ Fenix_Data_group_delete()

int Fenix_Data_group_delete ( int group_id)
local

Delete a data group.

Parameters
[in]group_idThe group to delete
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_group_get_number_of_snapshots()

int Fenix_Data_group_get_number_of_snapshots ( int group_id,
int * number_of_snapshots )

Get the number of locally-available snapshots in a data group.

May include snapshots that are inconsistent across the group.

Parameters
[in]group_idThe group to query
[out]number_of_snapshotsThe number of snapshots in the group
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_group_get_redundancy_policy()

int Fenix_Data_group_get_redundancy_policy ( int group_id,
int * policy_name,
void * policy_value,
int * flag )

Get the storage policy of a data group.

Parameters
group_idIdentified to the data group to query
policy_nameThe identifier of the policy name of the data group.
policy_valueA location within which to store the policy_values this group's policy was configured with.
flagA location set to true if a policy value was extracted, else false.
Returns
FENIX_SUCCESS, or an error value.

◆ Fenix_Data_group_get_snapshot_at_position()

int Fenix_Data_group_get_snapshot_at_position ( int group_id,
int position,
int * time_stamp )

Get the time stamp of a snapshot at a given index.

Snapshots are indexed in reverse order in which the user committed them (e.g. the most recent available snapshot has position=0).

Parameters
[in]group_idThe group to query
[in]positionThe index of the snapshot, which must be [0, number_of_snapshots)
[out]time_stampThe time stamp of the snapshot

◆ Fenix_Data_member_attr_set()

int Fenix_Data_member_attr_set ( int group_id,
int member_id,
int attribute_name,
void * attribute_value,
int * flag )

Set the value of a member's attribute.

Valid names are FENIX_DATA_MEMBER_ATTRIBUTE_BUFFER, FENIX_DATA_MEMBER_ATTRIBUTE_COUNT, and FENIX_DATA_MEMBER_ATTRIBUTE_DATATYPE.

The COUNT and DATATYPE attributes may only be set before the first store operation. Contrary to the Fenix specification, returning to Fenix_Init after a failure does not allow the user to set these attributes again.

Parameters
[in]group_idThe group to update
[in]member_idThe member to update
[in]attribute_nameThe attribute to update
[in]attribute_valueThe new value of the attribute
[out]flagSet to true if the attribute was set, else false
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_member_create()

int Fenix_Data_member_create ( int group_id,
int member_id,
void * buffer,
int count,
MPI_Datatype datatype )
collectivelocal

Create a data member for store/restore operations.

All calling ranks in the group's communicator must pass the same values for the parameters member_id, datatype, and group_id.

Parameters
group_idIdentifier to a data group within which to create the member.
member_idAn integer unique within the data group that identifies the data in source_buffer. Must be nonnegative and less than FENIX_MEMBER_ID_MAX, which is guaranteed to be at least 2^30.
bufferAddress of the data to be copied to redundant storage maintained by Fenix. Note that this parameter may also be specified using Fenix_Data_member_attr_set, which is critical for non-survivor ranks after a failure which will have an invalid address which was generated on the failed rank and must update.
countThe maximum number of contiguous elements of type datatype of the data to be stored. Need not be the same in all calling ranks.
datatypeThe MPI_Datatype of the elements in source_buffer
Returns
FENIX_SUCCESS, or an error value.

◆ Fenix_Data_member_delete()

int Fenix_Data_member_delete ( int group_id,
int member_id )
local

Delete a data member.

Parameters
[in]group_idThe group to delete from
[in]member_idThe member to delete
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_member_lrestore()

int Fenix_Data_member_lrestore ( int group_id,
int member_id,
void * target_buffer,
int max_count,
int time_stamp,
Fenix_Data_subset * found_data )

Local-only version of Fenix_Data_member_restore.

This function restores the data of a group member from the local snapshot.

Parameters
[in]group_idThe group to restore from
[in]member_idThe member to restore
[out]target_bufferThe buffer to store the restored data
[in]max_countThe maximum number of elements to restore
[in]time_stampThe time stamp of the snapshot to restore from
[out]found_dataThe subset of the data that was found in the snapshot
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_member_restore()

int Fenix_Data_member_restore ( int group_id,
int member_id,
void * target_buffer,
int max_count,
int time_stamp,
Fenix_Data_subset * found_data )
collective

Restore the data of a group member from a snapshot.

All ranks in the group’s resilient communicator must pass the same values for the parameters group_id, member_id, and time_stamp. This function is used to retrieve data from consistent snapshot members. This function can only be used if the size of the communicator used to store the data is the same as that at the time of data recovery (this implies non-shrinking communicator recovery in case of a rank loss).

If the size of the buffer needing to receive the recovery data is unknown for a particular rank, it can be queried using Fenix_Data_member_attr_get.

Parameters
[in]group_idThe group to restore from
[in]member_idThe member to restore
[out]target_bufferThe buffer to store the restored data
[in]max_countThe maximum number of elements to restore
[in]time_stampThe time stamp of the snapshot to restore from
[out]found_dataThe subset of the data that was found in the snapshot
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_member_store()

int Fenix_Data_member_store ( int group_id,
int member_id,
Fenix_Data_subset subset_specifier )
collective

Store a particular group member into the group's resilient storage space, in uncommitted storage.

The user can safely modify the member's data buffer after this call, as the current state is copied immediately. Multiple calls may be used to incrementally store data (using subset_specifiers), or overwrite old data prior to a commit.

Parameters
group_idAll ranks must provide the same group_id
member_idAll ranks must provide the same member_id
subset_specifierWhich subset of the data to store. It is always valid for every rank to provide the same subset_specifier; depending on the group's policy, varying combinations of specifiers may be possible.
Returns
FENIX_SUCCESS, or an error value.

◆ Fenix_Data_snapshot_delete()

int Fenix_Data_snapshot_delete ( int group_id,
int time_stamp )
local

Delete a snapshot from a data group.

Parameters
[in]group_idThe group to delete from
[in]time_stampThe time stamp of the snapshot to delete
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_subset_create()

int Fenix_Data_subset_create ( int num_blocks,
int start_offset,
int end_offset,
int stride,
Fenix_Data_subset * subset_specifier )

Create a data subset for use in store operations.

Creates a subset based on num_blocks pairs of {start_offset,end_offset}, {start_offset+stride,end_offset+stride}, {start_offset+2*stride,end_offset+2*stride}, etc.

The value of start_offset must be smaller than or equal to the value of end_offset to indicate non-negative block size. Otherwise, the function returns an error code.

Created subsets must be deleted with Fenix_Data_subset_delete to free memory.

Parameters
[in]num_blocksThe number of contiguous data blocks.
[in]start_offsetThe index of the first element in the first data block.
[in]end_offsetThe index of the last element in the first data block.
[in]strideRegular shift between successive data blocks.
[out]subset_specifierThe created subset.
Returns
FENIX_SUCCESS if successful, any return code otherwise.

◆ Fenix_Data_subset_createv()

int Fenix_Data_subset_createv ( int num_blocks,
int * array_start_offsets,
int * array_end_offsets,
Fenix_Data_subset * subset_specifier )

As Fenix_Data_subset_create, but with varying start and end offsets.

Creates a subset based on num_blocks pairs of {start_offset,end_offset}. The value of start_offset must be smaller than or equal to end_offset to indicate non-negative block size. Otherwise, the function returns an error code.

Created subsets must be deleted with Fenix_Data_subset_delete to free memory.

Parameters
[in]num_blocksThe number of contiguous data blocks.
[in]array_start_offsetsThe index of the first element in each data block.
[in]array_end_offsetsThe index of the last element in each data block.
[out]subset_specifierThe created subset.

◆ Fenix_Data_subset_delete()

int Fenix_Data_subset_delete ( Fenix_Data_subset * subset_specifier)

Delete a data subset.

Frees the memory associated with a data subset object.

Parameters
[in]subset_specifierThe subset to delete.
Returns
FENIX_SUCCESS if successful, any return code otherwise.