1. Introduction

This manual describes the procedures for generating HBase backups and restoring the database from the backups. The procedures for making HBase backups are covered in the Apache HBase manual , and this document highlights the particularities required for performing HBase backups of GBDS servers.

2. Configuration

The first steps for performing a database backup start on Section 83.

On section 86, pay attention to the settings, which are a prerequisite for the procedure.

The procedures on section 86.2 must be executed through the Ambari administration page:

http://<ambari-server>:8080/#/main/services/HBASE/configs

To configure the environment for backup, access the HBase through the Ambari main page, then access the Configs tab and the Advanced menu and do the following configuration:

  1. Enter the Advanced hbase-site menu and set the hbase.coprocessor.region.classes property value to org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint,org.apache.hadoop.hbase.backup.BackupObserver

  2. Enter the Custom hbase-site menu and, through the Add Property option, add the following properties:

    hbase.backup.enable

    Value: true

    hbase.master.logcleaner.plugins

    Value: org.apache.hadoop.hbase.backup.master.BackupLogCleaner

    hbase.procedure.master.classes

    Value: org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager

    hbase.procedure.regionserver.classes

    Value: org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager

    hbase.master.hfilecleaner.plugins

    Value: org.apache.hadoop.hbase.backup.BackupHFileCleaner

    hbase.regionserver.thread.compaction.small

    Value: 3

3. Creating Backups

Section 87.1 describes the creation of the backup file, which is initially created within HDFS and must later be moved to a storage location outside the HBase volume.

The following command is used to create a backup file, being it full or incremental:

hbase backup create <type> hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR> -t anomalies,people,transactions,uls,unresolvedlatent,quality,transactionkeys -w 3

The following variables must be changed according to your environment before performing the command above:

  • type

    This variable defines whether the backup being created is full or incremental.

  • NAME-NODE-SERVER

    This variable refers to the Name-node server within the current environment where the backup will be created. It goes the same for exporting and restoring backups.

  • HDFS BACKUP DIR

    This variable refers to the HDFS directory where the backup file will be stored.

  • -t

    This token refers to and must be followed by the HBase tables that will be included into the backup file. The example above contains the default tables that must be included into a GBDS backup file.

  • -w

    This token refers to and must be followed by the number of workers that will be dedicated to perform the backup.

Note

The default number of replicas that will be created for the backup file is 3 To change this value, run the following command:

hdfs dfs -setrep -R 2 hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR>

The HDFS BACKUP DIR must be the same path used as destination when creating the backup file.

The -setrep token refers to the new replication factor being set for the backup file, and is followed by the -R token, that defines the operation as recursive (the same operation will be performed for any file or folder within the specified path), and the new number of replicas (in this case, 2).

4. Exporting the Backup File

To export the backup file, the following command can be used to move the backup file to the local drive, so it can be moved to an external source:

hdfs dfs -get hdfs://<NAME-NODE-SERVER>:8020/<HDFS> /<LOCAL-DRIVE-DIR>

The variables for this command are the same presented in Creating Backups.

The LOCAL-DRIVE-DIR variable must be changed according to the path into the local drive to where the backup file will be moved.

5. Restoring Backups

Section 87.2 describes the procedure for restoring the database state from a previous backup:

There are 2 options to restore a backup file: to the same environment (without exporting the backup file from the HDFS) and to a different environment (importing an external backup file). The following sections detail each case.

5.1. Restoring to the Same Environment

The following command is used to restore a backup within the same HDFS:

hbase backup history
hbase restore hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR> <backup-id> -o -t anomalies,people,transactions,uls,unresolvedlatent,quality,transactionkeys

In the command above, the HDFS BACKUP DIR must be changed according to the path to the backup file that will be restored. The backup-id must be the unique identifier of the backup that will be restored.

The -o token sets whether the current data must be overwritten by the backup restoration and the -t token refers to the HBase tables that must be restored from the backup.

5.2. Restoring to a Different Environment

To restore the backup within a different environment, the backup must have been previously exported from the original environment through the process described in the Exporting the Backup File section. Once it is moved to the local drive of the new environment, it must be put into the HDFS through the following command:

hdfs dfs -put <LOCAL DRIVE BACKUP DIR> hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR>

The LOCAL DRIVE BACKUP DIR must be the path within the local drive where the backup file is located and the HDFS BACKUP DIR must be the path within the local HDFS where the backup will be stored. Once the backup is imported, the restoration process is the same described in the Restoring to the Same Environment section.