# Backup

## Introduction

This manual describes the processes to generate HBase backups and restore the database from those backups. The process to back up HBase is described in the [Apache HBase Manual](https://hbase.apache.org/book.html#backuprestore), and this document highlights the particularities required to perform HBase backups of the GBDS server.

## Settings

The first steps to perform the database backup begin in [Section 83](https://hbase.apache.org/book.html#br.overview).

In [section 86](https://hbase.apache.org/book.html#br.initial.setup), pay attention to the settings, which are prerequisites for the procedure.

The procedures in [section 86.2](https://hbase.apache.org/book.html#_hbase_specific_changes) must be executed through the Ambari administration page:

> `http://<ambari-server>:8080/#/main/services/HBASE/configs`

To configure the environment for backup, access the **HBase** through the main page of **Ambari**, then go to the **Configs** tab and the **Advanced** menu to make the following settings:

1. Enter the **Advanced hbase-site** menu and set the value of the property `hbase.coprocessor.region.classes` to `org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint,org.apache.hadoop.hbase.backup.BackupObserver`
2. Enter the **Custom hbase-site** and using the **Add Property** option, add the following properties:

   **hbase.backup.enable**

   > Value: `true`

   **hbase.master.logcleaner.plugins**

   > Value: `org.apache.hadoop.hbase.backup.master.BackupLogCleaner`

   **hbase.procedure.master.classes**

   > Value: `org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager`

   **hbase.procedure.regionserver.classes**

   > Value: `org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager`

   **hbase.master.hfilecleaner.plugins**

   > Value: `org.apache.hadoop.hbase.backup.BackupHFileCleaner`

   **hbase.regionserver.thread.compaction.small**

   > Value: `3`

## Creating Backups

The [Section 87.1](https://hbase.apache.org/book.html#br.creating.complete.backup) describes the creation of the backup file, which is initially created inside HDFS and must be moved to a location outside the HBase folders.

The following command is used to create the backup file, whether full or incremental:

```shell
hbase backup create <type> hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR> -t anomalies,people,transactions,uls,unresolvedlatent,quality,transactionkeys -w 3
```

The following variables must be changed according to your environment before running the above command:

* **type**

  > This variable defines whether the backup to be created will be `full` or `incremental`.
* **NAME-NODE-SERVER**

  > This variable refers to the Name-node server of the current environment where the backup will be created. The same applies to the export and restoration of backups.
* **HDFS BACKUP DIR**

  > This variable refers to the HDFS directory where the backup file will be stored.
* **-t**

  > This token refers to and must be followed by the HBase tables that will be included in the backup file. The example above contains the standard tables that should be included in the GBDS backup file.
* **-w**

  > This token refers to and must be followed by the number of workers that will be dedicated to performing the backup.

{% hint style="info" %}
The default number of replicas that will be created for the backup file is `3`. To change this value, run the following command:

```shell
hdfs dfs -setrep -R 2 hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR>
```

{% endhint %}

The **HDFS BACKUP DIR** must have the same path used as the destination when creating the backup file.

The token **-setrep** refers to the new replication factor being set for the backup file, and is followed by the token **-R**, which defines the operation as recursive (the same operation will be performed for any file or folder within the specified path), and the new number of replicas (in this case, `2`).

## Exporting the Backup File

To export a backup file, the following command must be used to move the backup file to the local drive, and then it can be moved to an external source:

```shell
hdfs dfs -get hdfs://<NAME-NODE-SERVER>:8020/<HDFS> /<LOCAL-DRIVE-DIR>
```

The variables of this command are described in [Creating Backups](#criando-backups).

The variable **LOCAL-DRIVE-DIR** must be changed according to the path on the local drive where the backup file will be moved.

## Restoring Backups

The [Section 87.2](https://hbase.apache.org/book.html#br.restoring.backup) describes the process to restore the database to the state prior to the backup.

There are two options to restore a backup file: to the same environment (without exporting the backup file from HDFS) and to a different environment (importing an external backup file). The following sections detail both cases.

### Restoring to the Same Environment

The following command is used to restore a backup within the same HDFS:

```shell
hbase backup history
hbase restore hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR> <backup-id> -o -t anomalies,people,transactions,uls,unresolvedlatent,quality,transactionkeys
```

In the command above, the **HDFS BACKUP DIR** must be changed according to the path to the backup file that will be restored. The **backup-id** must be the unique identifier of the backup that will be restored.

The token **-o** defines whether current data should be overwritten by the restoration and the token **-t** refers to the HBase tables that should be restored from the backup.

### Restoring to a Different Environment

To restore the backup to a different environment, the backup must be previously exported from the original environment according to the process described in [Exporting the Backup File](#exportando-o-arquivo-de-backup). Once it is moved to the local drive of the new environment, it must be placed in HDFS using the following command:

```shell
hdfs dfs -put <LOCAL DRIVE BACKUP DIR> hdfs://<NAME-NODE-SERVER>:8020/<HDFS BACKUP DIR>
```

The **LOCAL DRIVE BACKUP DIR** must be the path within the local drive where the backup file is located and the **HDFS BACKUP DIR** must be the path within the local HDFS where the backup will be stored. Once the backup is imported, the restoration process is the same described in [Restoring to the Same Environment](#restaurando-ao-mesmo-ambiente).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.griaule.com/gbs/en/gbds-installation/backupguide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
