# Migration Configuration

## Configuration File

The GBDS migration configuration parameters are defined in a configuration file containing all parameters and their respective values. Omitted parameters assume their default values. This section describes the properties of the configuration file.

### File location

The configuration file is `/etc/griaule/conf/gbds-migration/gbds-migration.properties`.

### File properties

The configuration file must meet some requirements to be correctly interpreted by GBDS. These requirements are:

1. The file name and location must be exactly as mentioned;
2. There must be exactly one configuration parameter per line;
3. Each configuration parameter must be in the format `<parameter>=<value>`, without line breaks;
4. Each value must be separated by a comma when assigned to a single parameter.

## Configuration parameters

This section describes each GBDS migration configuration parameter that may be listed in the configuration file and how they affect the system's operation.

### General

#### **gbscluster.zookeeper.quorum**

Defines the host name and port by which Zookeeper servers can be found. Each value must be separated by commas if more than one is available.

**Default value:**

> `<hostname>:<port>`

### RDB Connection

#### gbds.rbd.driverClassName

Defines the class name for the relational database to be used to store unresolved latents.

**Default value:**

> `com.mysql.jdbc.Driver`

#### gbds.rdb.url

Defines the URL of the relational database to be accessed.

**Default value:**

> `jdbc:mysql://<hostname>:3306/gbds?useSSL=false&allowPublicKeyRetrieval=true`

#### gbds.rdb.username

Defines the username to be used to access the relational database.

**Default value:**

> `<rdb-username>`

#### **gbds.rdb.password**

Defines the password to be used to access the relational database.

**Default value:**

> `<rdb-base64-password>`

#### gbds.rdb.dialect

Defines the dialect to be used in the relational database.

**Default value:**

> `org.hibernate.dialect.MySQLDialect`

#### gbds.rdb.showSql

Defines whether SQL statements should be included in the application logs.

**Default value:**

> `false`

**Possible values:**

> * `true`
> * `false`

#### gbds.rdb.maxPoolSize

Maximum number of connections that a pool will maintain at any time.

**Default value:**

> `100`

#### gbds.rdb.minPoolSize

Minimum number of connections that a pool will maintain at any given time.

**Default value:**

> `1`

#### gbds.rdb.initialPoolSize

Number of connections that a pool will attempt to acquire at startup. This value must be in the range of gbds.rdb.minPoolSize to gbds.rdb.maxPoolSize.

**Default value:**

> `2`

#### gbds.rdb.maxStatments

Defines the size of the c3p0 global PreparedStatement cache. If `gbds.rdb.maxStatments`is zero, the statement cache will not be enabled.

This parameter controls the total number of statements cached across all connections. If defined, it should be a fairly large number, since each pooled connection requires its own distinct set of cached statements. As a guide, consider how many distinct PreparedStatements are frequently used in your application and multiply that number by `gbds.rdb.maxPoolSize`to arrive at an appropriate value.

**Default value:**

> `0`

#### gbds.rdb.maxIdleTime

Defines, in seconds, how long a connection can remain in the pool but unused before being discarded. Zero means idle connections never expire.

**Default value:**

> `1800`

#### gbds.rdb.maxConnectionAge

Defines, in seconds, the maximum lifetime of a connection. A connection older than 10 years `gbds.rdb.maxConnectionAge`will be destroyed and removed from the pool. This differs from 10 years, `gbds.rdb.maxIdleTime`as it refers to absolute age. Even a connection that has not been very idle will be removed from the pool if it exceeds 10 years `gbds.rdb.maxConnectionAge`. Zero means there is no maximum absolute age applied.

**Default value:**

> `1800`

#### gbds.rdb.statementCacheNumDeferredCloseThreads

If set to a value greater than 0, the statement cache will track when Connections are in use and will destroy Statements only when their parent Connections are not in use. Although closing a Statement while the parent Connection is in use is formally within the specifications, some databases and/or JDBC drivers, primarily Oracle, do not handle this case well and hang, leading to deadlocks. Setting this parameter to a positive value should eliminate the problem. This parameter should only be set if you observe c3p0's attempts to close() cached statements hanging (usually you will see APPARENT DEADLOCKS in your logs). If set, this parameter almost always should be set to 1.

**Default value:**

> `1`

#### gbds.rdb.acquireIncrement

Determines how many connections at a time c3p0 will attempt to acquire when the pool is exhausted.

**Default value:**

> `10`

#### gbds.rdb.testConnectionOnCheckout

If true, an operation will be executed on each connection checkout to verify if the connection is valid. Testing connections on checkout is the simplest and most reliable way to test connections, but for better performance, consider checking connections periodically using `gbds.rdb.idleConnectionTestPeriod`.

**Default value:**

> `false`

#### gbds.rdb.testConnectionOnCheckin

If true, an operation will be executed asynchronously on each connection checkout to verify if the connection is valid. Use in combination with `gbds.rdb.idleConnectionTestPeriod`for fairly reliable and always asynchronous connection tests.

**Default value:**

> `true`

**Possible values:**

> * `true`
> * `false`

#### **gbds.rdb.acquireRetryAttempts**

Defines how many times c3p0 will try to obtain a new connection from the database before giving up. If this value is less than or equal to zero, c3p0 will continue trying to obtain a connection indefinitely.

**Default value:**

> `10`

#### gbds.rdb.idleConnectionTestPeriod

If this is a number greater than 0, c3p0 will test all idle, pooled but unchecked connections every this number of seconds.

**Default value:**

> `30`

## HBase Column Families <a href="#hbase-column-families" id="hbase-column-families"></a>

### **Default column family**

These parameters are divided by biometric modality. They are column families used for template read operations.

```
gbds.hbase.templates.fingerprint.cf.name
gbds.hbase.templates.palmprint.cf.name
gbds.hbase.templates.face.cf.name
gbds.hbase.templates.iris.cf.name
gbds.hbase.templates.newborn-palmprint.cf.name
```

The default value for these parameters is `tpts`.

### **Fallback column family**

These parameters refer to the column family previously used to store biometric templates, separated by biometric modality.

```
gbds.hbase.templates.fallback.fingerprint.cf.name
gbds.hbase.templates.fallback.palmprint.cf.name
gbds.hbase.templates.fallback.face.cf.name
gbds.hbase.templates.fallback.iris.cf.name
gbds.hbase.templates.fallback.newborn-palmprint.cf.name
```

The default values represent the column family used before changing these parameters and are, respectively: `fingerprints`, `palmprints`, `faces`, `iris`, `newborn-palmprints`.

## GBDS Re-extractor

### General

#### **gbds.reextract.nodeNumber**

Number of nodes running the Reextractor. It determines the scan range in HBase based on the total number of nodes.

**Default value:**

> `1`

**Minimum value:**

> `1`

**Maximum value:**

> Value of`gbds.reextract.totalNodes`

#### **gbds.reextract.totalNodes**

Total number of nodes running the Reextractor.

**Default value:**

> `1`

#### **gbds.reextract.totalScanRegions**

Total number of regions to break the scans into.

**Default value:**

> `256`(regions 00-FF)

#### **gbds.reextract.scanners.number**

Number of scanners. A scanner scans from HBase a range based on `gbds.reextract.nodeNumber`, `gbds.reextract.totalNodes`, and `gbds.reextract.totalScanRegions`.

**Default value:**

> `5`

#### **gbds.reextract.workers.number**

Number of workers. A worker holds a template extraction transaction.

**Default value:**

> `5`

#### **gbds.reextract.writers.number**

Number of writers. A writer obtains the extraction result and writes back to the transaction and to people, if needed.

**Default value:**

> `5`

#### **gbds.reextract.range**

External range configuration. Limits the automatic range.

* The range can be a 2-character hexadecimal (such as `00`or `A3`) or a 2-character hexadecimal range (such as `00-01`or `4A-50`).
* Always 2 hexadecimal characters.
* If absent, the GBS Reextractor will perform automatic partitioning as usual.

#### **gbds.reextract.validate.extraction**

Flag to validate previously reextracted templates.

* In each transaction:
  * When selected to be extracted, the GBS Reextractor will not validate it.
  * When it was extracted before but not validated, the GBS Reextractor will validate it.
  * When it is extracted and validated, the GBS Reextractor will ignore it.
  * The validation is saved in HBase in the column `transaction:<cf>-validated`.
* **To ensure that a transaction is validated in one call after extraction, remember to delete the SQLite file or set** `gbds.reextract.sqlite.resetOnStart=true` . **Otherwise, the entire range to which the transaction belongs will be skipped.**

This way, reextraction and validation can be done in different calls of the GBS Reextractor, giving time for HBase to write and consolidate templates in the `transactions`table.

**Default value:**

> `true`

**Possible values:**

> * `true`
> * `false`

### Pipeline Queue

#### **gbds.reextract.workers.inqueueMaxSize**

Scanner queue size for the worker. The larger the size, the more scans are performed and kept in workers, but more memory is allocated.

**Default value:**

> `100`

#### **gbds.reextract.writers.inqueueMaxSize**

Worker queue size for the writer. The larger the size, the more scans are performed and kept in writers, but more memory is allocated.

**Default value:**

> `100`

### Modality to Reextract Flags

#### **gbds.reextract.modality.finger**

Determines whether fingers should be reextracted.

**Default value:**

> `true`

**Possible values:**

> * `true`
> * `false`

#### **gbds.reextract.modality.face**

Determines whether faces should be reextracted.

**Default value:**

> `true`

**Possible values:**

> * `true`
> * `false`

#### **gbds.reextract.modality.palm**

Determines whether palms should be reextracted.

**Default value:**

> `false`

**Possible values:**

> * `true`
> * `false`

#### **gbds.reextract.modality.iris**

Determines whether irises should be reextracted.

**Default value:**

> `false`

**Possible values:**

> * `true`
> * `false`

#### **gbds.reextract.modality.newborn-palm**

Determines whether newborn palms should be reextracted.

**Default value:**

> `false`

**Possible values:**

> * `true`
> * `false`

### Template extraction microservices

#### **gbds.reextract.msextraction.ginger.number**

Defines how many instances of fingerprint, palmprint, newborn and sequence control model extraction microservices will be available. If this setting is set to `0`, the extraction microservices for these modalities will not be started.

**Default value:**

> `10`(multiples of 10 recommended)

#### **gbds.reextract.msextraction.face.number**

Defines how many instances of facial template extraction microservices will be available. If this setting is set to `0`, the extraction microservices for this modality will not be started.

**Default value:**

> `1`(10 times less than `gbds.reextract.msextraction.ginger.number`the recommended)

#### **gbds.reextract.msextraction.initialPort**

This parameter sets the starting port number for the template extraction microservices.

Each microservice instance will increment its port number by 1. For example, considering the default port `6000`, the first instance will use port `6000`, the second will use port `6001`, the third, `6002`and so on.

{% hint style="info" %}
Do not conflict with template extraction API microservice ports (over 30,000), quality extraction microservice ports (31,000) and GBDS matching microservice ports (32,000).
{% endhint %}

{% hint style="info" %}
Make sure to allow the firewall ports that the microservice will use.
{% endhint %}

**Default value:**

> `6000`

#### **gbds.reextract.msextraction.maxTries**

Defines the maximum number of extraction attempts GBDS will perform on a single biometric feature before returning an error.

**Default value:**

> `3`

#### **gbds.reextract.msextraction.linkLibSegfault**

Turns on/off the segmentation fault library debugger in the extraction microservice

**Default value:**

> `true`

#### **gbds.reextract.msextraction.checkTimeoutSecs**

Timeout in seconds to check if the template extraction microservice is up.

**Default value:**

> `30`

#### **gbds.reextract.msextraction.logLevel**

Log level of the template extraction microservice.

**Default value:**

> `INFO`

**Possible values:**

> * `INFO`
> * `TIME`
> * `DEBUG`

#### **gbds.reextract.msextraction.timeout**

Timeout in seconds for a single call to the template extraction microservice.

**Default value:**

> `60`

#### **gbds.reextract.msextraction.fingerprints.extractor.type**

This parameter sets the ginger extractor preset type to be used by the GBDS Migration in `--reextract`mode, also known as the GBDS Reextractor. The Reextractor will save in HBase and RDB the ginger extractor type that was used in the same way the API does, in the `transactions`HBase column `<cf>:ginger-extractor-type`and in the `transactions`RDB column `ginger_extractor_type`.

**Default value:**

> `GRIAULE_2024`

**Possible values:**

> * `GRIAULE_FAST`: a simpler and faster version of GRIAULE\_BASIC (never used in the API).
> * `GRIAULE_BASIC`: default extraction for Verify.
> * `GRIAULE_2020`: old default extraction for Enroll, Update.
> * `GRIAULE_2024`: new default extraction for Enroll, Update.
> * `GRIAULE_2018`: use GRIAULE\_2018 (slower).

### Reextraction HBase column families

#### **gbds.reextract.cf.finger**

Name of the finger column family to receive the extracted template in transactions and people.

**Default value:**

> `fingerprint-reextract-1`

#### **gbds.reextract.cf.palm**

Name of the Palm column family to receive the extracted template in transactions and people.

**Default value:**

> `palmprint-write`

#### **gbds.reextract.cf.face**

Name of the face column family to receive the extracted template in transactions and people.

**Default value:**

> `face-reextract-1`

#### **gbds.reextract.cf.iris**

Name of the Iris column family to receive the extracted template in transactions and people.

**Default value:**

> `iris-write`

#### **gbds.reextract.cf.newborn-palm**

Name of the newborn palm column family to receive the extracted template in transactions and people.

**Default value:**

> `newborn-palmprint-write`

### SQLite

#### **gbds.reextract.sqlite.filePath**

File path for the local SQLite database.

**Default value:**

> `/home/<username>/reextract.db`

#### **gbds.reextract.sqlite.resetOnStart**

Flag to reset SQLite on startup.

**Default value:**

> `false`

**Possible values:**

> * `true`
> * `false`

## Configuration file example

{% hint style="info" %}
Replace `<hostname>`, `<rdb-username>`, `<rdb-base64-password>`and `<username>`with the correct values. Also, if `zookeeper`and `mysql`are running on ports different from the defaults, replace the port numbers.
{% endhint %}

```
# GENERAL
gbscluster.zookeeper.quorum=<hostname>:2181

# RDB CONNECTION
gbds.rdb.driverClassName=com.mysql.jdbc.Driver
gbds.rdb.url=jdbc:mysql://<hostname>:3306/gbds?useSSL=false&allowPublicKeyRetrieval=true
gbds.rdb.username=<rdb-username>
gbds.rdb.password=<rdb-base64-password>
gbds.rdb.dialect=org.hibernate.dialect.MySQLDialect
gbds.rdb.showSql=false
gbds.rdb.maxPoolSize=100
gbds.rdb.minPoolSize=1
gbds.rdb.initialPoolSize=2
gbds.rdb.maxStatments=0
gbds.rdb.maxIdleTime=1800
gbds.rdb.maxConnectionAge=1800
gbds.rdb.statementCacheNumDeferredCloseThreads=1
gbds.rdb.acquireIncrement=10
gbds.rdb.testConnectionOnCheckout=false
gbds.rdb.testConnectionOnCheckin=true
gbds.rdb.acquireRetryAttempts=10
gbds.rdb.idleConnectionTestPeriod=30

# HBASE COLUMN FAMILIES - STANDARD
gbds.hbase.templates.fingerprint.cf.name=tpts
gbds.hbase.templates.palmprint.cf.name=tpts
gbds.hbase.templates.face.cf.name=tpts
gbds.hbase.templates.iris.cf.name=tpts
gbds.hbase.templates.newborn-palmprint.cf.name=tpts

# HBASE COLUMN FAMILIES - FALLBACK
gbds.hbase.templates.fallback.fingerprint.cf.name=fingerprints
gbds.hbase.templates.fallback.palmprint.cf.name=palmprints
gbds.hbase.templates.fallback.face.cf.name=faces
gbds.hbase.templates.fallback.iris.cf.name=iris
gbds.hbase.templates.fallback.newborn-palmprint.cf.name=newborn-palmprints

# REEXTRACTOR - GENERAL
gbds.reextract.nodeNumber=1
gbds.reextract.totalNodes=1
gbds.reextract.totalScanRegions=256
gbds.reextract.scanners.number=5
gbds.reextract.workers.number=5
gbds.reextract.writers.number=5
gbds.reextract.validate.extraction=false

# REEXTRACTOR - PIPELINE QUEUE
gbds.reextract.workers.inqueueMaxSize=100
gbds.reextract.writers.inqueueMaxSize=100

# REEXTRACTOR - MODALITY TO REEXTRACT FLAGS
gbds.reextract.modality.finger=true
gbds.reextract.modality.face=true
gbds.reextract.modality.palm=false
gbds.reextract.modality.iris=false
gbds.reextract.modality.newborn-palm=false

# REEXTRACTOR - TEMPLATE EXTRACTION MICROSERVICE
gbds.reextract.msextraction.ginger.number=10
gbds.reextract.msextraction.face.number=1
gbds.reextract.msextraction.initialPort=6000
gbds.reextract.msextraction.maxTries=3
gbds.reextract.msextraction.linkLibSegfault=true
gbds.reextract.msextraction.checkTimeoutSecs=30
gbds.reextract.msextraction.logLevel=INFO
gbds.reextract.msextraction.timeout=60
gbds.reextract.msextraction.fingerprints.extractor.type=GRIAULE_2024

# REEXTRACTOR - REEXTRACTION HBASE COLUMN FAMILIES
gbds.reextract.cf.finger=fingerprint-reextract-1
gbds.reextract.cf.palm=palmprint-write
gbds.reextract.cf.face=face-reextract-1
gbds.reextract.cf.iris=iris-write
gbds.reextract.cf.newborn-palm=newborn-palmprint-write

# REEXTRACTOR - SQLITE
gbds.reextract.sqlite.filePath=/home/<username>/reextract.db
gbds.reextract.sqlite.resetOnStart=false
```
