# SPID Monitoring

This manual describes the SPID environment, its startup, shutdown, monitoring process and other resources. The SPID environment is composed of a set of server-side services: SPID Server and Control Panel, GBDS, SQL Server and Ambari Services.

## System Monitoring

Griaule recommends monitoring tools such as Zabbix, Cacti and other systems to automate tracking of system resources and performance.

### SPID Status

One way to monitor SPID is through the API, available at the URL `http://<hostname>:8082/gbs-spid-server/service/cluster/ping`

Note that in the default configuration, SPID is configured on port 8082.

{% hint style="info" %}
This test can be performed by a browser.
{% endhint %}

If SPID is working, the following message will be displayed:

```default
Pong!
```

#### Server-side

There are two ways to check status via the terminal.

```sh
service spid status
```

or

```sh
ps aux | grep spid-server | grep -v grep
```

The response to these commands should show the process running.

### SPID Control Panel Status

The SPID Control Panel is a web service and is available at the URL `http://<hostname>:58086/gbs-spid-controlpanel`. In the default configuration, the control panel runs on port 58086.

#### Server-side

There are two ways to check status via the terminal.

```sh
service spid-cp status
```

or

```sh
ps aux | grep spid-controlpanel | grep -v grep
```

The response to these commands should show the process running.

### Idnservice Server-side

Griaule's IDN Service is an optional service, and when used, can be checked with the following commands:

```sh
service idnservice status
```

or

```sh
ps aux | grep spid-idnservice | grep -v grep
```

The response to these commands should show the process running.

### GBDS API Status

One way to monitor the GBDS API is via the URL `http://<hostname>:8085/gbds/v2/operations/ping`.

Always point the URL to a GBDS node that hosts the API. In the default configuration, the API runs on port 8085. The API should return the following message:

```json
{
	"data": "pong!"
}
```

An extra check, which also tests database access, is available at the following address:

```html
http://<hostname>:8085/gbds/v2/exceptions/EndDate=1400000000000
```

By clicking the link, the API will fetch exceptions up to the date May 13, 2014 (in *epoch time*), so the API should not return exception messages. If the response is similar to the response below, the connection to the database is working.

```json
{
	"pagination": {
		"total": 0,
		"count": 0,
		"pageSize": 0,
		"currentPage": 0,
		"totalPages": 0
	}
}
```

{% hint style="info" %}
Instead of ping, one can list exceptions in the database, but this operation demands more resources, so it should be used with restrictions.
{% endhint %}

#### Server-side

The GBDS API runs via a service named `gbdsapid`. The following command can be used to check if this service is running.

{% hint style="warning" %}
Remember to repeat the command on each node where the API is running.
{% endhint %}

```sh
service gbsapid status
```

or

```sh
ps aux | grep gbsapi | grep -v grep
```

The response to these commands should show the API process running.

### GBDS Status

#### Server-side

GBDS runs as a process. Remember to repeat the command on each node of the GBDS cluster.

The first command can be used to see if the GBDS process is running:

```sh
ps aux | grep -v grep | grep griaulebiometrics.gbds.driver.Driver
```

The output of this command should be displayed if the process is running.

The second command can be used to check the count of matchers:

```sh
ps aux | grep akka | grep -v grep | wc -l
```

The output of this command will show the number of matchers that are running.

## Troubleshooting

### GBDS

In case of problems GBDS should be restarted. First, it is necessary to check the service status and stop it.

```sh
su griaule

/var/lib/griaule/gbds/scripts/kill-cluster.sh

#Call again till all nodes return that no service is running
/var/lib/griaule/gbds/scripts/kill-cluster.sh
```

Then, as user `griaule`, the following script must be executed to start the driver.

```sh
/var/lib/griaule/gbds/scripts/start-cluster.sh
```

{% hint style="info" %}
More details for GBDS can be found in the logs.
{% endhint %}

### GBDS API

If there is any problem with the API, it should be restarted using a `griaule` or `superuser`.

```sh
service gbsapid restart #restart API
service gbsapid status #check api status
```

### SPID

If there is any problem with SPID, it should be restarted using a `griaule` or `superuser`.

```sh
service spid restart #restart spid
service spid status #check spid status
```

### SPID Control Panel

If there is any problem with the Control Panel, it should be restarted using a `griaule` or `superuser`.

```sh
service spid-cp restart
service spid-cp status
```

### IDN Service

If there is any problem with the idnservice, it should be restarted using a `griaule` or `superuser`.

{% hint style="info" %}
Remember that Griaule's IDN is optional, users may choose to implement it themselves.
{% endhint %}

```sh
service idnservice restart
service idnservice status
```

### Logs

If any problem is found, the support team should be contacted. Once contact is made, it is important to send the logs related to the problem to reduce the time to fix it.

| Application with error      | Path to logs                                           |
| --------------------------- | ------------------------------------------------------ |
| HBase                       | /var/log/hbase/                                        |
| HDFS                        | /var/log/hadoop/hdfs/hadoop-hdfs-datanode-hostname.log |
| GBDS                        | /var/log/griaule/gbds/gbds.log                         |
| GBDS API (start up process) | /var/log/griaule/gbsapi/console.out                    |
| GBDS API                    | /var/log/griaule/gbsapi/gbsapi.log                     |
| SPID                        | /var/log/griaule/spid/ac.log                           |
| SPID Control Panel          | /var/log/griaule/spid/controlpanel.log                 |
| idnService                  | /var/log/griaule/idnservice/                           |

## Post-Cluster Restart Processes

If all cluster nodes are restarted simultaneously, Ambari services must be restarted manually. This procedure can also be used in case the environment goes offline as an initial approach to handle the incident.

### Ambari services restart

To access the Ambari Control Panel, go to the URL `http://<hostname>:8080` from a web browser and log in. By default, both the login and password are `admin`.

Then, in the left side panel, in the Services tab, click on `...` and then on *Start All*. At the end of the operation, all services should be running (highlighted by a green dot). If any service fails to start, it will be marked in red and should be started manually.

![image](/files/e9dc9f56afb6e5fc1ebc41475f9d16878eaf3477)

{% hint style="info" %}
In the upper right corner of the screen you have a gear icon; by pressing it the user can follow the current startup status.
{% endhint %}

You should check if the *namenode* is not running in *Safemode* due to some problem. Therefore, check the status of the *namenode*.

```sh
hdfs dfsadmin -fs hdfs://<hostname>:8020 -safemode get | grep 'Safe mode is OFF'
```

![image](/files/18ccfe2dff80fa338d4288e168ea768d8e4174dd)

If the *namenode* is started in *safemode*, run the following command on Node 1 as the hdfs user.

```sh
sudo su - hdfs hdfs dfsadmin -safemode leave
```

### Service startup

Start GBDS, GBDS API, SPID Control Panel and IDN Service as indicated in [Troubleshooting](#solucao-de-problemas)

#### Shutdown

The following procedure should be used whenever production servers are shut down. This procedure can also be used in case the environment goes offline as an initial approach to handle the incident.

{% hint style="info" %}
You need a superuser to make the calls.
{% endhint %}

```sh
service spid stop

service spid-cp stop

service gbsapid stop

/var/lib/griaule/gbscluster/scripts/kill-gbscluster.sh

#Call again till all nodes return that no service is running
/var/lib/griaule/gbscluster/scripts/kill-gbscluster.sh
```

* Access the Ambari Control Panel via URL `http://<hostname>:8080`.
* Stop all Hadoop services by clicking `...` in “Services” and then on “Stop All”.

![image](/files/e9dc9f56afb6e5fc1ebc41475f9d16878eaf3477)

## Additional Information

GBDS is CPU-bound, which means it will always use as much CPU as possible to perform its operations. Therefore, it is common for monitoring software to report high CPU usage on cluster nodes.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.griaule.com/psbio/en/spid/spidmonitoring.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
