Monitoring

Introduction

Proper monitoring ensures your Fenine node runs reliably and helps you detect issues before they become critical. This guide covers monitoring setup using industry-standard tools.

Recommended Stack: Prometheus + Grafana + Node Exporter

Metrics to Monitor

Critical Metrics

Node Health

Sync Status: Node is fully synchronized
Peer Count: Connected to 20+ peers
Block Height: Matches network head
Chain Forks: No unexpected reorgs

Alert Thresholds:

Peers < 5: Warning
Peers < 2: Critical
Sync lag > 50 blocks: Warning
Sync lag > 200 blocks: Critical

System Resources

CPU Usage: Should stay < 80%
Memory Usage: Should stay < 85%
Disk Usage: Alert at 80% full
Disk I/O: Monitor for bottlenecks
Network Bandwidth: Track usage trends

Alert Thresholds:

CPU > 90% for 5min: Warning
Memory > 95%: Critical
Disk > 90%: Critical
Disk I/O wait > 50%: Warning

RPC Performance

Request Rate: Requests per second
Response Time: p50, p95, p99 latency
Error Rate: Failed requests
Concurrent Connections: Active clients

Alert Thresholds:

Response time p95 > 1s: Warning
Error rate > 5%: Warning
Error rate > 10%: Critical

Network Activity

Block Production: New blocks every 3s
Transaction Pool: Pending tx count
Gas Price: Current base fee
Validator Set: Active validators

Alert Thresholds:

No new block > 15s: Warning
No new block > 30s: Critical
Mempool > 10,000 tx: Warning

Prometheus Setup

Install Prometheus

# Download Prometheus
cd /tmp
PROM_VERSION="2.48.0"
wget "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.linux-amd64.tar.gz"

# Extract and install
tar -xzf prometheus-${PROM_VERSION}.linux-amd64.tar.gz
sudo mv prometheus-${PROM_VERSION}.linux-amd64 /opt/prometheus

# Create directories
sudo mkdir -p /etc/prometheus /var/lib/prometheus

# Move config
sudo mv /opt/prometheus/prometheus.yml /etc/prometheus/

Configure Prometheus

Create /etc/prometheus/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    monitor: 'fenine-node'

scrape_configs:
  # Fenine node metrics
  - job_name: 'fenine'
    static_configs:
      - targets: ['localhost:6060']
        labels:
          instance: 'fenine-mainnet'
  
  # System metrics
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
  
  # Process metrics
  - job_name: 'process-exporter'
    static_configs:
      - targets: ['localhost:9256']

Enable Metrics in Fene-Geth

Update /var/lib/fenine/config.toml:

[Metrics]
Enabled = true
HTTP = "0.0.0.0"
Port = 6060

Or add to systemd service:

sudo nano /etc/systemd/system/fenine.service

Add --metrics --metrics.addr 0.0.0.0 --metrics.port 6060 to ExecStart:

ExecStart=/usr/local/bin/fene-geth \
  --config /var/lib/fenine/config.toml \
  --metrics \
  --metrics.addr 0.0.0.0 \
  --metrics.port 6060 \
  --cache 4096 \
  --maxpeers 50

Restart node:

sudo systemctl daemon-reload
sudo systemctl restart fenine

Create Prometheus Service

sudo nano /etc/systemd/system/prometheus.service

Add:

[Unit]
Description=Prometheus Monitoring
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus/ \
  --web.console.templates=/opt/prometheus/consoles \
  --web.console.libraries=/opt/prometheus/console_libraries

Restart=always

[Install]
WantedBy=multi-user.target

Create user and set permissions:

sudo useradd -rs /bin/false prometheus
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus /opt/prometheus

Start Prometheus:

sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus

Verify at http://YOUR_IP:9090

Node Exporter Setup

Install system metrics exporter:

# Download Node Exporter
cd /tmp
NODE_EXP_VERSION="1.7.0"
wget "https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXP_VERSION}/node_exporter-${NODE_EXP_VERSION}.linux-amd64.tar.gz"

# Extract and install
tar -xzf node_exporter-${NODE_EXP_VERSION}.linux-amd64.tar.gz
sudo mv node_exporter-${NODE_EXP_VERSION}.linux-amd64/node_exporter /usr/local/bin/

# Create service
sudo nano /etc/systemd/system/node-exporter.service

Add:

[Unit]
Description=Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target

Start service:

sudo useradd -rs /bin/false node_exporter
sudo systemctl daemon-reload
sudo systemctl enable node-exporter
sudo systemctl start node-exporter

Grafana Setup

Install Grafana

# Add Grafana repository
sudo apt-get install -y software-properties-common
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

# Install
sudo apt-get update
sudo apt-get install -y grafana

# Start service
sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

Access Grafana at http://YOUR_IP:3000 (default login: admin/admin)

Add Prometheus Data Source

Navigate to Configuration → Data Sources
Click Add data source
Select Prometheus
Set URL to http://localhost:9090
Click Save & Test

Import Fenine Dashboard

Download dashboard JSON:

wget https://raw.githubusercontent.com/fenines-network/monitoring-dashboards/main/fenine-node.json

In Grafana:

Click + → Import
Upload fenine-node.json
Select Prometheus data source
Click Import

Official Fenine dashboard includes: Sync status, peer count, block height, gas usage, TPS, memory/CPU, disk I/O

Alerting Setup

Configure Alertmanager

Install Alertmanager:

cd /tmp
AM_VERSION="0.26.0"
wget "https://github.com/prometheus/alertmanager/releases/download/v${AM_VERSION}/alertmanager-${AM_VERSION}.linux-amd64.tar.gz"

tar -xzf alertmanager-${AM_VERSION}.linux-amd64.tar.gz
sudo mv alertmanager-${AM_VERSION}.linux-amd64 /opt/alertmanager

sudo mkdir -p /etc/alertmanager

Create /etc/alertmanager/alertmanager.yml:

global:
  resolve_timeout: 5m

route:
  receiver: 'email-notifications'
  group_by: ['alertname', 'instance']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 12h

receivers:
  - name: 'email-notifications'
    email_configs:
      - to: 'your-email@example.com'
        from: 'alertmanager@fene.network'
        smarthost: 'smtp.gmail.com:587'
        auth_username: 'your-email@gmail.com'
        auth_password: 'your-app-password'

Define Alert Rules

Create /etc/prometheus/alerts.yml:

groups:
  - name: fenine_node_alerts
    interval: 30s
    rules:
      # Sync alerts
      - alert: NodeNotSyncing
        expr: fenine_chain_head_block - fenine_chain_current_block > 200
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Node is out of sync"
          description: "Node is {{ $value }} blocks behind"
      
      # Peer alerts
      - alert: LowPeerCount
        expr: fenine_p2p_peers < 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Low peer count"
          description: "Only {{ $value }} peers connected"
      
      # Resource alerts
      - alert: HighCPUUsage
        expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage"
          description: "CPU usage is {{ $value }}%"
      
      - alert: HighMemoryUsage
        expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 95
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High memory usage"
          description: "Memory usage is {{ $value }}%"
      
      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk space low"
          description: "Only {{ $value }}% disk space remaining"
      
      # RPC alerts
      - alert: HighRPCErrorRate
        expr: rate(fenine_rpc_requests_failed[5m]) / rate(fenine_rpc_requests_total[5m]) > 0.10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High RPC error rate"
          description: "RPC error rate is {{ $value }}%"

Update Prometheus config to include rules:

# Add to /etc/prometheus/prometheus.yml
rule_files:
  - "alerts.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['localhost:9093']

Restart Prometheus:

sudo systemctl restart prometheus

Log Monitoring

Centralized Logging with Loki

Install Loki:

cd /tmp
LOKI_VERSION="2.9.3"
wget "https://github.com/grafana/loki/releases/download/v${LOKI_VERSION}/loki-linux-amd64.zip"
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki

# Config
sudo mkdir -p /etc/loki
sudo nano /etc/loki/config.yml

Basic config:

auth_enabled: false

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1

schema_config:
  configs:
    - from: 2024-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /tmp/loki/boltdb-shipper-active
    cache_location: /tmp/loki/boltdb-shipper-cache
    shared_store: filesystem
  filesystem:
    directory: /tmp/loki/chunks

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h

Install Promtail (Log Shipper)

wget "https://github.com/grafana/loki/releases/download/v${LOKI_VERSION}/promtail-linux-amd64.zip"
unzip promtail-linux-amd64.zip
sudo mv promtail-linux-amd64 /usr/local/bin/promtail

sudo mkdir -p /etc/promtail
sudo nano /etc/promtail/config.yml

Config:

server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

scrape_configs:
  - job_name: fenine
    static_configs:
      - targets:
          - localhost
        labels:
          job: fenine-logs
          __path__: /var/log/journal/*
    relabel_configs:
      - source_labels: ['__journal__systemd_unit']
        target_label: 'unit'

Add Loki to Grafana as data source, then explore logs.

Quick Health Check Script

Create /usr/local/bin/fenine-health.sh:

#!/bin/bash

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

echo "=== Fenine Node Health Check ==="

# Service status
if systemctl is-active --quiet fenine; then
  echo -e "${GREEN}✓${NC} Node service is running"
else
  echo -e "${RED}✗${NC} Node service is DOWN"
  exit 1
fi

# Sync status
SYNC=$(curl -s -X POST http://localhost:8545 \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' | jq -r '.result')

if [ "$SYNC" = "false" ]; then
  echo -e "${GREEN}✓${NC} Node is synced"
else
  echo -e "${YELLOW}⚠${NC} Node is syncing"
fi

# Peer count
PEERS=$(curl -s -X POST http://localhost:8545 \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' | jq -r '.result')

PEER_COUNT=$((16#${PEERS:2}))

if [ $PEER_COUNT -gt 10 ]; then
  echo -e "${GREEN}✓${NC} $PEER_COUNT peers connected"
elif [ $PEER_COUNT -gt 5 ]; then
  echo -e "${YELLOW}⚠${NC} $PEER_COUNT peers connected (low)"
else
  echo -e "${RED}✗${NC} $PEER_COUNT peers connected (critical)"
fi

# Disk usage
DISK_USAGE=$(df -h /var/lib/fenine | awk 'NR==2 {print $5}' | sed 's/%//')

if [ $DISK_USAGE -lt 80 ]; then
  echo -e "${GREEN}✓${NC} Disk usage: ${DISK_USAGE}%"
elif [ $DISK_USAGE -lt 90 ]; then
  echo -e "${YELLOW}⚠${NC} Disk usage: ${DISK_USAGE}% (warning)"
else
  echo -e "${RED}✗${NC} Disk usage: ${DISK_USAGE}% (critical)"
fi

# Memory usage
MEM_USAGE=$(free | grep Mem | awk '{print int($3/$2 * 100)}')

if [ $MEM_USAGE -lt 85 ]; then
  echo -e "${GREEN}✓${NC} Memory usage: ${MEM_USAGE}%"
else
  echo -e "${YELLOW}⚠${NC} Memory usage: ${MEM_USAGE}%"
fi

echo "================================"

Make executable:

sudo chmod +x /usr/local/bin/fenine-health.sh

Run health check:

fenine-health.sh

Automated Monitoring Cron

Add to crontab:

crontab -e

Add:

# Health check every 5 minutes, alert on failure
*/5 * * * * /usr/local/bin/fenine-health.sh || echo "Fenine node health check failed" | mail -s "Node Alert" your-email@example.com

# Disk cleanup weekly
0 2 * * 0 journalctl --vacuum-time=7d

Next Steps

Upgrade Guide

Keep your node up to date

Backup & Recovery

Protect your node data

Troubleshooting

Fix common issues

Grafana Dashboards

Official monitoring templates

Getting Started

Maintenance

Introduction

Metrics to Monitor

Critical Metrics

Prometheus Setup

Install Prometheus

Configure Prometheus

Enable Metrics in Fene-Geth

Create Prometheus Service

Node Exporter Setup

Grafana Setup

Install Grafana

Add Prometheus Data Source

Import Fenine Dashboard

Alerting Setup

Configure Alertmanager

Define Alert Rules

Log Monitoring

Centralized Logging with Loki

Install Promtail (Log Shipper)

Quick Health Check Script

Automated Monitoring Cron

Next Steps

Upgrade Guide

Backup & Recovery

Troubleshooting

Grafana Dashboards

Getting Started

Maintenance

​Introduction

​Metrics to Monitor

​Critical Metrics

​Prometheus Setup

​Install Prometheus

​Configure Prometheus

​Enable Metrics in Fene-Geth

​Create Prometheus Service

​Node Exporter Setup

​Grafana Setup

​Install Grafana

​Add Prometheus Data Source

​Import Fenine Dashboard

​Alerting Setup

​Configure Alertmanager

​Define Alert Rules

​Log Monitoring

​Centralized Logging with Loki

​Install Promtail (Log Shipper)

​Quick Health Check Script

​Automated Monitoring Cron

​Next Steps

Upgrade Guide

Backup & Recovery

Troubleshooting

Grafana Dashboards

Introduction

Metrics to Monitor

Critical Metrics

Prometheus Setup

Install Prometheus

Configure Prometheus

Enable Metrics in Fene-Geth

Create Prometheus Service

Node Exporter Setup

Grafana Setup

Install Grafana

Add Prometheus Data Source

Import Fenine Dashboard

Alerting Setup

Configure Alertmanager

Define Alert Rules

Log Monitoring

Centralized Logging with Loki

Install Promtail (Log Shipper)

Quick Health Check Script

Automated Monitoring Cron

Next Steps