Skip to main content

Introduction

Proper monitoring ensures your Fenine node runs reliably and helps you detect issues before they become critical. This guide covers monitoring setup using industry-standard tools.
Recommended Stack: Prometheus + Grafana + Node Exporter

Metrics to Monitor

Critical Metrics

  • Sync Status: Node is fully synchronized
  • Peer Count: Connected to 20+ peers
  • Block Height: Matches network head
  • Chain Forks: No unexpected reorgs
Alert Thresholds:
  • Peers < 5: Warning
  • Peers < 2: Critical
  • Sync lag > 50 blocks: Warning
  • Sync lag > 200 blocks: Critical
  • CPU Usage: Should stay < 80%
  • Memory Usage: Should stay < 85%
  • Disk Usage: Alert at 80% full
  • Disk I/O: Monitor for bottlenecks
  • Network Bandwidth: Track usage trends
Alert Thresholds:
  • CPU > 90% for 5min: Warning
  • Memory > 95%: Critical
  • Disk > 90%: Critical
  • Disk I/O wait > 50%: Warning
  • Request Rate: Requests per second
  • Response Time: p50, p95, p99 latency
  • Error Rate: Failed requests
  • Concurrent Connections: Active clients
Alert Thresholds:
  • Response time p95 > 1s: Warning
  • Error rate > 5%: Warning
  • Error rate > 10%: Critical
  • Block Production: New blocks every 3s
  • Transaction Pool: Pending tx count
  • Gas Price: Current base fee
  • Validator Set: Active validators
Alert Thresholds:
  • No new block > 15s: Warning
  • No new block > 30s: Critical
  • Mempool > 10,000 tx: Warning

Prometheus Setup

Install Prometheus

# Download Prometheus
cd /tmp
PROM_VERSION="2.48.0"
wget "https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/prometheus-${PROM_VERSION}.linux-amd64.tar.gz"

# Extract and install
tar -xzf prometheus-${PROM_VERSION}.linux-amd64.tar.gz
sudo mv prometheus-${PROM_VERSION}.linux-amd64 /opt/prometheus

# Create directories
sudo mkdir -p /etc/prometheus /var/lib/prometheus

# Move config
sudo mv /opt/prometheus/prometheus.yml /etc/prometheus/

Configure Prometheus

Create /etc/prometheus/prometheus.yml:
global:
  scrape_interval: 15s
  evaluation_interval: 15s
  external_labels:
    monitor: 'fenine-node'

scrape_configs:
  # Fenine node metrics
  - job_name: 'fenine'
    static_configs:
      - targets: ['localhost:6060']
        labels:
          instance: 'fenine-mainnet'
  
  # System metrics
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['localhost:9100']
  
  # Process metrics
  - job_name: 'process-exporter'
    static_configs:
      - targets: ['localhost:9256']

Enable Metrics in Fene-Geth

Update /var/lib/fenine/config.toml:
[Metrics]
Enabled = true
HTTP = "0.0.0.0"
Port = 6060
Or add to systemd service:
sudo nano /etc/systemd/system/fenine.service
Add --metrics --metrics.addr 0.0.0.0 --metrics.port 6060 to ExecStart:
ExecStart=/usr/local/bin/fene-geth \
  --config /var/lib/fenine/config.toml \
  --metrics \
  --metrics.addr 0.0.0.0 \
  --metrics.port 6060 \
  --cache 4096 \
  --maxpeers 50
Restart node:
sudo systemctl daemon-reload
sudo systemctl restart fenine

Create Prometheus Service

sudo nano /etc/systemd/system/prometheus.service
Add:
[Unit]
Description=Prometheus Monitoring
After=network.target

[Service]
Type=simple
User=prometheus
Group=prometheus
ExecStart=/opt/prometheus/prometheus \
  --config.file=/etc/prometheus/prometheus.yml \
  --storage.tsdb.path=/var/lib/prometheus/ \
  --web.console.templates=/opt/prometheus/consoles \
  --web.console.libraries=/opt/prometheus/console_libraries

Restart=always

[Install]
WantedBy=multi-user.target
Create user and set permissions:
sudo useradd -rs /bin/false prometheus
sudo chown -R prometheus:prometheus /etc/prometheus /var/lib/prometheus /opt/prometheus
Start Prometheus:
sudo systemctl daemon-reload
sudo systemctl enable prometheus
sudo systemctl start prometheus
Verify at http://YOUR_IP:9090

Node Exporter Setup

Install system metrics exporter:
# Download Node Exporter
cd /tmp
NODE_EXP_VERSION="1.7.0"
wget "https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXP_VERSION}/node_exporter-${NODE_EXP_VERSION}.linux-amd64.tar.gz"

# Extract and install
tar -xzf node_exporter-${NODE_EXP_VERSION}.linux-amd64.tar.gz
sudo mv node_exporter-${NODE_EXP_VERSION}.linux-amd64/node_exporter /usr/local/bin/

# Create service
sudo nano /etc/systemd/system/node-exporter.service
Add:
[Unit]
Description=Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
Start service:
sudo useradd -rs /bin/false node_exporter
sudo systemctl daemon-reload
sudo systemctl enable node-exporter
sudo systemctl start node-exporter

Grafana Setup

Install Grafana

# Add Grafana repository
sudo apt-get install -y software-properties-common
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee /etc/apt/sources.list.d/grafana.list

# Install
sudo apt-get update
sudo apt-get install -y grafana

# Start service
sudo systemctl daemon-reload
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
Access Grafana at http://YOUR_IP:3000 (default login: admin/admin)

Add Prometheus Data Source

  1. Navigate to ConfigurationData Sources
  2. Click Add data source
  3. Select Prometheus
  4. Set URL to http://localhost:9090
  5. Click Save & Test

Import Fenine Dashboard

Download dashboard JSON:
wget https://raw.githubusercontent.com/fenines-network/monitoring-dashboards/main/fenine-node.json
In Grafana:
  1. Click +Import
  2. Upload fenine-node.json
  3. Select Prometheus data source
  4. Click Import
Official Fenine dashboard includes: Sync status, peer count, block height, gas usage, TPS, memory/CPU, disk I/O

Alerting Setup

Configure Alertmanager

Install Alertmanager:
cd /tmp
AM_VERSION="0.26.0"
wget "https://github.com/prometheus/alertmanager/releases/download/v${AM_VERSION}/alertmanager-${AM_VERSION}.linux-amd64.tar.gz"

tar -xzf alertmanager-${AM_VERSION}.linux-amd64.tar.gz
sudo mv alertmanager-${AM_VERSION}.linux-amd64 /opt/alertmanager

sudo mkdir -p /etc/alertmanager
Create /etc/alertmanager/alertmanager.yml:
global:
  resolve_timeout: 5m

route:
  receiver: 'email-notifications'
  group_by: ['alertname', 'instance']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 12h

receivers:
  - name: 'email-notifications'
    email_configs:
      - to: 'your-email@example.com'
        from: 'alertmanager@fene.network'
        smarthost: 'smtp.gmail.com:587'
        auth_username: 'your-email@gmail.com'
        auth_password: 'your-app-password'

Define Alert Rules

Create /etc/prometheus/alerts.yml:
groups:
  - name: fenine_node_alerts
    interval: 30s
    rules:
      # Sync alerts
      - alert: NodeNotSyncing
        expr: fenine_chain_head_block - fenine_chain_current_block > 200
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Node is out of sync"
          description: "Node is {{ $value }} blocks behind"
      
      # Peer alerts
      - alert: LowPeerCount
        expr: fenine_p2p_peers < 5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Low peer count"
          description: "Only {{ $value }} peers connected"
      
      # Resource alerts
      - alert: HighCPUUsage
        expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 90
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage"
          description: "CPU usage is {{ $value }}%"
      
      - alert: HighMemoryUsage
        expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 95
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High memory usage"
          description: "Memory usage is {{ $value }}%"
      
      - alert: DiskSpaceLow
        expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "Disk space low"
          description: "Only {{ $value }}% disk space remaining"
      
      # RPC alerts
      - alert: HighRPCErrorRate
        expr: rate(fenine_rpc_requests_failed[5m]) / rate(fenine_rpc_requests_total[5m]) > 0.10
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High RPC error rate"
          description: "RPC error rate is {{ $value }}%"
Update Prometheus config to include rules:
# Add to /etc/prometheus/prometheus.yml
rule_files:
  - "alerts.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets: ['localhost:9093']
Restart Prometheus:
sudo systemctl restart prometheus

Log Monitoring

Centralized Logging with Loki

Install Loki:
cd /tmp
LOKI_VERSION="2.9.3"
wget "https://github.com/grafana/loki/releases/download/v${LOKI_VERSION}/loki-linux-amd64.zip"
unzip loki-linux-amd64.zip
sudo mv loki-linux-amd64 /usr/local/bin/loki

# Config
sudo mkdir -p /etc/loki
sudo nano /etc/loki/config.yml
Basic config:
auth_enabled: false

server:
  http_listen_port: 3100

ingester:
  lifecycler:
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1

schema_config:
  configs:
    - from: 2024-01-01
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
    active_index_directory: /tmp/loki/boltdb-shipper-active
    cache_location: /tmp/loki/boltdb-shipper-cache
    shared_store: filesystem
  filesystem:
    directory: /tmp/loki/chunks

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h

Install Promtail (Log Shipper)

wget "https://github.com/grafana/loki/releases/download/v${LOKI_VERSION}/promtail-linux-amd64.zip"
unzip promtail-linux-amd64.zip
sudo mv promtail-linux-amd64 /usr/local/bin/promtail

sudo mkdir -p /etc/promtail
sudo nano /etc/promtail/config.yml
Config:
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://localhost:3100/loki/api/v1/push

scrape_configs:
  - job_name: fenine
    static_configs:
      - targets:
          - localhost
        labels:
          job: fenine-logs
          __path__: /var/log/journal/*
    relabel_configs:
      - source_labels: ['__journal__systemd_unit']
        target_label: 'unit'
Add Loki to Grafana as data source, then explore logs.

Quick Health Check Script

Create /usr/local/bin/fenine-health.sh:
#!/bin/bash

# Colors
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

echo "=== Fenine Node Health Check ==="

# Service status
if systemctl is-active --quiet fenine; then
  echo -e "${GREEN}✓${NC} Node service is running"
else
  echo -e "${RED}✗${NC} Node service is DOWN"
  exit 1
fi

# Sync status
SYNC=$(curl -s -X POST http://localhost:8545 \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"eth_syncing","params":[],"id":1}' | jq -r '.result')

if [ "$SYNC" = "false" ]; then
  echo -e "${GREEN}✓${NC} Node is synced"
else
  echo -e "${YELLOW}⚠${NC} Node is syncing"
fi

# Peer count
PEERS=$(curl -s -X POST http://localhost:8545 \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}' | jq -r '.result')

PEER_COUNT=$((16#${PEERS:2}))

if [ $PEER_COUNT -gt 10 ]; then
  echo -e "${GREEN}✓${NC} $PEER_COUNT peers connected"
elif [ $PEER_COUNT -gt 5 ]; then
  echo -e "${YELLOW}⚠${NC} $PEER_COUNT peers connected (low)"
else
  echo -e "${RED}✗${NC} $PEER_COUNT peers connected (critical)"
fi

# Disk usage
DISK_USAGE=$(df -h /var/lib/fenine | awk 'NR==2 {print $5}' | sed 's/%//')

if [ $DISK_USAGE -lt 80 ]; then
  echo -e "${GREEN}✓${NC} Disk usage: ${DISK_USAGE}%"
elif [ $DISK_USAGE -lt 90 ]; then
  echo -e "${YELLOW}⚠${NC} Disk usage: ${DISK_USAGE}% (warning)"
else
  echo -e "${RED}✗${NC} Disk usage: ${DISK_USAGE}% (critical)"
fi

# Memory usage
MEM_USAGE=$(free | grep Mem | awk '{print int($3/$2 * 100)}')

if [ $MEM_USAGE -lt 85 ]; then
  echo -e "${GREEN}✓${NC} Memory usage: ${MEM_USAGE}%"
else
  echo -e "${YELLOW}⚠${NC} Memory usage: ${MEM_USAGE}%"
fi

echo "================================"
Make executable:
sudo chmod +x /usr/local/bin/fenine-health.sh
Run health check:
fenine-health.sh

Automated Monitoring Cron

Add to crontab:
crontab -e
Add:
# Health check every 5 minutes, alert on failure
*/5 * * * * /usr/local/bin/fenine-health.sh || echo "Fenine node health check failed" | mail -s "Node Alert" your-email@example.com

# Disk cleanup weekly
0 2 * * 0 journalctl --vacuum-time=7d

Next Steps

Upgrade Guide

Keep your node up to date

Backup & Recovery

Protect your node data

Troubleshooting

Fix common issues

Grafana Dashboards

Official monitoring templates