Automating SnapRAID Tasks

While SnapRAID is an amazing piece of software, it has its limitations, mainly, not real time protection, sync, scrub and diff jobs need to be triggered manually, unless…

There are many ways to automate these tasks, Zack Reed has created an amazing snapraid-helper script that he updates from time to time, it automate these tasks and also has some added benefits. I use a modified to my needs version of this script.

Read the other parts:
Part 1: A DIY NAS with a Twist
Part 2: MergerFS + SnapRaid: Installation and Setup\

Step 1: Automate the process:

A Note: this script is based on Zack Reed’s amazing snapraid helper script, I modified it to suit my needs.

Create the script for automation:

sudo nano /var/snapraid/snapraid-helper.sh

Copy and paste this script:

#!/usr/bin/env bash

#######################################################################
# This script is based on Zack Reed's original script found at 
# https://zackreed.me/posts/modern-snapraid-maintenance-script/
# Removed email notifications and added a few other optimizations
#
# What this script does (in order):
#   1) Optionally pauses configured Docker containers to prevent file
#      changes occurring during the sync window.
#   2) Verifies all SnapRAID content and parity files exist on disk.
#   3) Checks for files with zero sub-second timestamps and fixes them
#      with `snapraid touch` if needed.
#   4) Runs `snapraid diff` to detect what has changed since last sync.
#   5) Compares deleted/updated file counts against configured thresholds.
#      - If thresholds are exceeded: warns and optionally forces a sync
#        after N cumulative warnings (SYNC_WARN_THRESHOLD).
#      - If thresholds are not exceeded: authorises the sync.
#   6) Runs `snapraid sync` if authorised.
#   7) Runs `snapraid scrub` (partial, configurable %) if the array is
#      in a safe state (in-sync or sync just completed successfully).
#   8) Optionally logs SMART attributes via `snapraid smart`.
#   9) Optionally spins down disks via `snapraid down`.
#  10) Restores any paused Docker containers.
#  11) Saves the full run log to disk.
#
# Email and Healthchecks integrations have been removed.
# Use cron/systemd journal or a log shipper for alerting instead.
#######################################################################

#######################
# USER CONFIGURATION  #
#######################

# --- Threshold: how many files may be DELETED before sync is blocked.
# If more files than this were removed since the last sync, the script
# will refuse to sync automatically (to protect against accidental mass
# deletion). Adjust to suit your expected churn.
DEL_THRESHOLD=100

# --- Threshold: how many files may be UPDATED before sync is blocked.
# Same safety principle as DEL_THRESHOLD but for modified files.
UP_THRESHOLD=500

# --- Forced-sync policy when a threshold is breached:
#   0  -> always force a sync regardless of thresholds (DANGEROUS – use carefully)
#  -1  -> never force; manual intervention required every time (safest)
#   N  -> automatically force a sync once N consecutive warnings have
#          accumulated (warning count is stored in SYNC_WARN_FILE below)
SYNC_WARN_THRESHOLD=-1

# --- Scrub settings.
# SCRUB_PERCENT: percentage of the array to verify per run.
#   0   = scrub disabled
#   100 = full array verified in one run (can take many hours)
#   3   = light rolling scrub (recommended for daily/weekly cron)
# SCRUB_AGE: only scrub blocks not verified in the last N days.
SCRUB_PERCENT=3
SCRUB_AGE=10

# --- Spin down array disks after all jobs finish.
#   1 = run `snapraid down`  (saves power; disks must spin up on next access)
#   0 = leave disks spinning (useful when other services are still active)
SPINDOWN_DISKS=0

# --- Log SMART disk health attributes after each run.
#   1 = run `snapraid smart`
#   0 = skip
SMART_LOG=1

# --- Binary paths. Override if your system uses non-standard locations.
SNAPRAID_BIN="/usr/local/bin/snapraid"
DOCKER_BIN="/usr/bin/docker"

# --- SnapRAID configuration file path.
SNAPRAID_CONF="/etc/snapraid.conf"

# --- Docker service management.
# Set MANAGE_SERVICES=1 to pause these containers before diff/sync and
# unpause them afterwards, preventing mid-sync file changes.
# PAUSED_SERVICES is populated at runtime — do not edit it here.
MANAGE_SERVICES=1
SERVICES=(sabnzbd sonarr radarr lidarr)
PAUSED_SERVICES=()

# --- Persistent warning counter file.
# Stores the number of consecutive threshold-breach warnings so the
# forced-sync logic (SYNC_WARN_THRESHOLD > 0) survives across cron runs.
SYNC_WARN_FILE="/tmp/snapRAID.warnCount"

# --- Lock file to prevent overlapping runs (e.g. from cron overlap).
LOCK_FILE="/tmp/snapraid-sync.lock"

# --- Fail-fast behaviour on snapraid command errors.
#   0 = log warning and continue (best-effort run)
#   1 = abort immediately on any non-zero exit (except diff rc=2 which is normal)
FAIL_FAST=1

# --- Directory where full run logs are written.
# Each run creates a timestamped file: snapraid-<host>-<timestamp>.log
LOG_DIR="/var/log/snapraid"

############################
# DO NOT EDIT BELOW THIS   #
############################

# Exit immediately on unset variables and propagate pipe failures.
set -u
set -o pipefail
# Enable lastpipe so the final command in a pipeline runs in the current
# shell (needed for correct PIPESTATUS capture on some bash versions).
shopt -s lastpipe 2>/dev/null || true

# --- Runtime state variables (not user-configurable) ---

# Wall-clock timer; bash increments $SECONDS automatically.
SECONDS=0

# Path to the temporary log file accumulating all output this run.
TMP_OUTPUT=""

# Path where the final log is persisted to disk.
FULL_LOG_FILE=""

# Change counts parsed from `snapraid diff` output.
DEL_COUNT=""
ADD_COUNT=""
MOVE_COUNT=""
COPY_COUNT=""
UPDATE_COUNT=""
RESTORED_COUNT=""

# Sync decision flags.
CHK_FAIL=0   # Set to 1 if a threshold is breached.
DO_SYNC=0    # Set to 1 if sync is authorised.

# Loaded from SYNC_WARN_FILE at runtime.
SYNC_WARN_COUNT=""

# Exit-code captures for each major job.
DIFF_RC=0
SYNC_RC=0
SCRUB_RC=0
SMART_RC=0
DOWN_RC=0
TOUCH_RC=0
SERVICE_RC=0

# Set to 1 if any job returns a non-zero exit code.
HAD_FAILURE=0

# Human-readable list of jobs that ran this execution (for the log summary).
JOBS_DONE=""

# Service management counters (for the end-of-run summary log).
SERVICES_PAUSED_COUNT=0
SERVICES_RESTORED_COUNT=0
SERVICES_FAILED_PAUSE=0
SERVICES_FAILED_RESTORE=0

#######################################################################
# HELPER FUNCTIONS
#######################################################################

# log MESSAGE
# Writes a message to stdout (which is tee'd into TMP_OUTPUT during jobs).
log() {
  printf '%s\n' "$*"
}

# die MESSAGE
# Logs a fatal error and exits with code 1.
# Always terminates the script — use only for unrecoverable errors.
die() {
  log "**ERROR** $*"
  exit 1
}

# have_cmd COMMAND
# Returns 0 (true) if COMMAND is found in PATH, 1 otherwise.
have_cmd() {
  command -v "$1" >/dev/null 2>&1
}

# format_duration SECONDS
# Prints a human-readable duration string, e.g. "2h 5m 30s".
format_duration() {
  local total=$1
  local h=$(( total / 3600 ))
  local m=$(( (total % 3600) / 60 ))
  local s=$(( total % 60 ))
  if (( h > 0 )); then
    printf '%dh %dm %ds' "$h" "$m" "$s"
  elif (( m > 0 )); then
    printf '%dm %ds' "$m" "$s"
  else
    printf '%ds' "$s"
  fi
}

# require_bins
# Verifies every binary the script depends on is present and executable.
# Calls die() and aborts if anything is missing.
require_bins() {
  [[ -x "$SNAPRAID_BIN" ]] || die "snapraid binary not found/executable: $SNAPRAID_BIN"
  [[ -f "$SNAPRAID_CONF" ]] || die "snapraid config not found: $SNAPRAID_CONF"

  if (( MANAGE_SERVICES == 1 )); then
    [[ -x "$DOCKER_BIN" ]] || die "docker binary not found/executable: $DOCKER_BIN"
  fi

  # Core utilities used throughout the script.
  for b in awk sed grep hostname date tee mkdir mktemp; do
    have_cmd "$b" || die "Required utility not found in PATH: $b"
  done
}

# section TITLE
# Prints a visible section divider to the log for readability.
section() {
  log
  log "----------------------------------------"
  log "$1"
}

#######################################################################
# JOB MARKERS AND COMMAND RUNNER
#######################################################################

# mark_begin JOB_NAME
# Writes a structured BEGIN marker into TMP_OUTPUT.
# These markers let us reliably detect whether a job started and ended,
# and let get_counts() extract only the DIFF block when parsing output.
mark_begin() {
  printf '__SNAPRAID_%s_BEGIN__ [%s]\n' "$1" "$(date)" >> "$TMP_OUTPUT"
}

# mark_end JOB_NAME EXIT_CODE
# Writes a structured END marker with the job's exit code into TMP_OUTPUT.
mark_end() {
  printf '__SNAPRAID_%s_END__ [%s] rc=%s\n\n' "$1" "$(date)" "$2" >> "$TMP_OUTPUT"
}

# marker_end_present JOB_NAME
# Returns 0 (true) if TMP_OUTPUT contains a completed END marker for JOB_NAME.
# Used as a safety check before running dependent jobs (e.g. scrub after sync).
marker_end_present() {
  grep -q "__SNAPRAID_${1}_END__" "$TMP_OUTPUT"
}

# is_snapraid_diff_ok EXIT_CODE
# Returns 0 (true) for acceptable diff exit codes.
# snapraid diff intentionally returns rc=2 when differences are found —
# this is normal and must NOT be treated as an error.
is_snapraid_diff_ok() {
  [[ "$1" -eq 0 || "$1" -eq 2 ]]
}

# run_cmd JOB_NAME COMMAND [ARGS...]
# Runs a snapraid command with:
#   - BEGIN/END markers written to TMP_OUTPUT
#   - stdout+stderr tee'd to TMP_OUTPUT
#   - exit code captured and checked
#   - HAD_FAILURE set on non-zero exit
#   - die() called if FAIL_FAST=1
run_cmd() {
  local name="$1"; shift

  mark_begin "$name"

  # Run the command, merging stderr into stdout and tee'ing to the log.
  {
    printf '###%s [%s]\n' "$name" "$(date)"
    "$@"
  } 2>&1 | tee -a "$TMP_OUTPUT"

  # PIPESTATUS[0] captures the exit code of the command before `tee`.
  local rc=${PIPESTATUS[0]}
  mark_end "$name" "$rc"

  if (( rc != 0 )); then
    HAD_FAILURE=1
    log "**WARNING** ${name} returned non-zero exit code: ${rc}"
    if (( FAIL_FAST == 1 )); then
      die "${name} failed with rc=${rc} (FAIL_FAST=1)"
    fi
  fi

  return "$rc"
}

#######################################################################
# DOCKER SERVICE MANAGEMENT
#######################################################################

# service_pause
# Iterates over SERVICES and pauses any that are currently running.
# Only running containers are paused; stopped/missing ones are skipped.
# Paused container names are collected in PAUSED_SERVICES for later restore.
service_pause() {
  local s running
  for s in "${SERVICES[@]}"; do
    # Ask Docker for the running state of this container.
    running="$("$DOCKER_BIN" inspect -f '{{.State.Running}}' "$s" 2>/dev/null || true)"

    if [[ "$running" == "true" ]]; then
      log "Pausing Service - ${s}"
      if "$DOCKER_BIN" pause "$s" >/dev/null 2>&1; then
        PAUSED_SERVICES+=("$s")
        (( SERVICES_PAUSED_COUNT++ ))
      else
        log "WARNING: failed to pause $s"
        (( SERVICES_FAILED_PAUSE++ ))
        SERVICE_RC=1
        HAD_FAILURE=1
      fi
    elif [[ "$running" == "false" ]]; then
      log "Service not running (skip pause) - ${s}"
    else
      # Container doesn't exist or Docker returned unexpected output.
      log "Service not found (skip pause) - ${s}"
    fi
  done
}

# service_unpause
# Unpauses every container that was successfully paused by service_pause().
# Only unpauses containers that are still in a "paused" state (handles the
# edge case where a container was manually resumed externally).
service_unpause() {
  local s st
  for s in "${PAUSED_SERVICES[@]}"; do
    st="$("$DOCKER_BIN" inspect -f '{{.State.Status}}' "$s" 2>/dev/null || true)"
    if [[ "$st" == "paused" ]]; then
      log "Unpausing Service - ${s}"
      if "$DOCKER_BIN" unpause "$s" >/dev/null 2>&1; then
        (( SERVICES_RESTORED_COUNT++ ))
      else
        log "WARNING: failed to unpause $s"
        (( SERVICES_FAILED_RESTORE++ ))
        SERVICE_RC=1
        HAD_FAILURE=1
      fi
    else
      # Not paused (already running, stopped, etc.) — nothing to do.
      log "Service not paused (skip unpause) - ${s} (status: $st)"
    fi
  done
}

# restore_services
# Public wrapper around service_unpause().
# Called both from the normal run flow and from the EXIT trap (cleanup).
restore_services() {
  (( MANAGE_SERVICES == 1 )) || return 0

  if [[ ${#PAUSED_SERVICES[@]} -eq 0 ]]; then
    log "No services to restore."
    return 0
  fi

  service_unpause
}

# cleanup
# EXIT/INT/TERM trap handler — always runs on script exit, even on error.
# Guarantees that paused Docker containers are always restored, and that
# the lock file is cleaned up on a clean exit.
cleanup() {
  local exit_code=$?

  # Restore services regardless of why we're exiting.
  restore_services || {
    log "WARNING: Failed to restore services during cleanup" >&2
    # Don't overwrite an existing non-zero exit code.
    (( exit_code == 0 )) && exit_code=1
  }

  # Only remove the lock file on a clean (zero) exit so a failed run
  # leaves the lock in place for investigation.
  if (( exit_code == 0 )) && [[ -f "$LOCK_FILE" ]]; then
    rm -f "$LOCK_FILE" 2>/dev/null || true
  fi

  exit "$exit_code"
}

# Register cleanup to fire on any exit, interrupt, or termination signal.
trap cleanup INT TERM EXIT

#######################################################################
# SNAPRAID CONFIG PARSING
#######################################################################

# parse_snapraid_conf
# Reads SNAPRAID_CONF and populates:
#   CONTENT_FILES  - array of all content file paths
#   CONTENT_FILE   - primary content file (first entry)
#   PARITY_FILES   - array of all parity file paths
# Dies if either list is empty (misconfigured or wrong path).
parse_snapraid_conf() {
  # Extract "content <path>" lines, ignoring comments and blank lines.
  mapfile -t CONTENT_FILES < <(
    awk '
      /^[[:space:]]*($|#|;)/ { next }
      $1 == "content" && $2 != "" { print $2 }
    ' "$SNAPRAID_CONF"
  )
  (( ${#CONTENT_FILES[@]} > 0 )) || die "No content files found in $SNAPRAID_CONF"
  CONTENT_FILE="${CONTENT_FILES[0]}"

  # Extract parity lines. SnapRAID supports: parity, 2-parity … 6-parity, z-parity.
  # Values may be comma-separated (split parity across devices), so we split on commas.
  mapfile -t PARITY_FILES < <(
    awk '
      function trim(s) { gsub(/^[[:space:]]+|[[:space:]]+$/, "", s); return s }
      /^[[:space:]]*($|#|;)/ { next }
      $1 == "parity" || $1 ~ /^([2-6]|z)-parity$/ {
        if ($2 == "") next
        n = split($2, a, ",")
        for (i = 1; i <= n; i++) {
          path = trim(a[i])
          if (path != "") print path
        }
      }
    ' "$SNAPRAID_CONF"
  )
  (( ${#PARITY_FILES[@]} > 0 )) || die "No parity files found in $SNAPRAID_CONF"
}

# sanity_check
# Confirms every content and parity file actually exists on disk before
# proceeding. A missing file would indicate a disk is offline or unmounted,
# and running sync in that state could corrupt the array.
sanity_check() {
  local f
  log "Verifying all content files are present."
  for f in "${CONTENT_FILES[@]}"; do
    [[ -e "$f" ]] || die "Content file not found: $f"
  done

  log "Verifying all parity files are present."
  for f in "${PARITY_FILES[@]}"; do
    [[ -e "$f" ]] || die "Parity file not found: $f"
  done

  log "All content and parity files found. Continuing..."
}

#######################################################################
# SNAPRAID DIFF ANALYSIS
#######################################################################

# get_counts
# Parses the DIFF block in TMP_OUTPUT and sets:
#   ADD_COUNT, DEL_COUNT, UPDATE_COUNT, MOVE_COUNT, COPY_COUNT, RESTORED_COUNT
#
# SnapRAID diff summary lines look like:
#   "      50 added"
#   "       9 removed"
#   "       0 updated"
# We extract only the DIFF block (between BEGIN/END markers) to avoid
# accidentally picking up counts from other sections of the log.
get_counts() {
  # Extract only lines between the DIFF markers.
  local diff_block
  diff_block="$(
    awk '
      /__SNAPRAID_DIFF_BEGIN__/ { in_block=1; next }
      /__SNAPRAID_DIFF_END__/   { in_block=0 }
      in_block { print }
    ' "$TMP_OUTPUT"
  )"

  # If markers weren't found, fall back to parsing the entire log.
  [[ -n "$diff_block" ]] || diff_block="$(cat "$TMP_OUTPUT")"

  ADD_COUNT="$(     awk '/^[[:space:]]*[0-9]+[[:space:]]+added$/     { print $1; exit }' <<<"$diff_block" || true)"
  DEL_COUNT="$(     awk '/^[[:space:]]*[0-9]+[[:space:]]+removed$/   { print $1; exit }' <<<"$diff_block" || true)"
  UPDATE_COUNT="$(  awk '/^[[:space:]]*[0-9]+[[:space:]]+updated$/   { print $1; exit }' <<<"$diff_block" || true)"
  MOVE_COUNT="$(    awk '/^[[:space:]]*[0-9]+[[:space:]]+moved$/     { print $1; exit }' <<<"$diff_block" || true)"
  COPY_COUNT="$(    awk '/^[[:space:]]*[0-9]+[[:space:]]+copied$/    { print $1; exit }' <<<"$diff_block" || true)"
  RESTORED_COUNT="$(awk '/^[[:space:]]*[0-9]+[[:space:]]+restored$/  { print $1; exit }' <<<"$diff_block" || true)"

  # Default to 0 if the restored line was absent (older SnapRAID versions omit it).
  RESTORED_COUNT="${RESTORED_COUNT:-0}"
}

# chk_del
# Compares DEL_COUNT against DEL_THRESHOLD.
# Sets DO_SYNC=1 if within threshold, or CHK_FAIL=1 if exceeded.
chk_del() {
  if [[ -n "$DEL_COUNT" ]] && (( DEL_COUNT < DEL_THRESHOLD )); then
    log "Deleted files ($DEL_COUNT) below threshold ($DEL_THRESHOLD). SYNC authorised."
    DO_SYNC=1
  else
    log "**WARNING** Deleted files ($DEL_COUNT) exceeded threshold ($DEL_THRESHOLD)."
    CHK_FAIL=1
  fi
}

# chk_updated
# Compares UPDATE_COUNT against UP_THRESHOLD.
# Sets DO_SYNC=1 if within threshold, or CHK_FAIL=1 if exceeded.
chk_updated() {
  if (( UPDATE_COUNT < UP_THRESHOLD )); then
    log "Updated files ($UPDATE_COUNT) below threshold ($UP_THRESHOLD). SYNC authorised."
    DO_SYNC=1
  else
    log "**WARNING** Updated files ($UPDATE_COUNT) exceeded threshold ($UP_THRESHOLD)."
    CHK_FAIL=1
  fi
}

# chk_sync_warn
# Handles the forced-sync-after-N-warnings logic.
# Reads/writes SYNC_WARN_FILE to persist the warning count across runs.
# Sets DO_SYNC=1 if the warning count has reached SYNC_WARN_THRESHOLD.
chk_sync_warn() {
  if (( SYNC_WARN_THRESHOLD > -1 )); then
    log "Forced sync is enabled. [$(date)]"

    # Load existing warning count; default to 0 if file missing or invalid.
    if [[ -f "$SYNC_WARN_FILE" ]]; then
      SYNC_WARN_COUNT="$(awk 'NR==1 && /^[0-9]+$/ { print; exit }' "$SYNC_WARN_FILE" || true)"
    fi
    SYNC_WARN_COUNT="${SYNC_WARN_COUNT:-0}"

    if (( SYNC_WARN_COUNT >= SYNC_WARN_THRESHOLD )); then
      log "Warning count ($SYNC_WARN_COUNT) reached threshold ($SYNC_WARN_THRESHOLD). Forcing SYNC."
      DO_SYNC=1
    else
      (( SYNC_WARN_COUNT++ ))
      printf '%s\n' "$SYNC_WARN_COUNT" > "$SYNC_WARN_FILE"
      local remaining=$(( SYNC_WARN_THRESHOLD - SYNC_WARN_COUNT ))
      log "$remaining warning(s) remaining before forced sync. NOT proceeding with SYNC."
      DO_SYNC=0
    fi
  else
    log "Forced sync is disabled (SYNC_WARN_THRESHOLD=-1). NOT proceeding with SYNC."
    DO_SYNC=0
  fi
}

# chk_zero
# Runs `snapraid status` and checks for files with zero sub-second timestamps.
# If any are found, runs `snapraid touch` to fix them. Files with bad timestamps
# will be flagged as "updated" on every diff run even if their content hasn't
# changed, so fixing them reduces false-positive update counts.
chk_zero() {
  run_cmd "TOUCH_CHECK" "$SNAPRAID_BIN" status

  # Check the status output for the zero-timestamp warning line.
  local timelog
  timelog="$(grep -E 'You have [1-9][0-9]* files with zero sub-second timestamp\.' "$TMP_OUTPUT" | tail -n 1 || true)"

  if [[ -n "$timelog" ]]; then
    log "${timelog/You have/Found}"
    run_cmd "TOUCH" "$SNAPRAID_BIN" touch
    TOUCH_RC=$?
    JOBS_DONE="${JOBS_DONE:+$JOBS_DONE + }TOUCH"
  else
    log "No files with zero sub-second timestamps found."
  fi
}

#######################################################################
# LOG PERSISTENCE
#######################################################################

# persist_full_log
# Copies TMP_OUTPUT to a timestamped file in LOG_DIR.
# Sets FULL_LOG_FILE to the destination path.
# This is called at the end of the run so the complete log is archived.
persist_full_log() {
  mkdir -p "$LOG_DIR" || die "Unable to create log directory: $LOG_DIR"

  local ts host
  ts="$(date +'%Y%m%d-%H%M%S')"
  host="$(hostname)"
  FULL_LOG_FILE="${LOG_DIR}/snapraid-${host}-${ts}.log"

  cp -f "$TMP_OUTPUT" "$FULL_LOG_FILE" || die "Unable to write log to: $FULL_LOG_FILE"
  log "Full log saved to: $FULL_LOG_FILE"
}

#######################################################################
# MAIN EXECUTION
#######################################################################

main() {
  # --- Lock: prevent two instances running concurrently (e.g. from cron overlap).
  # Uses flock if available; silently skips locking if flock isn't installed.
  if have_cmd flock; then
    exec 200>"$LOCK_FILE"
    flock -n 200 || die "Another snapraid job is already running (lock: $LOCK_FILE)."
  fi

  # --- Initialisation ---
  require_bins

  # Create the temporary log accumulator for this run.
  TMP_OUTPUT="$(mktemp -t snapraid.out.XXXXXX)"
  : > "$TMP_OUTPUT"   # Ensure it starts empty.

  # Ensure standard system paths are in PATH (important when called from cron).
  export PATH="/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:$PATH"

  # Parse config so CONTENT_FILES and PARITY_FILES are populated before use.
  parse_snapraid_conf

  log "SnapRAID job started [$(date)]"

  ###################################################################
  # PHASE 1 — PREPROCESSING
  ###################################################################
  section "##Preprocessing"

  # Pause Docker containers to prevent writes during diff/sync.
  if (( MANAGE_SERVICES == 1 )); then
    log "### Stop Services [$(date)]"
    service_pause
  fi

  # Abort early if any disk is missing rather than running on an incomplete array.
  sanity_check

  ###################################################################
  # PHASE 2 — PROCESSING
  ###################################################################
  section "##Processing"

  # Fix zero sub-second timestamps before diff so they don't inflate UPDATE_COUNT.
  chk_zero

  # --- DIFF: detect all changes since the last sync ---
  # rc=2 from snapraid diff means "differences found" — this is normal, not an error.
  mark_begin "DIFF"
  {
    printf '###DIFF [%s]\n' "$(date)"
    "$SNAPRAID_BIN" diff
  } 2>&1 | tee -a "$TMP_OUTPUT"
  DIFF_RC=${PIPESTATUS[0]}
  mark_end "DIFF" "$DIFF_RC"
  JOBS_DONE="DIFF"

  # Validate the diff exit code.
  if ! is_snapraid_diff_ok "$DIFF_RC"; then
    HAD_FAILURE=1
    log "**WARNING** DIFF returned unexpected exit code: ${DIFF_RC}"
    if (( FAIL_FAST == 1 )); then
      die "DIFF failed with rc=${DIFF_RC} (FAIL_FAST=1)"
    fi
  fi

  # Parse the change counts from the DIFF output.
  get_counts

  # Bail out if any count could not be parsed — proceeding without reliable
  # counts could result in an unsafe sync bypassing the threshold checks.
  if [[ -z "${DEL_COUNT:-}" || -z "${ADD_COUNT:-}" || -z "${MOVE_COUNT:-}" || \
        -z "${COPY_COUNT:-}" || -z "${UPDATE_COUNT:-}" ]]; then
    log "**ERROR** Failed to parse change counts from DIFF output. Cannot proceed safely."
    persist_full_log
    exit 1
  fi

  log
  log "**SUMMARY of changes — Added [$ADD_COUNT] Deleted [$DEL_COUNT] Moved [$MOVE_COUNT] Copied [$COPY_COUNT] Updated [$UPDATE_COUNT]**"
  log

  # --- SYNC decision logic ---
  if (( DEL_COUNT > 0 || ADD_COUNT > 0 || MOVE_COUNT > 0 || COPY_COUNT > 0 || UPDATE_COUNT > 0 )); then
    if (( SYNC_WARN_THRESHOLD == 0 )); then
      # SYNC_WARN_THRESHOLD=0 means "always sync, ignore thresholds".
      DO_SYNC=1
    else
      # Check deletion threshold first.
      chk_del

      # Only check update threshold if deletion check passed.
      if (( CHK_FAIL == 0 )); then
        chk_updated
      fi

      # If a threshold was breached, apply the warning/forced-sync policy.
      if (( CHK_FAIL == 1 )); then
        chk_sync_warn
      fi
    fi
  else
    log "No changes detected. Skipping SYNC. [$(date)]"
    DO_SYNC=0
  fi

  # --- SYNC: update parity to reflect current state of data disks ---
  if (( DO_SYNC == 1 )); then
    run_cmd "SYNC" "$SNAPRAID_BIN" sync -q
    SYNC_RC=$?
    JOBS_DONE="${JOBS_DONE} + SYNC"

    # Reset the warning counter after a successful sync authorisation.
    [[ -e "$SYNC_WARN_FILE" ]] && rm -f "$SYNC_WARN_FILE"
  fi

  # --- SCRUB: verify a portion of the array's data integrity ---
  if (( SCRUB_PERCENT > 0 )); then
    if (( CHK_FAIL == 1 && DO_SYNC == 0 )); then
      # Parity is out of date (threshold was breached and sync was skipped).
      # Scrubbing against stale parity is meaningless and potentially misleading.
      log "Scrub cancelled — parity is out of sync (threshold exceeded and sync skipped). [$(date)]"
    elif (( DO_SYNC == 1 )); then
      # Sync ran — verify it actually completed before trusting parity for scrub.
      if ! marker_end_present "SYNC"; then
        log "**WARNING** SYNC end marker missing. Skipping SCRUB. [$(date)]"
      elif (( SYNC_RC != 0 )); then
        log "**WARNING** SYNC failed (rc=${SYNC_RC}). Skipping SCRUB. [$(date)]"
      else
        # Sync completed cleanly — safe to scrub.
        run_cmd "SCRUB" "$SNAPRAID_BIN" scrub -p "$SCRUB_PERCENT" -o "$SCRUB_AGE" -q
        SCRUB_RC=$?
        JOBS_DONE="${JOBS_DONE} + SCRUB"
      fi
    else
      # No sync was needed — array is already in sync, safe to scrub.
      run_cmd "SCRUB" "$SNAPRAID_BIN" scrub -p "$SCRUB_PERCENT" -o "$SCRUB_AGE" -q
      SCRUB_RC=$?
      JOBS_DONE="${JOBS_DONE} + SCRUB"
    fi
  else
    log "Scrub disabled (SCRUB_PERCENT=0). Skipping. [$(date)]"
  fi

  ###################################################################
  # PHASE 3 — POSTPROCESSING
  ###################################################################
  section "##Postprocessing"

  # --- SMART: log disk health attributes ---
  if (( SMART_LOG == 1 )); then
    run_cmd "SMART" "$SNAPRAID_BIN" smart
    SMART_RC=$?
    JOBS_DONE="${JOBS_DONE} + SMART"
  fi

  # --- DOWN: spin down array disks to save power ---
  if (( SPINDOWN_DISKS == 1 )); then
    run_cmd "DOWN" "$SNAPRAID_BIN" down
    DOWN_RC=$?
    JOBS_DONE="${JOBS_DONE} + DOWN"
  else
    log "Spindown disabled (SPINDOWN_DISKS=0). Skipping \`snapraid down\`."
  fi

  # Restore any paused Docker containers before logging completion.
  restore_services

  log "All jobs completed. [$(date)]"
  log "Total duration: $(format_duration "$SECONDS")"
  log "Jobs run: ${JOBS_DONE}"

  ###################################################################
  # PHASE 4 — REPORTING
  ###################################################################

  # Persist the accumulated log to disk.
  persist_full_log

  # Print a final status summary to stdout/log.
  log
  log "============================================================"
  if (( HAD_FAILURE == 1 )); then
    log "STATUS: FAILED — one or more jobs returned a non-zero exit code."
  elif (( CHK_FAIL == 1 && DO_SYNC == 0 )); then
    log "STATUS: WARNING — threshold exceeded; sync was not run."
  else
    log "STATUS: COMPLETED successfully."
  fi
  log "============================================================"

  exit 0
}

# --- Entry point ---
main "$@"

Make it executable:

sudo -i
chmod +x /var/snapraid/snapraid-helper.sh

Step 2: Schedule via cron:

Open crontab and create a schedule:

sudo crontab -e

# add to the end of the file
0 4 * * * /bin/bash /var/snapraid/snapraid-helper.sh >> /var/snapraid/cron.log 2>&1
# runs daily at 4am

TIP: Use Crontab Guru to verify cron schedules syntax.*

Next up is Installing the Tools for NAS Management

References:

Step 1: Automate the process:#

Step 2: Schedule via cron:#

Step 1: Automate the process:

Step 2: Schedule via cron: