Add stash-scheduler plugin (#708)

This commit is contained in:
TheSheesh
2026-05-02 06:36:18 -07:00
committed by GitHub
parent 27a72abd05
commit da31e38e09
7 changed files with 1459 additions and 0 deletions

View File

@@ -0,0 +1,308 @@
# Stash Scheduler Plugin
A plugin for [Stash](https://github.com/stashapp/stash) that automatically runs library scans on a schedule (hourly, daily, or weekly), with an optional identify pass after each scan.
---
## Features
- Schedule scans at **hourly**, **daily**, or **weekly** intervals
- Configure the **hour of day** (and day of week for weekly)
- Optionally run an **Identify** task automatically after each scan completes
- Identify is skipped safely if the scan fails, is cancelled, or no job ID is returned
- **Run Now** task for instant manual trigger (useful for testing)
- **Auto-start on system boot** via systemd, Windows startup, or a cron job (see below)
- All settings managed through Stash's built-in plugin settings UI — no config files to edit
---
## Requirements
- **Stash** v0.17.0 or later
- **Python 3.8+** (must be available as `python3` in your system PATH)
- pip packages: `apscheduler` and `stashapp-tools` (see Installation below)
- `curl` (only needed if you use the auto-start scripts)
---
## Installation
### 1 — Copy the plugin folder
Copy the entire `stash-scheduler/` directory into your Stash plugins folder:
```
<Stash data directory>/plugins/stash-scheduler/
```
The final layout should look like:
```
plugins/
└── stash-scheduler/
├── stash-scheduler.yml
├── stash_scheduler.py
├── requirements.txt
├── startup/
│ ├── autostart.sh (Linux/macOS)
│ ├── autostart.bat (Windows)
│ └── stash-scheduler.service (systemd)
└── README.md
```
> Your Stash data directory is shown under **Settings → System**.
> Common locations: `~/.stash` (Linux/macOS) or `C:\Users\<you>\.stash` (Windows).
### 2 — Install Python dependencies
Open a terminal and run:
```bash
pip install apscheduler "stashapp-tools>=0.2.40"
```
Or install from the requirements file:
```bash
pip install -r /path/to/plugins/stash-scheduler/requirements.txt
```
### 3 — Reload plugins in Stash
In Stash, go to **Settings → Plugins** and click **Reload Plugins**. "Stash Scheduler" will appear in the list.
> **Note on auto-start:** After dropping the plugin into the `plugins/` folder, the scheduler does **not** start automatically — Stash's plugin system has no built-in startup hook. You must either start it manually each time (Settings → Tasks → Start Scheduler) or configure OS-level auto-start using the scripts in `startup/` (see the [Auto-start on boot](#auto-start-on-boot-recommended) section). The systemd unit is the most complete option, as it covers Stash restarts as well as system boot.
---
## Configuration
Open **Settings → Plugins → Stash Scheduler** and set your preferences:
| Setting | Description | Default |
|---|---|---|
| **Scan Frequency** | `hourly`, `daily`, or `weekly` | `daily` |
| **Time of Day (HH:MM)** | Time to run the scan in 24-hour `HH:MM` format. Used by Daily and Weekly; ignored for Hourly. | `02:00` |
| **Day of Week** | Day to scan when Frequency is Weekly. Use `mon`, `tue`, `wed`, `thu`, `fri`, `sat`, or `sun`. | `sun` |
| **Timezone** | IANA timezone name for interpreting Time of Day and Day of Week. Examples: `America/New_York`, `Europe/London`, `Asia/Tokyo`. Leave blank for UTC. | `UTC` |
| **Run Identify After Scan** | When enabled, runs an Identify task after each scan finishes successfully. | `false` |
| **Scan Completion Timeout (minutes)** | Max time to wait for the scan before giving up on Identify. | `120` |
| **Limit to Paths** | Restrict the scan (and the follow-up Identify) to specific directories. One path per line, or comma-separated. Leave blank for the full library. | *(full library)* |
| **Generate Covers** | Generate cover images for scenes during scan. | `false` |
| **Generate Video Previews** | Generate video preview clips during scan. | `false` |
| **Generate Image Previews** | Generate image preview strips during scan. | `false` |
| **Generate Sprites** | Generate sprite sheets (seek-bar previews) during scan. | `false` |
| **Generate Video Phashes** | Generate perceptual hashes for video files (duplicate detection). | `false` |
| **Generate Image Phashes** | Generate perceptual hashes for image files (duplicate detection). | `false` |
| **Generate Image Thumbnails** | Generate thumbnails for image files during scan. | `false` |
| **Generate Image Clip Previews** | Generate animated clip previews for image gallery files. | `false` |
| **Force Rescan** | Rescan all files even if modification time is unchanged. Useful after Stash upgrades. | `false` |
### Timezone configuration example
To run a daily scan at **2:00 AM New York time**:
```
Scan Frequency: daily
Time of Day: 02:00
Timezone: America/New_York
```
To run a weekly scan at **3:30 AM London time every Sunday**:
```
Scan Frequency: weekly
Time of Day: 03:30
Day of Week: sun
Timezone: Europe/London
```
A full list of valid timezone names is available at
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
> **Note:** If the Timezone field is left blank or set to an unrecognised value, the scheduler falls back to **UTC** and logs a warning.
---
## Starting the Scheduler
The scheduler runs as a **long-lived task** inside Stash. You can start it manually or configure it to start automatically on system boot (recommended).
### Manual start
1. Go to **Settings → Tasks**.
2. Under **Stash Scheduler**, click **Start Scheduler**.
3. The task will appear as running and stay active until you stop it or restart Stash.
### Auto-start on boot (recommended)
Because Stash restarts the plugin process after each Stash restart, auto-start scripts are the most reliable way to ensure the scheduler is always running. Choose the method that fits your setup:
#### Linux — systemd (recommended — handles Stash restarts automatically)
The systemd unit uses `BindsTo=stash.service`, which means it stops when Stash stops and starts when Stash starts. This covers both the initial system boot **and** any subsequent Stash restarts — ensuring the scheduler is always running as long as Stash is running.
```bash
# 1. Make the startup script executable
chmod +x ~/.stash/plugins/stash-scheduler/startup/autostart.sh
# 2. Copy the systemd unit to the system directory
sudo cp ~/.stash/plugins/stash-scheduler/startup/stash-scheduler.service \
/etc/systemd/system/stash-scheduler.service
# 3. Edit the unit file — set your Stash unit name, URL, and plugin path
# Find your Stash unit name with: systemctl list-units | grep -i stash
sudo nano /etc/systemd/system/stash-scheduler.service
# 4. Enable and start the unit
sudo systemctl daemon-reload
sudo systemctl enable stash-scheduler.service
sudo systemctl start stash-scheduler.service
```
The key settings inside the unit file:
```ini
BindsTo=stash.service # restart this unit whenever Stash restarts
After=stash.service # wait for Stash to start before trying to connect
Restart=on-failure # retry if the connection script fails
STASH_URL=http://localhost:9999
PLUGIN_DIR=/path/to/.stash/plugins/stash-scheduler
# Uncomment to enable API key auth:
# STASH_API_KEY=your-api-key-here
```
#### Linux / macOS — cron @reboot
```bash
crontab -e
```
Add this line (adjust paths):
```cron
@reboot sleep 30 && /path/to/plugins/stash-scheduler/startup/autostart.sh >> /tmp/stash-scheduler-autostart.log 2>&1
```
The `sleep 30` gives Stash time to start before the script tries to connect.
#### Windows — Startup folder
1. Press `Win + R`, type `shell:startup`, press Enter.
2. Create a shortcut to `startup\autostart.bat` in that folder.
3. The script will run each time you log in.
Alternatively, use Task Scheduler to trigger `autostart.bat` on system startup (without needing a user session):
- Trigger: **At startup**
- Action: Start `C:\path\to\plugins\stash-scheduler\startup\autostart.bat`
- Check: **Run whether user is logged in or not**
#### Docker / custom entrypoint
Add the following to your Docker entrypoint or startup script, after Stash starts:
```bash
/app/plugins/stash-scheduler/startup/autostart.sh http://localhost:9999
```
---
## Testing Your Settings
Without waiting for the next scheduled run:
1. Go to **Settings → Tasks**.
2. Under **Stash Scheduler**, click **Run Scan Now**.
This triggers a scan (and identify, if enabled) immediately and marks itself complete when done.
---
## How It Works
```
System boots → autostart script runs → polls until Stash is ready
Calls runPluginTask via GraphQL
to start "Start Scheduler" task
stash_scheduler.py reads plugin settings from Stash API
APScheduler registers a cron job (hourly / daily / weekly)
(fires at each scheduled time)
metadataScan mutation → Stash starts a full library scan
┌─────────────────────┴──────────────────────┐
│ │
run_identify = false run_identify = true
│ │
done Poll job queue until scan finishes
or timeout elapses
┌──────────┴──────────┐
│ │
Scan succeeded Scan failed /
│ timed out /
▼ no job ID
metadataIdentify mutation → Identify skipped
(uses Settings → Identify (logged as warning)
sources & options)
```
### About Identify sources
The Identify step uses whatever sources you have configured in **Settings → Metadata → Identify** (e.g., Stash-box connections, scrapers). If no sources are configured there, the identify step will be skipped and a warning will appear in the Stash log.
Identify is also skipped if:
- The scan returns no trackable job ID
- The scan fails or is cancelled
- The scan exceeds the configured timeout
---
## Logs
All activity is written to the Stash log. To view it:
- Go to **Settings → Logs** (or the Stash log panel).
- Look for lines prefixed with `[Stash Scheduler]`.
---
## Troubleshooting
| Symptom | Fix |
|---|---|
| Plugin doesn't appear after Reload | Check that `stash-scheduler.yml` is in the correct folder and YAML syntax is valid. |
| `ModuleNotFoundError: No module named 'apscheduler'` | Run `pip install apscheduler stashapp-tools`. |
| `Could not connect to Stash` | Make sure Stash is running and accessible. |
| Identify is skipped every run | Go to **Settings → Metadata → Identify** and add at least one scraper or Stash-box source. |
| Identify skipped with "no job ID" warning | This is a safety measure — the scan still ran. It can happen on older Stash versions that don't return a job ID for the scan mutation. |
| Scan runs but Identify never starts | Increase **Scan Completion Timeout** if your library is large. |
| Auto-start script says "Stash not available" | Increase the `sleep` delay before calling the script, or raise `MAX_WAIT` inside `autostart.sh`. |
| Schedule fires at wrong time | Check the **Timezone** setting. Set it to your local IANA timezone (e.g. `America/Chicago`) so the Time of Day is interpreted correctly. |
---
## Version History
| Version | Notes |
|---|---|
| 0.6.0 | Added "Limit to Paths" setting — scan and identify can now be restricted to specific directories |
| 0.5.0 | Fixed identify-after-scan (null jobQueue crash); added 9 scan generation flag settings (covers, previews, sprites, phashes, thumbnails, clip previews, force rescan) |
| 0.4.0 | Added Check Status task; daemon logs written to file (`/tmp/stash-scheduler-daemon.log`) |
| 0.3.0 | Added Timezone setting — Time of Day and Day of Week are now interpreted in any IANA timezone instead of always UTC |
| 0.2.0 | Added auto-start scripts (Linux/macOS/Windows/systemd), configurable identify timeout, strict scan→identify sequencing (identify skipped on scan failure/unknown/timeout), improved logging |
| 0.1.0 | Initial release |
---
## License
MIT — use freely, modify as you like.

View File

@@ -0,0 +1,2 @@
apscheduler>=3.10,<4
stashapp-tools>=0.2.40

View File

@@ -0,0 +1,59 @@
@echo off
REM autostart.bat — Windows auto-start script for the Stash Scheduler.
REM
REM Usage:
REM autostart.bat [STASH_URL] [API_KEY]
REM
REM Examples:
REM autostart.bat
REM autostart.bat http://localhost:9999
REM autostart.bat http://localhost:9999 your-api-key-here
REM
REM To run automatically on Windows startup, place a shortcut to this
REM script in: shell:startup (press Win+R, type shell:startup, hit Enter)
setlocal
SET STASH_URL=%~1
IF "%STASH_URL%"=="" SET STASH_URL=http://localhost:9999
SET API_KEY=%~2
SET GRAPHQL=%STASH_URL%/graphql
SET MUTATION={"query":"mutation { runPluginTask(plugin_id: \"stash-scheduler\", task_name: \"Start Scheduler\") }"}
SET MAX_WAIT=120
SET INTERVAL=5
SET ELAPSED=0
echo [Stash Scheduler] Waiting for Stash at %STASH_URL% ...
:WAIT_LOOP
curl -sf --max-time 5 %GRAPHQL% -d "{\"query\":\"{health}\"}" >nul 2>&1
IF %ERRORLEVEL% EQU 0 GOTO STASH_UP
IF %ELAPSED% GEQ %MAX_WAIT% (
echo [Stash Scheduler] ERROR: Stash not available after %MAX_WAIT%s. Aborting.
exit /b 1
)
timeout /t %INTERVAL% /nobreak >nul
SET /A ELAPSED=%ELAPSED%+%INTERVAL%
GOTO WAIT_LOOP
:STASH_UP
echo [Stash Scheduler] Stash is up. Starting scheduler task...
timeout /t 3 /nobreak >nul
IF "%API_KEY%"=="" (
curl -sf --max-time 30 -H "Content-Type: application/json" -d "%MUTATION%" %GRAPHQL%
) ELSE (
curl -sf --max-time 30 -H "Content-Type: application/json" -H "ApiKey: %API_KEY%" -d "%MUTATION%" %GRAPHQL%
)
IF %ERRORLEVEL% EQU 0 (
echo [Stash Scheduler] Scheduler task started successfully.
) ELSE (
echo [Stash Scheduler] ERROR: Failed to start scheduler task.
exit /b 1
)
endlocal

View File

@@ -0,0 +1,88 @@
#!/usr/bin/env bash
# autostart.sh — Start the Stash Scheduler plugin task whenever Stash starts.
#
# Designed to run as a long-lived service (Type=simple) managed by systemd,
# with BindsTo=stash.service so it restarts every time Stash restarts.
# Also works as a one-shot script for cron @reboot and Docker entrypoints.
#
# Usage:
# ./autostart.sh [STASH_URL] [API_KEY]
#
# Examples:
# ./autostart.sh
# ./autostart.sh http://localhost:9999
# ./autostart.sh http://localhost:9999 your-api-key-here
set -euo pipefail
STASH_URL="${1:-${STASH_URL:-http://localhost:9999}}"
API_KEY="${2:-${STASH_API_KEY:-}}"
MAX_WAIT=180 # seconds to wait for Stash to become reachable
INTERVAL=5 # polling interval in seconds
GRAPHQL_ENDPOINT="${STASH_URL}/graphql"
HEALTH_ENDPOINT="${STASH_URL}/healthz"
log() { echo "[Stash Scheduler autostart] $*"; }
# Build auth args for curl (empty when no API key is set)
curl_auth_args() {
if [ -n "$API_KEY" ]; then
echo "-H" "ApiKey: ${API_KEY}"
fi
}
# GraphQL mutation to trigger the Start Scheduler plugin task
RUN_TASK_QUERY='{"query":"mutation { runPluginTask(plugin_id: \"stash-scheduler\", task_name: \"Start Scheduler\") }"}'
log "Waiting for Stash to become available at ${STASH_URL}"
elapsed=0
while true; do
# Try the health endpoint first, fall back to a lightweight GraphQL ping
if curl -sf --max-time 5 $(curl_auth_args) "${HEALTH_ENDPOINT}" > /dev/null 2>&1 || \
curl -sf --max-time 5 $(curl_auth_args) \
-H "Content-Type: application/json" \
-d '{"query":"{version{version}}"}' \
"${GRAPHQL_ENDPOINT}" > /dev/null 2>&1; then
log "Stash is reachable."
break
fi
if [ "$elapsed" -ge "$MAX_WAIT" ]; then
log "ERROR: Stash did not become available within ${MAX_WAIT}s. Aborting." >&2
exit 1
fi
log "Not ready yet, retrying in ${INTERVAL}s… (${elapsed}s elapsed)"
sleep "$INTERVAL"
elapsed=$((elapsed + INTERVAL))
done
# Brief pause to let Stash finish loading plugins after HTTP is up
sleep 5
log "Starting 'Start Scheduler' plugin task via Stash API…"
RESPONSE=$(curl -sf --max-time 30 \
$(curl_auth_args) \
-H "Content-Type: application/json" \
-d "$RUN_TASK_QUERY" \
"$GRAPHQL_ENDPOINT")
if echo "$RESPONSE" | grep -q '"errors"'; then
log "GraphQL errors returned by Stash:" >&2
echo "$RESPONSE" >&2
exit 1
fi
log "Scheduler task started successfully."
# When managed by systemd (Type=simple + BindsTo=stash.service), staying alive
# here keeps the unit "active" for the duration of the Stash session.
# When run as a one-shot script (cron, Docker), exit immediately is fine.
if [ -n "${INVOCATION_ID:-}" ]; then
# We're running under systemd — sleep until Stash (and therefore this
# unit) is stopped by the BindsTo relationship.
log "Running under systemd — sleeping until Stash stops."
exec sleep infinity
fi

View File

@@ -0,0 +1,59 @@
# stash-scheduler.service — systemd unit that starts the Stash Scheduler
# automatically whenever the Stash service starts or restarts.
#
# HOW IT WORKS:
# BindsTo=stash.service — this unit stops when Stash stops and starts
# when Stash starts, covering both initial boot
# and any subsequent Stash restarts.
# Type=simple + Restart — if the autostart script or Stash connectivity
# fails, systemd retries automatically.
#
# INSTALL:
# 1. Find your Stash systemd unit name:
# systemctl list-units | grep -i stash
# Common names: stash.service, stashapp.service
#
# 2. Edit this file:
# a. Replace every "stash.service" with your actual unit name.
# b. Set PLUGIN_DIR to the full path of the stash-scheduler/ folder.
# c. If Stash requires an API key, uncomment the STASH_API_KEY line.
#
# 3. Copy to systemd and enable:
# sudo cp stash-scheduler.service /etc/systemd/system/
# sudo systemctl daemon-reload
# sudo systemctl enable stash-scheduler.service
#
# 4. The unit will now start/restart automatically with Stash.
# To start it manually right now:
# sudo systemctl start stash-scheduler.service
[Unit]
Description=Stash Scheduler — auto-start scan scheduler alongside Stash
# Replace "stash.service" with your actual Stash service unit name
BindsTo=stash.service
After=stash.service
[Service]
Type=simple
# --- Edit these two variables ---
Environment=STASH_URL=http://localhost:9999
Environment=PLUGIN_DIR=/path/to/.stash/plugins/stash-scheduler
# Uncomment and set if Stash requires API key authentication:
# Environment=STASH_API_KEY=your-api-key-here
ExecStart=/bin/bash -c '${PLUGIN_DIR}/startup/autostart.sh ${STASH_URL} ${STASH_API_KEY}'
# Restart whenever the script exits (e.g., after a Stash restart
# this unit is re-triggered by BindsTo and tries again)
Restart=on-failure
RestartSec=15
# Give Stash up to 3 minutes to become reachable before giving up
TimeoutStartSec=180
[Install]
# "BindsTo" already handles the restart-with-Stash coupling.
# WantedBy ensures the unit is active in normal runlevels.
WantedBy=multi-user.target

View File

@@ -0,0 +1,149 @@
name: Stash Scheduler
description: Schedule automatic library scans with an optional identify pass after each scan.
version: "0.6.0"
url: https://github.com/stashapp/stash
exec:
- python3
- "{pluginDir}/stash_scheduler.py"
interface: raw
tasks:
- name: Start Scheduler
description: >
Start the background scheduler. It will run according to your configured
frequency settings and keep running until Stash is restarted or the task
is stopped. Start this once after configuring your schedule settings.
defaultArgs:
mode: start_scheduler
- name: Run Scan Now
description: >
Trigger a library scan immediately, then run Identify afterwards if
"Run Identify After Scan" is enabled in settings.
defaultArgs:
mode: run_now
- name: Force Scan + Identify Now
description: >
Force a full scan AND identify run immediately, regardless of the
"Run Identify After Scan" setting. Use this to test that everything
is working correctly.
defaultArgs:
mode: force_now
- name: Check Status
description: >
Show whether the scheduler daemon is running and display the last 30
lines of its log file. Useful for confirming the schedule loaded
correctly or diagnosing why a scan did not fire.
defaultArgs:
mode: check_status
settings:
frequency:
displayName: Scan Frequency
description: >
How often to run the library scan.
Valid values: hourly, daily, weekly.
Hourly ignores the Time of Day setting.
type: STRING
time_of_day:
displayName: Time of Day (HH:MM)
description: >
The time to run the scan in 24-hour HH:MM format. Used for Daily and
Weekly schedules; ignored for Hourly. Examples: 02:00, 14:30, 20:45.
Defaults to 02:00 if not set.
type: STRING
day_of_week:
displayName: Day of Week (Weekly only)
description: >
The day of the week on which to run the scan when Frequency is set to
Weekly. Valid values: mon, tue, wed, thu, fri, sat, sun.
Defaults to sun if not set.
type: STRING
timezone:
displayName: Timezone
description: >
IANA timezone name used to interpret the Time of Day and Day of Week
settings. Examples: America/New_York, Europe/London, Asia/Tokyo,
Australia/Sydney. Leave blank to use UTC. Full list at
https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
type: STRING
run_identify:
displayName: Run Identify After Scan
description: >
When enabled, an Identify task will automatically run after each scan
completes. Identify uses the sources and options configured under
Settings > Identify in Stash.
type: BOOLEAN
identify_timeout_minutes:
displayName: Scan Completion Timeout (minutes)
description: >
How long (in minutes) to wait for the scan to finish before running
Identify. If the scan takes longer than this, Identify is skipped for
that run. Only used when Run Identify After Scan is enabled.
Defaults to 120 if not set.
type: NUMBER
scanPaths:
displayName: Limit to Paths (one per line)
description: >
Restrict the scan (and the follow-up Identify, if enabled) to these
directories only. Enter one absolute path per line, or separate paths
with commas. Leave blank to scan and identify the full library.
Example: /media/new-imports
type: STRING
scanGenerateCovers:
displayName: Generate Covers
description: Generate cover images for scenes during scan.
type: BOOLEAN
scanGeneratePreviews:
displayName: Generate Video Previews
description: Generate video preview clips for scenes during scan.
type: BOOLEAN
scanGenerateImagePreviews:
displayName: Generate Image Previews
description: Generate image preview strips for scenes during scan.
type: BOOLEAN
scanGenerateSprites:
displayName: Generate Sprites
description: Generate sprite sheets (used for seek bar previews) during scan.
type: BOOLEAN
scanGeneratePhashes:
displayName: Generate Video Phashes
description: Generate perceptual hashes for video files during scan. Used for duplicate detection.
type: BOOLEAN
scanGenerateImagePhashes:
displayName: Generate Image Phashes
description: Generate perceptual hashes for image files during scan. Used for duplicate detection.
type: BOOLEAN
scanGenerateThumbnails:
displayName: Generate Image Thumbnails
description: Generate thumbnails for image files during scan.
type: BOOLEAN
scanGenerateClipPreviews:
displayName: Generate Image Clip Previews
description: Generate animated clip previews for image gallery files during scan.
type: BOOLEAN
rescan:
displayName: Force Rescan
description: >
Force a rescan of all files even if their modification time has not
changed. Useful after upgrading Stash or changing scan settings.
type: BOOLEAN

View File

@@ -0,0 +1,794 @@
#!/usr/bin/env python3
"""
stash_scheduler.py — Stash Scheduler Plugin
All plugin tasks (invoked by Stash via stdin JSON) exit IMMEDIATELY so they
never occupy the Stash job queue. Long-running work is handed off to detached
background subprocesses that run outside Stash's queue entirely.
Plugin task modes (mode comes from defaultArgs in the manifest):
start_scheduler — Save config, kill any existing daemon, start daemon in
background, return immediately.
run_now — Fire scan mutation, return immediately. If "Run Identify
After Scan" is enabled, launch a background subprocess to
wait for the scan and then trigger identify.
force_now — Fire scan mutation and ALWAYS launch a background subprocess
for identify afterwards, regardless of settings.
check_status — Report whether the daemon is running and show recent log
lines. Safe to run at any time; never modifies state.
Background subprocess modes (invoked via sys.argv, never by Stash directly):
--daemon — Run the APScheduler loop.
--after-identify <job_id> <mins> — Wait for scan job to finish, then
trigger identify.
"""
import json
import logging
import os
import signal
import subprocess
import sys
import tempfile
import threading
import time
# ---------------------------------------------------------------------------
# Paths (all in temp dir so subprocesses find them regardless of cwd)
# ---------------------------------------------------------------------------
_TMP = tempfile.gettempdir()
CONFIG_FILE = os.path.join(_TMP, "stash-scheduler-config.json")
PID_FILE = os.path.join(_TMP, "stash-scheduler-daemon.pid")
LOG_FILE = os.path.join(_TMP, "stash-scheduler-daemon.log")
# ---------------------------------------------------------------------------
# GraphQL
# ---------------------------------------------------------------------------
SCAN_MUTATION = """
mutation MetadataScan($input: ScanMetadataInput!) {
metadataScan(input: $input)
}
"""
IDENTIFY_MUTATION = """
mutation MetadataIdentify($input: IdentifyMetadataInput!) {
metadataIdentify(input: $input)
}
"""
JOB_QUEUE_QUERY = """
query JobQueue {
jobQueue {
id
status
description
progress
}
}
"""
CONFIGURATION_QUERY = """
query Configuration {
configuration {
plugins
defaults {
identify {
sources {
source {
stash_box_endpoint
scraper_id
}
options {
fieldOptions {
field
strategy
createMissing
}
setCoverImage
setOrganized
includeMalePerformers
}
}
options {
fieldOptions {
field
strategy
createMissing
}
setCoverImage
setOrganized
includeMalePerformers
}
}
}
}
}
"""
# ---------------------------------------------------------------------------
# Config file I/O
# ---------------------------------------------------------------------------
def save_config(server_connection, settings):
with open(CONFIG_FILE, "w") as f:
json.dump({"server_connection": server_connection, "settings": settings}, f)
def load_config():
with open(CONFIG_FILE) as f:
return json.load(f)
# ---------------------------------------------------------------------------
# Stash connection
# ---------------------------------------------------------------------------
def make_stash(server_connection):
from stashapi.stashapp import StashInterface
return StashInterface(server_connection)
def call_gql(stash, query, variables=None):
return stash.call_GQL(query, variables or {})
# ---------------------------------------------------------------------------
# Settings
# ---------------------------------------------------------------------------
def get_plugin_settings(stash, plugin_id="stash-scheduler"):
defaults = {
"frequency": "daily",
"time_of_day": "02:00",
"day_of_week": "sun",
"timezone": "UTC",
"run_identify": False,
"identify_timeout_minutes": 120,
# Comma- or newline-separated list of paths to restrict scan + identify.
# Empty string = full library (default).
"scanPaths": "",
# Scan generation flags — all off by default (non-breaking)
"scanGenerateCovers": False,
"scanGeneratePreviews": False,
"scanGenerateImagePreviews": False,
"scanGenerateSprites": False,
"scanGeneratePhashes": False,
"scanGenerateImagePhashes": False,
"scanGenerateThumbnails": False,
"scanGenerateClipPreviews": False,
"rescan": False,
}
try:
result = call_gql(stash, CONFIGURATION_QUERY)
plugins_cfg = result.get("configuration", {}).get("plugins", {})
saved = plugins_cfg.get(plugin_id, {})
defaults.update(saved)
except Exception as exc:
stash.log.warning(f"[Stash Scheduler] Could not read plugin settings: {exc}")
return defaults
def _parse_time_of_day(raw, warn):
raw = str(raw).strip()
try:
parts = raw.split(":")
if len(parts) != 2:
raise ValueError("expected HH:MM")
hh, mm = int(parts[0]), int(parts[1])
if not (0 <= hh <= 23 and 0 <= mm <= 59):
raise ValueError(f"values out of range: {hh}:{mm:02d}")
return hh, mm
except (ValueError, TypeError) as exc:
warn(f"[Stash Scheduler] Invalid time_of_day {raw!r} ({exc}) — defaulting to 02:00.")
return 2, 0
def validate_and_coerce_settings(settings, warn):
VALID_FREQUENCIES = {"hourly", "daily", "weekly"}
VALID_DAYS = {"mon", "tue", "wed", "thu", "fri", "sat", "sun"}
freq = str(settings.get("frequency", "daily")).strip().lower()
if freq not in VALID_FREQUENCIES:
warn(f"[Stash Scheduler] Invalid frequency {freq!r} — defaulting to 'daily'.")
freq = "daily"
settings["frequency"] = freq
raw_time = settings.get("time_of_day", "02:00")
hour, minute = _parse_time_of_day(raw_time, warn)
settings["time_of_day"] = f"{hour:02d}:{minute:02d}"
settings["hour"] = hour
settings["minute"] = minute
dow = str(settings.get("day_of_week", "sun")).strip().lower()
if dow not in VALID_DAYS:
warn(f"[Stash Scheduler] Invalid day_of_week {dow!r} — defaulting to 'sun'.")
dow = "sun"
settings["day_of_week"] = dow
try:
timeout = int(settings.get("identify_timeout_minutes", 120))
if timeout < 1:
raise ValueError(f"must be >= 1, got {timeout}")
except (ValueError, TypeError) as exc:
warn(f"[Stash Scheduler] Invalid identify_timeout_minutes ({exc}) — defaulting to 120.")
timeout = 120
settings["identify_timeout_minutes"] = timeout
settings["run_identify"] = bool(settings.get("run_identify", False))
tz_raw = str(settings.get("timezone", "")).strip()
if not tz_raw:
tz_raw = "UTC"
try:
import zoneinfo
zoneinfo.ZoneInfo(tz_raw)
except Exception:
try:
from backports import zoneinfo as _bz
_bz.ZoneInfo(tz_raw)
except Exception:
warn(
f"[Stash Scheduler] Unrecognised timezone {tz_raw!r} — defaulting to UTC."
)
tz_raw = "UTC"
settings["timezone"] = tz_raw
# scanPaths — parse comma/newline-separated string into a clean list
raw_paths = str(settings.get("scanPaths", "") or "")
import re as _re
scan_paths = [p.strip() for p in _re.split(r"[,\n]+", raw_paths) if p.strip()]
settings["scan_paths"] = scan_paths # normalised list; raw "scanPaths" kept for saving
# Scan generation flags — coerce to bool
for flag in (
"scanGenerateCovers", "scanGeneratePreviews", "scanGenerateImagePreviews",
"scanGenerateSprites", "scanGeneratePhashes", "scanGenerateImagePhashes",
"scanGenerateThumbnails", "scanGenerateClipPreviews", "rescan",
):
settings[flag] = bool(settings.get(flag, False))
return settings
# ---------------------------------------------------------------------------
# Scan / Identify helpers (used by both plugin tasks and the daemon)
# ---------------------------------------------------------------------------
_SCAN_FLAGS = (
"scanGenerateCovers",
"scanGeneratePreviews",
"scanGenerateImagePreviews",
"scanGenerateSprites",
"scanGeneratePhashes",
"scanGenerateImagePhashes",
"scanGenerateThumbnails",
"scanGenerateClipPreviews",
"rescan",
)
def trigger_scan(stash_or_log, gql_fn, settings=None):
"""
Trigger a library scan and return the job ID.
stash_or_log: a StashInterface, a logging.Logger, or None.
gql_fn: callable(query, variables) -> result dict.
settings: validated settings dict (used to set scanGenerate* flags).
"""
scan_input = {}
if settings:
for flag in _SCAN_FLAGS:
if settings.get(flag):
scan_input[flag] = True
paths = settings.get("scan_paths") or []
if paths:
scan_input["paths"] = paths
active_flags = [f for f in _SCAN_FLAGS if scan_input.get(f)]
paths_desc = (
f" paths=[{', '.join(scan_input['paths'])}]" if scan_input.get("paths") else ""
)
if active_flags:
_log_info(
stash_or_log,
"[Stash Scheduler] Triggering library scan with flags: "
+ ", ".join(active_flags) + paths_desc,
)
else:
_log_info(
stash_or_log,
"[Stash Scheduler] Triggering library scan"
+ (paths_desc if paths_desc else " (full library)") + "",
)
result = gql_fn(SCAN_MUTATION, {"input": scan_input})
job_id = result.get("metadataScan")
if job_id:
_log_info(stash_or_log, f"[Stash Scheduler] Scan job started (id={job_id})")
else:
_log_warn(stash_or_log, "[Stash Scheduler] metadataScan returned no job ID.")
return job_id
def trigger_identify(stash_or_log, gql_fn, paths=None):
"""
Trigger an identify task and return the job ID.
paths: optional list of path strings — when set, only scenes within those
paths are identified (mirrors the scan path restriction).
"""
try:
result = gql_fn(CONFIGURATION_QUERY, {})
identify = result.get("configuration", {}).get("defaults", {}).get("identify", {})
except Exception as exc:
_log_warn(stash_or_log, f"[Stash Scheduler] Could not read identify defaults: {exc}")
return None
if not identify or not identify.get("sources"):
_log_warn(
stash_or_log,
"[Stash Scheduler] Identify skipped — no sources configured in "
"Settings → Identify.",
)
return None
paths_desc = f" (paths: {', '.join(paths)})" if paths else " (full library)"
_log_info(stash_or_log, f"[Stash Scheduler] Triggering identify task{paths_desc}")
try:
identify_input = {"sources": identify["sources"]}
if identify.get("options"):
identify_input["options"] = identify["options"]
if paths:
identify_input["paths"] = paths
result = gql_fn(IDENTIFY_MUTATION, {"input": identify_input})
job_id = result.get("metadataIdentify")
_log_info(stash_or_log, f"[Stash Scheduler] Identify job started (id={job_id})")
return job_id
except Exception as exc:
_log_err(stash_or_log, f"[Stash Scheduler] Failed to start identify: {exc}")
return None
def wait_for_scan_and_identify(stash_or_log, gql_fn, job_id, timeout_minutes, paths=None):
"""
Poll until the scan job finishes, then trigger identify.
Runs in a daemon thread or a detached subprocess — never in a Stash task.
paths: optional list of path strings passed through to trigger_identify.
"""
if job_id is None:
_log_warn(stash_or_log, "[Stash Scheduler] No scan job ID — identify skipped.")
return
deadline = time.time() + timeout_minutes * 60
_log_info(
stash_or_log,
f"[Stash Scheduler] Waiting for scan {job_id} to finish "
f"(timeout {timeout_minutes} min)…",
)
while time.time() < deadline:
try:
result = gql_fn(JOB_QUEUE_QUERY, {})
queue = result.get("jobQueue") or []
job = next((j for j in queue if str(j.get("id")) == str(job_id)), None)
if job is None:
_log_info(stash_or_log, "[Stash Scheduler] Scan complete — starting identify.")
trigger_identify(stash_or_log, gql_fn, paths)
return
status = job.get("status", "UNKNOWN")
if status == "FINISHED":
_log_info(stash_or_log, "[Stash Scheduler] Scan finished — starting identify.")
trigger_identify(stash_or_log, gql_fn, paths)
return
if status in ("CANCELLED", "FAILED"):
_log_warn(
stash_or_log,
f"[Stash Scheduler] Scan ended with status {status}. Identify skipped.",
)
return
except Exception as exc:
_log_warn(stash_or_log, f"[Stash Scheduler] Error polling job queue: {exc}")
time.sleep(15)
_log_warn(
stash_or_log,
f"[Stash Scheduler] Timed out after {timeout_minutes} min. Identify skipped.",
)
# ---------------------------------------------------------------------------
# Logging helpers
# stash_or_log is either a StashInterface (plugin tasks) or a logging.Logger
# (daemon / background subprocesses). None = silent.
# ---------------------------------------------------------------------------
def _log_info(obj, msg):
if obj is None:
return
if isinstance(obj, logging.Logger):
obj.info(msg)
else:
try:
obj.log.info(msg)
except Exception:
pass
def _log_warn(obj, msg):
if obj is None:
return
if isinstance(obj, logging.Logger):
obj.warning(msg)
else:
try:
obj.log.warning(msg)
except Exception:
pass
def _log_err(obj, msg):
if obj is None:
return
if isinstance(obj, logging.Logger):
obj.error(msg)
else:
try:
obj.log.error(msg)
except Exception:
pass
def _make_file_logger(name="stash-scheduler"):
"""Return a Logger that writes to LOG_FILE with timestamps."""
logger = logging.getLogger(name)
if logger.handlers:
return logger
logger.setLevel(logging.DEBUG)
fh = logging.FileHandler(LOG_FILE, encoding="utf-8")
fh.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s"))
logger.addHandler(fh)
# Also mirror to stderr so it shows up when running interactively
sh = logging.StreamHandler(sys.stderr)
sh.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s"))
logger.addHandler(sh)
return logger
# ---------------------------------------------------------------------------
# Subprocess / daemon helpers
# ---------------------------------------------------------------------------
def _self():
return os.path.abspath(__file__)
def launch_detached(*args):
"""
Launch this script with the given argv in a fully detached process.
The daemon subprocess redirects stdout/stderr to LOG_FILE so its output
is preserved. Other subprocesses discard output.
"""
is_daemon = args and args[0] == "--daemon"
if is_daemon:
log_fd = open(LOG_FILE, "a")
stdout = log_fd
stderr = log_fd
else:
stdout = subprocess.DEVNULL
stderr = subprocess.DEVNULL
subprocess.Popen(
[sys.executable, _self()] + list(args),
stdin=subprocess.DEVNULL,
stdout=stdout,
stderr=stderr,
close_fds=True,
start_new_session=True,
)
def kill_existing_daemon():
if not os.path.exists(PID_FILE):
return
try:
with open(PID_FILE) as f:
pid = int(f.read().strip())
os.kill(pid, signal.SIGTERM)
time.sleep(1)
try:
os.kill(pid, signal.SIGKILL)
except OSError:
pass
except (OSError, ValueError):
pass
finally:
try:
os.remove(PID_FILE)
except OSError:
pass
def write_pid():
with open(PID_FILE, "w") as f:
f.write(str(os.getpid()))
def daemon_alive():
"""Return (alive: bool, pid: int|None)."""
if not os.path.exists(PID_FILE):
return False, None
try:
with open(PID_FILE) as f:
pid = int(f.read().strip())
os.kill(pid, 0) # signal 0 = existence check only
return True, pid
except (OSError, ValueError):
return False, None
def tail_log(n=30):
"""Return the last n lines of the daemon log file as a string."""
if not os.path.exists(LOG_FILE):
return "(log file not found)"
try:
with open(LOG_FILE, encoding="utf-8", errors="replace") as f:
lines = f.readlines()
return "".join(lines[-n:]).strip()
except Exception as exc:
return f"(could not read log: {exc})"
# ---------------------------------------------------------------------------
# Background mode: --daemon
# ---------------------------------------------------------------------------
def run_daemon():
log = _make_file_logger()
log.info("=== Stash Scheduler daemon starting ===")
try:
cfg = load_config()
except Exception as exc:
log.error(f"Cannot read config file: {exc}")
sys.exit(1)
write_pid()
log.info(f"PID {os.getpid()} written to {PID_FILE}")
try:
stash = make_stash(cfg["server_connection"])
except Exception as exc:
log.error(f"Cannot connect to Stash: {exc}")
sys.exit(1)
settings = validate_and_coerce_settings(cfg["settings"], lambda m: log.warning(m))
try:
from apscheduler.schedulers.background import BackgroundScheduler
except ImportError:
log.error("APScheduler not installed. Run: pip install apscheduler")
sys.exit(1)
frequency = settings["frequency"]
hour = settings["hour"]
minute = settings["minute"]
day_of_week = settings["day_of_week"]
timezone = settings["timezone"]
run_identify = settings["run_identify"]
identify_timeout = settings["identify_timeout_minutes"]
scan_paths = settings.get("scan_paths") or []
# Build a simple GQL callable for the daemon (not stash.log-based)
def gql(query, variables=None):
return call_gql(stash, query, variables)
def scheduled_job():
log.info("Scheduled scan cycle firing.")
try:
job_id = trigger_scan(log, gql, settings)
except Exception as exc:
log.error(f"Scan failed: {exc}")
return
if run_identify:
threading.Thread(
target=wait_for_scan_and_identify,
args=(log, gql, job_id, identify_timeout, scan_paths or None),
daemon=True,
).start()
scheduler = BackgroundScheduler(timezone=timezone)
job_kwargs = {"func": scheduled_job, "misfire_grace_time": 3600, "coalesce": True}
if frequency == "hourly":
scheduler.add_job(trigger="cron", minute=0, **job_kwargs)
log.info(f"Schedule: every hour at :00 ({timezone})")
elif frequency == "weekly":
scheduler.add_job(
trigger="cron", day_of_week=day_of_week, hour=hour, minute=minute, **job_kwargs
)
log.info(f"Schedule: weekly {day_of_week.upper()} at {hour:02d}:{minute:02d} ({timezone})")
else:
scheduler.add_job(trigger="cron", hour=hour, minute=minute, **job_kwargs)
log.info(f"Schedule: daily at {hour:02d}:{minute:02d} ({timezone})")
log.info(f"Identify after scan: {'yes' if run_identify else 'no'}")
if scan_paths:
log.info("Scan/identify paths: " + ", ".join(scan_paths))
else:
log.info("Scan/identify paths: full library")
active_flags = [f for f in _SCAN_FLAGS if settings.get(f)]
if active_flags:
log.info("Scan flags: " + ", ".join(active_flags))
else:
log.info("Scan flags: none (bare scan)")
scheduler.start()
log.info("Daemon is running. Waiting for scheduled events…")
stop = threading.Event()
signal.signal(signal.SIGTERM, lambda *_: stop.set())
signal.signal(signal.SIGINT, lambda *_: stop.set())
stop.wait()
scheduler.shutdown(wait=False)
log.info("Daemon stopped.")
# ---------------------------------------------------------------------------
# Background mode: --after-identify <job_id> <timeout_minutes>
# ---------------------------------------------------------------------------
def run_after_identify(job_id_arg, timeout_arg):
log = _make_file_logger()
log.info("=== after-identify subprocess starting ===")
try:
cfg = load_config()
except Exception as exc:
log.error(f"Cannot read config: {exc}")
sys.exit(1)
try:
stash = make_stash(cfg["server_connection"])
except Exception as exc:
log.error(f"Cannot connect to Stash: {exc}")
sys.exit(1)
def gql(query, variables=None):
return call_gql(stash, query, variables)
job_id = None if job_id_arg == "none" else job_id_arg
try:
timeout = int(timeout_arg)
except (ValueError, TypeError):
timeout = 120
# Read scan_paths from the saved settings so identify is scoped consistently
settings = validate_and_coerce_settings(
cfg.get("settings", {}), lambda m: log.warning(m)
)
scan_paths = settings.get("scan_paths") or None
wait_for_scan_and_identify(log, gql, job_id, timeout, scan_paths)
log.info("=== after-identify subprocess done ===")
# ---------------------------------------------------------------------------
# Plugin task modes — exit quickly, never block
# ---------------------------------------------------------------------------
def task_start_scheduler(stash, server_connection, settings):
save_config(server_connection, settings)
kill_existing_daemon()
launch_detached("--daemon")
freq = settings["frequency"]
time_str = settings.get("time_of_day", "??:??")
dow = settings.get("day_of_week", "")
tz = settings.get("timezone", "UTC")
if freq == "hourly":
schedule_desc = f"every hour at :00 ({tz})"
elif freq == "weekly":
schedule_desc = f"weekly on {dow.upper()} at {time_str} ({tz})"
else:
schedule_desc = f"daily at {time_str} ({tz})"
stash.log.info(
f"[Stash Scheduler] Daemon launched. Schedule: {schedule_desc}. "
f"Identify after scan: {'yes' if settings['run_identify'] else 'no'}. "
f"Daemon log: {LOG_FILE}"
)
print(json.dumps({"output": f"Scheduler daemon started ({schedule_desc})."}))
def task_run_now(stash, settings, force_identify=False):
stash.log.info("[Stash Scheduler] Triggering scan now…")
try:
job_id = trigger_scan(stash, lambda q, v=None: call_gql(stash, q, v), settings)
except Exception as exc:
stash.log.error(f"[Stash Scheduler] Scan could not be started: {exc}")
print(json.dumps({"output": f"Error starting scan: {exc}"}))
return
run_identify = force_identify or settings["run_identify"]
timeout = settings["identify_timeout_minutes"]
if run_identify:
jid_arg = str(job_id) if job_id else "none"
launch_detached("--after-identify", jid_arg, str(timeout))
stash.log.info("[Stash Scheduler] Identify will follow scan completion (background).")
print(json.dumps({"output": "Scan started. Identify will follow in the background."}))
else:
print(json.dumps({"output": "Scan started."}))
def task_check_status(stash):
alive, pid = daemon_alive()
status_line = f"Daemon: RUNNING (PID {pid})" if alive else "Daemon: NOT RUNNING"
recent = tail_log(30)
output = f"{status_line}\nLog file: {LOG_FILE}\n\nRecent log ({LOG_FILE}):\n{recent}"
stash.log.info(f"[Stash Scheduler] {status_line}")
print(json.dumps({"output": output}))
# ---------------------------------------------------------------------------
# Entry point
# ---------------------------------------------------------------------------
def main():
# Background subprocess modes
if len(sys.argv) > 1:
mode = sys.argv[1]
if mode == "--daemon":
run_daemon()
return
if mode == "--after-identify":
job_id_arg = sys.argv[2] if len(sys.argv) > 2 else "none"
timeout_arg = sys.argv[3] if len(sys.argv) > 3 else "120"
run_after_identify(job_id_arg, timeout_arg)
return
print(f"[Stash Scheduler] Unknown argument: {mode}", file=sys.stderr)
sys.exit(1)
# Plugin task mode
raw = sys.stdin.read()
try:
plugin_input = json.loads(raw)
except json.JSONDecodeError as exc:
print(f"[Stash Scheduler] Failed to parse plugin input: {exc}", file=sys.stderr)
sys.exit(1)
server_connection = plugin_input.get("server_connection", {})
task_mode = plugin_input.get("args", {}).get("mode", "start_scheduler")
try:
from stashapi.stashapp import StashInterface
stash = StashInterface(server_connection)
except ImportError:
print("[Stash Scheduler] stashapp-tools not installed.", file=sys.stderr)
sys.exit(1)
except Exception as exc:
print(f"[Stash Scheduler] Could not connect to Stash: {exc}", file=sys.stderr)
sys.exit(1)
settings = get_plugin_settings(stash)
settings = validate_and_coerce_settings(settings, lambda m: stash.log.warning(m))
if task_mode == "start_scheduler":
task_start_scheduler(stash, server_connection, settings)
elif task_mode == "run_now":
task_run_now(stash, settings, force_identify=False)
elif task_mode == "force_now":
stash.log.info("[Stash Scheduler] Force mode — identify will always run after scan.")
task_run_now(stash, settings, force_identify=True)
elif task_mode == "check_status":
task_check_status(stash)
else:
stash.log.error(f"[Stash Scheduler] Unknown task mode: {task_mode!r}")
sys.exit(1)
if __name__ == "__main__":
main()