Docker rewrite and optimizations (#321)

* Optimizations of dockerfile

Massive test optimizations with drop in image size to about 256mb from about 1.2 gb. Drawback is that I currently have to keep the dockerfile playwright version matched to the package.json version

* further optimizations

Removed redundant (hopefully) sessions directory creation during build

* Fix docker cron dependencies

Small fix that should make cron run properly

* Major docker update!

- **Dockerfile rewritten as a multi-stage build**
  - Split into a “builder” stage (`node:18-slim`) to install dependencies and compile TypeScript, and a “runtime” stage (official Playwright image) to run the script.
  - This keeps build tools and dependencies out of the final image, making it smaller, faster to pull, and more secure.

- **Entrypoint script (`entrypoint.sh`)**
  - Introduced an entrypoint that runs inside the container at startup to:
    1. Set the container’s timezone (`TZ`) correctly, based on the environment or defaulting to UTC.
    2. Validate that the user provided a `CRON_SCHEDULE` (exiting early with an error if missing).
    3. Optionally perform an initial run of the script immediately (when `RUN_ON_START=true`), without any random sleep.
  - Centralizing setup in an entrypoint keeps the Dockerfile simpler and ensures proper signal handling.

- **`run_daily.sh` improvements**
  - Removed custom browser-path override so Playwright uses bundled browsers in the official image.
  - Added a lock using `flock` to prevent overlapping runs if a previous run is still in progress.
  - Retained the random sleep between 5 and 50 minutes before each run.
  - Logs are timestamped and clearly report success or failure.

- **Cron template tweaks**
  - Updated `src/crontab.template` so that each job line redirects both stdout and stderr into Docker’s stdout (`>> /proc/1/fd/1 2>&1`), making it easy to view logs via `docker logs`.

- **Initial-run logic**
  - The entrypoint checks `RUN_ON_START=true` and, if set, invokes `npm start` immediately (without random sleep). This provides an immediate first execution on container startup.
  - Scheduled runs via cron still go through the normal `run_daily.sh` (with sleep and locking).

- **Cron logging and visibility**
  - By redirecting cron job output to the container’s stdout, all logs (initial run and scheduled runs) appear in `docker logs`, avoiding the need to tail log files manually.

- **Error handling and validation**
  - Entry point exits early if `CRON_SCHEDULE` is missing, preventing silent misconfiguration.
  - If the initial run fails, it logs a warning but still starts cron so future scheduled runs can proceed.
  - `run_daily.sh` will exit early if a previous run is still active (locking), avoiding overlapping executions.

* Docker (multi-stage) improvements

- added cron logging in entrypoint and fixed timezone support for cron-invoked script runs
- further optimized multi-stage dockerfile
- bumped playwright version to 1.52.0 in dockerfile and package.json
- added customization and enable/disable randomization for cron start times
- optionally add container health  monitor and resource limits in compose.yaml
This commit is contained in:
Michael Cammarata
2025-07-17 06:16:22 -04:00
committed by GitHub
parent e7c27ac16e
commit f51daf06d6
6 changed files with 189 additions and 83 deletions

View File

@@ -1 +1,2 @@
${CRON_SCHEDULE} TZ=${TZ} /bin/bash /usr/src/microsoft-rewards-script/src/run_daily.sh >> /proc/1/fd/1 2>> /proc/1/fd/2
# Run automation according to CRON_SCHEDULE; redirect both stdout & stderr to Docker logs
${CRON_SCHEDULE} TZ=${TZ} /bin/bash /usr/src/microsoft-rewards-script/src/run_daily.sh >> /proc/1/fd/1 2>&1

58
src/run_daily.sh Normal file → Executable file
View File

@@ -1,32 +1,42 @@
#!/bin/bash
#!/usr/bin/env bash
set -euo pipefail
# Set up environment variables
export PATH=$PATH:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin
# Ensure Playwright uses the preinstalled browsers
export PLAYWRIGHT_BROWSERS_PATH=/ms-playwright
# Ensure TZ is set
export TZ=${TZ}
# Ensure TZ is set (entrypoint sets TZ system-wide); fallback if missing
export TZ="${TZ:-UTC}"
# Change directory to the application directory
# Change to project directory
cd /usr/src/microsoft-rewards-script
# Define the minimum and maximum wait times in seconds
MINWAIT=$((5*60)) # 5 minutes
MAXWAIT=$((50*60)) # 50 minutes
# Optional: prevent overlapping runs
LOCKFILE=/tmp/run_daily.lock
exec 9>"$LOCKFILE"
if ! flock -n 9; then
echo "[$(date)] [run_daily.sh] Previous instance still running; exiting."
exit 0
fi
# Calculate a random sleep time within the specified range
SLEEPTIME=$((MINWAIT + RANDOM % (MAXWAIT - MINWAIT)))
# Random sleep between configurable minutes (default 5-50 minutes)
MINWAIT=${MIN_SLEEP_MINUTES:-5}
MAXWAIT=${MAX_SLEEP_MINUTES:-50}
MINWAIT_SEC=$((MINWAIT*60))
MAXWAIT_SEC=$((MAXWAIT*60))
# Convert the sleep time to minutes for logging
SLEEP_MINUTES=$((SLEEPTIME / 60))
# Skip sleep if SKIP_RANDOM_SLEEP is set to true
if [ "${SKIP_RANDOM_SLEEP:-false}" != "true" ]; then
SLEEPTIME=$(( MINWAIT_SEC + RANDOM % (MAXWAIT_SEC - MINWAIT_SEC) ))
SLEEP_MINUTES=$(( SLEEPTIME / 60 ))
echo "[$(date)] [run_daily.sh] Sleeping for $SLEEP_MINUTES minutes ($SLEEPTIME seconds) to randomize execution..."
sleep "$SLEEPTIME"
else
echo "[$(date)] [run_daily.sh] Skipping random sleep (SKIP_RANDOM_SLEEP=true)"
fi
# Log the sleep duration
echo "Sleeping for $SLEEP_MINUTES minutes ($SLEEPTIME seconds)..."
# Sleep for the calculated time
sleep $SLEEPTIME
# Log the start of the script
echo "Starting script..."
# Execute the Node.js script directly
npm run start
echo "[$(date)] [run_daily.sh] Starting script..."
if npm start; then
echo "[$(date)] [run_daily.sh] Script completed successfully."
else
echo "[$(date)] [run_daily.sh] ERROR: Script failed!" >&2
fi