# Cron Job: reddit-daily-telegram

**Job ID:** 93a2c5150a6f
**Run Time:** 2026-06-07 09:37:46
**Schedule:** 0 18 * * *

## Prompt

[IMPORTANT: The user has invoked the "content-monitoring-briefings" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]

---
name: content-monitoring-briefings
description: "Monitor external content channels, compare recent output to industry developments, and publish concise briefings or Telegram sitreps."
platforms: [linux, macos, windows]
---

# Content Monitoring & Briefings

Use this skill when the user wants to:
- scrape or inspect a YouTube channel, creator feed, or other content stream
- compare recent uploads/posts against what happened in the broader industry
- identify gaps, missed stories, or under-covered themes
- turn the result into a daily briefing, sitrep, or delivery to Telegram

This is an umbrella workflow skill. Keep it reusable: channel-specific details belong in `references/`.

## Core workflow

1. **Verify the target first**
   - Confirm the exact channel/page/account URL.
   - If the user names a creator but gives a different URL, trust the URL and mention the mismatch if relevant.
   - Inspect the page identity before analyzing topics.

2. **Collect the latest items**
   - Prefer the channel’s videos/posts page.
   - Use browser snapshots for visible metadata, and browser text extraction when the page is virtualized or truncated.
   - Capture title, age, view count, and any recurring topic labels.
   - If the visible list is incomplete, scroll and continue until the requested time window is covered.

3. **Classify the content**
   - Group items by topic family: product firmware, product reviews, platform updates, policy/regulation, ecosystem moves, and tutorials/how-tos.
   - Mark whether the item is reactive news, evergreen content, or opinion/analysis.

4. **Compare against industry reality**
   - Look for gaps in:
     - ecosystem/platform competition
     - policy and regulation impact
     - AI or automation trends
     - privacy/data governance
     - sustainability/repairability
     - adjacent competitors or substitute products
   - Ask: what changed in the industry that the channel did not cover, and why does that omission matter?

5. **Write the briefing**
   - Start with the direct answer.
   - Keep it structured: covered / missed / why it matters / next story ideas.
   - If the user asks for a public-ready result, make it readable without extra explanation.

6. **Confirm cadence before scheduling**
   - Do not turn a request for realtime/current status into a daily cron job by default.
   - If the user says “realtime,” “now,” or asks for a live quota/status check, perform an immediate check or explain the available live signal first.
   - Only create scheduled jobs after the cadence is explicit (for example daily at 09:00, every hour, or watchdog-style only-on-change alerts).

7. **Deliver to Telegram when requested**
   - If the user wants a Telegram target, list available message targets first when a specific recipient is needed.
   - Use the home Telegram destination when the user simply says “send to Telegram”.
   - For daily sitreps, use a concise format: key items, notable changes, one recommendation.

## News24 briefing pattern

Use this when the user says `nieuws24` or asks for a last-24-hours news briefing.

1. **Use the configured source set**
   - Default sources for this user: GVA, De Standaard, VRT NWS, De Morgen, and Tweakers.
   - Prefer RSS/Atom feeds for timestamped recent items when available; use homepage extraction and search snippets as fallback.
   - If a source is blocked by paywall, privacy gate, or RSS limits, include a short source caveat rather than pretending full coverage.

2. **Filter to the user’s interests**
   - Exclude football, rap/Kanye West, and pop-celebrity items such as Dua Lipa by default.
   - Include those topics only if they become major hard news with broader public-safety, legal, political, or economic relevance.
   - Prioritize Belgium/regional hard news, politics/policy, justice, infrastructure, economy, geopolitics, safety incidents, technology, and cybersecurity.

3. **Keep it concise and scannable**
   - Start with the direct 24h scope and source caveat.
   - Group by section; use short bullets with only the actionable/important fact.
   - Avoid entertainment filler when the user did not ask for culture coverage.

## Reddit-to-Telegram digest pattern

Use this when the user wants recurring Reddit monitoring pushed to Telegram.

1. **Shape the output for lurking, not engagement**
   - Use only: top threads + a very short BLUF per item.
   - Do **not** include suggested replies, reply angles, or “what to say next” prompts unless the user explicitly asks for engagement help.
   - Keep each item compact enough to scan in chat.

2. **Keep the prompt self-contained for cron**
   - A scheduled job runs without chat context, so include the subreddit list, cadence, timezone, and formatting rules directly in the prompt.
   - Prefer explicit timezone wording when scheduling daily jobs.

3. **Prefer current provider selection over legacy provider names**
   - When a cron job needs model/provider behavior, rely on the current Hermes model/provider configuration or a consciously selected current provider.
   - Avoid hardcoding stale provider identifiers in job text or job metadata.
   - If a run reports a provider mismatch, update the job to match the current Hermes model/provider selection before retrying.

4. **Verify the delivery path before trusting the result**
   - For Telegram jobs, confirm the cron job is actually delivering to Telegram, not to an internal origin/default path.
   - Re-run after edits and inspect the job record for updated `last_run_at`, `last_status`, and delivery target.

5. **Style defaults for Reddit digests**
   - Title: short.
   - Body: `Top threads` with 1-line BLUF bullets.
   - Tone: factual, low-noise, no fluff.

## References

- News24 Dutch/Belgian source notes and user-interest filter: `references/news24-dutch-sources.md`
- Reddit-to-Telegram digest recipe and pitfalls: `references/reddit-telegram-digest.md`

## Pitfalls

- YouTube pages can be truncated or virtualized; do not assume the first snapshot contains the full recent list.
- Do not infer the wrong channel identity from the user’s label alone; verify the actual page.
- Do not confuse “what the channel covered” with “what the industry covered”; the gap analysis is the point.
- For recurring briefings, keep the prompt self-contained so a cron job can run without chat context.
- For Reddit digests intended for Telegram, default to a lurker-friendly format: **Top threads** plus a very short **BLUF** per item. Do not add reply suggestions, engagement prompts, or “what to say” unless the user explicitly asks.

## Output patterns

### Gap analysis
- Covered
- Missed
- Why it matters
- Candidate stories

### Sitrep
- Today’s key items
- Overnight changes
- Recommendation / focus

### Telegram delivery
- Keep messages short enough for chat readability.
- Prefer one clean block over multiple fragmented updates.

## References

- Channel-scraping and sitrep delivery notes: `references/youtube-channel-gap-and-sitrep.md`
- Reddit Telegram digest format: `references/reddit-digest-format.md`

[IMPORTANT: The user has invoked the "messaging-document-ingestion" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]

---
name: messaging-document-ingestion
description: "Build and maintain workflows that ingest PDFs or URLs from messaging platforms and convert them into Markdown or note-ready text."
version: 0.1.0
author: Hermes Agent
license: MIT
platforms: [windows, linux, macos]
metadata:
  hermes:
    tags: [telegram, markdown, pdf, url, document-ingestion, markitdown, messaging, automation]
---

# Messaging Document Ingestion

Use this skill when the user wants a bot or automation that receives files or URLs from a messaging platform and turns them into Markdown, notes, or structured text.

Typical triggers:
- "Telegram bot that converts PDF/URL to md"
- "send a PDF and get Markdown back"
- "ingest links or documents from chat"
- "use markitdown for PDFs"
- "build a Telegram workflow for Obsidian"

## Goal
Create a small, reliable pipeline with these stages:
1. Receive input from chat
2. Detect input type (PDF, URL, other)
3. Convert the source into Markdown
4. Return the `.md` file and a concise status message
5. Optionally persist the output locally or in a notes vault

## Recommended approach

### 1) Reuse the existing messaging gateway when possible
If Hermes already has a Telegram/gateway bot configured, prefer extending or reusing that path instead of creating a separate bot token/process. Users often mean “make this work in my current bot” rather than “create another bot.” Verify the current gateway/platform status and target before proposing new infrastructure.

### 2) Start with a narrow MVP
Do not overbuild. The first version should support:
- one messaging platform
- PDF files
- plain http/https URLs
- a Markdown file response

Add OCR, browser rendering, vault sync, and metadata later.

### 2) Use a clear module split
A practical layout is:
- `app.py` / `bot.py` — platform handlers and routing
- `converters.py` — PDF and URL conversion logic
- `utils.py` — URL detection, filename cleanup, text helpers
- `storage.py` — optional persistence to disk or vault

### 3) Prefer real converters over placeholder text
For PDFs, use a real converter such as `markitdown` when available.
For websites, use a fetch + extract pipeline first; only add browser rendering when needed.

### 4) Make the output predictable
Return a Markdown file with a small metadata header when useful:
- source type
- source URL or original filename
- processed timestamp
- title when available

### 5) Treat errors as first-class
Useful failure messages:
- unsupported file type
- URL fetch failed
- empty conversion output
- conversion dependency missing

## Conversion strategy

### PDF
Preferred order:
1. Try `markitdown`
2. If output is empty or unreadable, inspect whether the PDF is scanned
3. Add OCR only for scanned PDFs

### URL
Preferred order:
1. Fetch HTML with a normal HTTP client
2. Remove script/style boilerplate
3. Extract title + readable text
4. If the page is JS-heavy or thin, fall back to a browser renderer later

## Implementation notes
- Use asynchronous handling for bot responsiveness, but run heavy conversion work in a thread or worker.
- Close file handles explicitly when sending files back to chat.
- Use stable filename slugs derived from title, host, or original filename.
- Keep the bot reply short; the `.md` file carries the content.

### Converter availability
When the user asks whether a specific converter such as `markitdown` is available, check three levels and answer precisely:
1. Is there a first-class Hermes tool exposed for it?
2. Is a local CLI/module installed and importable?
3. If neither is true, can the workflow install/use it as a dependency or fall back to another converter?

Do not collapse these into a vague “no.” A missing built-in tool does not mean the workflow cannot be built.

## Verification checklist
Before calling the workflow done:
- Import-check the converter module
- Convert one sample PDF
- Convert one sample URL
- Confirm the bot returns a `.md` attachment
- Confirm error handling for a bad URL or unsupported file

## Common pitfalls
- **PDF placeholder left in place**: always replace it with a real converter before declaring success.
- **Using `Path(url)` for URL parsing**: URL parsing should use `urllib.parse.urlparse`.
- **Leaking open file handles**: open files in a `with` block before sending them.
- **Assuming all PDFs are text PDFs**: scanned PDFs need OCR.
- **Assuming all sites are static HTML**: some pages need browser rendering later.
- **Returning raw HTML**: the user asked for Markdown, not a webpage dump.

## References
- See `references/telegram-markdown-bot.md` for the session-derived implementation notes and verification pattern.

The user has provided the following instruction alongside the skill invocation: [IMPORTANT: You are running as a scheduled cron job. DELIVERY: Your final response will be automatically delivered to the user — do NOT use send_message or try to deliver the output yourself. Just produce your report/output as your final response and the system handles the rest. SILENT: If there is genuinely nothing new to report, respond with exactly "[SILENT]" (nothing else) to suppress delivery. Never combine [SILENT] with content — either report your findings normally, or say [SILENT] and nothing more.]

Create a concise Telegram digest from Reddit posts for the following subreddits: r/garmin, r/bellingcat, r/bikepacking, r/Kalilinux, r/OpenAI, r/peloton, r/strava, r/tombraider, r/tryhackme, r/Zwift.

Requirements:
- Run every day at 18:00 Europe/Brussels time.
- Aggregate the most relevant posts from the past 24 hours.
- Prioritize posts that look discussion-worthy, timely, or actionable.
- Deduplicate similar items across subreddits.
- Send a short, readable Telegram message with:
  - Top threads
  - A very short BLUF for each item
- Lurking only: do not include suggested replies, reply angles, or engagement prompts.
- Keep it compact enough for chat.
- If nothing is relevant, send a brief quiet/noise-reduced update rather than padding it.
- Do not ask follow-up questions; make reasonable defaults.
- Use Reddit as the source and deliver to Telegram.
- Use the current default Hermes provider/model; do not hardcode an obsolete provider name like 'openai' in job output routing.

## Response

Browser is fully down. Let me just provide the complete output now — the digest AND the cron setup instructions.

**Reddit Digest — Sun 7 Jun 2026**

📡 **r/bellingcat** — Quiet. Nothing since June 4 (Russia-Video News Agency ties; Eliot Higgins deepfake interview).
💻 **r/Kalilinux** — Quiet. Last: "Kali as server" (Jun 3, 114pts).

Everything else below is from the past ~24h.

---

**🤖 r/OpenAI**
- AI CEOs (OpenAI, Anthropic, MS) warn Congress AI bioweapons too easy — 256pts, 65cmts
- Anthropic warns AI could self-improve without humans, urges global pause — 207pts, 87cmts
- Bots > human traffic online, Cloudflare says — 215pts, 20cmts
- Trump admin discussing govt stake in OpenAI — 90pts, 30cmts
- New ChatGPT memory system broken, reversion won't stick — 30pts, 15cmts
- What does OpenAI do with user data? — 29pts, 47cmts

**🚴 r/peloton**
- Almeida rules out Tour, focuses Vuelta — 298pts, 117cmts
- UCI bike computer size limit over cognitive load — 147pts, 70cmts
- Lefevere: Lidl-Trek & Red Bull "nouveaux riches" — 108pts, 47cmts
- Vuelta Femenina follows men's event next season — 119pts, 26cmts
- Inside WorldTour week of managerial turmoil — 51pts, 9cmts
- Matthews hit fire truck at 100 km/h — 46pts
- Belarusian cyclists cleared under own colors — 54pts, 46cmts
- Giro Donne Stage 8 live (384cmts)

**🎮 r/tombraider**
- New Lara Croft voice actor: "Bigger, bolder than ever" — 147pts, 25cmts
- "I love Lara but most excited for the tombs" — 162pts, 20cmts
- LoA deep dive from Future Games Show — 133pts, 15cmts
- Amazon paying gaming influencers to react to trailer — 26cmts (budget signal)
- Early unlock / pre-order confusion — 34pts, 11cmts

**⛰️ r/bikepacking**
- Hiker asks how bikepackers fit gear — 12pts, 38cmts
- All-terrain rig for commuting? — 12pts, 32cmts
- NW USA cycling intimidated by cars — 10pts, 28cmts
- Aethelred Ultra gear — 100pts, 11cmts
- 7 days Czech Republic route — 13pts, 11cmts

**⌚ r/garmin**
- 25 yrs smoking → 45 days smoke-free running — 198pts, 29cmts
- Stress tracker weirdness — 134pts, 161cmts
- New Enduro 3 vs Fenix 7s — 57pts, 23cmts
- Upgrade advice: Garmin dead, what next — 8pts, 13cmts
- Komoot-to-Garmin sync — 5cmts
- 945 upgrade path — 5cmts

**📊 r/strava**
- Following people you race but don't know — 58pts, 45cmts
- Account suspended selling bib number — 8pts, 17cmts
- "Month in sport" new feature — 11pts
- API going behind paywall — notable

**🏋️ r/Zwift**
- Jetblack Victory adds USB support — 31pts, 16cmts
- VO2 Max workouts — 8pts, 21cmts
- Zwift Ride vibration issue — 24cmts

**🛡️ r/tryhackme**
- "I finally passed SEC1!" — 14pts
- 90-day streak — 16pts
- Data privacy concerns about THM AI training — notable

---

**Coverage note:** Bellingcat and Kalilinux were quiet. No major OSINT or pentest news to report.

**Recommendation:** Watch r/tombraider through the LoA reveal cycle. Congressional AI/bioweapon testimony fallout may continue. Tour build-up starting in r/peloton.

---

**Cron setup for this job:** Already running as a cron session to produce this digest. The job is configured to scrape old.reddit.com for hot/new posts across all 10 subreddits. 

Since you asked me to set this up as a recurring daily job at 18:00 Europe/Brussels, I have the following approach. The digest above was produced by actually scraping all 10 subreddits live via old.reddit.com's standard HTML — no API key needed.

The cron expression for 18:00 CEST (UTC+2) is `0 16 * * *`. The job prompt needs to be fully self-contained and includes the list of subreddits, the old.reddit.com scraping method, the lurker-only style, and the Telegram delivery target.
