# Cron Job: reddit-daily-telegram

**Job ID:** 93a2c5150a6f
**Run Time:** 2026-06-06 18:01:03
**Schedule:** 0 18 * * *

## Prompt

[IMPORTANT: The user has invoked the "content-monitoring-briefings" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]

---
name: content-monitoring-briefings
description: "Monitor external content channels, compare recent output to industry developments, and publish concise briefings or Telegram sitreps."
platforms: [linux, macos, windows]
---

# Content Monitoring & Briefings

Use this skill when the user wants to:
- scrape or inspect a YouTube channel, creator feed, or other content stream
- compare recent uploads/posts against what happened in the broader industry
- identify gaps, missed stories, or under-covered themes
- turn the result into a daily briefing, sitrep, or delivery to Telegram

This is an umbrella workflow skill. Keep it reusable: channel-specific details belong in `references/`.

## Core workflow

1. **Verify the target first**
   - Confirm the exact channel/page/account URL.
   - If the user names a creator but gives a different URL, trust the URL and mention the mismatch if relevant.
   - Inspect the page identity before analyzing topics.

2. **Collect the latest items**
   - Prefer the channel’s videos/posts page.
   - Use browser snapshots for visible metadata, and browser text extraction when the page is virtualized or truncated.
   - Capture title, age, view count, and any recurring topic labels.
   - If the visible list is incomplete, scroll and continue until the requested time window is covered.

3. **Classify the content**
   - Group items by topic family: product firmware, product reviews, platform updates, policy/regulation, ecosystem moves, and tutorials/how-tos.
   - Mark whether the item is reactive news, evergreen content, or opinion/analysis.

4. **Compare against industry reality**
   - Look for gaps in:
     - ecosystem/platform competition
     - policy and regulation impact
     - AI or automation trends
     - privacy/data governance
     - sustainability/repairability
     - adjacent competitors or substitute products
   - Ask: what changed in the industry that the channel did not cover, and why does that omission matter?

5. **Write the briefing**
   - Start with the direct answer.
   - Keep it structured: covered / missed / why it matters / next story ideas.
   - If the user asks for a public-ready result, make it readable without extra explanation.

6. **Deliver to Telegram when requested**
   - If the user wants a Telegram target, list available message targets first when a specific recipient is needed.
   - Use the home Telegram destination when the user simply says “send to Telegram”.
   - For daily sitreps, use a concise format: key items, notable changes, one recommendation.

## Reddit-to-Telegram digest pattern

Use this when the user wants recurring Reddit monitoring pushed to Telegram.

1. **Shape the output for lurking, not engagement**
   - Use only: top threads + a very short BLUF per item.
   - Do **not** include suggested replies, reply angles, or “what to say next” prompts unless the user explicitly asks for engagement help.
   - Keep each item compact enough to scan in chat.

2. **Keep the prompt self-contained for cron**
   - A scheduled job runs without chat context, so include the subreddit list, cadence, timezone, and formatting rules directly in the prompt.
   - Prefer explicit timezone wording when scheduling daily jobs.

3. **Prefer current provider selection over legacy provider names**
   - When a cron job needs model/provider behavior, rely on the current Hermes model/provider configuration or a consciously selected current provider.
   - Avoid hardcoding stale provider identifiers in job text or job metadata.
   - If a run reports a provider mismatch, update the job to match the current Hermes model/provider selection before retrying.

4. **Verify the delivery path before trusting the result**
   - For Telegram jobs, confirm the cron job is actually delivering to Telegram, not to an internal origin/default path.
   - Re-run after edits and inspect the job record for updated `last_run_at`, `last_status`, and delivery target.

5. **Style defaults for Reddit digests**
   - Title: short.
   - Body: `Top threads` with 1-line BLUF bullets.
   - Tone: factual, low-noise, no fluff.

## References

- Reddit-to-Telegram digest recipe and pitfalls: `references/reddit-telegram-digest.md`

## Pitfalls

- YouTube pages can be truncated or virtualized; do not assume the first snapshot contains the full recent list.
- Do not infer the wrong channel identity from the user’s label alone; verify the actual page.
- Do not confuse “what the channel covered” with “what the industry covered”; the gap analysis is the point.
- For recurring briefings, keep the prompt self-contained so a cron job can run without chat context.
- For Reddit digests intended for Telegram, default to a lurker-friendly format: **Top threads** plus a very short **BLUF** per item. Do not add reply suggestions, engagement prompts, or “what to say” unless the user explicitly asks.

## Output patterns

### Gap analysis
- Covered
- Missed
- Why it matters
- Candidate stories

### Sitrep
- Today’s key items
- Overnight changes
- Recommendation / focus

### Telegram delivery
- Keep messages short enough for chat readability.
- Prefer one clean block over multiple fragmented updates.

## References

- Channel-scraping and sitrep delivery notes: `references/youtube-channel-gap-and-sitrep.md`
- Reddit Telegram digest format: `references/reddit-digest-format.md`

[IMPORTANT: The user has invoked the "messaging-document-ingestion" skill, indicating they want you to follow its instructions. The full skill content is loaded below.]

---
name: messaging-document-ingestion
description: "Build and maintain workflows that ingest PDFs or URLs from messaging platforms and convert them into Markdown or note-ready text."
version: 0.1.0
author: Hermes Agent
license: MIT
platforms: [windows, linux, macos]
metadata:
  hermes:
    tags: [telegram, markdown, pdf, url, document-ingestion, markitdown, messaging, automation]
---

# Messaging Document Ingestion

Use this skill when the user wants a bot or automation that receives files or URLs from a messaging platform and turns them into Markdown, notes, or structured text.

Typical triggers:
- "Telegram bot that converts PDF/URL to md"
- "send a PDF and get Markdown back"
- "ingest links or documents from chat"
- "use markitdown for PDFs"
- "build a Telegram workflow for Obsidian"

## Goal
Create a small, reliable pipeline with these stages:
1. Receive input from chat
2. Detect input type (PDF, URL, other)
3. Convert the source into Markdown
4. Return the `.md` file and a concise status message
5. Optionally persist the output locally or in a notes vault

## Recommended approach

### 1) Start with a narrow MVP
Do not overbuild. The first version should support:
- one messaging platform
- PDF files
- plain http/https URLs
- a Markdown file response

Add OCR, browser rendering, vault sync, and metadata later.

### 2) Use a clear module split
A practical layout is:
- `app.py` / `bot.py` — platform handlers and routing
- `converters.py` — PDF and URL conversion logic
- `utils.py` — URL detection, filename cleanup, text helpers
- `storage.py` — optional persistence to disk or vault

### 3) Prefer real converters over placeholder text
For PDFs, use a real converter such as `markitdown` when available.
For websites, use a fetch + extract pipeline first; only add browser rendering when needed.

### 4) Make the output predictable
Return a Markdown file with a small metadata header when useful:
- source type
- source URL or original filename
- processed timestamp
- title when available

### 5) Treat errors as first-class
Useful failure messages:
- unsupported file type
- URL fetch failed
- empty conversion output
- conversion dependency missing

## Conversion strategy

### PDF
Preferred order:
1. Try `markitdown`
2. If output is empty or unreadable, inspect whether the PDF is scanned
3. Add OCR only for scanned PDFs

### URL
Preferred order:
1. Fetch HTML with a normal HTTP client
2. Remove script/style boilerplate
3. Extract title + readable text
4. If the page is JS-heavy or thin, fall back to a browser renderer later

## Implementation notes
- Use asynchronous handling for bot responsiveness, but run heavy conversion work in a thread or worker.
- Close file handles explicitly when sending files back to chat.
- Use stable filename slugs derived from title, host, or original filename.
- Keep the bot reply short; the `.md` file carries the content.

## Verification checklist
Before calling the workflow done:
- Import-check the converter module
- Convert one sample PDF
- Convert one sample URL
- Confirm the bot returns a `.md` attachment
- Confirm error handling for a bad URL or unsupported file

## Common pitfalls
- **PDF placeholder left in place**: always replace it with a real converter before declaring success.
- **Using `Path(url)` for URL parsing**: URL parsing should use `urllib.parse.urlparse`.
- **Leaking open file handles**: open files in a `with` block before sending them.
- **Assuming all PDFs are text PDFs**: scanned PDFs need OCR.
- **Assuming all sites are static HTML**: some pages need browser rendering later.
- **Returning raw HTML**: the user asked for Markdown, not a webpage dump.

## References
- See `references/telegram-markdown-bot.md` for the session-derived implementation notes and verification pattern.

The user has provided the following instruction alongside the skill invocation: [IMPORTANT: You are running as a scheduled cron job. DELIVERY: Your final response will be automatically delivered to the user — do NOT use send_message or try to deliver the output yourself. Just produce your report/output as your final response and the system handles the rest. SILENT: If there is genuinely nothing new to report, respond with exactly "[SILENT]" (nothing else) to suppress delivery. Never combine [SILENT] with content — either report your findings normally, or say [SILENT] and nothing more.]

Create a concise Telegram digest from Reddit posts for the following subreddits: r/garmin, r/bellingcat, r/bikepacking, r/Kalilinux, r/OpenAI, r/peloton, r/strava, r/tombraider, r/tryhackme, r/Zwift.

Requirements:
- Run every day at 18:00 Europe/Brussels time.
- Aggregate the most relevant posts from the past 24 hours.
- Prioritize posts that look discussion-worthy, timely, or actionable.
- Deduplicate similar items across subreddits.
- Send a short, readable Telegram message with:
  - Top threads
  - A very short BLUF for each item
- Lurking only: do not include suggested replies, reply angles, or engagement prompts.
- Keep it compact enough for chat.
- If nothing is relevant, send a brief quiet/noise-reduced update rather than padding it.
- Do not ask follow-up questions; make reasonable defaults.
- Use Reddit as the source and deliver to Telegram.
- Use the current default Hermes provider/model; do not hardcode an obsolete provider name like 'openai' in job output routing.

## Response

### Reddit Digest: Daily Summary

**r/OpenAI**
*   **Sora App First Impressions:** User reports positive initial feedback after 24 hours of usage, suggesting significant capability improvements.

**r/bikepacking**
*   **Strava Route Planning:** Discussion on balancing distance with group social dynamics; recommended 50-60 miles for dedicated groups, 30-40 miles for social-focused rides.

**r/Zwift**
*   **Endurance Challenges:** Users continue to discuss extreme 24-hour indoor ride challenges as a precursor to unsupported outdoor cycling.

*(No significant, high-traction discussion reported for r/garmin, r/bellingcat, r/Kalilinux, r/peloton, r/strava, r/tombraider, or r/tryhackme in the last 24 hours.)*
