Track openings for companies I am interested in https://jobs.novanexus.ca
Find a file
Bastian Gruber e8eb9d3fcf
Initial commit: Job scraper for privacy/open-source companies
- Scrapes job listings from Greenhouse, Lever, and Ashby platforms
- Tracks 14 companies (1Password, DuckDuckGo, GitLab, etc.)
- SQLite database for change detection
- Filters by engineering job titles and location preferences
- Generates static HTML dashboard with search/filter
- Docker support for deployment to Debian server
2026-01-20 12:40:33 -04:00
scrapers Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
.gitignore Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
config.yaml Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
dashboard.py Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
db.py Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
docker-compose.yaml Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
Dockerfile Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
LICENSE Initial commit 2026-01-20 16:36:26 +00:00
main.py Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
nginx.conf Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
notify.py Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
README.md Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00
requirements.txt Initial commit: Job scraper for privacy/open-source companies 2026-01-20 12:40:33 -04:00

Job Scraper

Monitor job openings from privacy-focused and open-source companies. Runs daily and shows changes.

Quick Start (Local)

# Create venv and install deps
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run once
python main.py

# View dashboard
open data/dashboard.html

Deploy to Debian Server

1. Install Docker

# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in

# Install Docker Compose
sudo apt install docker-compose-plugin

2. Clone/Copy the project

# Copy project to server
scp -r job-scraper user@your-server:~/

# Or clone from git if you pushed it
git clone <your-repo> ~/job-scraper

3. Run with Docker Compose

cd ~/job-scraper

# Run scraper once to populate data
docker compose run --rm scraper

# Start dashboard + scheduled scraper
docker compose up -d scraper-scheduled dashboard

# View logs
docker compose logs -f

4. Access the dashboard

Open http://your-server:8080 in your browser.

Optional: Use a reverse proxy

If you want HTTPS or a custom domain, add nginx/caddy in front:

# Example with Caddy (auto HTTPS)
sudo apt install caddy
echo "jobs.yourdomain.com {
    reverse_proxy localhost:8080
}" | sudo tee /etc/caddy/Caddyfile
sudo systemctl reload caddy

Commands

# Run scraper once
docker compose run --rm scraper

# Run scraper with schedule (daily 9 AM)
docker compose up -d scraper-scheduled

# Start web dashboard
docker compose up -d dashboard

# View all jobs
docker compose run --rm scraper python main.py --list

# Stop everything
docker compose down

# View logs
docker compose logs -f scraper-scheduled

Configuration

Edit config.yaml to:

  • Add/remove companies
  • Change location filters
  • Configure email/Slack notifications

Dashboard Features

  • Dark theme, monospace font
  • Filter jobs by typing (press / to focus, Esc to clear)
  • Color-coded tags: remote, canada, berlin
  • Jump to company links
  • Updates automatically when scraper runs

Project Structure

job-scraper/
├── main.py           # CLI entry point
├── db.py             # SQLite database
├── dashboard.py      # HTML generator
├── notify.py         # Notifications
├── scrapers/         # Platform scrapers
│   ├── base.py       # Base class
│   ├── greenhouse.py # Greenhouse API
│   ├── lever.py      # Lever API
│   └── ashby.py      # Ashby API
├── config.yaml       # Company list & settings
├── Dockerfile
├── docker-compose.yaml
└── data/
    ├── jobs.db       # SQLite database
    └── dashboard.html # Generated dashboard