job-scraper/README.md

# Job Scraper

Monitor job openings from privacy-focused and open-source companies. Runs daily and shows changes.

## Quick Start (Local)

```bash
# Create venv and install deps
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Run once
python main.py

# View dashboard
open data/dashboard.html
```

## Deploy to Debian Server

### 1. Install Docker

```bash
# Install Docker
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
# Log out and back in

# Install Docker Compose
sudo apt install docker-compose-plugin
```

### 2. Clone/Copy the project

```bash
# Copy project to server
scp -r job-scraper user@your-server:~/

# Or clone from git if you pushed it
git clone <your-repo> ~/job-scraper
```

### 3. Run with Docker Compose

```bash
cd ~/job-scraper

# Run scraper once to populate data
docker compose run --rm scraper

# Start dashboard + scheduled scraper
docker compose up -d scraper-scheduled dashboard

# View logs
docker compose logs -f
```

### 4. Access the dashboard

Open `http://your-server:8080` in your browser.

### Optional: Use a reverse proxy

If you want HTTPS or a custom domain, add nginx/caddy in front:

```bash
# Example with Caddy (auto HTTPS)
sudo apt install caddy
echo "jobs.yourdomain.com {
    reverse_proxy localhost:8080
}" | sudo tee /etc/caddy/Caddyfile
sudo systemctl reload caddy
```

## Commands

```bash
# Run scraper once
docker compose run --rm scraper

# Run scraper with schedule (daily 9 AM)
docker compose up -d scraper-scheduled

# Start web dashboard
docker compose up -d dashboard

# View all jobs
docker compose run --rm scraper python main.py --list

# Stop everything
docker compose down

# View logs
docker compose logs -f scraper-scheduled
```

## Configuration

Edit `config.yaml` to:
- Add/remove companies
- Change location filters
- Configure email/Slack notifications

## Dashboard Features

- Dark theme, monospace font
- Filter jobs by typing (press `/` to focus, `Esc` to clear)
- Color-coded tags: `remote`, `canada`, `berlin`
- Jump to company links
- Updates automatically when scraper runs

## Project Structure

```
job-scraper/
├── main.py           # CLI entry point
├── db.py             # SQLite database
├── dashboard.py      # HTML generator
├── notify.py         # Notifications
├── scrapers/         # Platform scrapers
│   ├── base.py       # Base class
│   ├── greenhouse.py # Greenhouse API
│   ├── lever.py      # Lever API
│   └── ashby.py      # Ashby API
├── config.yaml       # Company list & settings
├── Dockerfile
├── docker-compose.yaml
└── data/
    ├── jobs.db       # SQLite database
    └── dashboard.html # Generated dashboard
```