CV Matcher Example — Offline Edition

AI-powered CV / Job Description matching pipeline built with kdeps. Runs entirely offline using Ollama for both LLM inference and text embeddings — no cloud API keys required.

For each CV + JD pair the workflow:

Parses both documents (PDF, DOCX, JSON, TXT, or URL via scraper)
Extracts structured data and skills with llama3.2 (local Ollama LLM)
Indexes skill embeddings into a SQLite vector DB with nomic-embed-text (local Ollama embedding model, cached)
Scores the match by skill category (software_dev, platform, cloud, data, ml_ai, general)
If the overall score exceeds the threshold:
- Generates a motivation letter and a tailored CV
- Renders a full match-report PDF
- Uploads the PDF to S3 (presigned URL) and/or Google Drive (optional)
- Appends a row to an existing Google Sheet via a Python inline script (optional)
- Emails the distribution list with an HTML summary and the PDF attachment (optional)
Returns a structured JSON result via apiResponse

Also serves an interactive web UI at GET / — open it in your browser to submit CVs and JDs without writing any curl commands.

Prerequisites

Tool / Service	Required for
Ollama + `llama3.2`	LLM extraction, scoring, letter generation
Ollama + `nomic-embed-text`	Offline skill embeddings
`wkhtmltopdf`	PDF generation (`generate-report-pdf` step)
SMTP server	Email distribution (`send-email` step, optional)
Google Cloud service account	Google Sheets append (`append-sheet` step, optional)
S3 presigned URL	S3 upload (`upload-s3` step, optional)
Google OAuth2 token	GDrive upload (`upload-gdrive` step, optional)

Install Ollama and pull models

# Install Ollama: https://ollama.com/download
ollama pull llama3.2
ollama pull nomic-embed-text

Install wkhtmltopdf

# macOS
brew install wkhtmltopdf

# Debian / Ubuntu
apt install wkhtmltopdf

Usage

Start the kdeps server:

kdeps run examples/cv-matcher/workflow.yaml

The server listens on port 16399.

Web UI (browser)

Open http://localhost:16399/ in your browser.

Fill in the CV and JD file paths, click Analyze Match, and the results panel renders inline — match score, category breakdown, matched skills table, missing skills, and PDF output paths.

API (curl)

curl -X POST http://localhost:16399/match \
  -H "Content-Type: application/json" \
  -d '{
    "cv_path": "/data/candidates/jane-smith.pdf",
    "jd_path": "/data/jobs/senior-backend-engineer.pdf"
  }'

Request body

{
  "cv_path": "/path/to/candidate.pdf",
  "cv_type": "pdf",
  "jd_path": "/path/to/job-description.pdf",
  "jd_type": "pdf",
  "distribution_list": ["hr@example.com", "hiring-manager@example.com"],
  "s3_presigned_url": "https://bucket.s3.amazonaws.com/upload?...",
  "gdrive_token": "ya29...",
  "gdrive_folder_id": "1AbCdEf...",
  "sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
  "sheets_tab": "Matches"
}

Field	Type	Required	Description
`cv_path`	string	yes	Local file path or URL to the CV
`cv_type`	string	no	`pdf`, `docx`, `json`, `txt`, `url` (default: `pdf`)
`jd_path`	string	yes	Local file path or URL to the job description
`jd_type`	string	no	Same options as `cv_type`
`distribution_list`	[]string	no	Email recipients for the match summary
`s3_presigned_url`	string	no	S3 PUT presigned URL for the match-report PDF
`gdrive_token`	string	no	Google OAuth2 bearer token for Drive upload
`gdrive_folder_id`	string	no	Google Drive folder ID to upload into
`sheets_id`	string	no	Google Spreadsheet ID to append a match row
`sheets_tab`	string	no	Sheet/tab name (default: `Matches`)

Response

{
  "candidate": {
    "name": "Jane Smith",
    "email": "jane@example.com",
    "phone": "+31612345678"
  },
  "position": {
    "title": "Senior Backend Engineer",
    "company": "TechCorp",
    "contact_name": "Jane Recruiter",
    "contact_email": "jane@techcorp.com"
  },
  "match": {
    "overall_score": 0.87,
    "is_match": true,
    "category_scores": {
      "software_dev": 0.92,
      "platform": 0.85,
      "cloud": 0.80,
      "data": 0.75,
      "ml_ai": 0.0,
      "security": 0.0,
      "general": 0.60
    },
    "matched_skills": [
      {"jd_skill": "Python", "cv_skill": "Python", "similarity": 1.0, "cv_duration_months": 38}
    ],
    "missing_skills": ["Kubernetes"],
    "summary": "Strong match — 8 of 10 required skills present, 3+ years Python."
  },
  "outputs": {
    "report_pdf": "/tmp/kdeps-cv-matcher/report_Jane_Smith_TechCorp.pdf",
    "tailored_cv": "/tmp/kdeps-cv-matcher/cv_Jane_Smith_TechCorp.pdf",
    "file_link": "",
    "sheet_updated": false,
    "email_sent": false
  }
}

Environment Variables

Variable	Description
`SMTP_HOST`	SMTP server hostname (default: `smtp.gmail.com`)
`SMTP_USER`	SMTP authentication username
`SMTP_PASS`	SMTP authentication password
`SMTP_FROM`	Sender address (defaults to `SMTP_USER`)
`GOOGLE_SA_JSON`	Path to Google Cloud service-account JSON key file
`GOOGLE_SPREADSHEET_ID`	Fallback spreadsheet ID if not in request body

Pipeline Steps

Step	Resource type	Description
`scrape-cv`	`scraper`	Download / read the CV file
`scrape-jd`	`scraper`	Download / read the JD file
`extract-cv`	`chat` (ollama/llama3.2)	Extract name, work history, skills from CV
`extract-jd`	`chat` (ollama/llama3.2)	Extract required / preferred / nice-to-have skills from JD
`embed-cv-skills`	`embedding` (ollama/nomic-embed-text)	Index CV skills into SQLite vector DB
`embed-jd-skills`	`embedding` (ollama/nomic-embed-text)	Index JD skills into SQLite vector DB
`compute-match`	`chat` (ollama/llama3.2)	Score the CV against the JD by skill category
`generate-letter`	`chat` (ollama/llama3.2)	Write a personalised motivation letter (skipped when no match)
`generate-tailored-cv`	`chat` (ollama/llama3.2)	Produce a CV tailored to the job description (skipped when no match)
`generate-report-pdf`	`pdf`	Render the full match report as a PDF
`upload-s3`	`httpClient`	PUT the PDF to an S3 presigned URL (optional)
`upload-gdrive`	`httpClient`	POST the PDF to Google Drive REST API (optional)
`append-sheet`	`python`	Append a result row to a Google Sheet (optional)
`send-email`	`email`	Email the distribution list with HTML summary + PDF (optional)
`web-ui`	`apiResponse`	Serves HTML web UI (GET /) or JSON result (POST /match)

Skill categories

Category	Examples	Weight
`software_dev`	Python, Go, Java, C++, TypeScript	1.0
`platform`	Docker, Kubernetes, Terraform, Ansible	0.9
`cloud`	AWS, GCP, Azure, Cloudflare	0.9
`data`	SQL, Spark, Kafka, dbt, Airflow	0.85
`ml_ai`	PyTorch, TensorFlow, scikit-learn, LLMs	0.85
`security`	OWASP, penetration testing, SIEM	0.8
`general`	Jira, Confluence, MS Office, Slack	0.5

Match threshold: overall_score >= 0.65.

Differences from `cv-matcher-online`

	`cv-matcher` (this example)	`cv-matcher-online`
LLM	Ollama / llama3.2 (local)	Anthropic / claude-haiku
Embeddings	Ollama / nomic-embed-text (local)	OpenAI / text-embedding-3-small
API keys needed	None	`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`
Web UI	Yes — `GET /`	No
Internet required	No (AI only)	Yes

cv-matcher

Install

README

CV Matcher Example — Offline Edition

Prerequisites

Install Ollama and pull models

Install wkhtmltopdf

Usage

Web UI (browser)

API (curl)

Request body

Response

Environment Variables

Pipeline Steps

Skill categories

Differences from `cv-matcher-online`

Versions

Details

Tags

cv-matcher

Install

README

CV Matcher Example — Offline Edition

Prerequisites

Install Ollama and pull models

Install wkhtmltopdf

Usage

Web UI (browser)

API (curl)

Request body

Response

Environment Variables

Pipeline Steps

Skill categories

Differences from cv-matcher-online

Versions

Details

Tags

Differences from `cv-matcher-online`