cv-matcher
workflowv1.0.0Fully offline AI-powered CV / Job Description matching pipeline using Ollama for LLM inference.
Install
Then run locally:
Configure LLM provider in ~/.kdeps/config.yaml (created automatically on first run).
README
CV Matcher Example — Offline Edition
AI-powered CV / Job Description matching pipeline built with kdeps. Runs entirely offline using Ollama for both LLM inference and text embeddings — no cloud API keys required.
For each CV + JD pair the workflow:
- Parses both documents (PDF, DOCX, JSON, TXT, or URL via scraper)
- Extracts structured data and skills with llama3.2 (local Ollama LLM)
- Indexes skill embeddings into a SQLite vector DB with nomic-embed-text (local Ollama embedding model, cached)
- Scores the match by skill category (software_dev, platform, cloud, data, ml_ai, general)
- If the overall score exceeds the threshold:
- Generates a motivation letter and a tailored CV
- Renders a full match-report PDF
- Uploads the PDF to S3 (presigned URL) and/or Google Drive (optional)
- Appends a row to an existing Google Sheet via a Python inline script (optional)
- Emails the distribution list with an HTML summary and the PDF attachment (optional)
- Returns a structured JSON result via
apiResponse
Also serves an interactive web UI at GET / — open it in your browser to submit CVs and JDs without writing any curl commands.
Prerequisites
| Tool / Service | Required for |
|---|---|
Ollama + llama3.2 | LLM extraction, scoring, letter generation |
Ollama + nomic-embed-text | Offline skill embeddings |
wkhtmltopdf | PDF generation (generate-report-pdf step) |
| SMTP server | Email distribution (send-email step, optional) |
| Google Cloud service account | Google Sheets append (append-sheet step, optional) |
| S3 presigned URL | S3 upload (upload-s3 step, optional) |
| Google OAuth2 token | GDrive upload (upload-gdrive step, optional) |
Install Ollama and pull models
# Install Ollama: https://ollama.com/download
ollama pull llama3.2
ollama pull nomic-embed-text
Install wkhtmltopdf
# macOS
brew install wkhtmltopdf
# Debian / Ubuntu
apt install wkhtmltopdf
Usage
Start the kdeps server:
kdeps run examples/cv-matcher/workflow.yaml
The server listens on port 16399.
Web UI (browser)
Open http://localhost:16399/ in your browser.
Fill in the CV and JD file paths, click Analyze Match, and the results panel renders inline — match score, category breakdown, matched skills table, missing skills, and PDF output paths.
API (curl)
curl -X POST http://localhost:16399/match \
-H "Content-Type: application/json" \
-d '{
"cv_path": "/data/candidates/jane-smith.pdf",
"jd_path": "/data/jobs/senior-backend-engineer.pdf"
}'
Request body
{
"cv_path": "/path/to/candidate.pdf",
"cv_type": "pdf",
"jd_path": "/path/to/job-description.pdf",
"jd_type": "pdf",
"distribution_list": ["hr@example.com", "hiring-manager@example.com"],
"s3_presigned_url": "https://bucket.s3.amazonaws.com/upload?...",
"gdrive_token": "ya29...",
"gdrive_folder_id": "1AbCdEf...",
"sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
"sheets_tab": "Matches"
}
| Field | Type | Required | Description |
|---|---|---|---|
cv_path | string | yes | Local file path or URL to the CV |
cv_type | string | no | pdf, docx, json, txt, url (default: pdf) |
jd_path | string | yes | Local file path or URL to the job description |
jd_type | string | no | Same options as cv_type |
distribution_list | []string | no | Email recipients for the match summary |
s3_presigned_url | string | no | S3 PUT presigned URL for the match-report PDF |
gdrive_token | string | no | Google OAuth2 bearer token for Drive upload |
gdrive_folder_id | string | no | Google Drive folder ID to upload into |
sheets_id | string | no | Google Spreadsheet ID to append a match row |
sheets_tab | string | no | Sheet/tab name (default: Matches) |
Response
{
"candidate": {
"name": "Jane Smith",
"email": "jane@example.com",
"phone": "+31612345678"
},
"position": {
"title": "Senior Backend Engineer",
"company": "TechCorp",
"contact_name": "Jane Recruiter",
"contact_email": "jane@techcorp.com"
},
"match": {
"overall_score": 0.87,
"is_match": true,
"category_scores": {
"software_dev": 0.92,
"platform": 0.85,
"cloud": 0.80,
"data": 0.75,
"ml_ai": 0.0,
"security": 0.0,
"general": 0.60
},
"matched_skills": [
{"jd_skill": "Python", "cv_skill": "Python", "similarity": 1.0, "cv_duration_months": 38}
],
"missing_skills": ["Kubernetes"],
"summary": "Strong match — 8 of 10 required skills present, 3+ years Python."
},
"outputs": {
"report_pdf": "/tmp/kdeps-cv-matcher/report_Jane_Smith_TechCorp.pdf",
"tailored_cv": "/tmp/kdeps-cv-matcher/cv_Jane_Smith_TechCorp.pdf",
"file_link": "",
"sheet_updated": false,
"email_sent": false
}
}
Environment Variables
| Variable | Description |
|---|---|
SMTP_HOST | SMTP server hostname (default: smtp.gmail.com) |
SMTP_USER | SMTP authentication username |
SMTP_PASS | SMTP authentication password |
SMTP_FROM | Sender address (defaults to SMTP_USER) |
GOOGLE_SA_JSON | Path to Google Cloud service-account JSON key file |
GOOGLE_SPREADSHEET_ID | Fallback spreadsheet ID if not in request body |
Pipeline Steps
| Step | Resource type | Description |
|---|---|---|
scrape-cv | scraper | Download / read the CV file |
scrape-jd | scraper | Download / read the JD file |
extract-cv | chat (ollama/llama3.2) | Extract name, work history, skills from CV |
extract-jd | chat (ollama/llama3.2) | Extract required / preferred / nice-to-have skills from JD |
embed-cv-skills | embedding (ollama/nomic-embed-text) | Index CV skills into SQLite vector DB |
embed-jd-skills | embedding (ollama/nomic-embed-text) | Index JD skills into SQLite vector DB |
compute-match | chat (ollama/llama3.2) | Score the CV against the JD by skill category |
generate-letter | chat (ollama/llama3.2) | Write a personalised motivation letter (skipped when no match) |
generate-tailored-cv | chat (ollama/llama3.2) | Produce a CV tailored to the job description (skipped when no match) |
generate-report-pdf | pdf | Render the full match report as a PDF |
upload-s3 | httpClient | PUT the PDF to an S3 presigned URL (optional) |
upload-gdrive | httpClient | POST the PDF to Google Drive REST API (optional) |
append-sheet | python | Append a result row to a Google Sheet (optional) |
send-email | email | Email the distribution list with HTML summary + PDF (optional) |
web-ui | apiResponse | Serves HTML web UI (GET /) or JSON result (POST /match) |
Skill categories
| Category | Examples | Weight |
|---|---|---|
software_dev | Python, Go, Java, C++, TypeScript | 1.0 |
platform | Docker, Kubernetes, Terraform, Ansible | 0.9 |
cloud | AWS, GCP, Azure, Cloudflare | 0.9 |
data | SQL, Spark, Kafka, dbt, Airflow | 0.85 |
ml_ai | PyTorch, TensorFlow, scikit-learn, LLMs | 0.85 |
security | OWASP, penetration testing, SIEM | 0.8 |
general | Jira, Confluence, MS Office, Slack | 0.5 |
Match threshold: overall_score >= 0.65.
Differences from cv-matcher-online
cv-matcher (this example) | cv-matcher-online | |
|---|---|---|
| LLM | Ollama / llama3.2 (local) | Anthropic / claude-haiku |
| Embeddings | Ollama / nomic-embed-text (local) | OpenAI / text-embedding-3-small |
| API keys needed | None | ANTHROPIC_API_KEY, OPENAI_API_KEY |
| Web UI | Yes — GET / | No |
| Internet required | No (AI only) | Yes |
Versions
| Version | Published | Status |
|---|---|---|
| 1.0.0 | 4/11/2026 | active |
Details
- Author
- kdeps
- License
- Apache-2.0
- Latest Version
- 1.0.0
- Published
- 4/11/2026