cv-matcher-online
workflowv1.0.0Online AI-powered CV/JD matching pipeline using cloud LLM APIs
Install
Then run locally:
Configure LLM provider in ~/.kdeps/config.yaml (created automatically on first run).
README
CV Matcher Example
AI-powered CV / Job Description matching pipeline built with kdeps.
For each CV + JD pair the workflow:
- Parses both documents (PDF, DOCX, JSON, TXT, or URL via scraper)
- Extracts structured data and skills with an LLM
- Indexes skill embeddings into a SQLite vector DB (cached — skipped if already known)
- Scores the match by skill category (software_dev, platform, cloud, data, ml_ai, general)
- If the overall score exceeds the threshold:
- Generates a motivation letter and a tailored CV
- Renders a full match-report PDF
- Uploads the PDF to S3 (presigned URL) and/or Google Drive
- Appends a row to an existing Google Sheet via a Python inline script
- Emails the distribution list with an HTML summary and the PDF attachment
- Returns a structured JSON result via
apiResponse
Prerequisites
| Tool / Service | Required for |
|---|---|
wkhtmltopdf | PDF generation (generate-report-pdf step) |
| Ollama or compatible LLM | Skill extraction, scoring, letter generation |
| SMTP server | Email distribution (send-email step) |
| Google Cloud service account | Google Sheets append (append-sheet step) |
| S3 presigned URL | S3 upload (upload-s3 step, optional) |
| Google OAuth2 token | GDrive upload (upload-gdrive step, optional) |
Install wkhtmltopdf:
# macOS
brew install wkhtmltopdf
# Debian / Ubuntu
apt install wkhtmltopdf
Configuration
Environment Variables
| Variable | Description |
|---|---|
SMTP_HOST | SMTP server hostname (e.g. smtp.gmail.com) |
SMTP_PORT | SMTP port (default: 587) |
SMTP_USERNAME | SMTP authentication username |
SMTP_PASSWORD | SMTP authentication password |
SMTP_FROM | Sender address |
GOOGLE_APPLICATION_CREDENTIALS | Path to Google Cloud service account JSON |
LLM_MODEL | Model name (e.g. llama3, claude-haiku) |
LLM backend
Set the model in settings.agentSettings inside workflow.yaml, or override it
at request time via the llm_model body field.
Usage
Start the kdeps server:
kdeps run examples/cv-matcher/workflow.yaml
The API listens on port 16399 at POST /match.
Request body
{
"cv_path": "/path/to/candidate.pdf",
"cv_type": "pdf",
"jd_path": "/path/to/job-description.pdf",
"jd_type": "pdf",
"distribution_list": ["hr@example.com", "hiring-manager@example.com"],
"s3_presigned_url": "https://bucket.s3.amazonaws.com/upload?...",
"gdrive_token": "ya29...",
"gdrive_folder_id": "1AbCdEf...",
"sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
"sheets_tab": "Matches"
}
| Field | Type | Required | Description |
|---|---|---|---|
cv_path | string | yes | Local file path or URL to the CV |
cv_type | string | no | pdf, docx, json, txt, url (default: pdf) |
jd_path | string | yes | Local file path or URL to the job description |
jd_type | string | no | Same options as cv_type |
distribution_list | []string | yes | Email recipients for the match summary |
s3_presigned_url | string | no | S3 PUT presigned URL for the match-report PDF |
gdrive_token | string | no | Google OAuth2 bearer token for Drive upload |
gdrive_folder_id | string | no | Google Drive folder ID to upload into |
sheets_id | string | no | Google Spreadsheet ID to append a match row |
sheets_tab | string | no | Sheet/tab name (default: Matches) |
Example
curl -X POST http://localhost:16399/match \
-H "Content-Type: application/json" \
-d '{
"cv_path": "/data/candidates/jane-smith.pdf",
"jd_path": "/data/jobs/senior-backend-engineer.pdf",
"distribution_list": ["hr@example.com", "cto@example.com"],
"sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms"
}'
Response
{
"candidate_name": "Jane Smith",
"job_title": "Senior Backend Engineer",
"overall_score": 0.87,
"is_match": true,
"category_scores": {
"software_dev": 0.92,
"platform": 0.85,
"cloud": 0.80,
"data": 0.75,
"general": 0.60
},
"report_pdf": "/tmp/kdeps/cv-match-jane-smith-20260307.pdf",
"gdrive_link": "https://drive.google.com/file/d/1AbCdEf.../view",
"s3_link": "https://bucket.s3.amazonaws.com/reports/jane-smith.pdf",
"email_sent": true,
"sheet_row_appended": true
}
When is_match is false, the report_pdf, gdrive_link, s3_link, and
email_sent fields are omitted and sheet_row_appended reflects whether the
non-match was still recorded in the spreadsheet.
Pipeline Steps
| Step | Resource type | Description |
|---|---|---|
scrape-cv | scraper | Download / read the CV file |
scrape-jd | scraper | Download / read the JD file |
extract-cv | chat | LLM extracts name, work history, skills from CV |
extract-jd | chat | LLM extracts required / preferred / nice-to-have skills from JD |
embed-cv-skills | embedding | Index CV skills into SQLite vector DB |
embed-jd-skills | embedding | Index JD skills into SQLite vector DB |
compute-match | chat | Score the CV against the JD by skill category |
generate-letter | chat | Write a personalised motivation letter (skipped when no match) |
generate-tailored-cv | chat | Produce a CV tailored to the job description |
generate-report-pdf | pdf | Render the full match report as a PDF |
upload-s3 | httpClient | PUT the PDF to an S3 presigned URL (optional) |
upload-gdrive | httpClient | POST the PDF to Google Drive REST API (optional) |
append-sheet | python | Append a result row to a Google Sheet |
send-email | email | Email the distribution list with HTML summary + PDF |
api-response | apiResponse | Return structured JSON result |
Skill categories
Skills are classified into the following categories for scoring:
| Category | Examples | Weight |
|---|---|---|
software_dev | Python, Go, Java, C++, TypeScript | 1.0 |
platform | Docker, Kubernetes, Terraform, Ansible | 0.9 |
cloud | AWS, GCP, Azure, Cloudflare | 0.9 |
data | SQL, Spark, Kafka, dbt, Airflow | 0.85 |
ml_ai | PyTorch, TensorFlow, scikit-learn, LLMs | 0.85 |
security | OWASP, penetration testing, SIEM | 0.8 |
general | Jira, Confluence, MS Office, Slack | 0.5 |
Match threshold: overall_score >= 0.65.
Versions
| Version | Published | Status |
|---|---|---|
| 1.0.0 | 4/11/2026 | active |
Details
- Author
- kdeps
- License
- Apache-2.0
- Latest Version
- 1.0.0
- Published
- 4/11/2026