cv-matcher-online

workflowv1.0.0

Online AI-powered CV/JD matching pipeline using cloud LLM APIs

Install

kdeps registry install cv-matcher-online

Then run locally:

kdeps exec cv-matcher-online

Configure LLM provider in ~/.kdeps/config.yaml (created automatically on first run).

README

CV Matcher Example

AI-powered CV / Job Description matching pipeline built with kdeps.

For each CV + JD pair the workflow:

  1. Parses both documents (PDF, DOCX, JSON, TXT, or URL via scraper)
  2. Extracts structured data and skills with an LLM
  3. Indexes skill embeddings into a SQLite vector DB (cached — skipped if already known)
  4. Scores the match by skill category (software_dev, platform, cloud, data, ml_ai, general)
  5. If the overall score exceeds the threshold:
    • Generates a motivation letter and a tailored CV
    • Renders a full match-report PDF
    • Uploads the PDF to S3 (presigned URL) and/or Google Drive
    • Appends a row to an existing Google Sheet via a Python inline script
    • Emails the distribution list with an HTML summary and the PDF attachment
  6. Returns a structured JSON result via apiResponse

Prerequisites

Tool / ServiceRequired for
wkhtmltopdfPDF generation (generate-report-pdf step)
Ollama or compatible LLMSkill extraction, scoring, letter generation
SMTP serverEmail distribution (send-email step)
Google Cloud service accountGoogle Sheets append (append-sheet step)
S3 presigned URLS3 upload (upload-s3 step, optional)
Google OAuth2 tokenGDrive upload (upload-gdrive step, optional)

Install wkhtmltopdf:

# macOS
brew install wkhtmltopdf

# Debian / Ubuntu
apt install wkhtmltopdf

Configuration

Environment Variables

VariableDescription
SMTP_HOSTSMTP server hostname (e.g. smtp.gmail.com)
SMTP_PORTSMTP port (default: 587)
SMTP_USERNAMESMTP authentication username
SMTP_PASSWORDSMTP authentication password
SMTP_FROMSender address
GOOGLE_APPLICATION_CREDENTIALSPath to Google Cloud service account JSON
LLM_MODELModel name (e.g. llama3, claude-haiku)

LLM backend

Set the model in settings.agentSettings inside workflow.yaml, or override it at request time via the llm_model body field.


Usage

Start the kdeps server:

kdeps run examples/cv-matcher/workflow.yaml

The API listens on port 16399 at POST /match.

Request body

{
  "cv_path": "/path/to/candidate.pdf",
  "cv_type": "pdf",
  "jd_path": "/path/to/job-description.pdf",
  "jd_type": "pdf",
  "distribution_list": ["hr@example.com", "hiring-manager@example.com"],
  "s3_presigned_url": "https://bucket.s3.amazonaws.com/upload?...",
  "gdrive_token": "ya29...",
  "gdrive_folder_id": "1AbCdEf...",
  "sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms",
  "sheets_tab": "Matches"
}
FieldTypeRequiredDescription
cv_pathstringyesLocal file path or URL to the CV
cv_typestringnopdf, docx, json, txt, url (default: pdf)
jd_pathstringyesLocal file path or URL to the job description
jd_typestringnoSame options as cv_type
distribution_list[]stringyesEmail recipients for the match summary
s3_presigned_urlstringnoS3 PUT presigned URL for the match-report PDF
gdrive_tokenstringnoGoogle OAuth2 bearer token for Drive upload
gdrive_folder_idstringnoGoogle Drive folder ID to upload into
sheets_idstringnoGoogle Spreadsheet ID to append a match row
sheets_tabstringnoSheet/tab name (default: Matches)

Example

curl -X POST http://localhost:16399/match \
  -H "Content-Type: application/json" \
  -d '{
    "cv_path": "/data/candidates/jane-smith.pdf",
    "jd_path": "/data/jobs/senior-backend-engineer.pdf",
    "distribution_list": ["hr@example.com", "cto@example.com"],
    "sheets_id": "1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgVE2upms"
  }'

Response

{
  "candidate_name": "Jane Smith",
  "job_title": "Senior Backend Engineer",
  "overall_score": 0.87,
  "is_match": true,
  "category_scores": {
    "software_dev": 0.92,
    "platform": 0.85,
    "cloud": 0.80,
    "data": 0.75,
    "general": 0.60
  },
  "report_pdf": "/tmp/kdeps/cv-match-jane-smith-20260307.pdf",
  "gdrive_link": "https://drive.google.com/file/d/1AbCdEf.../view",
  "s3_link": "https://bucket.s3.amazonaws.com/reports/jane-smith.pdf",
  "email_sent": true,
  "sheet_row_appended": true
}

When is_match is false, the report_pdf, gdrive_link, s3_link, and email_sent fields are omitted and sheet_row_appended reflects whether the non-match was still recorded in the spreadsheet.


Pipeline Steps

StepResource typeDescription
scrape-cvscraperDownload / read the CV file
scrape-jdscraperDownload / read the JD file
extract-cvchatLLM extracts name, work history, skills from CV
extract-jdchatLLM extracts required / preferred / nice-to-have skills from JD
embed-cv-skillsembeddingIndex CV skills into SQLite vector DB
embed-jd-skillsembeddingIndex JD skills into SQLite vector DB
compute-matchchatScore the CV against the JD by skill category
generate-letterchatWrite a personalised motivation letter (skipped when no match)
generate-tailored-cvchatProduce a CV tailored to the job description
generate-report-pdfpdfRender the full match report as a PDF
upload-s3httpClientPUT the PDF to an S3 presigned URL (optional)
upload-gdrivehttpClientPOST the PDF to Google Drive REST API (optional)
append-sheetpythonAppend a result row to a Google Sheet
send-emailemailEmail the distribution list with HTML summary + PDF
api-responseapiResponseReturn structured JSON result

Skill categories

Skills are classified into the following categories for scoring:

CategoryExamplesWeight
software_devPython, Go, Java, C++, TypeScript1.0
platformDocker, Kubernetes, Terraform, Ansible0.9
cloudAWS, GCP, Azure, Cloudflare0.9
dataSQL, Spark, Kafka, dbt, Airflow0.85
ml_aiPyTorch, TensorFlow, scikit-learn, LLMs0.85
securityOWASP, penetration testing, SIEM0.8
generalJira, Confluence, MS Office, Slack0.5

Match threshold: overall_score >= 0.65.

Versions

VersionPublishedStatus
1.0.04/11/2026active

Details

Author
kdeps
License
Apache-2.0
Latest Version
1.0.0
Published
4/11/2026

Tags

cvaimatching