Project: Build a Smart CV / Resume Analyser with Claude

Hiring teams review dozens to hundreds of CVs for every open role. Reading each one thoroughly takes time, and the most qualified candidates can be buried under a stack of applications. An AI-powered CV analyser automates the extraction and scoring phase — freeing recruiters to focus their energy on the candidates that actually match.
This project builds a complete CV analyser that takes a PDF or plain-text CV as input, extracts structured candidate data, scores the candidate against a provided job description, and generates a plain-English hiring recommendation. The full implementation runs in under 100 lines of Python and is production-ready with minor additions.
What We Are Building
The CV analyser does three things:
- Extracts structured data from the CV: name, email, years of experience, current role, skills, education, and employment history
- Scores the candidate against a job description on five dimensions: technical skills match, experience level, domain relevance, education requirement, and seniority alignment
- Generates a hiring recommendation: a plain-English paragraph explaining whether to interview the candidate and why
Prerequisites
- Python 3.9 or later
- Anthropic Python SDK: pip install anthropic
- For PDF support: pip install pypdf2 or pip install pymupdf
- An Anthropic API key set as ANTHROPIC_API_KEY
Project Architecture
The system has three components:
- CV ingestion: Reads the CV file (PDF or text) and prepares the content for Claude
- Structured extraction: Uses Claude with tool use to extract typed fields from the CV content
- Scoring and recommendation: Uses Claude to score and explain the match against the job description
Complete Implementation
1import anthropic
2import json
3from pathlib import Path
4
5client = anthropic.Anthropic()
6
7
8# ─── Step 1: CV Ingestion ────────────────────────────────────────────────────
9
10def read_cv(file_path: str) -> str:
11 """Read CV content from a text or PDF file."""
12 path = Path(file_path)
13
14 if path.suffix.lower() == ".pdf":
15 try:
16 import fitz # PyMuPDF
17 doc = fitz.open(file_path)
18 text = ""
19 for page in doc:
20 text += page.get_text()
21 return text
22 except ImportError:
23 raise RuntimeError("Install PyMuPDF: pip install pymupdf")
24
25 elif path.suffix.lower() in [".txt", ".md"]:
26 return path.read_text(encoding="utf-8")
27
28 else:
29 raise ValueError(f"Unsupported file type: {path.suffix}")
30
31
32# ─── Step 2: Structured Extraction ──────────────────────────────────────────
33
34EXTRACTION_TOOL = {
35 "name": "extract_candidate_data",
36 "description": "Extract structured information from a candidate CV or resume",
37 "input_schema": {
38 "type": "object",
39 "properties": {
40 "full_name": {"type": "string", "description": "Candidate's full name"},
41 "email": {"type": "string", "description": "Email address, or null if not found"},
42 "phone": {"type": "string", "description": "Phone number, or null if not found"},
43 "current_title": {"type": "string", "description": "Most recent job title"},
44 "years_experience": {
45 "type": "number",
46 "description": "Estimated total years of professional experience"
47 },
48 "skills": {
49 "type": "array",
50 "items": {"type": "string"},
51 "description": "Technical and professional skills mentioned"
52 },
53 "education": {
54 "type": "array",
55 "items": {
56 "type": "object",
57 "properties": {
58 "degree": {"type": "string"},
59 "field": {"type": "string"},
60 "institution": {"type": "string"},
61 "year": {"type": "string"}
62 }
63 },
64 "description": "Educational qualifications"
65 },
66 "work_history": {
67 "type": "array",
68 "items": {
69 "type": "object",
70 "properties": {
71 "title": {"type": "string"},
72 "company": {"type": "string"},
73 "duration": {"type": "string"},
74 "key_achievements": {"type": "array", "items": {"type": "string"}}
75 }
76 },
77 "description": "Employment history, most recent first"
78 }
79 },
80 "required": ["full_name", "current_title", "years_experience", "skills", "work_history"]
81 }
82}
83
84
85def extract_candidate_data(cv_text: str) -> dict:
86 """Use Claude to extract structured data from CV text."""
87 response = client.messages.create(
88 model="claude-sonnet-4-6",
89 max_tokens=4096,
90 tools=[EXTRACTION_TOOL],
91 tool_choice={"type": "tool", "name": "extract_candidate_data"},
92 messages=[
93 {
94 "role": "user",
95 "content": f"Extract all candidate information from this CV:\n\n{cv_text}"
96 }
97 ]
98 )
99
100 for block in response.content:
101 if block.type == "tool_use":
102 return block.input
103
104 raise RuntimeError("Extraction failed — no tool use block in response")
105
106
107# ─── Step 3: Scoring and Recommendation ─────────────────────────────────────
108
109SCORING_TOOL = {
110 "name": "score_candidate",
111 "description": "Score a candidate against a job description",
112 "input_schema": {
113 "type": "object",
114 "properties": {
115 "scores": {
116 "type": "object",
117 "properties": {
118 "technical_skills": {
119 "type": "integer",
120 "description": "Score 1-10: how well candidate skills match job requirements"
121 },
122 "experience_level": {
123 "type": "integer",
124 "description": "Score 1-10: match between candidate experience and required level"
125 },
126 "domain_relevance": {
127 "type": "integer",
128 "description": "Score 1-10: relevance of candidate's domain experience to this role"
129 },
130 "education": {
131 "type": "integer",
132 "description": "Score 1-10: education qualification match"
133 },
134 "overall": {
135 "type": "integer",
136 "description": "Overall score 1-10"
137 }
138 },
139 "required": ["technical_skills", "experience_level", "domain_relevance", "education", "overall"]
140 },
141 "strengths": {
142 "type": "array",
143 "items": {"type": "string"},
144 "description": "2-3 specific strengths relative to this role"
145 },
146 "gaps": {
147 "type": "array",
148 "items": {"type": "string"},
149 "description": "Key gaps or concerns relative to this role"
150 },
151 "recommendation": {
152 "type": "string",
153 "enum": ["strong_yes", "yes", "maybe", "no"],
154 "description": "Interview recommendation"
155 },
156 "recommendation_reasoning": {
157 "type": "string",
158 "description": "2-3 sentence explanation of the recommendation"
159 }
160 },
161 "required": ["scores", "strengths", "gaps", "recommendation", "recommendation_reasoning"]
162 }
163}
164
165
166def score_candidate(candidate_data: dict, job_description: str) -> dict:
167 """Score the candidate against the job description."""
168 response = client.messages.create(
169 model="claude-sonnet-4-6",
170 max_tokens=2048,
171 tools=[SCORING_TOOL],
172 tool_choice={"type": "tool", "name": "score_candidate"},
173 messages=[
174 {
175 "role": "user",
176 "content": f"""Score this candidate against the job description.
177
178JOB DESCRIPTION:
179{job_description}
180
181CANDIDATE PROFILE:
182{json.dumps(candidate_data, indent=2)}
183
184Provide an honest assessment. Score 1-10 on each dimension.
185Include specific strengths and concrete gaps."""
186 }
187 ]
188 )
189
190 for block in response.content:
191 if block.type == "tool_use":
192 return block.input
193
194 raise RuntimeError("Scoring failed — no tool use block in response")
195
196
197# ─── Main Analyser ────────────────────────────────────────────────────────────
198
199def analyse_cv(cv_file_path: str, job_description: str) -> dict:
200 """Run the full CV analysis pipeline."""
201 print(f"Reading CV from {cv_file_path}...")
202 cv_text = read_cv(cv_file_path)
203
204 print("Extracting candidate data...")
205 candidate_data = extract_candidate_data(cv_text)
206
207 print("Scoring candidate...")
208 scoring = score_candidate(candidate_data, job_description)
209
210 return {
211 "candidate": candidate_data,
212 "assessment": scoring
213 }
214
215
216def print_report(analysis: dict) -> None:
217 """Print a human-readable analysis report."""
218 c = analysis["candidate"]
219 a = analysis["assessment"]
220
221 print("\n" + "="*60)
222 print(f"CANDIDATE: {c['full_name']}")
223 print(f"Current Role: {c['current_title']}")
224 print(f"Experience: {c['years_experience']} years")
225 print(f"Key Skills: {', '.join(c['skills'][:8])}")
226 print("="*60)
227
228 scores = a["scores"]
229 print("\nSCORES:")
230 print(f" Technical Skills: {scores['technical_skills']}/10")
231 print(f" Experience Level: {scores['experience_level']}/10")
232 print(f" Domain Relevance: {scores['domain_relevance']}/10")
233 print(f" Education: {scores['education']}/10")
234 print(f" ─────────────────────────────")
235 print(f" Overall: {scores['overall']}/10")
236
237 print(f"\nRECOMMENDATION: {a['recommendation'].upper().replace('_', ' ')}")
238 print(f"\n{a['recommendation_reasoning']}")
239
240 print("\nSTRENGTHS:")
241 for s in a["strengths"]:
242 print(f" ✓ {s}")
243
244 if a["gaps"]:
245 print("\nGAPS:")
246 for g in a["gaps"]:
247 print(f" ✗ {g}")
248 print("="*60)
249
250
251# ─── Example Usage ───────────────────────────────────────────────────────────
252
253if __name__ == "__main__":
254 job_description = """
255 Senior DevOps Engineer — 5+ years experience
256 Requirements:
257 - Strong Python and Bash scripting
258 - Kubernetes and Docker container orchestration
259 - AWS or Azure cloud infrastructure experience
260 - CI/CD pipeline design (GitHub Actions, Jenkins, or similar)
261 - Infrastructure as Code (Terraform or Pulumi)
262 - Experience with monitoring stacks (Prometheus, Grafana)
263 Preferred: Experience with GitOps, ArgoCD, service mesh (Istio)
264 """
265
266 # Analyse a CV
267 analysis = analyse_cv("candidate_cv.pdf", job_description)
268 print_report(analysis)
269
270 # Also save JSON for downstream use
271 with open("analysis_result.json", "w") as f:
272 json.dump(analysis, f, indent=2)Extending the Project
- Batch processing: Add a loop to process an entire folder of CVs and produce a ranked shortlist using the Batch API for 50% cost savings
- Web interface: Wrap the analyser in a FastAPI or Flask endpoint that accepts file uploads and returns JSON analysis results
- Database integration: Store extracted candidate data and scores in PostgreSQL for filtering, searching, and tracking candidates across multiple roles
- Files API: For high-volume environments, upload CV PDFs via the Files API to avoid re-uploading the same document when analysing against multiple job descriptions
Use tool_choice: tool for Reliable Extraction
The scoring and extraction steps both use tool_choice: {type: 'tool', name: ...} to force Claude to produce structured output every time. This is more reliable than asking Claude to return JSON in the message text, because the tool use mechanism enforces the schema. Combined with Python's jsonschema library for post-extraction validation, this pattern produces highly consistent structured output across diverse CV formats.
Summary
This project demonstrates the three-step pattern that underlies most document-processing AI applications: ingest → extract → analyse. The same approach applies to processing contracts, financial statements, technical specifications, and any other structured document type.
- Extraction: Use tool_choice to guarantee structured JSON output from any document
- Scoring: Let Claude apply business logic — matching against requirements, identifying gaps — that would be complex to code manually
- Recommendation: Let Claude generate the natural language reasoning that explains its structured scores
Next project: Project: Build a Customer Support Chatbot with Claude API.
This post is part of the Anthropic AI Tutorial Series. Previous post: AI Agents Refresher: Key Concepts, Patterns, and Pitfalls.
