Wavesteam
HomeAll case studiesBook a consultation
AI Solution Case Study · 2026Wavesteam / Custom build / AI engineering

A high-recall, explainable
AI recruiting system.

Built for a multi-business-line enterprise with a heavy tech-hiring funnel, the system combines LLMs, multi-stage filtering and semantic matching to handleresume parsing, candidate scoring and role matching — with decision support for every step.

Book a 30-min consultationRead the full case
Client
Multi-business-line enterprise
Timeline
2025 Q4 – 2026 Q1
Volume
12,000+ resumes / 2 weeks
Stack
RPA · OCR · Vector search · LLM
/ 00How the project started

It started simple —
“The HR team can't keep up with the resumes anymore.”

In late 2025, this multi-business-line enterprise client kicked off another tech-hiring push. The first instinct was the obvious one — hire more recruiters. It didn't help.

01
Open roles
Java · Frontend · QA · Data
4 tech tracks hiring in parallel
02
Sourcing channels
BOSS · Zhaopin · Liepin · Email
Inbound from every direction
03
New resumes
12,000+
Two weeks, duplicates and noise included
04
Time-to-first-review
> 72h
Worst case, Java roles, average
/ Field note

Every morning, recruiters weren't screening resumes first.

They were burning hours just organizing them. The real screening problems only surfaced after that.

  1. 01Log into each job board
  2. 02Download attachments, rename files
  3. 03Dedupe, merge, categorize
  4. 04Forward to the right business unit
/ 01The real bottleneck wasn't volume

It was this — “tech roles have outgrown
the
judgement range of traditional HR.”

The JD says “high concurrency, microservices, distributed cache.” The candidate writes “flash-sale system, Spring Cloud refactor, Redis hot keys.” To engineering they're the same skill. To a non-technical recruiter, there's no way to make that call in 30 seconds.

What the JD says
What the resume says
  • High-concurrency systems experience
    Built a flash-sale system
  • Microservices architecture
    Spring Cloud refactor
  • Distributed cache
    Tuned Redis hot keys
  • Message queue
    Kafka peak shaving
P-01

Two recruiters, two completely different shortlists

Same role: recruiter A weighs degrees, B weighs FAANG-style backgrounds, C counts keyword hits. The interview pool ends up wildly inconsistent.

First time the client realized: their hiring bar was never actually structured.

P-02

Traditional ATS stopped working

Search “Go developer” and resumes saying “Golang / cloud-native backend / microservices” never surface. Search “user growth” and “Growth / growth ops / viral strategy” gets dropped too.

ATS finds “resumes that look like the JD,” not “people who actually fit.”

P-03

Engineering stopped trusting HR's first pass

Teams kept saying “great people never even reached us.” Engineering managers ended up doing first-round screening themselves — recruiter throughput dropped, engineers duplicated work, time-to-hire kept stretching.

At worst, time-to-first-review on a Java role hit > 72 hours.

/ 02Why we didn't go all-in on the LLM

The client asked — “can we just point GPT at the resumes?”
We didn't.

An LLM-only approach falls apart in production for three predictable reasons. That call shaped everything that came after.

· 01

Unstable scores

Same resume, different time, different prompt — scores drift. Recruiters can't build trust, and trust is the first gate any AI tool has to pass.

· 02

Weak explainability

Recruiters don't ask “why is this an 82.” They ask “why is this person ranked above the next one.” Without traceable skills, experience and role-fit, they'll re-screen by hand anyway.

· 03

Costs out of control

With a six-figure resume pool, full LLM inference on every candidate means latency, token spend and concurrency all blow up. The AI becomes the new bottleneck.

/ 03Reframing the problem

This isn't a “chat AI” problem.
It's a high-recall, consistent, explainable search-system problem.

/ Final architecture
Semantic recall+Rule-based scoring+ LLM enrichment

Let each layer do what it's good at: rules for stability, vector recall for coverage, LLM for context. AI no longer has the deciding vote — it's one input into the scoring system.

/ 04How the final system works

The system splits into four main pipelines —
each one maps to a pain point that used to stall the client.

Engineering isn't just calling an LLM. We broke the recruiting flow apart so each module owns what it's best at — and AI stays controllable, explainable and trustworthy.

Pipeline · the full journey of one resume
STEP 01

Fix “we can't even ingest the resumes” first

The easiest part to overlook. The first step isn't AI — it's RPA + OCR + schema normalization: pull from every platform, dedupe, parse PDF / Word / images into one canonical structure.

RPA ingestionOCR parsingField normalizationDedupe & merge
STEP 02

Semantic search, not keyword search

When the JD says “high-concurrency experience,” the system understands the connections between Redis, Kafka, flash-sale systems, distributed cache and service governance. It matches on “has this person actually done this kind of work,” not “does the word appear.”

Vector recallIntent understandingRole semantic graph
STEP 03

Five-dimension scoring — AI doesn't get the deciding vote

We started with the model emitting raw scores. They drifted, they were unstable, and they didn't explain themselves. We moved to rules-led, AI-assisted: weighted scores across skills, experience, contextual depth, logical structure and role fit. AI handles context; rules make the call.

Rule-ledExplainableFive-dim weighted
STEP 04

Interview-question generation — the part that surprised the client

When a candidate says “I optimized a high-concurrency system,” a non-technical recruiter has nothing to follow up with. The system generates the follow-ups automatically: “how would you handle cache penetration?” “why did you partition Kafka that way?” — questions that actually probe.

Project probesArchitecture probesRisk probes
What the client said afterwards:“Engineering managers used to help HR run the first screen. We can finally split that work apart.”
/ 05One resume, end to end, in under 10 seconds

This was the moment the client actually “got” AI.

We ran a real (anonymized) resume through the full pipeline: 3 pages of PDF, messy formatting, conversational phrasing — under 10 seconds end to end.

  1. STEP 01~2s

    OCR + structured parsing

    Pulls out skills, projects, timeline, education and work history.

  2. STEP 02~3s

    Semantic understanding

    Maps “Redis + Kafka optimization” to distributed cache, message queue, high-concurrency governance, then vector-matches against the role.

  3. STEP 03~2s

    Five-dimension scoring

    Weighted across skills, experience, contextual depth, logical structure and role fit — rendered as a radar.

  4. STEP 04~3s

    Interview outline generated

    Technical probes + project probes + architecture questions + risk questions — handed straight to the interviewer.

/ Candidate profiledemo
82
/ 100 overall match

Java backend · 5 years · distributed experience match

SkillsExperienceContext depthLogical structureRole fit
Skills
88
Experience
74
Context depth
81
Logical structure
70
Role fit
85
/ 09Actual system screenshots

Not a mockup.
The system running in production at the client

From ingestion and talent pool, to semantic search, deep analysis and interview-question generation — five core screens, mapped to the five pipelines in the case.

wavesteam · AI Recruit Console
Multi-channel resume ingestion
STEP 01 · Get the data flowing first

Multi-channel resume ingestion

Outlook, Gmail, NetEase Mail, QQ Mail and WeCom — 5 source types connected in one click. RPA keeps watching for new mail, parses attachments, normalizes fields and feeds them downstream. This is where the “we can't even collect the resumes” problem actually gets solved.

  • 5 mail / IM channels integrated
  • RPA polling · auto-ingest
  • Field normalization · auto-dedupe
01/05
/ 06The hard part isn't the AI

It's the engineering —
because real-world recruiting data is very, very messy.

These problems are harder than “call an LLM.” They decide whether an AI system actually survives in production.

01
Image-only resumes
Scans / screenshots / phone photos
02
Duplicate submissions
Same person, many channels, many versions
03
Inconsistent fields across platforms
Education / years of experience / project structure
04
Multiple versions per candidate
External / internal / mass-apply / targeted
05
Non-standard file names
“final-v-final-final.pdf”
06
PDF parsing errors
Fonts / tables / two-column layouts
07
Conversational skill phrasing
“messed with,” “fiddled with,” “touched on”
/ 07What happened after launch

The client didn't cut any HR roles.
Recruiters moved back to “judging people” — the part that matters.

For the first time, the whole recruiting flow worked as a real human + AI loop.

/ AI handles
CollectionSearchRecallFirst-pass screeningSuggestion generation

We handed “read the first round of the world” to the system — rules for stability, models for understanding, scores that explain themselves.

/ HR handles
Final decisionCultural fitRisk readConversation and follow-through

Recruiters went back to the work that matters — talking to people, judging fit, moving the conversation forward, spotting risk. The part AI doesn't replace.

Time-to-first-review
72h → 10s
Java roles at peak
Resumes added in 2 weeks
12,000+
RPA auto-ingest
Scoring dimensions
5-dim
Rule-led · AI-assisted
New HR headcount
0
Software, not more people
/ 08Retro takeaway
“Recruiting isn't keyword filtering.
It's about understanding people.”
— internal project retro

What actually worked wasn't “AI replaces HR.” It was “AI reads the first round of the world for HR.”

/ TransferableSame underlying capability

What's reusable underneath isn't “recruiting.”
It's large-scale, unstructured talent-data understanding

Campus recruiting screenHeadhunter talent poolBlue-collar hiringInternal mobility recommendationsFlexible-workforce platforms

Related cases

More case studies

Other projects delivered by Wavesteam — AI, IoT, platform builds and enterprise software.

  • Charity & Donation Platform

    Charity & Donation Platform

    A donation and student-aid platform for non-profit organizations — platform architecture and user flows tailored to charity operations.

    View case→
  • 24-Hour Sustainability Livestream Platform

    24-Hour Sustainability Livestream Platform

    A continuous livestream and audience-engagement platform built for environmental campaigns — covers production flow, scenario orchestration and post-event capture.

    View case→
  • EasysAI · Golf Scoring System

    EasysAI · Golf Scoring System

    End-to-end tournament registration, real-time scoring and an achievement system that brings golf tournaments online.

    View case→
Engineering Delivery

If you're staring at a concreteengineering problem, we can skip the small talk and start from scope, integrations and milestones.

These projects usually involve business systems, device integration, AI workflows or multi-role back-offices. We assess feasibility against real delivery constraints and give recommendations close to the implementation stage.

Business email
contact@boilingwater.cn
Office
10F, South Tower, Kingkey Yujing Times, Longgang District, Shenzhen

Please complete Cloudflare verification before submitting.

By submitting, you agree we'll use your information only for this consultation — never for unrelated marketing.

Wavesteam

Wavesteam ships production-grade AI software for B2B teams — mini programs, business systems, AI workflows, industry platforms and long-term engineering support.

Contact
© 2026 Wavesteam Technology. All rights reserved.
Email:contact@boilingwater.cnOffice:10F, South Tower, Kingkey Yujing Times, Longgang District, Shenzhen