Identity Verification6 min read

Face Verification, Matching & Deduplication: The Complete Guide

Face verification is a 1:1 check that confirms two faces are the same person; deduplication extends it to a 1:N database search. Both run on 512-dimensional embeddings through one Taareef API.

Every identity product eventually has to answer two questions from a face: is this the same person we expect? and have we seen this person before? The first is face verification — a 1:1 check that compares a live selfie against a reference photo. The second is face deduplication — a 1:N search across a database of enrolled faces. Both run on the same underlying idea: turn a face into numbers and measure the distance between them.

Together with liveness detection, these are the building blocks of modern identity verification, KYC, and fraud prevention. This guide walks through each one in plain terms — verification, matching, embeddings, deduplication, and the trade-offs that decide whether a match is accepted — and shows how Taareef does all of it through a single API.

What is face verification?

Face verification is a 1:1 check that confirms whether two faces belong to the same person, by comparing one face against a single reference — for example, a live selfie against the photo on a government ID. The service scores how similar the two faces are, and if that score clears your threshold, the identity is confirmed.

It's a claim check, not a database lookup: verification only confirms or rejects the identity a user has already claimed. That makes it the natural fit for digital onboarding, account recovery, and remote KYC.

Live selfie
Captured now
ID photo
Reference face
0.97 · Verified
Above threshold
1:1 verification — one live face compared against one reference face

Face verification vs. face identification

The difference comes down to how many faces you compare against — one, or many.

Verification (1:1)Identification (1:N)
Question"Is this the claimed person?""Who is this — and have we seen them?"
ComparesOne face against one referenceOne face against a whole database
Typical useSelfie-to-ID onboarding, loginDeduplication, watchlist search

What is a face embedding?

Before any faces can be compared, each one has to be turned into something a computer can measure. Face extraction detects the face in an image or video and separates it from the background; face embedding then converts that face's geometry — shape, contours, and key landmark points — into a compact list of numbers, called a vector.

Taareef produces a 512-dimensional face embedding through a single API, along with the face's location (a bounding box), a detection-confidence score, and its key landmark points.

Two photos of the same person produce embeddings that sit close together in vector space; two different people's embeddings sit far apart. That distance is what powers accurate, real-time matching at scale.

Crucially, embeddings are one-way — a vector can't be reverse-engineered back into the original photo. Storing embeddings instead of raw images is therefore meaningfully safer.

512-d embedding
[0.02, -0.15, 0.87, 0.34, …]
Distance = similarity
Same personclose
Different peoplefar
Each face becomes a 512-dimensional vector; similarity is the distance between vectors

What does face extraction return?

For every detected face, Taareef returns a complete, structured result you can act on downstream:

  • 512-dimensional embedding — the vector used for matching, comparison, and deduplication
  • Bounding box — the face's location in the image, for cropping, UI overlays, and audit trails
  • Detection score — the model's confidence that a face was found, for quality gating
  • Landmark points — eyes, nose, and contour points, used to align the face before the vector is generated
  • Aligned face crop — a normalized version of the face, useful for review and logging

Can you extract a face from a video?

Yes. Taareef samples frames from a video, runs face detection on each, and selects the best frame to work from. Once a face is found, the same extraction pipeline applies — bounding box, landmark points, detection score, and embedding — which is what makes video-based liveness checks and identity verification from recorded clips possible.

What is face matching?

Face matching determines whether two faces belong to the same person by comparing their embeddings. Given a selfie (image or video) and a document, the service detects and embeds both faces, measures their similarity using cosine similarity, and returns a normalized confidence score between 0 and 1 — the higher the score, the more similar the faces. The response also reports which detector and recognition model produced the result.

What is a good match threshold?

There's no single universal threshold — the right one depends on your distance metric (cosine similarity vs. Euclidean distance) and your risk tolerance. The common practice is to set it against an acceptable False Accept Rate (FAR): a stricter threshold produces fewer false accepts, while a more lenient one produces fewer false rejects. Tune it on your own data, review it over time, and pair it with liveness detection so a high score alone can't wave through a spoof.

What is face deduplication?

Face deduplication extends matching from 1:1 to a 1:N search — scanning a new face against every face already enrolled to see if it's a repeat. It's how you stop one person from enrolling twice: duplicate accounts, sign-up-bonus abuse, and re-registration under a new name all get caught before they're created.

New sign-up
Query face
Enrolled faces · 1:N search
Duplicate found
Re-enrollment blocked
1:N deduplication — one face searched against every enrolled face to catch a repeat

What happens when a face is hard to detect?

Poor lighting, sharp angles, occlusions, low resolution, or motion blur can push detection confidence down — or return no detection at all. Forcing a match on a bad capture risks unreliable embeddings and false rejects. To keep low-quality selfies and scans from being thrown out unfairly, Taareef falls back to a secondary detector, giving borderline captures a fair chance at matching without lowering the bar on accuracy.

From matching to a full KYC check

A face match proves two faces are the same — not that a real person is present. That gap is where presentation attacks and deepfakes live, so Taareef runs passive liveness detection and deepfake and spoof protection on the same endpoint as verification.

Because face verification and document OCR share that endpoint, a selfie-to-ID KYC flow — read the Emirates ID, match the cardholder photo to a live selfie, confirm liveness — becomes a couple of API calls against one vendor instead of a stitched-together stack.

Where face verification is used

  • Digital onboarding & KYC — match a selfie to an ID photo so account opening finishes remotely, in one sitting.
  • Account recovery — re-verify a returning user by face instead of leaning on knowledge-based questions.
  • Access control — confirm identity at a door, kiosk, or device without a card or PIN.
  • Duplicate-account prevention — run 1:N deduplication at sign-up to block repeat and bonus-abuse enrollments.
  • Remote identity proofing — verify identity for high-value actions without an in-person visit.

Why it matters

Face verification, matching, and deduplication aren't three separate integrations — they're the same embedding pipeline pointed at different questions. Verification asks is this the claimed person?, deduplication asks have we seen them before?, and liveness asks is a real person here right now?

Answer all three through one API, and identity verification stops being a chain of vendors and becomes a step in your onboarding flow — accurate, tunable to your own risk tolerance, and fast enough to run in real time.

Ready to integrate?

Get 100 free credits and turn any identity document into structured JSON — one API, 195+ countries, no credit card.