Back to Case Studies
Engineering Case StudyMachine LearningComputer Vision

Face Recognition Attendance System

Engineering a production-grade face recognition pipeline using InsightFace embeddings and cosine similarity — no model training at deployment time.

September 2025
3 min read
Devkumar Patel
Face RecognitionInsightFaceApplied ML
Face Recognition Attendance System

Problem

Automate classroom attendance. One photo, 10–15 students recognized in real time, attendance logged with timestamps.

Hard constraints:

  • Only 2–3 enrollment images per student
  • No model retraining at deployment
  • Low false acceptance rate (wrong person marked = unacceptable)

Architecture Decision

Closed-set classifiers (softmax over student IDs) were rejected — they require full retraining per new student. Chose open-set metric learning instead.

StageComponent
Face DetectionInsightFace (handles alignment internally)
Feature ExtractionArcFace — 512-D L2-normalized embedding
SimilarityCosine similarity
DecisionEmpirically calibrated threshold
StorageLocal serialized embeddings (float32)
OutputTimestamped CSV attendance log

No training code runs in production. The system is entirely inference + similarity search.


Failures & Fixes

FailureRoot CauseFix
Wrong person recognized with high confidenceEmbeddings not L2-normalized before comparisonExplicit normalization after extraction
Recognition quality degraded silently over timeEmbeddings stored as Python lists, precision lost on reloadEnforced float32, validated shape on load
Same student marked multiple times per sessionNo temporal deduplication logicPer-day in-memory cache — one entry per identity per date
Worked in tests, failed in real classroomTest images too clean, no pose/lighting variationMulti-angle enrollment, averaged prototype embeddings

Threshold Calibration

Initial threshold of 0.6 (intuition-based) caused silent false positives — different people matched with high confidence.

Calibration approach:

  1. Compute similarity distributions for same-identity and different-identity pairs
  2. Find the boundary where distributions separate
  3. Bias toward lower FAR (false acceptance), even at cost of higher FRR (false rejection)
Same-identity avg similarity:      0.75
Different-identity avg similarity:  0.25
Final threshold chosen:             0.40

This isn't calibration by metrics — it's a deliberate business decision. A wrong match in attendance is worse than a missed one.


Lessons

  1. Embedding-based systems fail silently without similarity distribution validation
  2. Most bugs were representation bugs, not model bugs
  3. Threshold selection is a business constraint, not a purely mathematical one
  4. Production ML is about containing failure modes — not maximizing accuracy numbers

Have a Better Approach?

Open-set face recognition has many valid approaches. If you know a better calibration method, a more robust embedding model for low-resource settings, or have thoughts on liveness detection, I'd love to hear it.

Get in touch →