POST /api/spot-check

Re-runs the spot-check audit on an already-extracted job without re-running extraction. Useful when you want to re-validate extraction accuracy after a pipeline update or to get a fresh confidence score.

Request


{
  "job_id": "uuid-string"
}

Headers: Authorization: Bearer <token>

Max duration: 120 seconds (Vercel serverless timeout)

Prerequisites

The job must have an output_storage_path (i.e., /api/extract must have completed successfully). If there is no output file, the endpoint returns a 400 error.

Processing Steps

Authenticate via checkAuth()
Fetch job and verify it has an output file
Download original file from Supabase Storage and parse into CellGrid
Download output file and read extracted rows from the Rent Roll sheet via XLSX parser
Run spot-check — call spotCheckExtraction() to independently re-extract a sample of units and compare against the existing output
Update job — merge new spot-check results into structure_result and update warnings

Job Updates

The following are updated on the job row:

structure_result._spot_check_confidence — new confidence score
structure_result._spot_check_discrepancies — new discrepancy list
warnings — old spot-check warnings are removed and replaced if confidence < 50
warning_count — recalculated from updated warnings array

Response

Success (200):


{
  "job_id": "uuid-string",
  "spot_check_confidence": 92,
  "discrepancies": 1,
  "sample_size": 8
}

Field	Type	Description
`job_id`	string	The job UUID
`spot_check_confidence`	number	0–100 confidence score from the audit
`discrepancies`	number	Count of discrepancies found (not the full discrepancy objects)
`sample_size`	number	Number of units sampled for the audit

Validation Errors

Check	Status	Message
Missing `job_id`	400	`Missing job_id`
Job not found	500	Error message from database or `"Job not found"`
No output file	400	`No output file — run extraction first`
No rows in output	400	`No rows in output file`
Missing Rent Roll sheet	500	`Output file missing 'Rent Roll' sheet`
Bad/missing token	401	`Unauthorized`

Error (500):


{
  "error": "Something went wrong. Please try again."
}

Notes

This endpoint does not re-run extraction. It reads the existing output file and re-audits a sample against the source spreadsheet.
The spot-check warning threshold is confidence < 50. If the new confidence is >= 50, any previous spot-check warning is removed.
Error messages are sanitized before returning to the client (same sanitization as /api/extract).