Skip to Content
FrontendHooks

Hooks

The frontend uses three custom hooks to manage the ETL pipeline lifecycle, real-time progress tracking, and a client-side demo simulation.

useEtlPipeline

File: hooks/useEtlPipeline.ts

Orchestrates the full ETL pipeline from the client side. Manages file upload, structure analysis, user review, and data extraction as a linear state machine with cancellation support.

State Machine

idle --> uploading --> analyzing --> awaiting_review --> extracting --> complete | | | | | +----------+-------------+--------------+------------------+--> failed | | | | | +----------+-------------+--------------+------------------+--> cancelled
StateDescription
idleNo pipeline running. Initial state.
uploadingFile is being sent to /api/upload via FormData
analyzingStructure analysis in progress via /api/analyze
awaiting_reviewPipeline paused. Structure result is available for user review. The user must call confirmStructure() to continue.
extractingData extraction in progress via /api/extract
completePipeline finished successfully
failedAn error occurred at any stage
cancelledUser cancelled via the cancel() method

Return Type: PipelineResult

type FailedStep = "upload" | "analyze" | "extract" | "confirm" | null; interface PipelineResult { state: PipelineState; jobId: string | null; error: string | null; failedStep: FailedStep; structureResult: StructureResult | null; start: (file: File) => Promise<void>; confirmStructure: (overrides?: Record<string, unknown>) => Promise<void>; cancel: () => void; reset: () => void; }

StructureResult

interface StructureResult { column_mapping?: Record<string, string>; column_headers?: Record<string, string>; header_row?: number; data_start_row?: number; data_end_row?: number; charge_orientation?: string; multi_row_per_unit?: boolean; property_name?: string; report_date?: string; notes?: string; }

API Calls

MethodEndpointWhenBody
start()POST /api/uploadUploading stateFormData with the file
start()POST /api/analyzeAnalyzing state{ job_id }
confirmStructure()POST /api/structure-confirmAfter review{ job_id, structure_overrides? }
confirmStructure() / start()POST /api/extractExtracting state{ job_id }

All requests include auth headers from getAuthHeaders() and support AbortController cancellation.

Key Behaviors

  • Review checkpoint: After analysis, if structure_result exists in the database, the pipeline pauses at awaiting_review. If no structure result is found, it skips directly to extraction.
  • Cancellation: Each phase (analyze, extract) gets its own AbortController, which is cleared at phase boundaries. The cancel() method aborts the current controller, nulls the ref, and updates the Supabase job row to status: 'failed' with stage_message: 'Cancelled by user'.
  • Error extraction: Handles non-JSON error responses from the server (e.g., Vercel 504 timeouts return HTML). Falls back to generic messages for 504 and 502 status codes.
  • Reset: Aborts any in-flight request, clears all state, and returns to idle.

useJobProgress

File: hooks/useJobProgress.ts

Subscribes to real-time updates for a specific ETL job via Supabase Realtime. Maintains a rolling list of milestone messages for the timeline UI.

Usage

const progress = useJobProgress(jobId); // progress.status, progress.progress_pct, progress.milestones, etc.

JobProgress Interface

interface JobProgress { status: string; progress_pct: number; stage_message: string | null; property_name: string | null; report_date: string | null; unit_count: number | null; error_count: number; warning_count: number; errors: string[]; warnings: string[]; output_storage_path: string | null; original_filename: string | null; milestones: Milestone[]; total_rows: number | null; total_chunks: number | null; file_size_bytes: number | null; extraction_method: string | null; spot_check_confidence: number | null; spot_check_discrepancies: SpotCheckDiscrepancy[]; unit_breakdown: UnitBreakdown | null; expected_data_rows: number | null; }

Milestone

interface Milestone { message: string; pct: number; timestamp: number; // Date.now() when received }

Milestones are accumulated client-side in a ref. Each new stage_message from the database becomes a milestone entry. Duplicate consecutive messages are deduplicated.

SpotCheckDiscrepancy

interface SpotCheckDiscrepancy { unit_id: string; field: string; extracted: unknown; expected: unknown; severity: "error" | "warning"; explanation: string; }

UnitBreakdown

interface UnitBreakdown { successful: number; flagged: number; failed: number; total: number; }

Supabase Realtime Subscription

On mount (when jobId is non-null), the hook:

  1. Fetches initial state — queries etl_jobs for all tracked columns using supabase.from('etl_jobs').select(...).
  2. Subscribes to changes — opens a Supabase Realtime channel named etl_job_{jobId} listening for postgres_changes UPDATE events on the etl_jobs table, filtered by id=eq.{jobId}.
  3. Applies updates — each payload is mapped into the JobProgress shape and a new milestone is appended if the stage_message changed.
  4. Cleans up — on unmount or jobId change, removes the channel via supabase.removeChannel(channel).

Default State

When no jobId is provided (or on reset), the hook returns DEFAULT_PROGRESS:

const DEFAULT_PROGRESS: JobProgress = { status: "pending", progress_pct: 0, stage_message: null, property_name: null, report_date: null, unit_count: null, error_count: 0, warning_count: 0, errors: [], warnings: [], output_storage_path: null, original_filename: null, milestones: [], total_rows: null, total_chunks: null, file_size_bytes: null, extraction_method: null, spot_check_confidence: null, spot_check_discrepancies: [], unit_breakdown: null, expected_data_rows: null, };

useDemoMode

File: hooks/useDemoMode.ts

Client-side simulation of the full ETL pipeline. Makes zero API calls — runs entirely from hardcoded step data and timers. Used on the public landing page to demonstrate the pipeline experience.

See the Demo Mode page for full documentation.

Return Type

{ active: boolean; // Whether the demo is currently running state: string; // Current simulated pipeline state progress: JobProgress; // Full JobProgress object (same shape as real pipeline) start: () => void; // Begin the demo simulation reset: () => void; // Stop and reset to idle }
Last updated on