Skip to Content
Overview

RollFormat

RollFormat converts messy, non-standardized rent roll Excel files into a clean, canonical format. Drop in any rent roll XLSX and get back a standardized spreadsheet — same 40-field format every time, regardless of the source.

How It Works

The pipeline follows a consistent data flow:

FileSheetsTablesUnitsClean Output

  • Sheet — a tab in the Excel file. Most rent rolls have 1–3.
  • Table — a contiguous block of rows within a Sheet. One Sheet often has several: unit data, summaries, metadata.
  • Unit — a single apartment or space extracted from a Table. Can be one row or multiple rows depending on layout.

Pipeline Stages

  1. Upload — User uploads an .xlsx or .xls file (max 10 MB)
  2. Scan — Every Sheet is scanned in parallel to find Tables and classify them
  3. Analyze — Deep structure analysis: column mapping, row boundaries, charge layout detection
  4. Review — User reviews and optionally overrides the detected structure
  5. ExtractUnits are extracted via LLM (Claude Sonnet, parallel chunks)
  6. Validate — Per-Unit quality checks: missing IDs, duplicates, charge mismatches
  7. OutputClean Output XLSX with 40+ canonical fields per Unit, plus a validation report sheet

Tech Stack

LayerTechnology
FrontendNext.js 16 (App Router), React 19, Mantine 8
BackendNext.js API Routes (serverless)
DatabaseSupabase (PostgreSQL + Realtime)
StorageSupabase Storage
AIAnthropic Claude (Opus for structure, Sonnet for extraction)
ParsingXLSX.js
HostingVercel
  • Architecture — System design, module inventory, data flow
  • Pipeline — Detailed breakdown of each processing phase
  • Canonical Schema — All 40+ output fields documented
  • API Reference — Every endpoint with request/response formats
  • Database — Schema, migrations, realtime setup
  • Frontend — Hooks, components, demo mode
Last updated on