Automated Report Generation System
Developed a backend-driven analytics and automated report generation platform with structured outputs and real-time data processing.
Year
2023
Category
Automation
Stack
NestJS, Flutter, PostgreSQL

Key Results
4 min
Per Report
down from 6–8 hrs
3×
Report Volume
no added headcount
180+
Monthly Reports
fully automated
0
Format Errors
since launch
01
The Challenge
A healthcare consultancy was producing 50+ client reports per month entirely by hand — pulling data from three incompatible systems, formatting in Word, and emailing PDFs. Each report took 6–8 hours. The team couldn't take on new clients without hiring, but hiring wasn't in the budget.
02
What We Built
A NestJS backend with BullMQ job queues for async report generation, a PostgreSQL warehouse normalising inputs from all three source systems, and a WeasyPrint PDF renderer driven by HTML templates. A Flutter app lets account managers trigger and track reports from their phones.
03
Results
Report generation time dropped from 6–8 hours to 4 minutes. Monthly report volume tripled without adding headcount. Formatting errors fell to zero. The system now generates over 180 reports per month automatically.
04
Before & After
Before
Analysts manually exported data from three incompatible systems every report cycle
Each report took 6–8 hours of senior analyst time to compile and format
Word document formatting was inconsistent — every report looked slightly different
Reports were emailed manually — no tracking, no audit trail, no self-serve
After
Scheduled ETL jobs pull, normalise, and warehouse data from all three sources automatically
Reports generated in under 4 minutes via an async job queue
HTML templates with WeasyPrint produce pixel-perfect, brand-consistent PDFs every time
Clients receive a secure download link automatically. Account managers track delivery in the Flutter app
05
How We Built It
We started with data before UI — until the ETL pipeline and warehouse schema were solid, building a report generator on unreliable data would have been wasted effort.
Data audit & warehouse design
Audited all three source systems, mapped their schemas, and designed a normalised PostgreSQL warehouse. Built ETL jobs in NestJS scheduled tasks that run nightly and on-demand.
Report template system
Designed HTML/CSS report templates with the client's brand team. Built the NestJS template engine that injects warehouse data into templates and passes the result to WeasyPrint for PDF rendering.
BullMQ job queue & delivery
Wrapped report generation in a BullMQ job so the API returns immediately with a job ID. Jobs run in workers, upload the finished PDF to S3, and notify the client with a signed download URL.
Flutter management app
Built the Flutter app for triggering ad-hoc reports, monitoring job status via WebSocket, configuring delivery schedules, and viewing report history with download links.
06
System Architecture
A NestJS API handles requests. BullMQ workers do the heavy lifting asynchronously. WeasyPrint runs as a sidecar process. Generated PDFs go to S3 — the app only ever touches signed URLs.
ETL Pipeline
NestJS Scheduled Tasks
Three NestJS services, one per source system, run on a nightly cron. Each pulls incremental data, normalises to the warehouse schema, and upserts to PostgreSQL.
Data Warehouse
PostgreSQL
Normalised schema as the single source of truth. Materialized views pre-compute expensive aggregations so report generation runs fast SELECTs, not heavy joins.
Report API
NestJS REST
Accepts report trigger requests, validates parameters, enqueues a BullMQ job, and returns a job ID immediately. Exposes job status and history endpoints.
Job Queue
BullMQ + Redis
Workers pick up report jobs, query the warehouse, render the HTML template, call WeasyPrint, and upload the PDF to S3. Redis stores job state for real-time status updates.
PDF Renderer
WeasyPrint
Runs as a Python sidecar called via child_process from NestJS. Converts fully-rendered HTML/CSS to PDF with pixel-perfect fidelity to designed templates.
Flutter App
Flutter + Riverpod + WebSocket
Account managers trigger reports, monitor job progress via WebSocket subscription, and download finished PDFs. Riverpod manages real-time job state.
07
Tech Stack
Backend
Queue
Mobile
Database
Storage
08
How We Approached the Problem
The core insight was that this problem was 90% data plumbing and 10% PDF generation. Most teams would reach for a reporting tool. We went the other way — warehouse the data first, then report generation becomes a simple template render against clean, reliable data.
Alternatives considered & rejected
Off-the-shelf reporting tool (Metabase / Looker)
These tools are excellent for internal dashboards but poor for client-facing branded PDFs. The client needed reports that looked like they came from their design team.
Puppeteer for PDF generation
Puppeteer added a full Chromium dependency. WeasyPrint produces equivalent output from pure HTML/CSS with a much smaller footprint and is easier to run in Docker.
Direct DB connections to source systems
Two source systems were SaaS products with rate-limited APIs. The ETL-to-warehouse approach decoupled us from source system availability and rate limits entirely.
09
Data Modelling
The warehouse schema is deliberately denormalised for read performance. Materialized views pre-aggregate KPIs so the template render is a simple SELECT, not a multi-join aggregation.
10
API Layer
ReportService enqueues BullMQ jobs and returns immediately. The ReportProcessor runs in a worker — queries the warehouse, renders the template, generates the PDF, uploads to S3, and updates the job record.
11
Database Functions
Refresh the materialized view after each ETL run and fetch report history with a single query for the Flutter history screen.
12
Frontend Connection
The Flutter app triggers jobs via REST and tracks real-time progress via WebSocket. A Riverpod notifier subscribes to the WebSocket gateway and rebuilds the UI as job state changes — no manual refresh needed.
13
Lessons Learned
Warehouse first, reporting second
Starting with a clean warehouse meant report generation was a 2-day task once the data was right. The ETL work took 3 weeks — that ratio was exactly correct and paid dividends immediately.
Materialized views are underused
Every expensive aggregation timed out in production against real data volumes. Moving them to materialized views — refreshed after each ETL run — dropped report query time from 45 seconds to under 200ms.
WeasyPrint needs explicit page break rules
Tables split across pages at awkward points. Adding CSS page-break-inside: avoid on rows and page-break-before: always on section headers gave clean pagination without custom logic.
BullMQ progress events need throttling
The processor emitted a progress event on every DB row processed, flooding the WebSocket gateway. Throttling to once per second reduced WebSocket traffic by 95% with no noticeable UX difference.
Start a project

