Skip to main content
Back to Projects

Case Study

ebia-insights

A production analytics and semantic-search platform — RAG over a real corpus, statistically grounded trends, and data that refreshes without a redeploy.

EB-1A immigration appeals intelligence · Live · 2026

RAGVertex AICloud RunReact 19TypeScriptGCP
ebia-insights product landing page

By the numbers

The shape of the system

3,383
Decisions analyzed
7,682
Semantic chunks indexed
558
Denial reasons canonicalized
768-dim
Retrieval embeddings
~8,900
Lines of TypeScript
6
API endpoints

Overview

What it is, and the problem it solves

A gated intelligence dashboard over thousands of federal immigration appeal decisions. It pairs natural-language semantic search with criterion-level trend analytics, built as a clean, read-only consumer of an upstream data pipeline.

Practitioners reasoning about EB-1A petitions have no good way to ask, in plain language, what has actually persuaded the appeals office — or to see which arguments are gaining or losing ground over time. The raw decisions are unstructured, voluminous, and refresh continuously. The platform turns that corpus into searchable, trend-aware intelligence that stays current without manual intervention.

Engineering

Technical highlights

The decisions and techniques that make the system fast, current, and trustworthy.

Production RAG over a real corpus

Natural-language queries are embedded with Gemini (768-dim, retrieval-tuned), matched against a Vertex AI Vector Search index, then re-ranked with a hybrid relevance-plus-recency score (exponential decay, ~2-year half-life, normalized so semantic relevance still dominates). A lexical fallback keeps search working even when the embedding or vector service is unavailable.

Zero-redeploy data architecture

Every request checks a tiny version manifest; when the upstream pipeline publishes a new snapshot, the API loads the JSONL artifacts from object storage in parallel into in-memory maps and hot-swaps them. Fresh data goes live in seconds with no deploy, no cache-bust, and no downtime.

Shipped full-stack, solo

A React 19 + Vite SPA, an Express API on Cloud Run, Firebase authentication, and a Firestore beta-access gate enforced with atomic transactions — plus a custom design system with light/dark themes and reduced-motion support. One person, the whole stack, in production.

Analytics you can defend

Criterion trends are classified with a two-proportion z-test on volume-normalized shares (not raw counts), compared over fixed 12-month windows to eliminate partial-year bias, and gated on significance, sample size, and effect size — producing rising / falling / steady labels that hold up to scrutiny.

Architecture

How a request flows

Edge
Firebase Hosting serves the SPA over a CDN and rewrites /api/** to Cloud Run.
App
React 19 SPA — React Router, TanStack Query for server-state caching, Framer Motion transitions.
API
Express on Cloud Run: Firebase ID-token auth, a Firestore-backed access gate, and a small set of analytics + search endpoints.
Retrieval
Gemini query embeddings → Vertex AI Vector Search, with recency-aware re-ranking.
Data
A version-aware artifact store loads cases, reasons, and chunks from Cloud Storage and caches them in memory.
State
Firestore tracks beta-access seats with first-come-first-served atomic grants.

Stack

Tech stack

Frontend

React 19TypeScriptVite 6React Router 7TanStack QueryTailwind CSSRadix UIFramer Motion

Backend

Node 22Express 5ZodFirebase Admin SDK

AI / Retrieval

Gemini embeddings (768-dim)Vertex AI Vector Search

Cloud / Infra

Google Cloud RunFirebase HostingFirestoreCloud Storage

Quality

VitestSupertestTypeScript strict mode

Why this matters for your team

What this project demonstrates

Applied AI that survives contact with real data

I build retrieval systems — embeddings, vector search, hybrid ranking, sensible fallbacks — that work on messy production data, not curated notebook demos.

Architecture designed for operations

Data that refreshes without redeploys, least-privilege service accounts, atomic access control, and memory and cost sized to the workload. I design for the day after launch.

End-to-end delivery

SPA, API, authentication, infrastructure, tests, and documentation — shipped by one engineer. A single engagement can move the whole system, not just one layer.

Next step

Building something similar?

Bring the problem, the constraints, and the timeline. I’ll help you design and ship it.

Start an Inquiry