PBC Index logo

Behind the scenes

What happens when you search

You type a company name. Seconds later it is either in the index, with its sources, or you get a plain reason why not. Here is the whole journey, in five steps.

You
Check
Find
Decide
Publish

01

We check you're real

A quick, invisible check makes sure a real person is asking, not a bot or a flood of traffic.

Vercel BotID · Firestore fixed-window rate limit, 8/min per IP

02

Maybe we already know

If the company is already listed, or we looked it up recently, you get the answer instantly.

Firestore lookups cache · 30-day TTL · keyed by normalized name

03

Exa reads the live web

For a new company, Exa searches the open web and hands back the six most relevant pages, with their actual text, so we are reading real sources, not guessing. The classifier may only cite pages Exa returned.

Exa Search API (type: auto) + Contents API · text, highlights, summaries · 6 results

04

Gemini weighs the evidence

Google's Gemini reads those pages and decides one thing: is this company's mission locked in by how it is governed?

Gemini 3.1 Flash-Lite on Vertex AI · temperature 0 · structured JSON (responseJsonSchema)

05

Publish, or explain

Confident and well-sourced? The company joins the index automatically, with a little confetti. If not, we show exactly why, or offer a quick human review.

confidence ≥ 0.65 · Next.js Cache Tags · revalidateTag(companies-list)

Outcome A

Published

Added to the index with its sources, ready to share.

Outcome B

Explained or queued

A clear reason it does not qualify, or a path to human review.

What it runs on

Exa

exa-js · Search API (type: auto) + Contents API: text, highlights, summaries

Google Gemini

@google/genai · gemini-3.1-flash-lite · Vertex AI (global) · structured output

Vercel

Functions on Fluid Compute · BotID · Cache Tags / ISR

Next.js · React

16 App Router · React 19 · SSG + ISR · Zod + Ajv validation

Google Firestore

firebase-admin · REST transport · nam5 · cache + rate-limit store

PostHog

posthog-js · autocapture · reverse-proxied via /ingest

Notes for engineers

  • · Firestore runs over REST (preferRest) to dodge the gRPC cold-start hang on serverless Functions.
  • · The Gemini call sets the @google/genai httpOptions.timeout (a real fetch abort) and retries once, so a dead keep-alive socket can't wedge a warm instance.
  • · Scoped cache tags: a publish revalidates only companies-list, not every company page.

Architecture overview · informational. Back to the index.