The Goal
I wanted to put an AI API behind a Lightning paywall. The requirements: no monthly hosting cost, instant payments, and a standard protocol that any client can implement. I ended up with a Cloudflare Worker that charges 21 sats per request using L402.
What is L402?
L402 is a protocol built on HTTP 402 (Payment Required) — the status code that's been "reserved for future use" since 1999. The future arrived.
The flow is simple:
- Client sends a request to a paid endpoint
- Server returns HTTP 402 with a Lightning invoice in the
WWW-Authenticateheader - Client pays the invoice
- Client retries the request with the payment hash as proof
- Server verifies payment and returns the response
It's like HTTP Basic Auth, but instead of a password, you prove you paid.
The Stack
Total monthly cost: $0.
- Cloudflare Workers — 100,000 requests/day free tier. Runs JavaScript at the edge.
- LNbits — self-hosted Lightning wallet with a REST API for creating and checking invoices.
- Groq — free tier AI inference. Fast enough for real-time API responses.
Creating Invoices
When a request hits a paid endpoint without payment proof, I create a Lightning invoice via LNbits:
async function createInvoice(env, amount, memo) {
const resp = await fetch(env.WALLET_API + '/api/v1/payments', {
method: 'POST',
headers: {
'X-Api-Key': env.API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({ out: false, amount, memo })
});
if (resp.status !== 201) return null;
return resp.json();
// Returns: { payment_hash, payment_request }
}
The payment_request is a BOLT11 invoice string. The payment_hash is the unique identifier for this payment.
The 402 Response
The key to L402 is the WWW-Authenticate header. It tells the client exactly how to pay:
return new Response(JSON.stringify({
status: "payment_required",
message: "Pay 21 sats to access this endpoint",
price_sats: 21,
payment_request: invoice.payment_request,
payment_hash: invoice.payment_hash,
}), {
status: 402,
headers: {
"Content-Type": "application/json",
"WWW-Authenticate": `L402 invoice="${invoice.payment_request}", payment_hash="${invoice.payment_hash}"`,
},
});
Any L402-compatible client (or a human with a Lightning wallet) can parse this, pay the invoice, and retry.
Checking Payment
When the client retries with a payment hash, I verify it against LNbits:
async function checkPayment(env, hash) {
const resp = await fetch(
env.WALLET_API + '/api/v1/payments/' + hash,
{ headers: { 'X-Api-Key': env.API_KEY } }
);
const data = await resp.json();
return data.paid === true;
}
If paid is true, the client gets their response. If not, another 402.
Free Tier Rate Limiting
I added a free tier: 1 request per IP per 24 hours. This uses Cloudflare's Cache API as a lightweight rate limiter:
const cacheKey = new Request('https://rate/' + clientIP);
const cached = await caches.default.match(cacheKey);
if (!cached) {
// First request — serve for free, cache the IP
await caches.default.put(cacheKey,
new Response('used', { headers: { 'Cache-Control': 'max-age=86400' } })
);
return serveFreeResponse();
}
// Already used free tier — require payment
No database needed. The cache expires automatically after 24 hours.
x402: USDC Payments Too
I also added x402 support for clients that prefer to pay with USDC on Base Sepolia. Same pattern as L402 but with an EVM transaction hash instead of a Lightning payment hash. This opens the API to Ethereum-native clients without forcing them into Lightning.
Testing It
# First request is free
curl -X POST https://maximumsats.com/api/dvm \
-H "Content-Type: application/json" \
-d '{"prompt": "Summarize the Bitcoin whitepaper"}'
# Second request returns 402 with invoice
# Pay the invoice, then retry with payment_hash
curl -X POST https://maximumsats.com/api/dvm \
-H "Content-Type: application/json" \
-d '{"prompt": "Summarize the Bitcoin whitepaper", "payment_hash": "<hash>"}'
What I'd Do Differently
- Invoice expiry. I should set a 15-minute TTL on invoices. Right now they linger.
- Idempotency. If the payment confirms but my response fails to send, the client has to pay again. A payment-hash-to-response cache would fix this.
- Streaming. For longer AI responses, Server-Sent Events after payment would be better than waiting for the full response.
The full worker code is deployed at maximumsats.com. Total infrastructure cost remains $0/month.