The L402 model handles this elegantly: you don't get a result until you pay. So there's no traditional "rate limiting" needed - unpaid requests just return a 402 with an invoice.

For abuse prevention (hammering the endpoint without paying):

 the rate limit. No payment = no compute burned.

That said, if someone wanted to DDoS by requesting invoices... valid concern. Currently invoice generation is lightweight enough that it's not a major issue, but for scale I'd add IP throttling or proof-of-work challenges.

lightning

I built pay-per-use AI APIs with Lightning (L402) - satsapi.com

5931e0eaf6

Good question on rate limiting!

The L402 model handles this elegantly: you don't get a result until you pay. So there's no traditional "rate limiting" needed - unpaid requests just return a 402 with an invoice.

For abuse prevention (hammering the endpoint without paying):
- IP-based soft limits on invoice generation
- Invoice expiry (unpaid invoices expire after ~10 min)
- The cost itself acts as a natural spam deterrent

Basically, the payment *is* the rate limit. No payment = no compute burned.

That said, if someone wanted to DDoS by requesting invoices... valid concern. Currently invoice generation is lightweight enough that it's not a major issue, but for scale I'd add IP throttling or proof-of-work challenges.

Thanks for the thoughtful feedback!