Error Handling
Production integrations should fail clearly and recover when recovery is possible. Separate validation problems, authentication failures, rate limits, empty results, and transient outages so each one gets the right response.
Classify errors before retrying
Do not retry every error
Retry server errors, network failures, and safe timeouts. Fix the request for 400-range validation failures before sending it again.
Handle 429 separately
Rate limits need pacing, backoff, a queue, or plan changes. Immediate repeated retries usually make the issue worse.
Fail closed on authentication
If a key is missing, invalid, expired, or unauthorized, stop the workflow and alert the operator. Do not keep retrying with the same credentials.
Protect the user experience
Show graceful fallbacks
For user-facing apps, display cached data, partial results, empty states, or a clear retry state when a live request fails.
Keep internal details private
Do not expose provider messages, keys, hosts, request headers, or stack traces to end users. Log details internally and show a clear product-level message.
Make support easier
Attach a request ID, job ID, or correlation ID to failures so your team can trace the event without asking users to reproduce it from scratch.