Building a Standard Error Response for Our APIs - Part I

Errors are inevitable. Systems fail, data comes in broken, or networks time out. As API developers and architects, we cannot prevent every failure. But we can control how we communicate errors back to consumers.

In this blog post, we will build an example of a standard error response. We will define a set of fields, datatypes, and values chosen specifically for this tutorial. Your organization may have different needs and preferences, and that is expected. What matters most is not the exact structure we choose here, but the goal: to design a consistent, clear, and actionable error response that fits your context.


Why We Need a Standard Error Response

Without a shared structure, error responses become confusing. One API may return only an HTTP status. Another might include a vague message. A third might provide a wall of technical logs. API Consumers will struggle to understand what went wrong and what to do next.

When we standardize error responses, we:
  • Simplify debugging. Developers know exactly where to look.
  • Improve traceability. Errors can be followed across systems and layers.
  • Strengthen consistency. Every API in our organization speaks the same language.
  • Reduce frustration. Consumers spend less time guessing and more time fixing.
Without this, the opposite happens: fragmented error handling, wasted hours of troubleshooting, and unreliable integrations.


What Should an Error Response Tell Us?

Every error response should answer three questions:
  1. What went wrong? The type of failure and human-readable message.
  2. Where did it happen? Which system, application, or layer raised the error.
  3. What should I do next? Should I retry, fix an input, or escalate the issue?
Who looks at these responses?
  • API consumers want to fix their request and know if they can retry.
  • Support teams want to trace the error across logs and systems.
  • Architects want to diagnose systemic issues across layers.
Example Scenarios
  • A mobile app receives a VALIDATION_ERROR for a missing required field. The developer corrects the payload.
  • A partner API client gets a CONNECTIVITY_ERROR. The client retries later.
  • An operations team sees an AUTH_ERROR in production. They investigate expired credentials.


Fields in Our Standard Error Response

Let’s start building our Standard Error Response. In this tutorial, this is the list of fields that will be part of our Error Response (you might have more or less, there’s not a unique standard error response that works for all organizations):

FieldWhy it’s useful
timestampWhen the error occurred. Useful for debugging across systems.
correlationIdA unique ID per request, carried across APIs for traceability.
transactionId(optional)A business transaction identifier, if applicable.
applicationIdentifies which app generated the error.
environmentDistinguishes errors across dev, test, and prod.
apiLayerShows if the error came from System, Process, or Experience layer. Helps architects diagnose.
httpStatusThe HTTP status code (e.g., 400, 500).
errorCodeA canonical error code. Combines business and technical context.
errorCategoryA simple categorization such as VALIDATION_ERROR, AUTH_ERROR, TIMEOUT, CONNECTIVITY_ERROR.
messageA human-readable description for consumers.
detailsA key-value structure for field-specific or contextual errors.
retryableA Boolean flag showing if the client should retry.
docsLink (optional)A link to documentation for this error code.

Our error response must be consistent, but also flexible enough to handle different scenarios. To achieve this, we’ll dive deeper into some of those error response fields and extend our error taxonomy with them:
  • Error Category – A predefined group that classifies the error by type, ensuring clarity and consistency.
  • Error Code – A unique, canonical identifier that makes each error traceable and recognizable across systems.
  • Error Details – A structured breakdown of field-level or contextual information that explains exactly what went wrong.

Each sub-data type focuses on a specific aspect of the error, keeping our design structured and reusable.
By doing this, we avoid mixing too much information in a single field. We also make it easier for both humans and machines to read, parse, and act on the error.


Error Categories

Errors can come in many shapes, but we need a way to group them consistently. Categories act as buckets for errors, so that anyone reading the response understands the nature of the failure at a glance.
We defined ten categories:
  • VAL – Validation
  • AUTH – Authentication
  • AUTHZ – Authorization
  • RES – Resource not found
  • BUS – Business rule violation
  • DEP – Dependency failure
  • SYS – System error
  • NET – Network error
  • SEC – Security violation
  • UNK – Unknown
Each error code will reference exactly one category. This avoids ambiguity and makes it easier to design dashboards, alerts, or automated recovery logic. With these categories we’ll achieve:
  • Architects can see patterns of failure across environments.
  • Consumers can act differently depending on the error type (e.g., retry a NET error, fix input for a VAL error).
  • Organizations avoid a “wild west” of free-text error types that no one understands.
By limiting categories, we avoid chaos. Each error falls into one of these groups, which keeps diagnosis clean and simple.


Error Codes

The error code is a canonical identifier for the error. Its purpose is to give every error a stable name that developers, support teams, and business users can refer to.

We use the format:
<DOMAIN>-<CATEGORY>-NNN

Where:

  • DOMAIN – Which part of the business or system this error belongs to. Example: CUST for customer, PAY for payments.
  • CATEGORY – A short code that shows what type of error this is (taken from our predefined categories).
  • NNN – A number that uniquely identifies the error within the domain and category.
Examples:
  • CUST-VAL-001 → Customer domain, validation error, first in the series.
  • PAYM-BUS-010 → Payments domain, business error, tenth in the series.
This structure brings order. It prevents duplication. It also helps us search, filter, and analyze errors across logs and monitoring tools. With that, consumers can automate handling for specific codes and support teams have a common language when escalating issues.


Error Details

Not all errors are simple. Some affect specific fields, inputs, or contextual values. The details structure allows us to express these fine-grained issues without overloading the main error message. We define an ErrorDetail type with:
  • fieldName – Which field caused the error.
  • reason – Why it failed.
  • expectedValue – What the field should have been.
  • actualValue – What was actually received.
  • context – Any extra notes that help explain the problem.
Example: 

"details": {
"fieldName": "email",
"reason": "Invalid format",
"expectedValue": "A valid email address",
"actualValue": "abc123",
"context": "Email provided during customer signup"
}

With that example we achieve:
  • Consumers immediately know what to fix.
  • Errors affecting multiple fields can all be reported together.
  • Operations teams gain rich context for troubleshooting.

Conclusion

A standard error response is more than a convenience. It is a contract between our APIs and their consumers. It saves time, reduces confusion, and builds trust.

In this post, we defined why standardization matters and introduced the structure of our error response. In the next post, we will bring this model to life with RAML datatypes, examples, traits, and a shared library that developers can reuse across APIs.

Previous Post Next Post