When using MuleSoft Object Store, one of the most important design decisions is whether or not to enable persistence. Not every use case requires it—but getting this choice wrong can lead to data loss, unexpected behavior, or unnecessary complexity. On the other hand, enabling persistence when it’s not needed can introduce unnecessary latency and degrade performance, especially in high-throughput applications.
In this post, we’ll explore how to decide if our app really needs persistence in the Object Store and walk through the key questions that guide that decision.
What is the Object Store in Mule?
MuleSoft Object Store is a key-value store provided as part of the Anypoint Platform, used to persist data across MuleSoft applications and runtimes. It's primarily used for storing temporary, transient, or shared state, such as:- API throttling and rate-limiting counters
- Cache data between flows
- Store tokens, IDs, or correlation data
- Store retry or watermark information in batch jobs
What does Persistence mean in the Object Store?
In the context of MuleSoft Object Store, persistence refers to the ability to store data in a durable way so that it survives application restarts, crashes, or redeployments. Unlike in-memory storage, which is lost when the Mule application restarts, a persistent Object Store ensures critical data remains intact and available across the lifecycle of the application. This is essential for scenarios like maintaining watermarks, managing retry states, or sharing tokens between flows and instances.By enabling persistence, we can make our integrations more reliable, fault-tolerant, and capable of handling stateful processing across distributed and scaled environments.
We should use Object Store with persistence when our MuleSoft application needs to retain state across application restarts, runtime crashes, or even redeployments.
Here’s a the list of question I'd use to understand when and why to use persistent object stores:
1. Does the data need to survive application restarts or redeployments?
We ask this question to determine whether the data we're storing is volatile or needs to be durable. In many MuleSoft applications, memory is cleared when the app is redeployed, scaled down, or restarted—especially in CloudHub and Runtime Fabric environments. If our application logic depends on data being available after such events, persistence becomes essential.We need persistence when the data must remain available even after the application stops or restarts. For example, if we're running a batch job that processes customer records based on a last-processed timestamp, and we store that timestamp in the Object Store, we must persist it to ensure the job resumes correctly after a failure.
On the other hand, we don’t need persistence if the data is only used temporarily within the same runtime instance. For example, if we're storing a variable for logging correlation within a single execution thread or flow, there's no need to persist it, since it loses relevance after the flow completes.
2. How long should the data live?
We ask about the expected lifespan of the data because short-lived and long-lived data typically require different storage strategies. If data is only relevant for a few milliseconds or seconds, it's often fine to store it in memory. But if the data must be available for minutes, hours, or days, then we should consider persistence.When we expect to reuse the data later, possibly long after it was created, persistence ensures that the data won’t disappear due to memory eviction or process termination. An example would be storing a session token retrieved from a third-party API that’s valid for 12 hours. If our application restarts, we still want to reuse that token rather than re-authenticate unnecessarily.
If, however, we’re caching a quick computation result for the next few seconds or simply passing temporary data between processors in a flow, there's no need to persist it. In-memory storage or flow variables will suffice.
3. What happens if the application crashes or is redeployed?
This question focuses on risk and recovery. We need to understand the business or operational impact of unexpected failures. If our application crashes and the data is lost, can we recover without issues? Or does that cause duplication, inconsistency, or data loss?We use persistence when the loss of data due to failure would compromise functionality or correctness. For instance, consider a system that sends email verification codes to users. If the app restarts and we’ve stored the codes only in memory, users won’t be able to verify themselves anymore. Persistence is needed to avoid broken user experiences or redundant work.
If nothing breaks and we can recompute or refetch the data easily, then persistence might be unnecessary. An example would be a cache of non-critical metadata that we can easily retrieve again if the app restarts.
4. Will this data be used across multiple flows or applications?
We ask this to understand the scope of data access. If the data must be shared across different flows or even different applications, then it must outlive the scope of a single thread or memory space. This typically points to the need for a persistent store.Persistence is required when the data must be accessed from multiple parts of the application over time. Suppose we generate a correlation ID in one flow and another flow later needs to look it up to track request progress. If we store the correlation ID in a named, persistent Object Store, both flows can safely access it.
If the data is only used within one flow and won’t be needed again, we don’t need persistence. A common example is storing user input temporarily to pass it between processors in the same request-response cycle.
5. What is the cost of data loss?
This question helps us evaluate the risk associated with losing the data. If the consequences of data loss are high—such as regulatory violations, broken business processes, or customer dissatisfaction—we should lean toward persistence.We need persistence when the data loss would require expensive reprocessing, violate SLAs, or lead to incorrect outcomes. For example, if we’re counting API requests for rate limiting and lose the counter, we might allow abuse or overuse of our APIs, resulting in financial or reputational damage.
Conversely, if the cost of data loss is negligible or we can easily restore the data, persistence may be overkill. Let’s say we temporarily cache the name of the last accessed customer record for convenience; if it disappears, we can simply reload it.
6. Is this part of a retry, deduplication, or watermarking mechanism?
This question targets specific patterns where state tracking is crucial for correctness. Retries, deduplication, and watermarks all rely on stored state to avoid reprocessing or to resume accurately after failure.We should use persistence when we're tracking identifiers, timestamps, or status values used to prevent duplication or resume safely. For example, in a deduplication use case, we may store previously processed message IDs in Object Store. If those IDs are lost, we risk processing the same message multiple times.
In scenarios where retries or duplicates are managed in a short-lived memory space, such as within a single synchronous HTTP flow, and we're not worried about crashes or parallelism, then persistence may not be required.
7. Is the Mule app running in multiple replicas (CloudHub 2.0, RTF)?
This question is about infrastructure. In distributed environments like CloudHub 2.0, Runtime Fabric, when multiple replicas are deployed for the same app, each runtime or replica has its own memory. If we store something in memory or an in-memory Object Store, it won't be visible to other instances.Persistence is necessary when we want data consistency across replicas. For example, if one replica stores a value that another replica needs to read—say, for coordinating retry attempts or token sharing—we must use a persistent Object Store.
If we’re running a single-replica, single-instance app where data is only used locally within that instance, in-memory or transient storage may be sufficient.
8. Is the data security-sensitive?
This question helps us evaluate whether persistence will introduce security concerns. When dealing with credentials, tokens, or personal data, we must consider how it’s stored, for how long, and in what format.We need persistence when secure data must be retained across invocations, but we must also ensure it’s encrypted and managed safely. An example is storing an OAuth access token retrieved during a handshake. Persisting it securely avoids needing to re-authenticate for every request.
However, if we only need the data for a single, short-lived operation and it’s safe to discard afterward, in-memory storage is safer and simpler. For instance, if we're decrypting a one-time password and immediately using it, there's no need to persist it.
9. Is the stored data used for analytics, monitoring, or delayed processing?
We ask this to assess whether the data needs to be held for analysis or future actions. If so, it must persist until it is processed, even if the app restarts.Persistence is required when we store values like error counts, failed requests, or processing metrics that are aggregated or reviewed later. For instance, we might log usage data into Object Store throughout the day and push it to a monitoring platform at night. That data must persist until the push happens.
If the data is used instantly and doesn’t require delayed handling—like recording a log message sent directly to a log aggregator—then persistence may not be necessary.
10. Will the data be needed if the app is scaled horizontally (multi-replicas in Cloudhub 2.0 or RTF deployment)?
This question is similar to question seven but emphasizes scale-out scenarios. When we deploy applications with multiple instances or scale pods/workers for high availability or performance, they won’t share memory. If we store non-persistent data, each instance will have a separate, isolated copy.We need persistence when data must be consistent across all replicas. For example, in a Runtime Fabric deployment where three replicas of the app process incoming messages, and all of them need to access a shared retry counter, persistence ensures they all see the same values.
If the data is only used locally and doesn’t need to be coordinated across instances—for example, flow-level state used in each worker independently—then persistence isn’t needed.