While the costs of XA transactions are well known (e.g. increased data contention, higher latency, significant disk I/O for logging, availability challenges, etc.), in many cases they are the most attractive option for coordinating logical transactions across multiple resources.
There are a few common approaches when integrating Coherence into applications via the use of an application server's transaction manager:
Disabling hardware multicast (by configuring well-known addresses aka WKA) will place significant stress on the network. For messages that must be sent to multiple servers, rather than having a server send a single packet to the switch and having the switch broadcast that packet to the rest of the cluster, the server must send a packet to each of the other servers. While hardware varies significantly, consider that a server with a single gigabit connection can send at most ~70,000 packets per second.
Applications that rely on partial caches of databases, and use read-through to maintain those caches, have some trade-offs if queries are required. Coherence does not support push-down queries, so queries will apply only to data that currently exists in the cache. This is technically consistent with "read committed" semantics, but the potential absence of data may make the results so unintuitive as to be useless for most use cases (depending on how much of the database is held in cache).
Some NamedCache methods (including clear(), entrySet(Filter), aggregate(Filter, …), invoke(Filter, …)) may generate large intermediate results. The size of these intermediate results may result in out-of-memory exceptions on cache servers, and in some cases on cache clients. This may be particularly problematic if out-of-memory exceptions occur on more than one server (since these operations may be cluster-wide) or if these exceptions cause additional memory use on the surviving servers as they take over partitions from the failed servers.
When integrating Coherence into applications, each application has its own set of requirements with respect to data integrity guarantees. Developers often describe these requirements using expressions like "avoiding dirty reads" or "making sure that updates are transactional", but we often find that even in a small group of people, there may be a wide range of opinions as to what these terms mean. This may simply be due to a lack of familiarity, but given that Coherence sits at an intersection of several (mostly) unrelated fields, it may be a matter of conflicting vocabularies (e.g.
Large clusters (measured in terms of the number of storage-enabled members participating in the largest cache services) may introduce challenges when issuing queries. There is no particular cluster size threshold for this, rather a gradually increasing tendency for issues to arise.
A recent A-Team engagement required the development of a custom PartitionAssignmentStrategy (PAS). By way of background, a PAS is an implementation of a Java interface that controls how a Coherence partitioned cache service assigns partitions (primary and backup copies) across the available set of storage-enabled members. While seemingly straightforward, this is actually a very difficult problem to solve. Traditionally, Coherence used a distributed algorithm spread across the cache servers (and as of Coherence 3.7, this is still the default implementation).