Oracle Application Continuity (AC & TAC)
Modern business applications are built on the foundation of uninterrupted service delivery and high availability. Planned or unplanned disruptions at the database layer can negatively impact the end-user experience and jeopardize business continuity. Oracle offers various technologies under the Oracle Maximum Availability Architecture (MAA) umbrella to overcome these challenges. One such critical feature, Oracle Application Continuity (AC), is designed specifically to ensure high availability at the application layer.
1.1 Definitions: Application Continuity (AC) and Transparent Application Continuity (TAC)
Application Continuity (AC) is an Oracle Database feature that enables the seamless and rapid replay of an in-flight request against the database following a recoverable error that makes the database session unusable. Its primary goal is to ensure that the interruption appears to the end-user as nothing more than a delay in request processing. AC works by completely reconstructing the database session after an outage, including all states, cursors, variables, and the last transaction (if any). This effectively masks disruptions caused by planned maintenance (e.g., patching, configuration changes) or unplanned outages (e.g., network errors, instance failures).
Transparent Application Continuity (TAC), introduced with Oracle Database 18c, is an extension or mode of AC. TAC transparently tracks and records session and transactional state, enabling the recovery of a database session after recoverable outages. The key characteristic of TAC is its ability to operate without requiring any application code changes or specific knowledge of the application by the database administrator (DBA). This transparency is achieved through a state-tracking infrastructure that categorizes session state usage.
Both AC and TAC can be used with Oracle Real Application Clusters (RAC), Oracle RAC One Node, Oracle Active Data Guard, and Oracle Autonomous Database (both shared and dedicated infrastructure). These features enhance the fault tolerance of systems and applications by masking database outages and recovering in-flight work that would otherwise be lost.
1.2 Problem Solved: Masking Interruptions and Ensuring Business Continuity
Without AC/TAC, database outages cause significant problems for applications. Applications receive error messages, users are left uncertain about the status of their transactions (e.g., money transfers, flight reservations, orders), and middleware servers might even need restarting to handle the surge of login requests post-outage. This leads to both end-user dissatisfaction and operational inefficiency.
AC and TAC enable the Oracle Database, Oracle drivers, and Oracle connection pools to collaborate, safely and reliably masking many planned and unplanned outages. By automatically handling recoverable errors, they improve the end-user experience and reduce the need for application developers to write complex error-handling code. This boosts developer productivity and aims for uninterrupted application operation.
The evolution from Oracle’s basic failover mechanisms (like TAF – Transparent Application Failover) to AC and then TAC reflects a strategic shift towards making high availability increasingly transparent and reducing application-specific coding dependencies. TAF (pre-12c) had significant limitations, especially around DML operations and session state management. AC (12c) addressed DML replay but required awareness of connection pool usage and request boundaries. TAC (18c+) further reduced complexity by automating state tracking and boundary detection. This progression shows Oracle recognized the adoption barriers of earlier solutions and prioritized ease of use alongside capability. Consequently, TAC has become Oracle’s preferred solution for modern applications, especially in cloud and Autonomous Database environments , while AC remains relevant for specific legacy systems or customization needs.
1.3 Role within Oracle Maximum Availability Architecture (MAA)
AC and TAC extend Oracle’s MAA principles to the application tier. MAA is a set of best practices, configurations, and architectural blueprints designed to achieve zero data loss and zero application downtime goals. AC and TAC contribute to these goals by recovering in-flight transactions and the application stack.
These features work in conjunction with other Oracle HA solutions like RAC, Data Guard, and Fast Application Notification (FAN) to form the building blocks for continuous availability. The MAA framework aims to keep applications continuously available by hiding planned and unplanned events, as well as load imbalances at the database tier. AC and TAC are integral parts of this architecture, minimizing the impact of database outages on the application.
2. Core Concepts and Working Mechanism
The fundamental principle behind Application Continuity is to recover ongoing work during an interruption and allow it to continue without the user noticing. This is achieved through a complex replay process, accurate definition of request boundaries, and meticulous management of session state.
2.1 The Replay Process: How AC/TAC Recovers Sessions
The working mechanism of AC and TAC involves the following steps when a recoverable error is detected:
- Error Detection: The system identifies a recoverable error (e.g., network interruption, temporary instance failure) that renders the session unusable.
- New Session Establishment: A new database session is established on another available database instance.
- Session State Restoration: The state of the original session before the interruption (non-transactional state, variables, PL/SQL package states, etc.) is reconstructed in the new session. This is managed through service parameters like
FAILOVER_RESTORE
andSESSION_STATE_CONSISTENCY
, and mechanisms like Database Templates in 23ai. - Replay of Database Calls: The database calls (SQL queries, DML operations) made from the beginning of the interrupted request are executed sequentially in the new session.
- Consistency Check and Idempotence: During replay, data consistency is checked. The Transaction Guard mechanism ensures that the transaction is committed only once (idempotence), especially if the interruption occurred during the
COMMIT
operation. - Continuation or Error: If the replay is successful, the application perceives the interruption merely as a delay and continues from where it left off. However, if data inconsistency is detected during replay (e.g., a replayed query returns different results) or an unrecoverable state is encountered, the replay is rejected, and the application receives the original error. Unrecoverable errors (e.g., invalid data input) are never replayed.
This process ensures that the user is unaffected by the interruption and the transaction is either completed safely or the original error state is accurately reported.
2.2 Understanding Request Boundaries
A “request” is a logical unit of work from the application’s perspective. Defining the start and end points of these work units, known as request boundaries, is critical for the correct functioning of AC and TAC. These boundaries define the scope of work to be replayed and allow the system to discard unnecessary call history, using resources efficiently.
- Typical Boundary: Often, a request boundary spans the time between an application borrowing a database connection from a connection pool (checkout) and returning it (check-in). This is the default behavior for ODP.NET and Oracle connection pools.
- Explicit Boundaries: If Oracle connection pools are not used, or if AC is managed manually, the application must explicitly mark the request boundaries. This is usually done via driver-provided API calls like
BeginRequest
andEndRequest
(or equivalents). This method improves resource consumption and ensures replay occurs within the correct scope. - Implicit/Discovered Boundaries (TAC): A significant advantage of TAC is its ability to automatically detect request boundaries. Especially with modern drivers (JDBC 18c+, OCI 19c+), TAC can determine boundaries by monitoring the application’s behavior. Conditions for discovering a boundary typically include no open transaction, cursors being closed or cached, and the session state being restorable.
- Importance: Besides defining the replay scope, request boundaries are fundamental for functions like connection draining during planned maintenance, load balancing, and resource management.
2.3 Session State Management and Restoration (FAILOVER_RESTORE
)
For a successful replay, the state of the original session before the interruption must be consistent with the state of the new session where the replay occurs. This includes not only the in-flight transaction but also non-transactional session settings, PL/SQL package states, temporary objects, and other session attributes.
Oracle provides various mechanisms and service parameters to manage this state:
FAILOVER_RESTORE
Service Parameter: Determines the extent to which session state is restored after failover.LEVEL1
(orBASIC
): Available for TAF and AC since Oracle Database 12.2, this setting restores commonly used basic initial session states (e.g., NLS settings).AUTO
(TAC): The recommended setting used with TAC. It enables automatic tracking, validation, and restoration of session state.
SESSION_STATE_CONSISTENCY
Service Parameter: Controls how session state is handled during a request.DYNAMIC
(Older default for AC): If non-transactional session state changes during the request, replay is internally disabled until the next request.STATIC
(Older, limited support for AC): Assumes the application does not change non-transactional state during the request. As of 23ai, its use withFAILOVER_TYPE=TRANSACTION
is not supported.AUTO
(TAC): State is tracked and validated transparently. After a disablement, failover is automatically re-enabled when possible. Recommended for TAC.
- Restoring Application-Specific State: If there are application-specific initial states not covered by
FAILOVER_RESTORE=LEVEL1
(e.g., custom PL/SQL package variables), additional mechanisms are needed:- Connection Initialization Callbacks (Java): The application can register a callback function to be invoked when a connection is obtained or during replay.
- TAF Callbacks (OCI – Legacy): A similar mechanism for OCI applications.
- UCP/WLS Connection Labeling: State management can be achieved by assigning labels to connections and defining callbacks that initialize state based on these labels.
- It is crucial that these callback mechanisms are idempotent, meaning they produce the same result if run multiple times, as an outage could occur during the callback itself.
- Oracle 23ai Database Templates: This new feature introduced in 23ai provides more advanced checkpointing and restoration of session state, enhancing the scope and reliability of AC.
The complexity of session state management has been a significant barrier to AC/TAC adoption. The evolution from manual callbacks to FAILOVER_RESTORE=LEVEL1
, then SESSION_STATE_CONSISTENCY=AUTO
, and finally Database Templates demonstrates Oracle’s continuous effort to automate and simplify this critical aspect. Inconsistent session state between the original and replayed session is a primary cause of replay failure. TAC’s AUTO
setting and 23ai’s Templates aim to make state management transparent, significantly increasing the likelihood of successful replay and broadening applicability. However, this implies that applications with complex, non-standard session state might still require careful design or potentially fall back to AC with custom callbacks. Understanding the application’s state usage is crucial for selecting the right approach and configuration. The addition of the RESET_STATE
feature further underscores the need to manage state cleanly between requests.
2.4 Transactional State and Idempotence
AC and TAC aim to preserve the integrity of the last transaction during the replay of an interrupted request. This becomes critical, especially when an interruption occurs after the COMMIT
command is sent but before the acknowledgment is received. This is where Transaction Guard (TG) comes into play.
TG determines the definitive outcome (COMMIT_OUTCOME
) of the transaction, preventing the same transaction from being committed multiple times during replay. AC and TAC rely on this idempotence guarantee provided by TG to perform the replay safely. The detailed mechanism of Transaction Guard is discussed in Section 4.
3. Essential Enabling Components
The seamless operation of Application Continuity necessitates an ecosystem approach, requiring the coordinated function of the database, drivers, connection pools, and notification mechanisms.
3.1 Transaction Guard (TG): Guaranteeing Definitive Commit Outcomes
Transaction Guard is a cornerstone of AC/TAC. It prevents duplicate transactions by ensuring at-most-once execution during replay. AC and TAC depend on TG to determine the transaction’s status post-outage and ensure safe replay. (Details in Section 4).
3.2 Application Continuity Aware Drivers
For replay to occur, the client-side Oracle drivers must support Application Continuity and be capable of capturing database operations for potential replay. Key supported drivers include:
- JDBC: Oracle JDBC Replay Driver. Version 12c or later for AC, 18c or later for TAC. With the Oracle 23ai driver, AC support is automatically enabled.
- OCI (Oracle Call Interface): OCI Session Pool. Version 12.2 or later for AC, 19c or later for TAC.
- ODP.NET: Unmanaged Provider (in pooled mode). Version 12.2 or later for AC, 18c or later for TAC. Core and Managed ODP.NET support was added in later releases.
- SQL*Plus: Version 19c (specifically 19.3) or newer supports AC/TAC.
- Others: Support for languages like Python, PHP, Node.js is typically provided via the respective OCI or JDBC drivers.
Unsupported drivers or configurations include Asynchronous ODP.NET , older drivers , JDBC OCI Type 2 driver, OLE DB, ODBC, OCCI, and pre-compilers.
3.3 Connection Pools
Connection pools play a critical role in the effectiveness of AC and TAC. Pools manage the lifecycle of connections, which simplifies the determination of request boundaries. They also integrate with FAN/FCF to respond quickly to outage notifications and manage connections. Major supported pools include:
- Oracle Universal Connection Pool (UCP) (12c+)
- WebLogic Server (WLS) Active GridLink (12c+)
- Third-party JDBC application servers using UCP (e.g., JBoss, HikariCP)
- OCI Session Pool
- ODP.NET Connection Pool
Best practice dictates returning connections to the pool immediately after each request completes. Holding connections unnecessarily hinders draining during planned maintenance and compromises high availability.
3.4 Fast Application Notification (FAN) and Fast Connection Failover (FCF)
Fast Application Notification (FAN) is an Oracle Clusterware mechanism that publishes event notifications about the status of cluster and database services (e.g., instance crash, service start/stop, load balancing advisories).
FAN’s critical role is providing immediate notification about outages. This allows clients and connection pools to react swiftly instead of waiting for TCP/IP timeouts. FAN is a mandatory component for effective AC/TAC.
Fast Connection Failover (FCF) is a client-side feature, typically embedded within connection pools (UCP, WLS Active GridLink), that subscribes to FAN events. FCF uses the received FAN events to perform actions like:
- Immediately terminating or removing connections belonging to failed instances from the pool.
- Initiating connection draining for planned maintenance.
- Performing runtime connection load balancing.
FAN events are transported via the Oracle Notification Service (ONS), typically requiring port 6200 to be open.
The tight coupling between AC/TAC, specific drivers, connection pools, and FAN/FCF underscores that Application Continuity is not just a database feature but an ecosystem requiring coordinated configuration across tiers (client, mid-tier, database). Replay requires driver intelligence. Request boundaries are often managed by pools. Fast failure detection relies on FAN/FCF. Without all pieces working together (correct driver versions, pool configuration, network paths for ONS, service settings), AC/TAC will not function effectively, or at all. This necessitates a holistic view and collaboration between DBAs, application developers, and potentially network administrators for AC/TAC implementation. Simply enabling a service feature is insufficient. The checklist approach in MAA documentation reinforces the need for comprehensive configuration. While the introduction of automatic ONS configuration aims to simplify this, understanding the components remains crucial.
4. Deep Dive: Transaction Guard (TG)
Transaction Guard is a fundamental technology offered by Oracle Database that significantly enhances application reliability, especially after interruptions. It is the key mechanism behind the safe and automated replay capability of Application Continuity.
4.1 Purpose: Ensuring At-Most-Once Execution
The core problem TG solves is the uncertainty that arises when the COMMIT
acknowledgment sent to the client is lost following a recoverable error (like a network outage). The application cannot know if the transaction actually succeeded. Retrying the transaction in this state of uncertainty could lead to the same transaction being executed multiple times (duplicate transaction), causing logical data corruption.
TG addresses this by providing idempotence. In this context, idempotence is the ability to guarantee that a transaction, if retried after an error, is executed at most once. TG enables the application to learn the definitive outcome of the last transaction before the interruption, thereby preventing logical corruption caused by duplicate transactions.
4.2 How It Works: Logical Transaction ID (LTXID)
At the heart of Transaction Guard is a globally unique identifier called the Logical Transaction ID (LTXID). The working principle involves these steps:
- LTXID Assignment: When a database session is established, it is automatically assigned an LTXID. This ID typically consists of the session’s logical number and a commit number that increments with each
COMMIT
orROLLBACK
within the session. - LTXID Tracking: The database tracks the LTXID for each transaction within the session. A copy of the LTXID is held both on the client (in the OCI session handle or JDBC/ODP.NET connection object) and on the server.
- Commit Outcome Association: When a transaction is committed, its outcome (success/failure) is associated with the corresponding LTXID, and this information is persistently stored in the database.
- Reliable Outcome Retrieval: When an interruption or error occurs and the
COMMIT
acknowledgment is lost, the application (or AC) uses the LTXID held on the client for the failed session. GET_LTXID_OUTCOME
Call: The application invokes theDBMS_APP_CONT.GET_LTXID_OUTCOME
PL/SQL procedure, passing this LTXID.- Outcome Return: Based on the stored LTXID information, the database returns the definitive outcome of the transaction (committed/not committed, completed/not completed) to the application. This information allows the application to safely decide whether to retry the transaction.
- At-Most-Once Enforcement: When the outcome is requested using the LTXID, the database can block any earlier in-flight transaction with the same LTXID from committing, thus ensuring at-most-once execution.
- Retention Period: The database retains the LTXID and its associated commit outcome for a configurable duration (default 24 hours, set by the
RETENTION_TIMEOUT
service parameter), giving applications sufficient time for recovery and outcome querying.
4.3 Benefits and Use Cases
The primary benefits provided by Transaction Guard include:
- Provides definitive commit outcomes.
- Prevents logical corruption by avoiding duplicate transactions.
- Enables safe transaction replay.
- Forms the foundation for Application Continuity.
- Improves user experience and reduces support costs.
- Increases developer productivity.
Use cases include critical applications where duplicate transactions are unacceptable (banking, e-commerce order systems, etc.) , enabling AC/TAC, and allowing applications to safely implement their own custom recovery logic.
4.4 Configuration Requirements
To enable and use Transaction Guard, the following steps are required:
- Database Version: Oracle Database 12.1 or newer must be used.
- Application Service: All database work must go through a specifically created application service. The default database service should not be used. The service is created with
srvctl
for RAC orDBMS_SERVICE
for single instance. COMMIT_OUTCOME
Parameter: TheCOMMIT_OUTCOME
parameter must be set toTRUE
on the application service.- Grant Permission:
EXECUTE
privilege on theDBMS_APP_CONT
package must be granted to the application users who will call theGET_LTXID_OUTCOME
procedure. DDL_LOCK_TIMEOUT
(Optional): If TG is to be used with DDL statements, increasing theDDL_LOCK_TIMEOUT
parameter might be considered.- Recommendations: FAN configuration (for RAC/Data Guard), checking
RETENTION_TIMEOUT
, using connection pools.
4.5 Relationship with Application Continuity
Transaction Guard is a foundational technology underlying Application Continuity. AC/TAC internally and automatically uses TG when performing replay after an outage to reliably determine the status of the previous transaction.
Thanks to this integration, developers using AC/TAC generally do not need to interact directly with TG APIs; AC/TAC manages this process behind the scenes. However, if an application needs to implement its own custom recovery or replay logic, TG can also be used independently.
In essence, Transaction Guard lays the groundwork for safe replay by addressing the fundamental problem of uncertainty during commit in distributed systems. AC/TAC builds upon this foundation to offer an automated and transparent application continuity solution. Understanding TG is important not only for AC/TAC but also for building robust Oracle applications that need to handle failures during transactions.
5. Application Continuity (AC) vs. Transparent Application Continuity (TAC)
Oracle offers two primary mechanisms for ensuring application continuity: Application Continuity (AC) and Transparent Application Continuity (TAC). While both serve the same fundamental purpose, they differ significantly in their operation, configuration requirements, and impact on the application. Choosing the right solution depends on the application’s architecture, the technologies used, and the desired level of transparency.
5.1 Key Differences Summarized
The following table summarizes the core differences between AC and TAC:
Feature | Application Continuity (AC) | Transparent Application Continuity (TAC) |
---|---|---|
Transparency Level | Lower (Requires pool/boundary awareness, potential code changes for state/side-effects) | Higher (Aims for zero code changes, automatic state/boundary management) |
Request Boundaries | Explicit (App/Pool defined) or Implicit (Pool defined) | Implicit/Discovered (Driver/Database detects) |
Session State Management | Requires FAILOVER_RESTORE=LEVEL1 + potential Callbacks/Labeling | Uses FAILOVER_RESTORE=AUTO , SESSION_STATE_CONSISTENCY=AUTO |
Side Effect Handling (Default) | Replays side effects | Does not replay side effects |
Customization (Callbacks, Side Effects) | Yes (Allows callbacks, explicit side-effect replay) | No (Designed for transparency, avoids complex customization) |
DB Version Introduced | 12.1 (JDBC), 12.2 (OCI/ODP.NET) | 18c/19c |
Key Service Setting (FAILOVER_TYPE ) | TRANSACTION | AUTO |
This table provides a concise, direct comparison of the most critical differentiating factors between AC and TAC, derived from synthesizing information across numerous citations, aiding user clarity and decision-making.
5.2 Transparency and Configuration Effort
TAC’s primary design goal is to require minimal or zero changes to application code. Features like automatic state tracking and request boundary discovery mean TAC configuration is generally simpler than AC, often achieved by using AUTO
values in service parameters.
In contrast, AC might necessitate code adjustments, particularly when using older oracle.sql.*
concrete classes , when custom session state management is needed , or when request boundaries must be manually defined if not using a connection pool.
5.3 Handling Side Effects (Non-Idempotent Operations)
Side effects are actions that occur outside the main database transaction and leave persistent results. Examples include sending emails via UTL_SMTP
, writing to the file system, calling external web services, or autonomous transactions.
- AC’s Default Behavior: AC replays statements with side effects by default. This can lead to undesired outcomes in some cases (e.g., sending the same email twice). AC provides mechanisms to manage this, such as disabling replay for specific code blocks (
disableReplay
API ) or potentially using callbacks for custom handling. - TAC’s Default Behavior: TAC does not replay side effects by default. TAC automatically detects calls known to have side effects and prevents their replay. This provides a safer default behavior for most applications.
The ACCHK tool can report on non-replayable side effects.
5.4 Customization Capabilities (Callbacks, Initial State)
AC allows customization for complex initial session state setups via connection initialization callbacks or connection labeling. This is useful when the application needs to be brought to a specific initial state post-failover.
TAC generally does not support such customizations, as its core philosophy is transparency and automation.
5.5 When to Choose AC vs. TAC?
- TAC: Is the default and recommended solution for most modern applications, especially where code changes are undesirable or not feasible. It’s simpler and leverages automatic features.
- AC: May be preferred or required in the following situations:
- Using older driver or database versions (pre-18c/19c for TAC).
- Needing fine-grained control over side-effect replay (e.g., intentionally replaying or suppressing specific side effects).
- Requiring complex initial state setup via callbacks.
- The application uses state patterns that cannot be automatically managed by TAC (e.g., persistent temporary tables not cleaned up between requests).
The choice between AC and TAC is not purely technical; it reflects a trade-off between transparency/simplicity (TAC) and control/customization (AC). TAC lowers the barrier to adoption by automating many complex aspects. However, this automation comes at the cost of flexibility. AC provides hooks for developers to handle edge cases or specific requirements that don’t fit the TAC model. Architects need a thorough understanding of the application’s behavior regarding session state and side effects to make the right choice. Migrating an application designed for AC to TAC might require analysis to understand if the default TAC behavior (e.g., not replaying side effects) is acceptable. The existence of both options caters to a wider range of application architectures and legacy constraints.
6. Requirements and Compatibility
Successfully implementing Application Continuity depends on meeting a set of requirements spanning database and client versions, service configuration, and application design. Incompatible or incomplete configurations can lead to the feature not working as expected, or not working at all.
6.1 Database and Client Version Prerequisites
Specific minimum Oracle database and client versions are required for AC, TAC, and related technologies to function. The table below summarizes these requirements:
Component | AC Requirement | TAC Requirement | TG Requirement | 23ai Features |
---|---|---|---|---|
Database | 12.1+ | 18c+ (19c+ recommended) | 12.1+ | 23ai |
JDBC Driver | 12c+ Replay | 18c+ Replay (19c+ rec.) | 12.1+ Thin | 23ai (auto-on) |
OCI Client/Driver | 12.2+ | 19c+ | 12.1+ | 23ai |
ODP.NET Unmanaged (Pooled) | 12.2+ | 18c+ | 12.1+ | 23ai (full) |
SQL*Plus | 19.3+ | 19.3+ | – | 23ai |
This table centralizes critical version compatibility information necessary for planning deployments or upgrades, preventing users from attempting unsupported combinations.
Using the latest client drivers is always strongly recommended for best results and full feature support. Mismatches between driver and database versions can lead to issues.
6.2 Essential Service Configuration Parameters
Using application-specific services managed by Oracle Clusterware (or created with DBMS_SERVICE
for single instance) instead of the default database service is mandatory for AC and TAC. These services must be configured using the srvctl modify service
(or add service
) command with the following key parameters:
-failovertype
: Determines replay behavior. UseTRANSACTION
for AC,AUTO
for TAC.-commit_outcome
: Enables Transaction Guard. Must be set toTRUE
for AC and TAC.-failover_restore
: Sets the session state restoration level. TypicallyLEVEL1
for AC,AUTO
for TAC.-session_state_consistency
: Defines the session state consistency mode.AUTO
is recommended for TAC.-replay_init_time
: Specifies the maximum time (in seconds) allowed for replay to begin. A crucial tuning parameter.-retention_timeout
: Determines how long (in seconds) the Transaction Guard commit outcome is retained.-drain_timeout
: Specifies the time (in seconds) allowed for draining active sessions during planned maintenance.-stopoption
: Defines how the service is stopped during planned maintenance (e.g.,IMMEDIATE
).-notification
: Determines if FAN events are published for the service (should be set toTRUE
).-rlbgoal
,-clbgoal
: Set runtime and connection-time load balancing goals.-failoverretry
,-failoverdelay
: Connection retry settings.
6.3 Application Design Considerations
The application’s design and coding practices directly impact the effectiveness of AC/TAC:
- Pooling: Using Oracle connection pools (UCP, WLS Active GridLink, OCI/ODP.NET pools) is strongly recommended. Connections should be returned to the pool immediately after use.
- Request Boundaries: Clear request boundaries must be ensured, either implicitly via pools or explicitly via API calls if necessary.
- Statelessness: Aim for stateless application logic between requests whenever possible. If state exists, ensure it is managed correctly (restorable via
FAILOVER_RESTORE
or callbacks for AC). TheRESET_STATE
feature in 23ai can also assist here. - Error Handling: Applications must still have robust error handling for unrecoverable errors not handled by AC/TAC.
- Avoid Legacy Concrete Classes: JDBC applications should avoid using legacy concrete classes from the
oracle.sql.*
package, preferring standard JDBC ororacle.jdbc.*
interfaces. ACCHK can detect these. - Mutable Functions: Understand how AC/TAC handles mutable functions like
SYSDATE
,SYS_GUID
,sequence.NEXTVAL
(values are preserved during replay). GrantKEEP
privileges if necessary.
6.4 Supported and Unsupported Configurations/Operations
- Supported: Standard SQL and PL/SQL operations with compatible drivers, pools, and database versions are generally supported. XA transactions are supported with specific replay data sources (
OracleXADataSourceImpl
). - Unsupported/Limitations:
- Asynchronous ODP.NET.
- Specific drivers (JDBC Type 2 OCI, OLE DB, ODBC, OCCI, Pro*C, etc.).
LONG
/LONG RAW
data types: Using LOBs is recommended. Can cause replay failures if not handled carefully due to their streaming nature.- Certain session state changes or PL/SQL calls might temporarily disable replay.
- Operations using legacy
oracle.sql.*
classes.
Successful AC/TAC deployment hinges on meeting a cascade of prerequisites involving database version, client version, specific driver types, connection pooling strategy, and service configuration. It is not a plug-and-play feature. Each component (Database, driver, pool, FAN) has evolved with AC/TAC capabilities. Using mismatched versions can lead to partial functionality or complete failure. Service parameters directly control AC/TAC behavior. Application design choices (pooling, state management) interact directly with AC/TAC mechanisms. Therefore, a comprehensive checklist approach covering all tiers is essential before implementation. Overlooking any requirement (e.g., using an unsupported driver, not configuring FAN, incorrect service settings) will likely lead to failed replays and frustration. The version requirements also imply that leveraging the latest AC/TAC features often necessitates upgrading both server and client components.
7. Benefits, Limitations, and Considerations
While Application Continuity offers significant availability advantages for applications relying on Oracle databases, it also has specific limitations and requires careful planning and configuration for successful implementation.
7.1 Advantages of Implementing AC/TAC
- Outage Masking: Hides both planned maintenance (patching, configuration) and unplanned outages (instance, network, storage failures) from applications and end-users.
- Improved User Experience: Users experience only brief delays instead of error messages during interruptions, increasing satisfaction.
- Increased Application Availability: Minimizes downtime, ensuring business continuity.
- Developer Productivity: Reduces the need for complex error-handling code for recoverable errors.
- Transaction Integrity (Idempotence): Prevents duplicate transaction commits during replay, thanks to Transaction Guard.
- TAC Transparency: TAC offers easy configuration without requiring code changes.
- Broad Platform Support: Supports various platforms like Java,.NET, Python, and common connection pools.
7.2 Potential Drawbacks, Limitations, and Common Pitfalls
- Recoverable Errors Only: Only handles recoverable database errors like network issues or instance failures. Unrecoverable errors originating from application logic or invalid data must still be managed by the application.
- Compatibility Requirements: Requires specific and compatible Oracle database and client versions, dedicated drivers, and connection pools (See Section 6).
- Configuration Complexity: Needs careful service configuration, FAN setup, and potentially application adjustments. Misconfiguration leads to replay failure.
- Session State Management: Restoring session state correctly can be complex, especially for AC or stateful applications. Unrestorable state prevents replay.
- Performance Impact: State tracking and potential replay can introduce some performance overhead, although usually minimal. Tuning parameters like
replay_init_time
is important. - Debugging Challenges: Diagnosing replay failures can be difficult, requiring tracing and tools like ACCHK.
- Non-Replayable Operations: Not all operations are replayable (e.g., those using certain legacy classes , some complex PL/SQL or external calls if not handled carefully ).
- Application Assumptions: Code assuming ROWIDs don’t change or relying on middle-tier timing might encounter issues.
7.3 Understanding and Managing Side Effects
Managing side effects (actions outside the database transaction: email, file writes, etc. ) is a key difference between AC and TAC and requires careful consideration.
- Default Behaviors: AC replays side effects by default, while TAC does not.
- Strategies for AC:
- If replay of the side effect is acceptable (it’s idempotent or business logic allows), the default behavior can be used.
- To prevent unwanted replay, disable replay for specific code blocks using the
disableReplay()
API. - Callbacks can be used for custom logic, though complex.
- Strategy for TAC: Rely on automatic detection and suppression. If a side effect must be replayed, TAC might not be suitable, and AC should be considered.
- Analysis: Identifying side effects during application analysis is crucial.
While AC/TAC offers significant availability benefits, they are not a panacea. Their effectiveness depends heavily on the application’s architecture, the nature of the failure, and meticulous configuration and validation. The limitations make it clear that AC/TAC targets a specific class of problems. They do not fix application bugs or handle every possible state. The pitfalls around configuration and state indicate that implementation requires expertise and testing. Side effect management remains a key differentiator and potential point of complexity. Therefore, setting realistic expectations is vital. Organizations should view AC/TAC as powerful tools within a larger HA strategy , not a complete replacement for robust application design and error handling. Thorough testing and validation with tools like ACCHK are non-negotiable before production deployment.
8. Integration with Oracle High Availability Solutions
While Application Continuity is a valuable feature on its own, its power and effectiveness are maximized when integrated with Oracle’s other high availability solutions. Particularly when used alongside technologies like Oracle RAC and Active Data Guard, it provides multi-layered protection, creating a comprehensive defense against application interruptions.
8.1 AC/TAC and Real Application Clusters (RAC)
Oracle RAC allows a single database to run across multiple servers (nodes), providing instance-level redundancy and scalability. AC/TAC plays a critical role in RAC environments in the following ways:
- Failover Target: When a RAC node or instance fails, AC/TAC directs the session to another surviving instance in the cluster and initiates the replay process.
- Clusterware Services and FAN Integration: In RAC, database services are managed by Oracle Clusterware. These services distribute workloads, define preferred and available instances. FAN, as an integral part of Clusterware, instantly broadcasts changes in node or service status (crash, start, stop). AC/TAC uses these FAN notifications for rapid failure detection and triggering the failover process.
- Planned Maintenance and Load Balancing: RAC services are used to manage workload redirection or draining during planned maintenance (e.g., rolling patching). AC/TAC helps ensure continuity during these processes. Furthermore, load balancing information provided via FAN allows connection pools to direct new connections to less loaded instances.
- 23ai Enhancements: Features introduced in Oracle 23ai, such as Smart Connection Rebalance and Smooth Reconfiguration of RAC Instances, further enhance overall application availability in RAC environments.
8.2 AC/TAC and Active Data Guard (ADG)
Oracle Active Data Guard provides disaster recovery and data protection by creating one or more synchronized physical standby copies of a primary database. The standby can also be used for read-only queries (Active). AC/TAC is also supported in ADG environments:
- Disaster Recovery Scenarios: When the primary database becomes completely unavailable (e.g., site disaster), Data Guard performs an automatic or manual failover or switchover to the standby database.
- Role-Based Services: In ADG environments, services are typically defined based on a specific role (PRIMARY or STANDBY). An AC/TAC-enabled service is defined to run on the database holding the primary role.
- Post-Failover Replay: After the role transition (failover/switchover) completes, the application service starts on the new primary database. AC/TAC directs client connections to this new primary and attempts to rebuild and replay the interrupted session and transaction there.
- Data Loss Consideration: Crucially, if data loss occurred during the Data Guard role transition, AC/TAC will not attempt replay. Therefore, running Data Guard in Maximum Availability or Maximum Protection mode is preferred for seamless failover with AC/TAC.
8.3 Relationship with Connection Draining for Planned Maintenance
The primary mechanism for achieving zero downtime during planned maintenance (e.g., database or OS patching, hardware upgrades) is connection draining.
- Process: Services on the instance undergoing maintenance are stopped or relocated with a specific drain timeout (
drain_timeout
) (srvctl relocate service
orstop service
). FAN notifies connection pools of this status. Pools stop giving new connections to the instance marked for draining and close existing idle connections. Active connections are closed when they finish their work and are returned to the pool. - AC/TAC’s Role: AC/TAC acts as a backup mechanism during the draining process. If a session cannot complete its work within the defined
drain_timeout
and its connection is forcibly terminated, AC/TAC attempts to replay that session on a surviving instance. This helps ensure continuity even if the drain timeout is insufficient or for unexpectedly long-running transactions.
AC/TAC is most potent when deployed within a comprehensive MAA framework involving RAC and/or Active Data Guard, leveraging the redundancy and failover capabilities of the underlying infrastructure. RAC provides immediate local failover targets. ADG provides disaster recovery failover targets. FAN, integral to Clusterware/Data Guard Broker, provides the necessary rapid notifications. Draining handles planned events gracefully. AC/TAC fills the gap by handling the application session recovery during these infrastructure events. While AC/TAC can provide some benefit on a single instance , its true value emerges in clustered or replicated environments where alternative processing resources are readily available. The synergy between these technologies (RAC/ADG + FAN + Pools + AC/TAC) creates a multi-layered defense against disruption.
9. Enhancements in Oracle Database 23ai
Oracle continues to invest in Application Continuity and related high availability technologies. The Oracle Database 23ai release introduces significant new features and improvements in this area, expanding the capabilities of AC/TAC and simplifying its usage.
9.1 Overview of Key HA/Continuity Features in 23ai
Oracle 23ai includes a range of innovations focused on enhancing continuous availability. The main enhancements directly or indirectly related to Application Continuity are:
23ai Feature | Description | Benefit | Related Snippets |
---|---|---|---|
AC Session State Restore (Database Templates) | Uses templates to checkpoint/restore session state for AC replay/migration. | Simplifies/broadens AC use, improves state restore reliability, reduces planned downtime. | |
AC Batch Support (Resumable Cursors) | TAC automatically manages and allows replay of long-running cursors common in batch jobs. | Extends AC/TAC protection to batch workloads. | |
JDBC Auto-Enabled AC | AC support is enabled by default in 23ai JDBC drivers; only requires AC-enabled service. | Lowers the entry barrier for Java applications. | |
JDBC True Cache Integration | JDBC driver can route read-only workloads to True Cache instances. | Improves performance/scalability (indirectly related to replay). | |
Database Native Transaction Guard | Persists LTXID as part of commit, reducing overhead compared to separate table. | Improves TG performance, requires no client changes. | |
Smart Connection Rebalance | Automatically moves sessions between RAC instances based on performance. | Transparently improves performance/resource utilization. | |
Smooth Reconfiguration of RAC Instances | Reduces downtime when nodes join/leave a RAC cluster. | Enhances continuous availability during cluster changes. | |
JDBC Self-Driver Diagnosability | Single production JAR; dumps in-memory trace on first failure. | Simplifies debugging and diagnostics for JDBC/AC issues. |
This table directly addresses the user’s request for 23ai features, providing a quick summary of the most relevant enhancements impacting application continuity and high availability.
9.2 Database Templates for Session State Restoration
Prior to Oracle 23ai, restoring complex session states, especially with AC, could require mechanisms like manual callbacks or connection labeling. Database Templates, introduced in 23ai, significantly simplify and automate this process. These templates are used to periodically checkpoint the session state, both server-side and client-visible aspects. When an outage occurs and replay is needed, AC uses these templates to quickly and reliably restore the session state at the start of the replay. Enhancements in the JDBC driver allow these templates to be shared across sessions and manage state variations. This simplifies AC configuration and increases the likelihood of successful replay.
9.3 Support for Batch Applications (Resumable Cursors)
Traditionally, long-running database cursors (often used in batch jobs or reporting) could prevent AC/TAC from automatically detecting request boundaries, as the request wasn’t considered finished while the cursor was open. Oracle 23ai addresses this with TAC and the JDBC driver. With Resumable Cursors support, long-running cursors that meet certain criteria (e.g., open with no open transaction and restorable session state) no longer prevent the implicit determination of request boundaries. This extends TAC protection to batch-type workloads that might have previously been excluded.
9.4 JDBC Driver Enhancements (Auto-Enablement, True Cache)
The Oracle 23ai JDBC driver includes significant innovations simplifying AC usage:
- Auto-Enablement: With the 23ai driver, Application Continuity support is enabled by default in all data sources. All that’s required for an application to benefit from AC is to connect to an AC/TAC-enabled database service. To disable AC, the new
oracle.jdbc.enableACSupport=false
connection property or system property can be used. - True Cache Integration: True Cache is an in-memory, consistent, read-only replica of the primary database. When enabled via the
oracle.jdbc.useTrueCacheDriverConnection=true
property, the 23ai JDBC driver can automatically route read-only workloads to appropriate True Cache instances, improving performance and scalability. - Self-Diagnosability: To simplify debugging, the 23ai JDBC driver records critical execution state in memory and dumps this recording when an error occurs, providing valuable diagnostic information on the first occurrence of the problem. The need for separate debug and metrics JAR files has also been eliminated.
9.5 Transaction Guard Improvements
Oracle 23ai introduces Database Native Transaction Guard. This enhancement persists the LTXID and commit outcome as part of the transaction’s commit record itself, rather than in a separate table. This eliminates the extra redo generation and performance overhead associated with Transaction Guard and requires no client-side changes.
9.6 Smart Connection Rebalance and Smooth Reconfiguration
While not direct AC/TAC mechanisms, these two features enhance overall availability and performance in RAC environments, making the platform AC/TAC runs on more stable:
- Smart Connection Rebalance: Automatically redistributes sessions across RAC instances based on real-time performance, balancing load and optimizing performance.
- Smooth Reconfiguration: Reduces the brief service interruptions (brownouts) experienced during RAC topology changes, such as nodes joining or leaving the cluster.
Oracle 23ai represents a significant leap forward in making Application Continuity more powerful, easier to manage, and applicable to a broader range of workloads (notably batch). Features like Templates , Resumable Cursor support , and JDBC Auto-enablement directly address previous limitations or complexities. Native TG improves efficiency. RAC enhancements strengthen the underlying platform. For organizations considering AC/TAC, 23ai offers compelling reasons to upgrade. It lowers adoption barriers and expands protection, aligning with trends towards autonomous operations and broader HA coverage. True Cache also signals a move towards optimizing read performance in HA architectures.
10. Configuration, Validation, and Best Practices
Transforming the theoretical benefits of Application Continuity into reality requires careful configuration, thorough validation, and adherence to established best practices. Ensuring the correct settings on both the database server and client-side, along with designing the application to be compatible with AC/TAC, is critically important.
10.1 Configuring Database Services for AC/TAC
Enabling AC/TAC and controlling its behavior is done through application services managed via Oracle Clusterware (for RAC) or the DBMS_SERVICE
package (for single instance ). Key parameters set using srvctl add service
or modify service
commands include:
- Example Configuration for TAC:
srvctl add service -db <db_unique_name> -pdb <pdb_name> -service GOLD_TAC \ -preferred <instance1> -available <instance2> \ -failovertype AUTO \ -commit_outcome TRUE \ -failover_restore AUTO \ -session_state_consistency AUTO \ -replay_init_time 1800 \ -retention 86400 \ -notification TRUE \ -drain_timeout 600 \ -stopoption IMMEDIATE
This command uses the recommendedAUTO
settings for TAC (failovertype
,failover_restore
,session_state_consistency
) and enables Transaction Guard (commit_outcome=TRUE
). Parameters likereplay_init_time
(replay initiation timeout),retention
(TG outcome retention time), anddrain_timeout
(draining timeout) should be tuned based on the workload. - Example Configuration for AC:
srvctl add service -db <db_unique_name> -pdb <pdb_name> -service GOLD_AC \ -preferred <instance1> -available <instance2> \ -failovertype TRANSACTION \ -commit_outcome TRUE \ -failover_restore LEVEL1 \ -replay_init_time 1800 \ -retention 86400 \ -notification TRUE \ -drain_timeout 600 \ -stopoption IMMEDIATE
This command usesfailovertype=TRANSACTION
for AC andfailover_restore=LEVEL1
for basic state restoration.
Careful tuning of parameters (especially replay_init_time
and drain_timeout
) based on workload and expected outage durations is essential.
10.2 Client-Side Configuration
Clients must connect correctly to AC/TAC-enabled services and be configured to leverage failover mechanisms:
- TNS Connect String: The recommended connect string should include parameters for timeouts (
CONNECT_TIMEOUT
,TRANSPORT_CONNECT_TIMEOUT
), retry count (RETRY_COUNT
), and delay (RETRY_DELAY
). For RAC environments, SCAN addresses (HOST=<scan_vip>
withinADDRESS_LIST
) andSERVICE_NAME
must be used (SID usage is unsupported and discouraged).Alias = (DESCRIPTION = (CONNECT_TIMEOUT=3)(RETRY_COUNT=4)(RETRY_DELAY=2)(TRANSPORT_CONNECT_TIMEOUT=3) (ADDRESS_LIST = (LOAD_BALANCE=on) (ADDRESS = (PROTOCOL = TCP)(HOST=<scan_vip_or_host1>)(PORT=1521)) (ADDRESS = (PROTOCOL = TCP)(HOST=<host2_vip>)(PORT=1521))... ) (CONNECT_DATA= (SERVICE_NAME = GOLD_TAC) ) )
- JDBC Configuration:
- DataSource: For pre-23ai drivers,
oracle.jdbc.replay.OracleDataSourceImpl
(orOracleXADataSourceImpl
for XA) must be used. In 23ai, the standardOracleDataSource
is sufficient as AC is auto-enabled. - Pool Properties: In pools like UCP, properties like
setFastConnectionFailoverEnabled(true)
must be set to enable FCF. - JAR Files: Ensure necessary Oracle JDBC driver (e.g.,
ojdbc11.jar
), UCP (ucp.jar
), and ONS (ons.jar
) JAR files are in the CLASSPATH.
- DataSource: For pre-23ai drivers,
- OCI/ODP.NET Configuration:
oraaccess.xml
: FAN and Runtime Load Balancing (RLB) settings can be configured via this file.- Connection String Attributes: Properties like
HA Events=true
(enable FAN),Load Balancing=true
(enable RLB),Pooling=true
can be specified in the connection string.
10.3 Validating Coverage with ACCHK
Simply performing destructive testing (e.g., shutting down an instance) is insufficient to confirm correct configuration and understand the extent to which the application is protected by AC/TAC. Oracle provides the ACCHK (Application Continuity Checker) tool for this purpose.
- Purpose: ACCHK analyzes and reports the AC/TAC protection level while a specific workload is running. It shows the protection percentage, number of protected/unprotected operations, reasons for lack of protection, and usage of legacy concrete classes.
- Steps:
- Create Views: Before first use, create ACCHK views (
DBA_ACCHK_*
) and theACCHK_READ
role in the database usingexecute dbms_app_cont_admin.acchk_views();
(COMPATIBLE
must be >= 12.2). - Grant Privilege: Grant
GRANT ACCHK_READ TO <user_name>;
to users who need to read the reports. - Enable Tracing: Enable AC tracing for a specific duration using
execute dbms_app_cont_admin.acchk_set(true, <duration_seconds>);
. This can also be done at the database or session level withALTER SYSTEM/SESSION SET EVENTS 'trace[progint_appcont_rdbms]';
. - Run Workload: Execute the application workload to be validated while tracing is active.
- Report/Query: After the tracing period ends (or is manually stopped with
acchk_set(false)
), query views likeDBA_ACCHK_EVENTS
,DBA_ACCHK_STATISTICS
, or use tools like ORAchk to analyze the protection level and details.
- Create Views: Before first use, create ACCHK views (
Using ACCHK before deployment and after application changes is critical to verify the protection level and proactively identify potential issues.
10.4 Recommended Best Practices for Developers and DBAs
- DBAs:
- Use application-specific services configured with correct parameters (FAILOVERTYPE, COMMIT_OUTCOME, etc.).
- Enable FAN and ensure client accessibility (e.g., port 6200).
- Monitor AC statistics (AWR, ACCHK views).
- Tune timeouts (
replay_init_time
,drain_timeout
) based on workload. - Apply recommended database patches.
- Developers:
- Use Oracle connection pools and return connections promptly after use.
- Write stateless code where possible, or manage state carefully.
- Implement robust error handling for unrecoverable errors.
- Avoid legacy
oracle.sql.*
concrete classes. - Understand and manage side effects (consider
disableReplay
for AC or TAC’s default behavior). - Validate coverage with ACCHK.
- General:
- Perform thorough testing in QA environments that mimic production.
- Use the latest drivers and patches.
- Follow Oracle MAA guidelines.
Configuration and validation are iterative processes, not one-time tasks. Tuning and verification are essential to realize the full benefits of AC/TAC. Default settings may not be optimal. Application behavior can change, potentially introducing unprotected calls. ACCHK provides the necessary visibility into actual protection levels, allowing informed adjustments and fixes before a real outage reveals weaknesses. Successful AC/TAC implementation requires ongoing diligence, involving not just initial setup but also performance monitoring , coverage validation , and potentially adjusting configuration or application code based on findings. Collaboration between DBAs and developers is key.
11. Conclusion
Oracle Application Continuity (AC) and its more transparent variant, Transparent Application Continuity (TAC), offer powerful solutions to the challenges of uninterrupted service and high availability faced by modern applications. These technologies mask the impact of planned and unplanned database outages from end-users and applications, aiming to make disruptions feel like nothing more than brief processing delays.
AC and TAC function by recovering interrupted database sessions, including their state and in-flight transactions, and automatically replaying them. Transaction Guard plays a crucial role in this process, guaranteeing that transactions are executed only once, thereby preserving data integrity. Fast Application Notification (FAN), along with compatible connection pools and drivers, are critical ecosystem components enabling rapid failure detection and seamless transition. While TAC provides transparent protection often without code changes , AC offers greater customization and control.
These technologies are particularly potent when deployed alongside other MAA components like Oracle RAC and Active Data Guard, creating a highly resilient and continuously available infrastructure. Innovations in Oracle 23ai, such as Database Templates, enhanced support for batch jobs, and auto-enablement in the JDBC driver, further advance the capabilities of AC/TAC and ease their adoption.
However, fully leveraging the benefits of AC/TAC requires more than just enabling a feature. Success depends on selecting the correct database and client versions, meticulous service configuration, designing applications according to best practices, and performing thorough validation using tools like ACCHK. Organizations should view AC/TAC as powerful tools within the MAA framework that complement robust application design and overall error management strategies. With proper configuration, continuous monitoring, and adherence to best practices, Oracle Application Continuity can play a critical role in achieving the level of application availability demanded by today’s challenging business requirements.