The Redo Log and a Capture Process
A capture process is an optional Oracle background process that scans the database redo log to capture DML and DDL changes made to database objects. When a capture process is configured to capture changes from a redo log, the database where the changes were generated is called the source database.
A capture process can run on the source database or on a remote database. When a capture process runs on the source database, the capture process is a local capture process. When a capture process runs on a remote database, the capture process is called a downstream capture process, and the remote database is called the downstream database.
Logical Change Records (LCRs)
A capture process reformats changes captured from the redo log into LCRs. An LCR is a message with a specific format that describes a database change. A capture process captures two types of LCRs: row LCRs and DDL LCRs.
After capturing an LCR, a capture process enqueues a message containing the LCR into a queue. A capture process is always associated with a single ANYDATA queue, and it enqueues messages into this queue only. For improved performance, captured messages always are stored in a buffered queue, which is System Global Area (SGA) memory associated with an ANYDATA queue. You can create multiple queues and associate a different capture process with each queue.
Capture Process Rules
A capture process either captures or discards changes based on rules that you define. Each rule specifies the database objects and types of changes for which the rule evaluates to TRUE. You can place these rules in a positive rule set or negative rule set for the capture process.
If a rule evaluates to TRUE for a change, and the rule is in the positive rule set or a capture process, then the capture process captures the change. If a rule evaluates to TRUE for a change, and the rule is in the negative rule set for a capture process, then the capture process discards the change. If a capture process has both a positive and a negative rule set, then the negative rule set is always evaluated first.
Types of DML Changes Captured
When you specify that DML changes made to certain tables should be captured, a capture process captures the following types of DML changes made to these tables
Insert
Update
Delete
Merge
Piecewise updates to LOB
A capture process converts each MERGE change into an INSERT or UPDATE change. MERGE is not a valid command type in a row LCR.
A capture process can capture changes made to an index-organized table only if
the index-organized table does not contain any columns of the following
datatypes
ROWID
UROWID
User defined types (object types, REFs, varrays, and nested tables)
A capture process ignores CALL, EXPLAIN PLAN, or LOCK TABLE statements.
A capture process cannot capture DML changes made to temporary tables or object tables. A capture process raises an error if it attempts to capture such changes.
If you share a sequence at multiple databases, then sequence values used for individual rows at these databases might vary. Also, changes to actual sequence values are not captured. For example, if a user references a NEXTVAL or sets the sequence, then a capture process does not capture changes resulting from these operations.
DDL Changes and Capture Processes
A capture process captures the DDL changes that satisfy its rule sets, except for the following types of DDL changes
Alter Database
Create Controlfile
Create Database
Create Pfile
Create SPfile
Flashback Database
A capture process can capture DDL statements, but not the results of DDL statements
unless the DDL statement is a CREATE TABLE AS SELECT statement.
Some types of DDL changes that are captured by a capture process cannot be applied by an apply process. If an apply process receives a DDL LCR that specifies an operation that cannot be applied, then the apply process ignores the DDL LCR and records information about it in the trace file for the apply process.
Other Types of Changes Ignored by a Capture Process
The session control statements ALTER SESSION and SET ROLE.
The system control statement ALTER SYSTEM.
Invocations of PL/SQL procedures, which means that a call to a PL/SQL procedure is not captured. However, if a call to a PL/SQL procedure causes changes to database objects, then these changes can be captured by a capture process if the changes satisfy the capture process rule sets.
Changes made to a table or schema by online redefinition using the DBMS_REDEFINITION package. Online table redefinition is supported on a table for which a capture process captures changes, but the logical structure of the table before online redefinition must be the same as the logical structure after online redefinition.
Streams Capture Processes and Oracle Real Application Clusters
You can configure a Streams capture process to capture changes in an Oracle Real Application Clusters (RAC) environment. If you use one or more capture processes and RAC in the same environment, then all archived logs that contain changes to be captured by a capture process must be available for all instances in the RAC environment. In a RAC environment, a capture process reads changes made by all instances.
Each capture process is started and stopped on the owner instance for its ANYDATA queue, even if the start or stop procedure is run on a different instance. Also, a capture process will follow its queue to a different instance if the current owner instance becomes unavailable. The queue itself follows the rules for primary instance and secondary instance ownership. If the owner instance for a queue table containing a queue used by a capture process becomes unavailable, then queue ownership is transferred automatically to another instance in the cluster. In addition, if the captureprocess was enabled when the owner instance became unavailable, then the capture process is restarted automatically on the new owner instance. If the capture process was disabled when the owner instance became unavailable, then the capture process remains disabled on the new owner instance.
The DBA_QUEUE_TABLES data dictionary view contains information about the owner instance for a queue table. Also, any parallel execution servers used by a single capture process run on a single instance in a RAC environment.
LogMiner supports the LOG_ARCHIVE_DEST_n initialization parameter, and Streams capture processes use LogMiner to capture changes from the redo log. If an archived log file is inaccessible from one destination, a local capture process can read it from another accessible destination. On a RAC database, this ability also enables you to use cross instance archival (CIA) such that each instance archives its files to all other instances. This solution cannot detect or resolve gaps caused by missing archived log files. Hence, it can be used only to complement an existing solution to have the archived files shared between all instances.
Capture Process Architecture
A capture process is an optional Oracle background process whose process name is cnnn, where nnn is a capture process number. Valid capture process names include c001 through c999. A capture process captures changes from the redo log by using the infrastructure of LogMiner. Streams configures LogMiner automatically. You can create, alter, start, stop, and drop a capture process, and you can define capture process rules that control which changes a capture process captures.
Capture Process Components
One reader server that reads the redo log and divides the redo log into regions.
One or more preparer servers that scan the regions defined by the reader server in parallel and perform prefiltering of changes found in the redo log. Prefiltering involves sending partial information about changes, such as schema and object name for a change, to the rules engine for evaluation, and receiving the results of the evaluation.
One builder server that merges redo records from the preparer servers. These redo records either evaluated to TRUE during partial evaluation or partial evaluation was inconclusive for them. The builder server preserves the SCN order of these redo records and passes the merged redo records to the capture process.
Each reader server, preparer server, and builder server is a parallel execution server. A capture process (cnnn) is an Oracle background process.
Capture Process States
INITIALIZING - Starting up.
WAITING FOR DICTIONARY REDO - Waiting for redo log files containing the dictionary build related to the first SCN to be added to the capture process session. A capture process cannot begin to scan the redo log files until all of the log files containing the dictionary build have been added.
DICTIONARY INITIALIZATION - Processing a dictionary build.
MINING (PROCESSED SCN = scn_value) - Mining a dictionary build at the SCN scn_value.
LOADING (step X of Y) - Processing information from a dictionary build and currently at step X in a process that involves Y steps, where X and Y are numbers.
CAPTURING CHANGES - Scanning the redo log for changes that evaluate to TRUE against the capture process rule sets.
WAITING FOR REDO - Waiting for new redo log files to be added to the capture process session. The capture process has finished processing all of the redo log files added to its session. This state is possible if there is no activity at a source database. For a downstream capture process, this state is possible if the capture process is waiting for new log files to be added to its session.
EVALUATING RULE - Evaluating a change against a capture process rule set.
CREATING LCR - Converting a change into an LCR.
ENQUEUING MESSAGE - Enqueuing an LCR that satisfies the capture process rule sets into the capture process queue.
PAUSED FOR FLOW CONTROL - Unable to enqueue LCRs either because of low memory or because propagations and apply processes are consuming messages slower than the capture process is creating them. This state indicates flow control that is used to reduce spilling of captured messages when propagation or apply has fallen behind or is unavailable.
SHUTTING DOWN - Stopping.
Multiple Capture Processes in a Single Database
If you run multiple capture processes in a single database, consider increasing the size of the System Global Area (SGA) for each instance. Use the SGA_MAX_SIZE initialization parameter to increase the SGA size. Also, if the size of the Streams pool is not managed automatically in the database, (The size of the Streams pool is managed automatically if the SGA_TARGET initialization parameter is set to a nonzero value.) then you should increase the size of the Streams pool by 10 MB for each capture process parallelism.
If you have two capture processes running in a database, and the parallelism parameter is set to 4 for one of them and 1 for the other, then increase the Streams pool by 50 MB (4 + 1 = 5 parallelism).
Oracle recommends that each ANYDATA queue used by a capture process, propagation, or apply process have captured messages from at most one capture process from a particular source database. Therefore, a separate queue should be used for each capture process that captures changes originating at a particular source database.
Capture Process Checkpoints
A checkpoint is information about the current state of a capture process that is stored persistently in the data dictionary of the database running the capture process. A capture process tries to record a checkpoint at regular intervals called checkpoint intervals.
Required Checkpoint SCN
The SCN that corresponds to the lowest checkpoint for which a capture process requires redo data is the required checkpoint SCN. The redo log file that contains the required checkpoint SCN, and all subsequent redo log files, must be available to the capture process. If a capture process is stopped and restarted, then it starts scanning the redo log from the SCN that corresponds to its required checkpoint SCN. The required checkpoint SCN is important for recovery if a database stops unexpectedly.
Maximum Checkpoint SCN
The SCN that corresponds to the last checkpoint recorded by a capture process is the maximum checkpoint SCN. If you create a capture process that captures changes from a source database, and other capture processes already exist which capture changes from the same source database, then the maximum checkpoint SCNs of the existing capture processes can help you decide whether the new capture process should create a new LogMiner data dictionary or share one of the existing LogMiner data dictionaries.
Checkpoint Retention Time
The checkpoint retention time is the amount of time, in number of days, that a capture process retains checkpoints before purging them automatically.
Whenever the first SCN is reset for a capture process, the capture process purges information about archived redo log files prior to the new first SCN from its LogMiner data dictionary. After this information is purged, the archived redo log files remain on the hard disk, but the files are not needed by the capture process. The PURGEABLE column in the DBA_REGISTERED_ARCHIVED_LOG view displays YES for the archived redo log files that are no longer needed. These files can be removed from disk or moved to another location without affecting the capture process.
If you create a capture process using the CREATE_CAPTURE procedure in the DBMS_CAPTURE_ADM package, then you can specify the checkpoint retention time, in days, using the checkpoint_retention_time parameter. The default checkpoint retention time is 60 days if the checkpoint_retention_time parameter is not specified in the CREATE_CAPTURE procedure, or if you use the DBMS_STREAMS_ADM package to create the capture process.
You can change the checkpoint retention time for a capture process by specifying a new time in the ALTER_CAPTURE procedure in the DBMS_CAPTURE_ADM package. If you do not want checkpoints for a capture process to be purged automatically, then specify DBMS_CAPTURE_ADM.INFINITE for the checkpoint_retention_time parameter in CREATE_CAPTURE or ALTER_CAPTURE.
Capture Process Creation
You can create a capture process using the DBMS_STREAMS_ADM package or the DBMS_CAPTURE_ADM package. Using the DBMS_STREAMS_ADM package to create a capture process is simpler because defaults are used automatically for some configuration options.
Alternatively, using the DBMS_CAPTURE_ADM package to create a capture process is more flexible, and you create one or more rule sets and rules for the capture process either before or after it is created. You can use the procedures in the DBMS_STREAMS_
ADM package or the DBMS_RULE_ADM package to add rules to a rule set for the capture process.
When you create a capture process using a procedure in the DBMS_STREAMS_ADM package and generate one or more rules in the positive rule set for the capture process, the objects for which changes are captured are prepared for instantiation automatically, unless it is a downstream capture process and there is no database link from the downstream database to the source database.
When you create a capture process using the CREATE_CAPTURE procedure in the DBMS_CAPTURE_ADM package, you should prepare for instantiation any objects for which you plan to capture changes as soon as possible after capture process creation.
The LogMiner Data Dictionary for a Capture Process
A capture process requires a data dictionary that is separate from the primary data dictionary for the source database. This separate data dictionary is called a LogMiner data dictionary. There can be more than one LogMiner data dictionary for a particular source database. If there are multiple capture processes capturing changes from the source database, then two or more capture processes can share a LogMiner data dictionary, or each capture process can have its own LogMiner data dictionary. If the LogMiner data dictionary needed by a capture process does not exist, then the capture process populates it using information in the redo log when the capture process is started for the first time.
The Streams Data Dictionary
Propagations and apply processes use a Streams data dictionary to keep track of the database objects from a particular source database. A Streams data dictionary is populated whenever one or more database objects are prepared for instantiation at a source database. Specifically, when a database object is prepared for instantiation, it is recorded in the redo log. When a capture process scans the redo log, it uses this information to populate the local Streams data dictionary for the source database. In the case of local capture, this Streams data dictionary is at the source database. In the case of downstream capture, this Streams data dictionary is at the downstream database.
ARCHIVELOG Mode and a Capture Process
A local capture process reads online redo logs whenever possible and archived redo log files otherwise. Therefore, the source database must be running in ARCHIVELOG mode when a local capture process is configured to capture changes.
A real-time downstream capture process reads online redo data from its source database whenever possible and archived redo log files that contain redo data from the source database otherwise. In this case, the redo data from the source database is stored in the standby redo log at the downstream database, and the archiver at the downstream database archives the redo data in the standby redo log. Therefore, both the source database and the downstream database must be running in ARCHIVELOG mode when a real-time downstream capture process is configured to capture changes.
An archived-log downstream capture process always reads archived redo log files from its source database. Therefore, the source database must be running in ARCHIVELOG mode when an archived-log downstream capture process is configured to capture changes.
You can query the REQUIRED_CHECKPOINT_SCN column in the DBA_CAPTURE data dictionary view to determine the required checkpoint SCN for a capture process. When the capture process is restarted, it scans the redo log from the required checkpoint SCN forward. Therefore, the redo log file that includes the required checkpoint SCN, and all subsequent redo log files, must be available to the capture process.
You must keep an archived redo log file available until you are certain that no capture process will need that file. The first SCN for a capture process can be reset to a higher value, but it cannot be reset to a lower value. Therefore, a capture process will never need the redo log files that contain information prior to its first SCN. Query the DBA_LOGMNR_PURGED_LOG data dictionary view to determine which archived redo log files will never be needed by any capture process.
RMAN and Archived Redo Log Files Required by a Capture Process
Some Recovery Manager (RMAN) commands delete archived redo log files. If one of these RMAN commands is used on a database that is running one or more local capture processes, then the RMAN command does not delete archived redo log files that are needed by a local capture process. That is, the RMAN command does not delete archived redo log files that contain changes with SCN values that are equal to or greater than the required checkpoint SCN for a local capture process.
The RMAN command DELETE OBSOLETE permanently purges the archived redo log files that are no longer needed. This command only deletes the archived redo log files in which all of the changes are less than the required checkpoint SCN for a local capture process.
The RMAN command BACKUP ARCHIVELOG ALL DELETE INPUT copies the archived redo log files and deletes the original files after completing the backup. This command only deletes the archived redo log files in which all of the changes are less than the required checkpoint SCN for a local capture process. If archived redo log files are not deleted because they contain changes required by a capture process, then RMAN display a warning message about skipping the delete operation for these files.
If a database is a source database for a downstream capture process, then these RMAN commands might delete archived redo log files that have not been transferred to the downstream database and are required by a downstream capture process. Therefore, before running these commands on the source database, make sure any archived redo log files needed by a downstream database have been transferred to the downstream database.
Capture Process Parameters
The parallelism capture process parameter controls the number of preparer servers used by a capture process. The preparer servers concurrently format changes found in the redo log into LCRs. Each reader server, preparer server, and builder server is a parallel execution server, and the number of preparer servers equals the number specified for the parallelism capture process parameter. So, if parallelism is set to 5, then a capture process uses a total of seven parallel execution servers, assuming seven parallel execution servers are available: one reader server, five preparer servers, and one builder server.
The time_limit capture process parameter specifies the amount of time a capture process runs, and the message_limit capture process parameter specifies the number of messages a capture process can capture. The capture process stops automatically when it reaches one of these limits.
The disable_on_limit parameter controls whether a capture process becomes disabled or restarts when it reaches a limit. If you set the disable_on_limit parameter to y, then the capture process is disabled when it reaches a limit and does not restart until you restart it explicitly. If, however, you set the disable_on_limit parameter to n, then the capture process stops and restarts automatically when it reaches a limit.
Capture Process Rule Evaluation
A capture process maintains a persistent status when the database running the capture process is shut down and restarted. If a capture process is enabled when the database is shut down, then the capture process automatically starts when the database is restarted. Similarly, if a capture process is disabled or aborted when a database is shut down, then the capture process is not started and retains the disabled or aborted status when the database is restarted.