Performing a Bulk Extract
A bulk extract refers to a mass sending of events describing the current state of object instances in the database through the event stream without waiting for an update on these instances.
The most common reason for this is to fill a data sink that sits downstream in your event streaming pipeline.
Use the
Before performing a bulk extract, make sure that it is necessary for your use case and that you have considered some of the performance concerns listed later in this section.
Before you attempt any bulk extract operations, ensure that the JadeChangeCaptureApp application is running and configured correctly and that the classes you want to extract are specified in the Change Capture Specification file. For details, see "Change Capture Specification File", elsewhere in this document.
After configuring the Change Capture Specification file, write an appropriate script to perform the bulk extract and run it. The following is an example of a script to perform the bulk extract with comments providing additional context.
bulkExtractClass();
vars
instance: Object;
instanceArray: ObjectArray;
maximumInstances: Integer64;
begin
//Collect all the instances of this class
create instanceArray transient;
/*Set the maximum number of instances to extract. Be careful, Max_Integer
(max value allowed for this method) is used for demonstration, but make
sure you know how many you're about to extract before you start*/
maximumInstances := Max_Integer;
C1.allInstances(instanceArray, maximumInstances, false); /*Last parameter
indicates we don't want instances that are subclasses*/
beginTransaction;
foreach instance in instanceArray do
if app.isValidObject(instance) then /*Make sure the instance hasn't been
deleted by a concurrent transaction*/
instance.updateObjectAndSlobOrBlobEditions(instance);
//Here, any object will do as a receiver. Only the parameter matters
endif;
endforeach;
commitTransaction;
epilog
delete instanceArray;
end;
Note the comments in the previous method example, specifically:
-
Consider carefully the value specified as the maximum number of instances. Extracting a large number of instances will likely have a significant performance impact on both Jade and the event stream. Make sure you have an idea of the number of instances and the performance impact before you attempt a bulk extract on a production system. You could mitigate the performance impact by:
-
Splitting the transaction; for example, if 10,000 instances are too many to extract at once, you could get the loop to extract the first 1,000 instances in one transaction, the next 1,000 in another transaction, and so on.
-
Filtering the instances you want to extract; for example, call the updateObjectAndSlobOrBlobEditions method only on instances where a timestamp property of the instance falls within a certain range.
-
-
The allInstances method returns all of the persistent instances that exist at the time that the call is made. On an active system, between this call and the updateObjectAndSlobOrBlobEditions method calls, new objects of the class could have been created or some of them could have been deleted.
-
Newly‑created objects will not be captured by this script, but will be captured by the event stream as a create event so they will end up being extracted.
-
Objects could be deleted making references within the instanceArray invalid. The script in the previous method example checks the reference is valid to avoid an error 4 (Object not found).
-
Overall on an active system, due to performance impact and potential changing state, we recommend taking the system offline for a large bulk extract. It is possible to perform a bulk extract without first taking the system offline, but consider the potential impact carefully.
-
-
The last parameter of the allInstances method:
-
Specifies whether to include instances that are subclasses of the specified class. If you want to include subclasses, make sure these classes are specified in the Change Capture Specification file (use the recursiveSubClasses option to include subclasses of a specified class).
-
Does not find subclasses recursively (unlike the recursiveSubClasses option). You must specify any classes that you want to extract farther down the hierarchy in your script.
In addition, if you only want to extract some subclasses using this script, we recommend specifying them individually as there will be an unnecessary performance cost to calling the method on classes you are not extracting.
The allInstances method and the recursiveSubClasses option do not search for subclasses across schemas. Write your own scripts to find these subclasses if you want to include them in the bulk extract.
-
2025.0.01 and higher
