Large Batch Processes (Adds, Deletes, or Updates)

The suggestions in this section apply whenever you are doing a large number of adds, deletes, or updates.

Consider running the batch process as server execution methods, a server application, or running in single user mode. All of these options will generally improve performance. For potential impacts on other users, see "serverExecution Method Option", earlier in this document.

If running in multiuser mode, consider breaking up your transactions so that updates to a centralized collection, which multiple processes may want to access, are separated from the main bulk updates. This may help to reduce single‑threading when Jade does its exclusive lock on the updated collection. In a situation where there is one updater and one or more readers, using update locks can also help reduce single‑threading.

For example, in the following code fragment, as soon as a new company is discovered, it will be created and added to the collection off Root. Jade will put an exclusive lock on the collection root.allCompanies, which means that no one else will be able to read this collection until this process finishes and performs the commitTransaction instruction to unlock everything that has been locked.

beginTransaction;
foreach record in company sales input file do
    company:= app.myRoot.allCompanies.getAtKey(record.companyID);
    if company = null then
        create company;
        company.ID := record.companyID;
        company.name := record.companyName;
        company.myRoot := app.myRoot;
    endif;
    create sale;
    sale.date := record.date;
    sale.myCompany:= company;
endforeach;
commitTransaction;

A better approach may be to read through the input file first, just to establish if any new companies needed adding and add them at that stage. You would then re‑process the file and perform the sale update process.

You could also do a commitTransaction every 1,000‑10,000 objects or after a specific interval. This can help keep the transactions from becoming too large and to allow other processing to gain access to locked resources.

The best number of objects per transaction can be difficult to determine, so it may be easier to simply commit based on elapsed time, perhaps every 500 milliseconds or every 2 seconds. You should base the specified time on how long locks can be held without introducing problems, and how large the transaction is. Overly large transactions can cause additional resource consumption.

Another advantage of time‑based commits over count‑based commits is that they automatically reduce the impact on other users during peak periods. You may process 1,000 objects in 0.5 seconds during idle times, but it may take 6 seconds during peak times, resulting in locks being held longer when you can least afford them. Time‑based commits automatically include fewer objects in each commit during peak times.

Breaking a single large update into multiple smaller ones can help performance, but it can also introduce recovery and concurrency issues. If the server goes down, you need to be able to restart and reprocess from the point to which the database recovered. If other users can access the objects during the load, they need to be committed in a coherent state.