Duplicate records in MDC repository

Problem

A table in MDC repository contains multiple records with the same MDC id - either an instance or master id.

Why this is happening

In VLDB persistence the NME engine does not do any UPDATE operations on the repository. It executes INSERTs and DELETEs only. DB persistence supports regular INSERT, UPDATE and DELETE operations. The cleanup process is implemented though a separate component - Logical Transaction Collector (LTC), sometimes referred to as MDC's garbage collector (GC). Until all transactions are complete and LTC has run, especially under heavy load, there may be a state with duplicate id's.

What to do if it happens

• Obtain the MDC log for the prior 36-48 hours
• Do a thread dump of the MDC Java process –  it can be found in Admin Center > Resources > Thread, select Thread Dump (to browser's window)
• Get the JVM GC log for the same period (if log rotation is enabled), or since the MDC server started;
• Check Execution Status in the MDC console to see if there are any running operations;
• Check Event Handlers to see if there are any running operations;
• Check the overall Persistence Status and "Detail debug info" for your persistence (typically called C_). Download the pages as simple HTML for future reference;
• Check the <prefix>_TREG table with the following SQL:

-- SQL #1
select sysdate as snapshot_time, tstatus, tphase, count(1) as transactions
from C__TREG
group by tstatus, tphase

-- SQL #2
select sysdate as snapshot_time, TID, tstatus, tphase, start_ts, end_ts
from C__TREG
where tstatus=1 or tphase=1

• Restart MDC server to get back to normal, pending transactions will automatically rollback (in case of Event Handlers some data might be re-published upon restart), and the logical transaction collector (LTC) will kick-off automatically. The restart might take longer time as some of the data needs to be collected prior to the complete system startup.

After completing the steps listed above, reach out to Ataccama Support and make sure you attach the collected information to the support ticket. In the worst case, restart the MDC server to make it work again.

After a restart of the server, all evidence and investigation is gone and the issue could happen again.