Friday, September 6, 2013

Failed to purge nodes - DeletedNodeCleanupWorker


We recently faced issue with alfresco OOB scheduler - DeletedNodeCleanupWorker.
We were getting below error:

"21:00:40,348 WARN [node.db.DeletedNodeCleanupWorker] Failed to purge nodes. If the purgable set is too large for the available DB resources then the nodes can be purged manually as well.
...................
### Error updating database.  Cause: java.sql.SQLException: The DELETE statement conflicted with the REFERENCE constraint "fk_alf_cass_pnode". The conflict occurred in database "alfrescoDB", table "dbo.alf_child_assoc", column 'parent_node_id"

Reason of this error is - Alfresco is referencing to deleted node such that purge is failing because of the constraint on db
This error may affect lucene indexing. Indexes can be out-of-sync and search might not return all results.

To solve this error, We followed below steps.Basically we need to delete corrupted nodes manually from Database.

1)First we need to find out problematic node, For that Run full indexing. You may enable below logger for detail info.


log4j.logger.org.alfresco.repo.node.index.AbstractReindexComponent=debug
log4j.logger.org.alfresco.repo.node.index.IndexTransactionTracker=debug
log4j.logger.org.alfresco.repo.node.index.FullIndexRecoveryComponent=debug
log4j.logger.org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent=debug

2)Check Full indexing logs.Nodes with below error are corrupted nodes which is causing issue.

"Caused by: org.springframework.dao.ConcurrencyFailureException: Attempt to follow reference workspace://SpacesStore/fc4450e4-c316-4531-b90a-b94a0e73a4a5 to deleted node 1260437"

3) We would consider this node 1260437 and we need to delete this node manually from DB. Before we delete this node, we need to make sure it doesn't affect any other contents and it doesn't have any child.

4)Need to execute below queries. For example, I got 2 corrupted nodes - 1260437,467222. I have also provided query result so you can know which nodes we needs to consider again from result and how we can go bottom of the tree.

From below query, Mainly we need to consider :SELECT * FROM alf_child_assoc WHERE parent_node_id IN(1260437,467222)

SELECT * FROM alf_node_assoc WHERE source_node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_assoc WHERE target_node_id IN (1260437,467222); - No Records
SELECT * FROM alf_usage_delta WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_aspects WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_properties WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_child_assoc WHERE child_node_id IN (1260437,467222); - No Records

SELECT * FROM alf_child_assoc WHERE parent_node_id IN (1260437,467222);
id version parent_node_id type_qname_id child_node_name_crc child_node_name child_node_id qname_ns_id qname_localname qname_crc is_primary assoc_index
1164570 1 1260437 193 -3852862330 68a0807e-99d1-4208-a28c-8153c45805ea 1260445 6 webpreview 1387062285 1 -1

SELECT * FROM alf_node WHERE alf_node.id IN (1260437,467222);
id version store_id uuid transaction_id node_deleted type_qname_id acl_id audit_creator audit_created audit_modifier audit_modified audit_accessed locale_id
467222 6 6 7295c299-f750-4b5f-84d5-381059f117a5 1581812 1 32 (null) HFRHH300 2012-07-09T19:11:03.430-04:00 HFRHH300 2013-05-19T16:16:42.048-04:00 (null) 1
1260437 4 6 65b506c4-e24d-4347-9961-331ee7df36dd 1473392 1 32 (null) HFINN500 2013-04-23T16:49:00.372-04:00 HFINN500 2013-04-23T16:49:01.658-04:00 (null) 1

Above query result, we can identify that node 1260437 has child with id: 1260445.
So again we need to execute query for node: 1260445 to make sure it doesn't have any child.

SELECT * FROM alf_child_assoc WHERE parent_node_id IN (1260445); - No Result

From query result we can confirm node 1260445 doesn't have any child. So we are good to delete three nodes - 1260437, 467222, 1260445

5) Need to execute below DELETE query.

DELETE * FROM alf_node_assoc WHERE source_node_id IN (1260437,467222,1260445); 
DELETE * FROM alf_node_assoc WHERE target_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_usage_delta WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node_aspects WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node_properties WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_child_assoc WHERE child_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_child_assoc WHERE parent_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node WHERE alf_node.id IN (1260437,467222,1260445);

NOTE: Must need to take care of below things, before you delete nodes from DB.

1) Need to execute delete in given order
2) Must need to take DB backup. Once this nodes are deleted and if any issue then there is no way we can recover it.
3) Also take lucene backup if in case any issue.

Final Steps

1) Shutdown alfresco
2) Take database backup
3) Take lucene backup
4) Execute delete queries
5) Start alfresco with FULL index ON. [Optional]
6) Monitor scheduler for few days.Issue should be resolved.

We successfully executed in our env - alfresco 4.0.1EE and  DB: SQL server 2008.

Hope this helps! 

1 comment: