Tuesday, December 24, 2013

Configuring Solr and Alfresco on different tomcat

Alfresco supports Solr Enterprise search platform which is open source enterprise search platform.

In new alfresco version, Solr is default search engine and If you have installed alfresco using setup wizard then it comes with alfresco bundle. In newer version Solr is not deployed under tomcat/webapps due to different behavior of tomcat 7.

In your production environment, You might need solr to run on different application server than alfresco. So Here I would describe detail steps to configure solr on different standalone application server.

Assuming you have standalone tomcat running in your env on port 8080 and SSL port 8443 and have existing alfresco installed on another tomcat on port 9090 and SSL port 8444

We will refer 
TOMCAT_HOME_ALFRESCO  - where Alfresco is installed
TOMCAT_HOME_SOLR - where Solr is installed
SOLR_HOME - where we will extract all solr related file

Step 1: Get  alfresco-enterprise-solr-4.2.0.zip and extract it to SOLR_HOME

Step 2: Create solr.xml file under TOMCAT_HOME_SOLR\conf\Catalina\localhost and copy context.xml file to this solr.xml. [You can get context.xml file under SOLR_HOME]

Step 3: Modify solr.xml and edit docBase and solr/home
docBase should point to - SOLR_HOME/apache-solr-1.4.1.war
solr/home should point to - SOLR_HOME

For example: Step 4: Edit solrcore.properties to set data.dir.root where your Solr index will be stored. Modify it for both core - workspace and archive

File location:
SOLR_HOME\solr\workspace-SpacesStore\conf
SOLR_HOME\solr\archive-SpacesStore\conf

For example:
data.dir.root=D:/Alfresco_SOLR/solr/solr-index

You can define same value for both cores. It will create sub-directory for each core.

Step 5: Edit solrcore.properties. Set your alfresco server detail for both core.

alfresco.host=localhost
alfresco.port=9090
alfresco.port.ssl=8444

Before we go ahead lets generate Secure keys for communication between Alfresco and Solr

Step 6: Edit generate_keystores.bat (Windows) and generate_keystores.sh(Linux) located at SOLR_HOME\solr\alf_data\keystore and set below environment variable.

ALFRESCO_HOME
ALFRESCO_KEYSTORE_HOME
SOLR_HOME

For example:
ALFRESCO_HOME = D:\MyAlfresco
ALFRESCO_KEYSTORE_HOME=D:\Alfresco_SOLR\solr\alf_data\keystore
SOLR_HOME=D:/Alfresco_SOLR/solr

Also make sure you have proper value set for JAVA_HOME, REPO_CERT_DNAME and SOLR_CLIENT_CERT_DNAME .

Step 7: Run this script. And set dir.keystore value in your alfresco-global.properties file.

For example:
dir.keystore=D:/Alfresco_SOLR/solr/alf_data/keystore

Step 8: Edit server.xml file for both Solr and Alfresco's tomcat to use the keystore and truststore for https requests

This SSL connector settings are changed from 4.1 so make sure you configure it as below.
Removed configurations: maxSavePostSize
Added/changed configurations: clientAuth="want" maxHttpHeaderSize="32768"

Configure server.xml file location located at <TOMCAT_HOME_ALFRESCO>/conf/server.xml on connector port 8444
Configure server.xml file location located at <TOMCAT_HOME_SOLR>/conf/server.xml on connector port 8443

Step 9: Edit your alfresco-global.properties file.
Set search subsytem, dir.keystore which we already set in above step and solr server detail.

index.subsystem.name=solr
dir.keystore=D:/Alfresco_SOLR/solr/alf_data/keystore
solr.port.ssl=8443
solr.host=localhost
solr.port=8080
solr.secureComms=https

Step 10: Edit tomcat-users.xml located at TOMCAT_HOME_SOLR/conf to configure an identity for the Alfresco server.

Step 11: Edit tomcat-users.xml located at TOMCAT_HOME_ALFRESCO/conf to configure an identity for the Alfresco server.

Now, we are done with required configurations start your alfresco and Solr tomcat server and verify Solr indexes and search.
While searching on alfresco You might get below error

org.apache.tomcat.util.net.jsse.JSSESupport handShake WARNING: SSL server initiated renegotiation is disabled, closing connection

OR in solr tomcat logs you get below timeout error

WARN  [org.alfresco.solr.tracker.CoreTracker] Tracking communication timed out.

These error shows issue in communication between alfresco and solr over SSL connection.
Please check it for troubleshooting your issue - http://docs.alfresco.com/4.0/topic/com.alfresco.enterprise.doc/concepts/solr-troubleshooting.html

Alfresco version: 4.2EE


Friday, December 6, 2013

Alfresco Share webscript Connection timeout


Recently We faced HTTP connection timeout issue in Alfresco share webscript while calling repository webscrtipt.
This timeout occurs for several reasons like high load on server or custom logical operation on huge amount of data so alfresco explorer takes too much time to respond etc.
So sometimes you might need to change this timeout parameters used by Share remote client. Here I will describe how we can change them.

Initially, this parameters was hard coded, now you can set them through configuration in newer version of alfresco.
Connection timeout default value is 10s.

Need to follow below steps

Step 1:
Locate spring-webscript-application-context.xml. This settings are defined in this file.

This file is in spring-webscripts-1.2.0-SNAPSHOT.jar [tomcat\webapps\alfresco\WEB-INF\lib] and go to org/springframework/extensions/webscripts/.

Step 2:
Search for "connector.remoteclient.abstract" and copy entire bean definition.


Step 3:
Go to tomcat\shared\classes\alfresco\web-extension and create custom-slingshot-application-context.xml if you don't have else modify this file.

Copy remoteclient bean defination into this and update "connectTimeout" value as per your requirement.

Step 4:
Restart server and verify.

You can increase logs to debug if any issue.
log4j.logger.org.springframework.extensions.webscripts.connector.RemoteClient=debug

NOTE: This might affect your system performance. So till possible rectify your code which is causing this issue.

Alfresco version: 4.2

Hope this helps!

Wednesday, December 4, 2013

Alfresco share custom evaluator to check node associations

In Share, we have evaluator based on that templates, action, indicator can be displayed.
We also can have our own custom evaluator . 

I see - many has requirement to show/hide action based on content/folder associations and which is NOT available OOB.
So, In this example I will show you how you can implement it.

We will implement custom evaluator which checks if content has "cm:references" association then show OOB action - "Delete Document". You can have your custom action or any other action.

First we will start with Share side changes.

Step1:  Define custom evaluator

Create file - custom-documentlibrary-context.xml and put it to \tomcat\shared\classes\alfresco\web-extension

Add below code

Step 2: Create custom evaluator class.

This class calls alfresco repository webscript which accepts content noderef and association name and
checks if content has given association or not and accordingly return true or false.

Step 3:  Add new custom evaluator for delete document action in share-config-custom.xml
So now we are done with Share side changes.
We need to create alfresco repository webscript which checks if node has given association or not. This would be simple javascript webscript

Step1: Create desc file . node-association.get.desc.xml.
Put in to extension folder: \tomcat\shared\classes\alfresco\extension\templates\webscripts\custom\example\node\association
Step 2: Create node-association.get.js

Step 3: Create node-association.get.json.ftl

Deploy your custom code and restart server. Verify Delete action for content with and without cm:reference association.
I have not considered multiple association, you can change it same way.

Hope its helpful!
 

Friday, September 20, 2013

Manage Alfresco through JMX - Revert persistence property from database


We can monitor and manage alfresco through JMX. How to connect alfresco through JMX , Please refer - http://wiki.alfresco.com/wiki/JMX

JMX allows us to change several Alfresco Subsystems configuration without restart of server.
But we must need to know that - this settings are being persisted in database [Not all configurations] and this changes get applied to all nodes in cluster which is NOT always good.

For example you have 2 nodes in cluster and both node has different location for its lucene index .

node1: dir.indexes=/opt/alfresco/lucene/lucene-indexes
node2: dir.indexes=/opt/alfresco/luceneNew/lucene-indexes

Now, if you change node 1 location from JMX then it will also affect node2. and on next restart you can not override changes from your alfresco-global.properties file.
Property value saved in database takes precedence over property file.

How to revert changes

To Revert back this changes, Need to go to Operation tab as shown in below image and click on revert - which will store your default/original value.



Which configuration are persistent

You can know which MBean  properties are persistent from your JMX client [I am using VisualVM].
Go to Metadata tab as shown in below image. MBean with Description as "Persistent Managed Bean" stores property value in Database.





Hope this information will help!

Alfresco 4.2 New Features


Alfresco 4.2 have lot many new and enhanced features. I tried this new version in my local env and I have listed this features.
Alfresco theme has been changed which is very light and sleek.

Alfresco Share NEW UI

New Header
  • Header theme has been changed to for better visualization 


My Files
  • New link is added - My Files for users personal documents
  • In BackEnd - Alfresco explorer - It saves contents to User's Home Folder .


Shared Files
  • New link Shared Files has been added.
  • In Share Repository view, new folder has been added - Shared under Company Home
  • This is default folder for all users to add/share contents which can help to hide system Folders like Sites, Data Dictionary etc...
  • Repository link is as it is which shows all folders under Company Home

Site Interface
  • Site UI has been changed, again new look and for better visualization.
  • Site Page header has been removed. Site customization header and Page headers are combined in single toolbar
  • You can navigate to recently accessed sites from Sites menu

  • In Document Library many new views has been introduced like Filmstrip, Table view etc..


Admin Console
  • Admin console has been divided in two part 
  • Share Admin Tool
    • This includes
      • Tools - Application, Category Manager, Node Browser etc..
      • File Management -Trashcan
      • Content Publishing - Channel Manager
      • Repository - Replication Jobs
      • Users and Groups - Groups, Users


  • Repository Administration Console
    • It runs independently of alfresco app so there is no direct link is available from Share or Explorer UI
    • Need to use this URL to launch this Repository admin console - http://<host-name>:<port>/alfresco/service/enterprise/admin/admin-directorymanagement 
    • This includes 
      • System Summary 
      • Email Services 
      • General- License, Repository Information etc.. 
      • Repository Services -Activities Feed, Repository Server Clustering, Process Engines, Replication Service, Search Service etc.. In this new enhanced Repository Server Clustering is added. 
      • Support Tools -Download JMX Dump 
      • Directories - Directory Management 
      • Virtual File Systems - File Servers, IMAP Service


User Trashcan
  • User trashcan is added in User profile page
  • User can recover/purge deleted items from own trash

Site level activity feed
  • User can control activity feed at individual site level

Download as Zip
  • New action is added to download multiple files together as zip
  • You can select multiple files then from menu select - Download s zip action which will create a zip file
  • You can also download entire folders as a Zip bundle
Google Docs Integration
  • Google docs integration is really improved.
  • You can create doc, xls or PPT file in just one click; you can edit/update your content and can save back to Alfresco
  • Here is Jeff Potts blog for detailed features video - http://ecmarchitect.com/archives/2012/10/10/1715
New Search Dashlets

Site Search
  • It allows you to search contents from your sites if you have added in your personal dashboard OR specific to site if its added on Site dashboard

Saved Search
  • It allows you to save your search term.
  • You need to configure your search term from dashlet configurations
  • So this dashlet will list all contents related to your search term

You may try new version. It has really good user experience!

Reference Link: http://docs.alfresco.com/4.2/index.jsp

Friday, September 6, 2013

Failed to purge nodes - DeletedNodeCleanupWorker


We recently faced issue with alfresco OOB scheduler - DeletedNodeCleanupWorker.
We were getting below error:

"21:00:40,348 WARN [node.db.DeletedNodeCleanupWorker] Failed to purge nodes. If the purgable set is too large for the available DB resources then the nodes can be purged manually as well.
...................
### Error updating database.  Cause: java.sql.SQLException: The DELETE statement conflicted with the REFERENCE constraint "fk_alf_cass_pnode". The conflict occurred in database "alfrescoDB", table "dbo.alf_child_assoc", column 'parent_node_id"

Reason of this error is - Alfresco is referencing to deleted node such that purge is failing because of the constraint on db
This error may affect lucene indexing. Indexes can be out-of-sync and search might not return all results.

To solve this error, We followed below steps.Basically we need to delete corrupted nodes manually from Database.

1)First we need to find out problematic node, For that Run full indexing. You may enable below logger for detail info.


log4j.logger.org.alfresco.repo.node.index.AbstractReindexComponent=debug
log4j.logger.org.alfresco.repo.node.index.IndexTransactionTracker=debug
log4j.logger.org.alfresco.repo.node.index.FullIndexRecoveryComponent=debug
log4j.logger.org.alfresco.repo.node.index.AVMFullIndexRecoveryComponent=debug

2)Check Full indexing logs.Nodes with below error are corrupted nodes which is causing issue.

"Caused by: org.springframework.dao.ConcurrencyFailureException: Attempt to follow reference workspace://SpacesStore/fc4450e4-c316-4531-b90a-b94a0e73a4a5 to deleted node 1260437"

3) We would consider this node 1260437 and we need to delete this node manually from DB. Before we delete this node, we need to make sure it doesn't affect any other contents and it doesn't have any child.

4)Need to execute below queries. For example, I got 2 corrupted nodes - 1260437,467222. I have also provided query result so you can know which nodes we needs to consider again from result and how we can go bottom of the tree.

From below query, Mainly we need to consider :SELECT * FROM alf_child_assoc WHERE parent_node_id IN(1260437,467222)

SELECT * FROM alf_node_assoc WHERE source_node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_assoc WHERE target_node_id IN (1260437,467222); - No Records
SELECT * FROM alf_usage_delta WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_aspects WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_node_properties WHERE node_id IN (1260437,467222); - No Records
SELECT * FROM alf_child_assoc WHERE child_node_id IN (1260437,467222); - No Records

SELECT * FROM alf_child_assoc WHERE parent_node_id IN (1260437,467222);
id version parent_node_id type_qname_id child_node_name_crc child_node_name child_node_id qname_ns_id qname_localname qname_crc is_primary assoc_index
1164570 1 1260437 193 -3852862330 68a0807e-99d1-4208-a28c-8153c45805ea 1260445 6 webpreview 1387062285 1 -1

SELECT * FROM alf_node WHERE alf_node.id IN (1260437,467222);
id version store_id uuid transaction_id node_deleted type_qname_id acl_id audit_creator audit_created audit_modifier audit_modified audit_accessed locale_id
467222 6 6 7295c299-f750-4b5f-84d5-381059f117a5 1581812 1 32 (null) HFRHH300 2012-07-09T19:11:03.430-04:00 HFRHH300 2013-05-19T16:16:42.048-04:00 (null) 1
1260437 4 6 65b506c4-e24d-4347-9961-331ee7df36dd 1473392 1 32 (null) HFINN500 2013-04-23T16:49:00.372-04:00 HFINN500 2013-04-23T16:49:01.658-04:00 (null) 1

Above query result, we can identify that node 1260437 has child with id: 1260445.
So again we need to execute query for node: 1260445 to make sure it doesn't have any child.

SELECT * FROM alf_child_assoc WHERE parent_node_id IN (1260445); - No Result

From query result we can confirm node 1260445 doesn't have any child. So we are good to delete three nodes - 1260437, 467222, 1260445

5) Need to execute below DELETE query.

DELETE * FROM alf_node_assoc WHERE source_node_id IN (1260437,467222,1260445); 
DELETE * FROM alf_node_assoc WHERE target_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_usage_delta WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node_aspects WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node_properties WHERE node_id IN (1260437,467222,1260445);
DELETE * FROM alf_child_assoc WHERE child_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_child_assoc WHERE parent_node_id IN (1260437,467222,1260445);
DELETE * FROM alf_node WHERE alf_node.id IN (1260437,467222,1260445);

NOTE: Must need to take care of below things, before you delete nodes from DB.

1) Need to execute delete in given order
2) Must need to take DB backup. Once this nodes are deleted and if any issue then there is no way we can recover it.
3) Also take lucene backup if in case any issue.

Final Steps

1) Shutdown alfresco
2) Take database backup
3) Take lucene backup
4) Execute delete queries
5) Start alfresco with FULL index ON. [Optional]
6) Monitor scheduler for few days.Issue should be resolved.

We successfully executed in our env - alfresco 4.0.1EE and  DB: SQL server 2008.

Hope this helps!