Thursday, July 24, 2014

Alfresco CMIS example with mule CMIS connector using RAML


As we know, Alfresco implements Content Management Interoperability Services (CMIS) standard which allows our application to manage content and its metadata in Alfresco repository or in Alfresco cloud.

Here, we will see how we can use Mule to integrate Alfresco into Enterprise Architecture using CMIS.
And mule provides developer tooling RESTful API Modeling Language (RAML) which enables us to build the design-first approach to REST APIs.

We will implement simple HTTP service which gets content metadata from Alfresco as following
  • Design and write our API definition in RAML
  • Use this RAML file to create simple REST API using Anypoint studio
  • Use CMIS connector to connect to Alfresco repository
  • Get metadata of given node and display it in JSON format.
Assumption: You have already installed Alfresco and AnypointStudio.

I have executed this example with Alfresco 4.2.1EE, Mule server 3.5.1 EE

Our URL and Response will look like:

Script URL: http://localhost:8080/mule/test/content/props?nodeRef=c76fe91c-c1f1-45cb-9b75-6742398482a6
Response:
{
                "title" : "test content",
                "name" : "test.ppt",
                "creator" : "admin",
                "description" : "My first RAML file"
              }

Write API Definition in RAML: 

We will use API web designer to write our RAML file

  • Our RAML file contains mainly title, URI, HTTP method – get/post etc… query parameter and response.
  • Below is RAML file for our example.
  • Once your RAML file is ready you can test it.
  • Click on Get which will show Request, Response and Try it tab

  • Request shows description and query parameter detail that we have provided in our RAML file

  • Response is static response that we have provided as example in RAML file with proper HTTP status code 200 and if any error occurs then 500

  • Try it – You can try your script and check its response. Click on GET which will show you Response.

  • Once your RAML file is ready then save it on your local drive.
  • We are done with RAML file.
Create Mule Project:

We will create a mule project and will import this RAML file which will automatically generate the flow.
  • Click on File - New – Mule Project
  • Provide name of your project
  • Check Add APIkit Component and select From RAML file or URL
  • Select RAML file that you have saved in your local drive.
  • Click Finish. 


  • Studio creates new project and it automatically generates Main flow from RAML file and you can check your RAML file under – src/main/api

  • In Flows you can see:
    • Main Flow with HTTP end point and API Kir Router
    • Global exception strategy mappings
    • Backend flow for each resource-action pairing

  • In backend flow, it initially sets up hard coded parameter and response in payload.
  • In Main flow, click on HTTP connector which will open its properties. Click on Advanced tab.
  • Check studio has not populated address as per baseUri of our RAML file. Change its values as per RAML file - http://localhost:8080/mule/test/content and save it.
  • Let’s first run this project and test in console
  • Right click on Project- Run as – Mule Application. 
  • On startup of server, it will launch APIkit Console.
  • You can verify your script same as we did before.

  • We are done with importing RAML file and creating flow.

CMIS Connector:

Now, we will implement actual logic for our flow to connect Alfresco using CMIS connector and to get metadata of given content.
  • In your flow get:/props:alfrescoContentMetaData-config add CMIS connector.

  • Open CMIS connector properties window
  • In General tab, in Connector Configurations click on plus sign to add Alfresco cmis repository and add properties as following. Here we will use alfresco cmis 1.1

    • Name: CMIS [Provide any name]
    • UserName: admin
    • Password : admin
    • BaseURL: http://localhost:9090/alfresco/api/-default-/public/cmis/versions/1.1/atom
    • Repository id:  -default-
    • Use Alfresco Extension: true [Default value is false. You need to make it true if you need to access alfresco open cmis extension API. Our example will work with true/false both value.]
  • Once done Test connection
  • If everything is OK then Click on OK

  • In Basic Settings, Select Operation – Here we want to get CMIS node object from given nodeId so Select “Get object by Id”
  • In General: In object id, we need to provide noderef. You can test it by giving any hard coded noderef of your alfresco’s content . But we want to access it from parameter so give it as below- #groovy:message.getProperty('nodeRef',org.mule.api.transport.PropertyScope.INBOUND)]
  • We are done with CMIS connector configurations.
  • For testing run your project and hit URL - http://localhost:8080/mule/test/content/props?nodeRef=c76fe91c-c1f1-45cb-9b75-6742398482a6 with actual noderef this will give hard coded JSON response that we have provided in RAML file as I mentioned earlier when we import RAML file, in payload it sets this hardcoded message.
Process CMIS object:

Now we need to Process CMIS object to get required properties as per our JSON response
  • CMIS connector and its operation Get object by Id gives us CMIS object
  • You can process it with either js like groovy or java. Here I will show you with JAVA
  • Add java component as shown in image and add class com.test.mule.alfresco.cmis.content.GetContentProps

  • In src/main/java add GetContentProps class and also add ContentPropsBean. Below is source code for both classes.
  • GetContentProps class will process CMIS object to get required properties and it will set to instance of ContentPropsBean
  • Now add Object to Json transformer as shown in image to get our response in JSON format.

  • And also remove/comment payload. 
  • We are done with our coding. Below is our main Flow.
  • Run your mule project.
  • Hit mule service with actual noderef. Make sure Alfresco is Up and running : http://localhost:8080/mule/test/content/props?nodeRef=c76fe91c-c1f1-45cb-9b75-6742398482a6
  • Verify your JSON response.
You can download source code of this example from GitHub






Thursday, July 17, 2014

Basics of Audit Analysis and Reporting in Alfresco


In Alfresco, there is general requirement to have audit data and its report! Alfresco does provide audit feature we can enable it but there is not out-of-box support for audit report.

Here we will discuss on this audit and reporting addons - Alfresco Audit Analysis and Reporting (A.A.A.R.)

What is Alfresco Audit Analysis and Reporting (A.A.A.R.)

This solution provides very detailed report on Alfresco audit data. It fetches audit data from Alfresco stores it into Data Mart and creates the report in pdf, XLS etc… and uploads this report back to Alfresco.

This solution creates AAAR_datamart database to store Alfresco audit data.

Report in different format gets generated from Pentaho Report Designer from this data mart.

Pentaho Data Integration Or Kettle is ETL system which provides powerful capabilities to Extract, Transform and Load data. So here Pentaho Data Integration is used to extract data from Alfresco and load into Data mart and generate Reports and load them back to Alfresco.

How to install AAAR

We need
  •  Alfresco 4.X
  • MySql/PostgreSQL
  • Pentaho Report Designer
  •  Pentaho Data Integration
NOTICE: I have tested this with Alfresco 4.2.1EE and MySql DB in win7 

Install Alfresco 4.2.1EE 

Please refer this link to install Alfresco: http://docs.alfresco.com/4.2/concepts/master-ch-install.html
  • Once Alfresco is installed enable auditing. Edit - alfresco-global.properties located at <AlfrescoInstallDir>/tomcat/shared/classes
                               ### Audit Configuration
                               audit.enabled=true
                               audit.alfresco-access.enabled=true

                                ### FTP Server Configuration ###
                                ftp.enabled=true
                                ftp.port=21
  • Restart server and verify audit is enabled and working OK.
  • Hit this webscript which will give you below JSON response - http://<alfresco_url>:<alfresco_port>/alfresco/service/api/audit/control
Create Data Mart :
  • We need to create AAAR_DataMart 
  • Run cmd
  • Go to MySql bin folder using cd C:/Program Files (x86)/MySQL/MySQL Server 5.5/bin
  • Execute this script AAAR_DataMart.sql. You can find this script either at biserver-ce\pentaho-solutions\system\AAAR\endpoints\kettle\src\MySql OR <your AAR installed folder>\AAAR\endpoints\kettle\src\PostgreSql
  • Execute this command mysql –u root –p<password> “<AAAR folder>\AAAR_DataMart.sql” OR you can execute this entire script from your MySql editor.
  • Check aaar_datamart database has been created
Install Pentaho Data Integration/Kettle :
  • Make sure you have java7 installed
  • Download pdi-ce-5.0.1-stable.zip from the official website or the sourceforge  - http://sourceforge.net/projects/pentaho/files/Data%20Integration/5.0.1-stable/
  • Unzip pdi-ce-5.0.1-stable.zip
  • To run it with MySql add mysql driver jar file to data-integration/lib folder
  • PDI is composed of Spoon, Kitchen and Pan
  • Execute spoon.bat located at pdi-ce-5.1.0.0-752\data-integration to create all configuration folders and files
  • Now we need to set PDI/Kettle repository
  • As we executed script to create aaar_datamart same way we need to execute AAAR_Kettle_v5.sql to create aaar_kettle database
Set Pentaho Data Integration repository :
  • Next step is to set PDI data repository to store ETL data.
  • Go to pdi-ce-5.1.0.0-752\data-integration and run Spoon.bat
  • Click on the green plus to add a new repository and define a new repository connection in the database.


  • Add a new database connection to the repository
  • Select “Kettle Database Repository”  which will allow you to select DB connection settings



  • Select General
  • Give Connection Name – AAAR_Kettle
  • Select Connection Type – MySql
  • Select Access – Native(JDBC)
  • In Settings, provide
    • Host Name: localhost
    • Database Name: AAAR_Kettle
    • Port Number: 3306
    • User Name: root
    • Password: root
  • Once done Test connection and make sure you are able to connect AAAR_Kettle DB
  • Click OK

  • In Repository Connection you will find AAAR_KETTLE
  • User Name: admin and Password: admin
  • Click OK
  • We are done with AAAR_Kettle DB setup
  • Now lets configure AAAR_Datamart
  • From the Pentaho Data Integration panel, click on Tool -> Repository -> explore

  • Click on the 'Connections' tab and edit the AAAR_DataMart connection.

  • Edit this DB connection. Provide all details same as  AAAR_kettle DB setup

  • We are done with PDI repository setup
Install Pentaho Business Analytics platform 5 [Pentaho BI-Server 5] :
Install A.A.A.R from Pentaho marketplace
  • Login as admin user to - http://<server>:8080/pentaho
  • Go to Home-> Market Place
  • Install:
    • Community Data Access
    • Community Dashboard Editor
    • Alfresco Audit Analysis and Reporting

  • Once its installed, Restart your BIServer
  • Again login as admin
  • Go to Tools -> AAAR
  • Click on Configuration
  • Provide details for Alfresco, Data Mart and PDI/Kettle
  • Alfresco details:
    • Protocol : http
    • Host: localhost
    • Port: 9090
    • Login: admin
    • Password: admin
    • FTP Path : alfresco
    • FTP Port : 2121
    • Max audit: 50000
  • Data Mart details:
    • Type: MySql
    • Host: localhost
    • Port: 3306
    • Login: root
    • Password: root
    • Bin Path [MySql bin dir’s path]:  C:/Program Files (x86)/MySQL/MySQL Server 5.5/bin
  • PDI/Kettle details
    • Path[Where you have installed your PDI]: D:/Pentaho/pdi-ce-5.1.0.0-752/data-integration

  • Save your data
  • Click Install tab then click on Install
  • Check your logs
  • After successful installation, Go to Tools -> Refresh -> CDA cache
  • Click Use

          Use A.A.A.R
          • Login as admin or other user to - http://<server>:8080/pentaho
          • Go to Tools -> AAR
          • Click on Use
          • Here you will find 
            • Extract: Get audit data from alfresco to Data mart
            • Publish: Upload report to alfresco
            • Analyse: Analyze data from dashboard
          • Extract data:
            • You can schedule this script to run as per your requirement or for testing you can also run it manually
            • To run it manually go to - biserver-ce\pentaho-solutions\system\AAAR\endpoints\kettle\script and execute AAAR_Extract.bat script
            • Execute this script OR 
            • Go to cmd and change your working directory to data-integration
            • Execute this command: kitchen.bat /rep:"AAAR_Kettle" /job:"Get all" /dir:/Alfresco /user:admin /pass:admin /level:Basic

          • Publish to Alfresco:
            • We can publish extracted report from PDI repository to Alfresco. This will be static report.
            • You can schedule this script to run or you can run it manually
            • To run it manually go to - biserver-ce\pentaho-solutions\system\AAAR\endpoints\kettle\script and execute AAAR_Publish.bat script
            • Execute this script OR 
            • Go to cmd and change your working directory to data-integration
            • Execute this command:kitchen.bat /rep:"AAAR_Kettle" /job:"Report all" /dir:/Alfresco /user:admin /pass:admin /level:Basic
            • Once reports are published you can go to Alfresco and check reports are generated under Company Home


          • Analyze data:
            • Go to Analyze tab and click on Analytics and you can analyze real time data from here
            • OR you can access through http://<server>:8080/pentaho/api/repos/:public:AAAR:main.wcdf/generatedContent

          Reference: https://addons.alfresco.com/addons/alfresco-audit-analysis-and-reporting-aaar








          Thursday, July 3, 2014

          Sorting Search Results with Alfresco Solr


          In Alfresco, we can configure our search subsystem to Solr.

          During search we need to sort our result based on property. So here I would like to focus on sorting on property specifically when you define your custom property.

          Sorting Search Results:

          When you want to sort on specific property, you need to set “tokenised”. It should be set either “false” or “both”.

          <index enabled=”true”>
          <atomic>true</atomic>
          <stored>false</stored>
          <tokenised>false</tokenised>
          </index>

          By default “tokenised” is set to “true” and when you try to sort your result, Solr will throw below error

          2014-06-30 13:23:06,768 ERROR [solr.core.SolrCore] [http-8443-5] java.lang.UnsupportedOperationException: Ordering not supported for @custom:myTestProperty

          Sorting on Date and Date Range Query:

          As mentioned above, if you want to sort your result based on any property, here specifically for 
          d: datetime, you can either set its “tokenised” to “false” or “both”.

          If you make your date property “tokenised” to “false” then soring will work but it will NOT allow you to search on range query.

          This query won’t give you any result with “tokenised “to “false”: 
          @custom\:myTestDate:[2011-06-13 TO 2014-06-01]

          So you need to set “Tokenised” to “both” so your sorting and range query both will work.

          Numeric Sort:

          In some scenario, You need to sort your search result numerically for an example for d:int type of property.

          If you set “Tokenised” to “false” for your numeric property then Solr will sort results in lexical order [same as your text property ex: cm:name] rather than numeric order.

          So to sort your results numerically you need to make “Tokenised” to “both”.

          In Lucene, you can sort result regardless of “Tokenised” setting. If you are moving your search sub-system from Lucene to Solr then you need to make sure for “Tokenised” settings.

          Whenever you make any change in your content model, need to re-build indexes so you need to make sure for searching and sorting requirement in advance.