SQL-Alfresco:Alfresco SQL views
From SIDE-Labs.org
Unsurprisingly, performances on querying the views with several joins are disastrous! It is the reason SIDE 1.0 and SIDE 2.0 implementations now use real tables, synchronized through an Alfresco policy, instead of views.
Help & support are available at http://www.side-labs.org/forums/
Contents |
Overview
This component permits to use SQL queries to select data stored through Alfresco. It permits to manage structured data with Alfresco.
Architecture
Alfresco stores data in a "CMS way", that means meta informations are mixed in one table, whatever the main type is. BlueXML's solution provides a way :
- to reconstruct one table for each type
- to manage inheritance
- to manage class associations.
Alfresco Database schema
( from the Alfresco's wiki)
The table alf_node
+-----+---------+-----------+-------------+--------------------------------------+-------------------------------------------------------------------------------------------+--------+
| id | version | protocol | identifier | uuid | type_qname | acl_id |
+-----+---------+-----------+-------------+--------------------------------------+-------------------------------------------------------------------------------------------+--------+
| 841 | 1 | workspace | SpacesStore | 986570b5-4a1b-11dd-823c-f5095e006c11 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entreprise | NULL |
| 842 | 1 | workspace | SpacesStore | 98709447-4a1b-11dd-823c-f5095e006c11 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entreprise | NULL |
| 843 | 1 | workspace | SpacesStore | 987bdee9-4a1b-11dd-823c-f5095e006c11 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entreprise | NULL |
| 844 | 1 | workspace | SpacesStore | 9887298b-4a1b-11dd-823c-f5095e006c11 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entreprise | NULL |
It stores all the nodes and thus all the class instances defined in the model. Here we can remark the identifiers (id) of the different objects of type Entreprise (type_qname column). The column protocol is also important since it enables one to filter the elements from the workspace (actually, an element which was removed is still existing in the database with the protocol archive).
Table alf_node_properties
+---------+-------------+--------------+----------------+---------------+------------+-------------+--------------+----------------+-----------------+--------------------+--------------------------------------------------------------------------------+
| node_id | actual_type | multi_valued | persisted_type | boolean_value | long_value | float_value | double_value | string_value | attribute_value | serializable_value | qname |
+---------+-------------+--------------+----------------+---------------+------------+-------------+--------------+----------------+-----------------+--------------------+--------------------------------------------------------------------------------+
| 1032 | STRING | | STRING | | 0 | 0 | 0 | Monsieur | NULL | NULL | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_civilite |
| 1032 | STRING | | STRING | | 0 | 0 | 0 | Kermagoret | NULL | NULL | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_nom |
| 1032 | STRING | | STRING | | 0 | 0 | 0 | Laurent | NULL | NULL | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_prenom |
| 1032 | STRING | | STRING | | 0 | 0 | 0 | Lolo | NULL | NULL | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_surnom |
| 1037 | STRING | | STRING | | 0 | 0 | 0 | Bois | NULL | NULL
For each instance defined in the table alf_node, we can define the set of values of the corresponding attributes. For example, for a given class Contact, we can define the different instance attributes (1032), for which values Monsieur, Kermagoret and Laurent are given respectively to the attributes civilite, nom and prenom. Note that the type of the attribute is given by the column qname. The latter matches the class to which the attribute name is added.
Inheriting management
In Alfresco, nodes store the attributes (and the associations) regarding the real container. Inheriting is then interpreted with help of a specific ORM (hypothesis). For example, if an object Contact(email) inherits from an object Personne(civilite,nom,prenom), an instance of a Contact object will produce a recording of the information (civilite, nom, prenom) in three properties associated to a node of type (type_qname) Contact; however, the attributes types (in the table alf_node_properties) is actually Personne_X (where X in {civilite, nom, prenom}).
Table alf_node_assoc
+----+---------+----------------+----------------+-----------------------------------------------------------------------------------------------------------+
| id | version | source_node_id | target_node_id | type_qname |
+----+---------+----------------+----------------+-----------------------------------------------------------------------------------------------------------+
| 4 | 0 | 613 | 610 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Newsletter_hasReceived |
| 6 | 0 | 613 | 610 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Action_action_realisee |
| 7 | 0 | 613 | 610 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_NewsletterWithReplySlip_haveReplied_yes |
| 8 | 0 | 610 | 617 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_rattache_a |
| 9 | 0 | 617 | 610 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entite_rattache_a |
| 10 | 0 | 617 | 732 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entite_rattache_a |
| 11 | 0 | 732 | 617 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_rattache_a |
| 12 | 0 | 728 | 733 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entite_rattache_a |
| 13 | 0 | 733 | 728 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Contact_rattache_a |
| 14 | 0 | 617 | 728 | {http://www.bluexml.com/model/content/crm/1.0}com_bluexml_sem_crm_Entite_filiale_de |
+----+---------+----------------+----------------+-----------------------------------------------------------------------------------------------------------+
Here we define the associations between classes through the notion of association between nodes. Associations defines a direction (implicitly identified with the couple source/target). Note that another similar table alf_child_assoc exists; the latter is used to store parent/child (hierarchical relations) associations (in BxDS, these relations are used to manage compositions and aggregations).
Performed work
Let's consider the following BxDS model.
Class Management
Each class generates an SQL view enclosing the set of attributes and the set of inherited attributes. Inherited attribtues are prefixed with the name of the class which defines it in order to avoid potential name conflicts.
Generated SQL query
- Example: Contact Class
create or replace view Contact (id, email, Person_firstName, Person_lastName, Person_salutation, Person_birthDate) as select id, email.string_value, firstName.string_value, lastName.string_value, salutation.string_value, birthDate.string_value from alf_node left outer join alf_node_properties email on (id = email.node_id and email.qname like '%org_test_contact_Contact_email') left outer join alf_node_properties firstName on (id = firstName.node_id and firstName.qname like '%org_test_contact_Person_firstName') left outer join alf_node_properties lastName on (id = lastName.node_id and lastName.qname like '%org_test_contact_Person_lastName') left outer join alf_node_properties salutation on (id = salutation.node_id and salutation.qname like '%org_test_contact_Person_salutation') left outer join alf_node_properties birthDate on (id = birthDate.node_id and birthDate.qname like '%org_test_contact_Person_birthDate') where ( type_qname like '%org_test_contact_Contact' ) and protocol = 'workspace';
Query sample
- Example: Contact Class
mysql> select * from Contact; +-----+--------------------+------------------+-----------------+-------------------+-------------------------------+ | id | email | Person_firstName | Person_lastName | Person_salutation | Person_birthDate | +-----+--------------------+------------------+-----------------+-------------------+-------------------------------+ | 470 | bpajot@bluexml.com | Brice | PAJOT | M. | 1981-05-03T10:12:00.000+02:00 | | 478 | bch@bluexml.com | Benjamin | CHEVALLEREAU | M. | 1984-02-07T16:56:00.000+01:00 | +-----+--------------------+------------------+-----------------+-------------------+-------------------------------+
Inheriting Management
Abstract classes or more generally all the classes which are not leaves in a specialisation arborescence define views.
Generated SQL query
- Example: Person Class
create or replace view Person (id, firstName, lastName, salutation, birthDate) as select id, firstName.string_value, lastName.string_value, salutation.string_value, birthDate.string_value from alf_node left outer join alf_node_properties firstName on (id = firstName.node_id and firstName.qname like '%org_test_contact_Person_firstName') left outer join alf_node_properties lastName on (id = lastName.node_id and lastName.qname like '%org_test_contact_Person_lastName') left outer join alf_node_properties salutation on (id = salutation.node_id and salutation.qname like '%org_test_contact_Person_salutation') left outer join alf_node_properties birthDate on (id = birthDate.node_id and birthDate.qname like '%org_test_contact_Person_birthDate') where ( type_qname like '%org_test_contact_Person' or type_qname like '%org_test_contact_Contact' ) and protocol = 'workspace';
Query Example
- Example: Person Class
mysql> select * from Person; +-----+-----------+--------------+------------+------------+ | id | firstName | lastName | salutation | birthDate | +-----+-----------+--------------+------------+------------+ | 470 | Brice | PAJOT | M. | 1981-05-03 | | 478 | Benjamin | CHEVALLEREAU | M. | 1984-02-07 | +-----+-----------+--------------+------------+------------+
Note that the birthDate column is of (Mysql) date type, whereas Alfresco stores this data as a string (ISO-8601 format); thus it enables to query the database with date constraints. Also remark that attributes are not prefixed with the name of the class contrarily to the preceding query on Contact; although it is a possibility, we do not recommend this view since it brings potential name conflicts (with aspect and inherited attributes).
Aspect Management
Each aspect generates a view with an identifier (id) which is the same as the one on which the aspect is attached (1-1 relation). Furthermore, aspects are integrated in the views of the classes which declare these aspects. Normally, classical views without aspects are replaced, but it is possible to configure the generation in order to obtain a view with a different name; for example, we can configure the generation in order to add the suffix _WA to class names, giving thus Person_WA and Contact_WA for the classes Person and Contact which defines the aspect BankAccount.
Generated SQL Query
- Example: BankAccount Class
create or replace view BankAccount (id, number) as select aspect_id.node_id, number.string_value from alf_node_aspects aspect_id left outer join alf_node_properties number on (aspect_id.node_id = number.node_id and number.qname like '%BankAccount_number') where aspect_id.qname like '%BankAccount';
Query Example
- Example: BankAccount Class
mysql> select * from BankAccount; +-----+------------+ | id | number | +-----+------------+ | 478 | 1234567890 | +-----+------------+
Generated SQL Query
- Example: Contact_WA Class
create or replace view Contact_WA (id, Person_firstName, Person_lastName, Person_salutation, Person_birthDate, BankAccount_number, email) as select id, firstName.string_value, lastName.string_value, salutation.string_value, birthDate.string_value, number.string_value, email.string_value from alf_node left outer join alf_node_properties firstName on (id = firstName.node_id and firstName.qname like '%org_test_contact_Person_firstName') left outer join alf_node_properties lastName on (id = lastName.node_id and lastName.qname like '%org_test_contact_Person_lastName') left outer join alf_node_properties salutation on (id = salutation.node_id and salutation.qname like '%org_test_contact_Person_salutation') left outer join alf_node_properties birthDate on (id = birthDate.node_id and birthDate.qname like '%org_test_contact_Person_birthDate') left outer join alf_node_properties number on (id = number.node_id and number.qname like '%org_test_contact_BankAccount_number') left outer join alf_node_properties email on (id = email.node_id and email.qname like '%org_test_contact_Contact_email') where ( type_qname like '%org_test_contact_Contact' ) and protocol = 'workspace';
Query Example
- Example: Contact_WA Class
mysql> select * from Contact_WA; +-----+------------------+-----------------+-------------------+-------------------------------+--------------------+--------------------+ | id | Person_firstName | Person_lastName | Person_salutation | Person_birthDate | BankAccount_number | email | +-----+------------------+-----------------+-------------------+-------------------------------+--------------------+--------------------+ | 470 | Brice | PAJOT | M. | 1981-05-03T10:12:00.000+02:00 | NULL | bpajot@bluexml.com | | 478 | Benjamin | CHEVALLEREAU | M. | 1984-02-07T16:56:00.000+01:00 | 1234567890 | bch@bluexml.com | +-----+------------------+-----------------+-------------------+-------------------------------+--------------------+--------------------+
Note that the number attribute defined by the BankAccount aspect is prefixed by the name of the aspect, but not by the name of the inherited class; this could raise potential name conflicts in specific (and erroneous) cases where the aspect would be defined on a child and on one of its parents.
Association Management
Contrarily to the classical representation in the database (based classically on a Entity-Relation Model), where One-To-Many (or Many-To-One) relations are represented with help of foreign keys, we decided in the sake of simplicity, to generate only association tables. Thus each association defined in the BxDS model generates a (association) table between a source class and a target class. The view name is determined with the name of the both classes, the association name and the association direction (navigability). Therefore, for a given association R between two classes A and B: A -- R --> B, we define a view of name A_R_B. When the association is oriented in both ways, two views are thus generated.
Defining an association of type containment (aggregation or composition) does not generate any difference in the view creation (only tables on which are performed the queries are different).
Remark: maybe there is still a bug in the generation of the view when the direction of the association is opposite to the navigability direction (a particularity of the modelling with BxDS).
Generated SQL Query
- Example: Person_addresses_Address Association
create or replace view Person_addresses_Address (id, Person, Address) as select assoc__.id, assoc__.source_node_id, assoc__.target_node_id from alf_node_assoc assoc__ where assoc__.type_qname like '%org_test_contact_Person_addresses_org_test_contact_Address';
Query Example
- Example: Person_addresses_Address Association
mysql> select * from Person_addresses_Address; +----+--------+---------+ | id | Person | Address | +----+--------+---------+ | 1 | 470 | 469 | +----+--------+---------+
Query Example
- Example: Person / Address Join
mysql> select Person.*, Address.* from Person, Address, Person_addresses_Address link where (link.Person = Person.id) and (link.Address = Address.id); +-----+-----------+----------+------------+-------------------------------+-----+-------------------------+---------+----------+--------+ | id | firstName | lastName | salutation | birthDate | id | street | zipCode | city | number | +-----+-----------+----------+------------+-------------------------------+-----+-------------------------+---------+----------+--------+ | 470 | Brice | PAJOT | M. | 1981-05-03T10:12:00.000+02:00 | 469 | chemin du marché Besson | 85300 | CHALLANS | 25 | +-----+-----------+----------+------------+-------------------------------+-----+-------------------------+---------+----------+--------+
The problem with this query is that it produces results for the links defined between Person and Address; thus it only displays results for the persons which define an address. This problem can be solved by the use of left outer joins.
mysql> select Person.*, Address.* from Person left outer join Person_addresses_Address link on (link.Person = Person.id) left outer join Address on (link.Address = Address.id); +-----+-----------+--------------+------------+-------------------------------+------+-------------------------+---------+----------+--------+ | id | firstName | lastName | salutation | birthDate | id | street | zipCode | city | number | +-----+-----------+--------------+------------+-------------------------------+------+-------------------------+---------+----------+--------+ | 470 | Brice | PAJOT | M. | 1981-05-03T10:12:00.000+02:00 | 469 | chemin du marché Besson | 85300 | CHALLANS | 25 | | 478 | Benjamin | CHEVALLEREAU | M. | 1984-02-07T16:56:00.000+01:00 | NULL | NULL | NULL | NULL | NULL | +-----+-----------+--------------+------------+-------------------------------+------+-------------------------+---------+----------+--------+
Association-Class Management
Association-classes complete the information on an association. Thus for each association-class, we generate a view on the association containing an identifier of the decorating association-class. For example, for an association-class C decorating the association R defined between A and B: A -- R/C --> B, we define a view A_R_B having a source A, a target B, and and an association-class R.
Generated SQL Query
- Example: Contact_works_in_Entreprise Association
create or replace view Contact_works_in_Entreprise (id, Contact, Entreprise, Role) as select assoc__.id, assoc__.source_node_id, assoc__.target_node_id, Role.target_node_id from alf_node_assoc assoc__ left outer join alf_node_assoc Role on ( assoc__.source_node_id = Role.source_node_id and Role.type_qname like '%org_test_contact_Contact_works_in_org_test_contact_Role') where assoc__.type_qname like '%org_test_contact_Contact_works_in_org_test_contact_Entreprise';
Query Example
- Example: Contact_works_in_Entreprise Association
mysql> select * from Contact_works_in_Entreprise; +----+---------+------------+------+ | id | Contact | Entreprise | Role | +----+---------+------------+------+ | 9 | 470 | 472 | 501 | | 8 | 478 | 472 | NULL | +----+---------+------------+------+
Limitations and Problems
- Security mechanism provided by Alfresco are not propagated for the moment while selecting data. By offering direct access to data, we indeed bypass almost all the security mechanism provided by Alfresco. Only a basic security level is provided for the moment. We are currently studying various ways to solve this problem.
- Currently, the views are only defined on the Mysql database. Switching to another database would be however almost straightforward.
- Unsurprisingly, performance on querying the views with several joins are disastrous! It is the reason the current implementation uses real tables, synchronized through an Alfresco policy, instead of views.
- Currently, we normally use fully-qualified names for classes and attributes, thus prefixing each name by a package definition like com_bluexml_test_. This problem avoid conflicting names in large procjects where several sub-models are combined together to build a complete application. However, using this approach on an internal project raised a problem on Mysql for which the names of the tables are limited to 64 characters: several fully-qualified names of our objects were more than 64-characters long, thus requiring to work with simplified names.
- More help can be found at http://www.side-labs.org/forums/
