hippo 7 and xml

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

hippo 7 and xml

Sergei Kozyrenko
Hello,
I'm currently prototyping a system that would let my company integrate a
repository, our XML schema, and a web WYSIWYG editor (Xopus) into one
package. I've started looking into Hippo 7, but can't find anything on
your website related to how the content is actually stored -- and if
it's stored as XML, could we plugin our own XML schema? If so, is there
any documentation on how to do that?

Thanks,
        Sergei
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/community.html
Reply | Threaded
Open this post in threaded view
|

Re: hippo 7 and xml

Arje Cahn
Administrator
This post has NOT been accepted by the mailing list yet.
Hi Sergei!

Sergei Kozyrenko wrote
I'm currently prototyping a system that would let my company integrate a
repository, our XML schema, and a web WYSIWYG editor (Xopus) into one
package. I've started looking into Hippo 7, but can't find anything on
your website related to how the content is actually stored -- and if
it's stored as XML, could we plugin our own XML schema? If so, is there
any documentation on how to do that?
First of all: in Hippo CMS 7, the standard way of setting up templates is by using the template editor [1]. You can extend the templates as described in [2].

If you want to use your own XML Schema, and you're using Xopus [3] as your web based XML editor, my best bet would be to:
- Create a Xopus editing template plugin that wraps Xopus as 1 field in the editor
- Store the XML structure that comes out of Xopus in one node in your nodetype

In this way, you have all the power of Xopus and especially XML Schema - on the other hand, you don't have the XML structure split out in a dedicated nodestructure in the repository. If you want that, you might have to build some extractors to do that for you. Hippo CMS 7 uses this concept of nodetypes as described in the JCR spec [4] [5].

This procedure is the same as how it used to work in Hippo CMS 6. The one thing that would be 'nice to have' is a way to extract data from an XML fragment by using an XPATH statement and storing it as a property on the document node.

We also have no plugin available for Xopus in the Hippo CMS codebase (yet!). But if you'd like to build one, I'd be very interested and we'd love to help you out :)

Hope this helps,

Arjé

[1] http://www.onehippo.org/cms7/delve_into/user/information_architects/howto/document_types/editor.html
[2] http://www.onehippo.org/cms7/delve_into/custom/howto/customize_editing_template.html
[3] For more information on Xopus, see www.xopus.com
[4] http://jackrabbit.apache.org/jcr-api.html
[5] http://jackrabbit.apache.org/node-types.html

Ard
Reply | Threaded
Open this post in threaded view
|

RE: hippo 7 and xml

Ard
Hello Sergei and Arje,

>
> In this way, you have all the power of Xopus and especially
> XML Schema - on the other hand, you don't have the XML
> structure split out in a dedicated nodestructure in the
> repository. If you want that, you might have to build some
> extractors to do that for you. Hippo CMS 7 uses this concept
> of nodetypes as described in the JCR spec [4] [5].

Basically Jackrabbit has a xml persistency manager, that splits out xml
into a node structure. Obviously, for example a 10 Mb xml file will
result in a very large node tree already, hence performance is not
optimal.

OTOH, you can store xml as binary data and have it indexed with an xml
extractor, in other words, only lucene index the text to make it
searcheable. This obviously is blistering fast.

So pluging in an xml schema, edit the xml content with xopus, and store
it again as a binary field is pretty straight forward. The only thing
that than still needs to be added it logical configurable meta data,
which can be used to search on. Arje refers to extractors for this, but
meta data indexing might be a more enlightning term for it: for example,
index the <title> field, <date> field, and so on, to be able to search
on it.

Anyway, what I wanted to add to the discussion, is, and I didn't yet
dive into the feasibility of the idea, is that I would like to have the
xml as a node structure available, while still having the fast binary
data storage: Hippo Repository has added to Jackrabbit pluggable virtual
providers, in other words, on the fly created node structures, based on
meta data of the entire repository, filtered structures of the physical
content, the lucene term space (not yet build) etc etc. Note that all
these structures are exposed over jcr, so the client won't notice a
difference whether it is getting physical or virtual nodes.

So, not yet sure whether we can easily accomplish it, but having a
virtual provider being able to deliver an xml binary file as a virtual
node structure, I think would be great. So, for example, if I would have
the xml file:

<doc>
        <title>foo</title>
        <body>bar</body>
</doc>  

And this xml file would be stored in the node mynode as a binary field
like:

-rootNode
        `-mynode
                - xmldata: stream
                - someprop

And, if you would access this node structure through the virtual
provider, you would get:

-rootNode
        `-mynode
                `doc
                   |-title
                   `-body
                       

And you could do:

Node n = rootNode.getNode("mynode/doc/title")

This, way, I think we could deliver fast xml storage, and at the same
time being able to deliver it as a node structure.

Only querying within the virtual node structure would not be possible,
and would need some predefined indexing, which Arje is referring to with
extractors,

Regards Ard

>
>
_______________________________________________
Hippo-cms7-user mailing list and forums
http://www.onehippo.org/cms7/support/community.html