Plone should…

Include something like PloneQueueCatalog, so we can get deferred indexing without SOLR

23 votes
Vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    I agree to the terms of service
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    jonstahlAdminjonstahl (Admin, Plone) shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    9 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      I agree to the terms of service
      Signed in as (Sign out)
      Submitting...
      • Hanno SchlichtingHanno Schlichting commented  ·   ·  Flag as inappropriate

        One other way of doing this, is to introduce deferred indexing into a "write-ahead-log" on one specialized process and let the catalog search results take values in this log into account. From what I understand of what Laurence tells me, this is essentially what Postgres does. This would introduces the "catalog" as a specialized process for an entire ZEO cluster, though and thus introduces some new challenges. This would also give the advantage that you could remove duplicate indexing operations from the queue when processing it.

      • WichertWichert commented  ·   ·  Flag as inappropriate

        I do not agree with Jon: any delay is bad for the user experience, even a single second. If a user creates a new object and does not see it anywhere after pressing save, or if he changes a title and sees the old title appear the result will be confusing.

        This could be solved by only deferring some indexes/columns (such as SearchableText). This guarantees that the indexes used for navigation are always up to date. This also prevents the need for new navtree and folder contents code.

      • IanFHoodIanFHood commented  ·   ·  Flag as inappropriate

        and ask uservoice to make posts editable by the original author? LOL

      • Andreas ZeidlerAndreas Zeidler commented  ·   ·  Flag as inappropriate

        wow, better scratch that — copy & paste was playing tricks on me... :) here's the same comment again, but this time only _one_ copy:

        just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.

        so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.

        other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).

        all of this can be employed today, thereby improving the "save page experience" as jon put it. however, as geir already pointed out, it's not always desirable to index all data asynchronously, so separating concerns by splitting up the current (portal) catalog into many — or for starters into at least two for search and navigation — would be a big win. incidentally, `collective.indexing` could be of some use for such an effort... :)

      • Andreas ZeidlerAndreas Zeidler commented  ·   ·  Flag as inappropriate

        just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.

        so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.

        other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).

        just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.

        so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.

        other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).

        just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.

        so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.

        other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).

        all of this can be employed today, thereby improving the "save page experience" as jon put it. however, as geir already pointed out, it's not always desirable to index all data asynchronously, so separating concerns by splitting up the current (portal) catalog into many — or for starters into at least two for search and navigation — would be a big win. incidentally, `collective.indexing` could be of some use for such an effort... :)

      • Andreas JungAndreas Jung commented  ·   ·  Flag as inappropriate

        The issue with concurrent writes is basically an issue of the ZODB. A big win would be to decrease the transaction size for changes on AT-based content. A 50-100KB transaction size for a stupid change with a Plone document is just a performance killer.

      • jonstahlAdminjonstahl (Admin, Plone) commented  ·   ·  Flag as inappropriate

        Really? How long would indexing be likely to be deferred? I mean, I could live with a few seconds (my main objective is to speed up the "save page" experience) until things showed up in navigation & folder contents views.

      • Geir BækholtGeir Bækholt commented  ·   ·  Flag as inappropriate

        Agree. collective.indexing with deferred indexing regardless of which search engine one uses. But we will need a new navigation tree and folder contents code for this. Rather big change. Needs to go in Plone 5, not 4 IMO.

      Feedback and Knowledge Base