Include something like PloneQueueCatalog, so we can get deferred indexing without SOLR
9 comments
-
Hanno Schlichting
commented
One other way of doing this, is to introduce deferred indexing into a "write-ahead-log" on one specialized process and let the catalog search results take values in this log into account. From what I understand of what Laurence tells me, this is essentially what Postgres does. This would introduces the "catalog" as a specialized process for an entire ZEO cluster, though and thus introduces some new challenges. This would also give the advantage that you could remove duplicate indexing operations from the queue when processing it.
-
Wichert
commented
I do not agree with Jon: any delay is bad for the user experience, even a single second. If a user creates a new object and does not see it anywhere after pressing save, or if he changes a title and sees the old title appear the result will be confusing.
This could be solved by only deferring some indexes/columns (such as SearchableText). This guarantees that the indexes used for navigation are always up to date. This also prevents the need for new navtree and folder contents code.
-
IanFHood
commented
and ask uservoice to make posts editable by the original author? LOL
-
Andreas Zeidler
commented
wow, better scratch that — copy & paste was playing tricks on me... :) here's the same comment again, but this time only _one_ copy:
just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.
so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.
other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).
all of this can be employed today, thereby improving the "save page experience" as jon put it. however, as geir already pointed out, it's not always desirable to index all data asynchronously, so separating concerns by splitting up the current (portal) catalog into many — or for starters into at least two for search and navigation — would be a big win. incidentally, `collective.indexing` could be of some use for such an effort... :)
-
Andreas Zeidler
commented
just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.
so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.
other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).
just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.
so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.
other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).
just to clarify: `collective.solr` (or solr alone for that matter) doesn't give you deferred indexing. what `collective.indexing` does is collect indexing calls in order to optimize them, i.e. remove duplicates, and then process them at the transaction boundary. for the typical plone site that already gives you a 10% performance gain for editing operations.
so `collective.indexing` is already somewhat useful by itself. however, it also provides an infrastructure to hook up additional index processors. one of these is implemented in `collective.solr` and allows indexing to be dispatched to a solr server. the package also comes with a hook for plone's site search (and other searches as well). the idea here was to use solr's superior search capabilities and speed and reduce the size of plone's catalog and thereby also write conflicts at the same time.
other indexing processors might be used to index things in additional catalogs within the zodb, for example the one membrane comes with in its latest version. and — coming back to deferred indexing — there's also an implementation of a truly asynchronous index processor (by enfold systems & avail from their public repository), which could be used as an alternative to `PloneQueueCatalog` (while still keeping the added flexibility of the `collective.indexing` approach).
all of this can be employed today, thereby improving the "save page experience" as jon put it. however, as geir already pointed out, it's not always desirable to index all data asynchronously, so separating concerns by splitting up the current (portal) catalog into many — or for starters into at least two for search and navigation — would be a big win. incidentally, `collective.indexing` could be of some use for such an effort... :)
-
Andreas Jung
commented
The issue with concurrent writes is basically an issue of the ZODB. A big win would be to decrease the transaction size for changes on AT-based content. A 50-100KB transaction size for a stupid change with a Plone document is just a performance killer.
-
T. Kim Nguyen
commented
We need something like this to get a MUCH better story for concurrent writes
-
Adminjonstahl
(Admin, Plone)
commented
Really? How long would indexing be likely to be deferred? I mean, I could live with a few seconds (my main objective is to speed up the "save page" experience) until things showed up in navigation & folder contents views.
-
Geir Bækholt
commented
Agree. collective.indexing with deferred indexing regardless of which search engine one uses. But we will need a new navigation tree and folder contents code for this. Rather big change. Needs to go in Plone 5, not 4 IMO.