Apache Solr Notes - Drupalcon DC

05 Mar 2009
Posted by jcfiala

Apache Solr module
Jacob Singh and Peter Wolanin

cloud based for free on aquia - dropal.org is using solr.

relevant and fast results in apache solr. Filter down based on metadata - sortable. And content recommendation - that sounds good.

change how you navigate sites showing craigslist as bad example.

Information architecture is the science and art of guessing what your users want to see or do on your site and helping them get there.

architeypes - who
content - you got it
organzie and prosper - name groups as navigation
sports - menu item, international news - menu item
That's how we make sites - 1996 - what's wrong with that? Primary problem is that websites are more complicated than they used to be and content comes from other places.
also, websites do a lot more than they used to. And they grow, sometimes very fast, and in directions you didn't expect. So, in a month your content and menus are overflowing.

visitors are too varied to think of them all and the content doesn't fit your menu.
amazon.com - handmade paper no good catagory.

content is too vast, too many clicks to find things.
content is contributed by users and unpredictable - wrong taxonomy, duplicated terms, etc, etc.

drupal is a social publishing platform - use it so users will add content, and if it's dynamic content, why is navigation static?
So, go with web 2.0 jargon to save the day: tag clouds, content recommendation, social networking, social benchmarking.
how search died and how we bring it back - web 0.5 :)
Search has not been a priority. rfp - search is on the list, details isn't. Most drupal websites really ignore search.
Big sites emphasize search - top left, big wide, not small and right.

Why did search die?
It was too slow, not smart enough, and can't be trusted.

golden rule of ia: no dead ends - user leaves site when they reach a dead end.
Search on g.d.o - did search for job results, did not actually get any jobs until page 5.

Put same data into solr - filter by type block, showing jobs!
Find out where to look - filter by type, event type, etc, sort by date, etc.

stemming - search for 'moderating' and get 'moderator/moderators/and french moderation' spelling suggestions - did you mean 'drupal' for drupaul search.

Drupal not in dictionary, but solr looks at your data to decide on spellings.

shaping results - result biasing. More comments can be rated higher in the list, promoted to home page, etc. Lots of ways to shape traffic.

Not vaporware - been working on it for a while. Dries is using it, human rights watch, cms report, 'learn by the drop' - other sites as well.
Drupal.org - buy drinks for d.o. admins. - can use search on drupal.org again!

Tailor made navigation for every user. :)

Stop trying to think for our users, let them to find what they want and use the language they want to use. Turning it over to Peter for details.

stable and proven, written in java, uses lucene - top open source search library, distributed - scales out.

solr - aquia developments, contributed back.
You can run solr yourself - 12 or so steps to set it all up.
Aquia is going to do it - sign up on acquia.com and install search module and turn it on. They will take care of all else.

search data goes to master server, replicated to slave servers, and they respond when you do searches.
search results run in usually < 200 ms, even under high load.
Small and medium size sites - easy access to enterprise search.
large sites and acquia partners - all taken care of for you.

Get projects done faster.
Currently free for beta - leaves core search as it is. Free 30 day trial free try
screencasts of solr:

quick look at the admin interface:
Different methods of shaping results again - weights by node type, types to exclude from search index... weight for tags, etc.

roadmap - what's the killer feature?
want to know what the best feature is.
File attachment indexing very popular, so is geographic search

Date faceting nearly ready!
Part of range faceting - returning data like price ranges, date ranges, etc.
filter by modification date: 2006, 2007, 2008, 2009 etc.

by year, month, day, god forbid, by hour - drill down of faceting.
Also, it shows in a breadcrumb - elit > 2007 > September 2007 > september 21, 2007.
no dead ends! Can click back through breadcrumb on search.

a way that apache solr gives you a lot of options for users.

bof: 3:45 on Friday
BoF: Hands on with apache solr search
UX Testers wanted at acquia blog.

questions -c an't search panels, no views integration yet but is coming.