Chapter 11: Curry Indexer and search
In this chapter we will discuss the Indexer and how to customize search results. The Indexer is a backend module that indexes content of each page module. A page module must have the "Search Visibility" checkbox selected in order to be "visible" to the Indexer.
If you are using the Indexer for the first time in a project, you will first need to create the index. You can then update the index in future by rebuilding it.
The index is stored on disk instead of the database in the path cms/data/searchindex
. You can find a number of interesting folders in cms/data/
folder. For e.g. database JSON backups are stored in the backup
folder, files deleted from the File Finder are stored in the trash
folder, etc. I am not sure why that log
folder is there though. Curry_Core::log
never puts it logs into that folder. I use Monolog
instead.
Curry uses Zend Lucene for its search engine.
Create a search page
After creating the site index, we must provide site visitors with a page where they can enter keywords and another page where search results can be presented to them.
Let's create the following pages:
- Demo/Search
- Demo/Search/Results
On the Demo/Search/Results
page add the core Search
module to the content
target.
Now, on the Search page we need to add a form in which a site visitor can type search keywords and the form will submit to the Results page. The Search
module on the Results page looks for a URL query parameter named query
. If found, the Search
module will do the necessary processing to yield the results.
Create a new Twig template named SearchForm.html.twig
and type in the following code:
<form method="get" action="demo/search/results/">
<dl>
<dt><label for="query">Query</label></dt>
<dd><input id="query" type="text" name="query" required value="{{ curry.request.get.query }}" /></dd>
<dt></dt>
<dd><button type="submit">Search</button></dd>
</dl>
</form>
Now, I am not going to write a page module just to put this content onto the web page. Curry provides a core module named Template
for a scenario like this.
Now, test the search functionality. Type "article" in the search form and you should see some results.
Index model classes
The search functionality that we created above indexes page modules only. It is not able to index a model class as yet. If you type the name of a company, you will not see any valid results.
To make a model searchable, the model class must implement the Curry_ISearchable
interface. This interface declares the getSearchDocument
method which should return a Zend_Search_Lucene_Document
object. The returned object must contain the following Zend_Search_Lucene_Field
fields: title
, body
, description
and url
. You can add additional fields to the index. For e.g. if I want to customize my search I can store the model instance id field in the index. In this case I would store it as an UnIndexed
field.
For more information on Zend_Search_Lucene
please refer the documentation at https://framework.zend.com/manual/1.12/en/zend.search.lucene.overview.html.
Let's make the Company
model indexable. Modify the model class in the path cms/propel/build/classes/project/
to implement the Curry_ISearchable
interface. Then define the getSearchDocument()
method as follows:
class Company extends BaseCompany implements Curry_ISearchable
{
// Code removed to focus on relevant portions
/**
* Return a document to index or null to
* skip indexing this object.
* @return Zend_Search_Lucene_Document
*/
public function getSearchDocument()
{
$doc = new Zend_Search_Lucene_Document();
$doc->addField(Zend_Search_Lucene_Field::Text('title', $this->getName()));
// if we had some description field, we could use it here.
$doc->addField(Zend_Search_Lucene_Field::Text('description', strip_tags($this->getName())));
$doc->addField(Zend_Search_Lucene_Field::Text('body', strip_tags($this->getBody())));
$doc->addField(Zend_Search_Lucene_Field::Keyword('url', $this->getUrl()));
return $doc;
}
protected function getBody()
{
$html=<<<HTML
<pre>
Company Name: {$this->getName()}
Org Nbr: {$this->getOrgNumber()}
City: {$this->getCity()->getName()}
Recycling types: {$this->getRecyclingTypeList()}
</pre>
HTML;
return $html;
}
public function getRecyclingTypeList()
{
$list = RecyclingTypeQuery::create()
->useCoRtQuery()
->filterByCompany($this)
->endUse()
->find()
->toKeyValue('PrimaryKey', 'Name');
return implode(', ', $list);
}
// we don't have a company page as yet
// so return an empty string
// or the url of the search page.
public function getUrl()
{
return 'demo/search/';
}
// rest of code...
}
Perfect! Now Rebuild and Optimize the index with Curry Indexer and test. You can test in the backend to see if you get hits.
The backend will only show you the fields that have been indexed. You can see the actual results in the frontend.
Clicking the links are of no use because we will be redirected to the search page. That's the URL that we specified for each company object in the index. Maybe I should show you a useful feature provided by Curry to create such dynamic pages. Let's learn to work with Model Routes in the next chapter.
Update the index dynamically
Let's say you have a database of products. The editor may add product information to the database but may not want to display some products on the front end. In such a case, you may not want to index such products. Furthermore, when these products are ready to be displayed, you may want to add them individually to the index.
One way to handle this scenario is to add a is_visible
field to the product table. When an editor saves a product in the backend, you would check the value of the is_visible
field and either index the product or remove the "hit" from the index.
In the backend module, see the what type of form you created to edit a product item. This form is set as the value to the modelForm
property of Curry_ModelView_List
. If the form is of type Curry_Form_ModelForm
then you will have create an instance of Curry_ModelView_Form
. After that you will have to hook into the PostSave
hook.
Let's say your code is similar to this skeleton:
...
$form = new Curry_Form_ModelForm('Product', array(
// Additional options
));
$list = new Curry_ModelView_List('Product', array(
'modelForm' => $form,
...
));
...
You will then have to make the following adjustments:
...
$form = new Curry_Form_ModelForm('Product', array(
// Additional options
));
$mf = new Curry_ModelView_Form($form);
$mf->setPostSave(array($this, 'afterProductSave'));
$list = new Curry_ModelView_List('Product', array(
'modelForm' => $mf,
...
));
...
In the PostSave
hook we write the code to either index or unindex the product.
public function afterProductSave(Product $product, Curry_Form_ModelForm $form)
{
/** @var Zend_Search_Lucene_Document | null */
$doc = $product->getSearchDocument();
$index = Curry_Core::getSearchIndex();
$hits = $index->find('model:Product');
// remove document from index.
foreach ($hits as $hit) {
// id is the model id stored in the index.
if ($hit->id == $product->getPrimaryKey()) {
$index->delete($hit->id);
break;
}
}
// NOTE: If the is_visible field is false, Product::getSearchDocument()
// should return null. In this case, we have unindexed the product (above).
if ($doc) {
// add document to index.
$doc->addField(Zend_Search_Lucene_Field::Keyword('model', get_class($product)));
$doc->addField(Zend_Search_Lucene_Field::Keyword('model_id', serialize($product->getPrimaryKey())));
$index->addDocument($doc);
}
...
}
The code above updates the index but does not optimize it. You may want to do this too. I will leave this decision to you. If you are updating the index with a cronjob then I recommend that you optimize your index at the end of the job. You can read about index optimization at https://framework.zend.com/manual/1.12/en/zend.search.lucene.index-creation.html#zend.search.lucene.index-creation.optimization