Uploaded image for project: 'eZ Publish / Platform'
  1. eZ Publish / Platform
  2. EZP-14742

Add support for character mapping in Solr

    XMLWordPrintable

Details

    Description

      While the existing version of Solr in eZ Find supports solr.MappingCharFilterFactory , it is possibly an old version (see https://issues.apache.org/jira/browse/SOLR-822 ), and it doesn't support solr.CharStreamAwareWhitespaceTokenizerFactory which appears to be needed (see http://wiki.apache.org/solr/SchemaDesign?highlight=(MappingCharFilterFactory)#head-cbd09984c67526fbfde825739d72e9c37139f52c ).

      I've tried using MappingCharFilterFactory to map characters, with no success.

      Steps to reproduce

      1. Edit ezfind/java/solr/conf/schema.xml and add the charFilter line below to both the "query" and "index" analyzers:

      <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
              <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-macrons.txt"/>
      

      2. Create a file in the same directory called mapping-macrons and paste the following:

      "ā" => a
      "Ä€" => A
      "Ä“" => e
      "Ä’" => E
      "Ä«" => i
      "Ī" => I
      "ō" => o
      "Ō" => O
      "Å«" => u
      "Ū" => U
      

      3. Put some of the Unicode chars in mapping-macrons.txt into an XML-block in a content object on your eZ Publish site e.g "Māori".

      4. Re-start Solr & re-index your site using the --clean option

      5. Search for the mapped equivalent, e.g "maori"

      Attachments

        Activity

          People

            pborgerm pborgerm
            gbentley gbentley
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: