Opened 3 years ago
Last modified 3 years ago
#22 new defect
BibIndex (fulltext): Demo fulltext searching on inspire-hep-dev doesn't find 'rattazzon'
| Reported by: | tbrooks | Owned by: | |
|---|---|---|---|
| Priority: | minor | Milestone: | |
| Component: | BibIndex | Version: | |
| Keywords: | Cc: |
Description
Suprisingly:
astro-ph/0607086
does not find rattazzon even though it is in the fulltext (see snippet for "honor theorist" search in fulltext)
This may be due to its enclosure in in the text...
Note: See
TracTickets for help on using
tickets.

The problem seems to be due to non-ASCII UTF-8 quotes.
If one searches for ‘rattazzon’, one finds it:
http://inspire-hep-dev.cern.ch/search?p=‘rattazzon’&f=fulltext
The current word breaking sequencer handles only ASCII quotes.
That is, ` not ‘, and ' not ’.
We should add all tho common UTF-8 characters of that kind
to the config.