Configuring Apache Tika in Solr
Let's go ahead and create a new core called tika-example in our Solr instance. To make things easier, you can copy the core from the Chapter 6 folder of the ZIP file that comes with this book. After creating the core, we'll need to configure solrconfig.xml.
We need to add the extraction libraries that are available in the %SOLR_HOME/contrib/extraction/lib folder, and also the solr-cell library in solrconfig.xml:
<lib dir="${solr.install.dir:../../..}/contrib/extraction/lib" regex=".*\.jar"/>
<lib dir="${solr.install.dir:../../..}/dist/" regex="solr-cell-\d.*\.jar"/>We can then configure ExtractingRequestHandler in solrconfig.xml:
<requestHandler name="/update/extract" class="solr.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="fmap.content">content</str>
<str name="lowernames">true</str>
<str name="uprefix">attr_</str>
<str name="captureAttr">false</str...