SOLR Index & search on multiple datasource - mysql

I have a problem to search on two dataSource. When I importAll, I see all my records import but when I search, I have in my results, only dataSource's 2 records.
In my data-config.xml :
<document>
<entity name="one" dataSource="ds-1" query="SELECT * FROM artist">
<field column="name" name="name" />
</entity>
<entity name="two" dataSource="ds-2" query="SELECT * FROM faqdata">
<field column="thema" name="thema" />
</entity>
</document>
And in my schema.xml :
<fields>
<field name="id" type="int" indexed="true" stored="true" required="true" />
<field name="slug" type="string" indexed="false" stored="true"/>
<field name="name" type="text" indexed="true" stored="true" />
<field name="alt_name" type="text" indexed="false" stored="true"/>
<field name="created_at" type="date" indexed="false" stored="true"/>
<field name="updated_at" type="date" indexed="false" stored="true"/>
<field name="thema" type="text" indexed="true" stored="true" />
<field name="text" type="text" indexed="true" stored="false" multiValued="true"/>
<dynamicField name="*" type="ignored" multiValued="true" />
</fields>
<uniqueKey>id</uniqueKey>
<defaultSearchField>text</defaultSearchField>
<solrQueryParser defaultOperator="OR"/>
<copyField source="name" dest="text"/>
<copyField source="thema" dest="text"/>
What is problems?
Thank

Ids in Solr needs to be unique.
If you insert Entities with the same Ids the previous record would get overwritten.
Solr does not update records. It deletes and reinserts the records.
If you want both the records, define a unique id.
e.g. Prepend Artist and faqdata to the id so that artists and faqdata don't overwrite each other.
SELECT A.*, 'ARTIST_' || ID PRIMARY_ID FROM ARTIST A
SELECT A.*, 'FAQDATA_' || ID PRIMARY_ID FROM FAQDATA A
and use PRIMARY_ID as the primary id and unique field.

Related

Multiple indexes in Solr

I want to index two tables from MySQL using Apache Solr. Please see my data-config and schema files below.
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/test" user="root" password="root" batchSize="1" />
<document name="tb_location">
<entity name="tb_location" query="SELECT * FROM tb_location">
<field column="loc_code" name="id"/>
<field column="loc_code" name="loc_code"/>
<field column="loc_name" name="loc_name"/>
<field column="loc_name" name="loc_name_ci"/>
<field column="ADM1_FULL_NAME" name="state"/>
</entity>
</document>
<document name="person">
<entity name="person" query="SELECT * FROM person">
<field column="id" name="personid"/>
<field column="fname" name="fname"/>
<field column="lname" name="lname"/>
<field column="town" name="town"/>
</entity>
</document>
</dataConfig>
Schema.xml
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> -
<field name="loc_code" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="loc_name" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="loc_name_ci" type="string_ci" indexed="true" stored="true" required="true" multiValued="false" />
<field name="state" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="personid" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="fname" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="lname" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="town" type="string" indexed="true" stored="true" required="true" multiValued="false" />
Also i created unique id for each tables (id and personid). But when i execute the dataimport module, nothing is fetched or indexed. Can someone help me to figure out where exactly the problem ?
Please check the below link for Multiple indexes...
Multiple indexes
Fixed it !!! data-config.xml should be as follows.
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/test" user="root" password="root" batchSize="1" />
<document name="tb_location">
<entity name="tb_location" query="SELECT * FROM tb_location">
<field column="loc_code" name="id"/>
<field column="loc_code" name="loc_code"/>
<field column="loc_name" name="loc_name"/>
<field column="loc_name" name="loc_name_ci"/>
<field column="ADM1_FULL_NAME" name="state"/>
</entity>
<entity name="person" query="SELECT * FROM person">
<field column="id" name="personid"/>
<field column="fname" name="fname"/>
<field column="lname" name="lname"/>
<field column="town" name="town"/>
</entity>
</document>
</dataConfig>

New FIELDS are not showing in search

I did a basic solr setup, Configured dataImportHandler and create very simple data config file with two fields and indexed it. It all worked fine.. But now I am adding new fields there and doing full import after that but for some reason new fields are just not showing in search result ( using solr interface for search). I have tried restarting solr, running config-reload to no effect.
this is my data config file. Not sure what's wrong here.
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/msl4" user="root" password=""/>
<document>
<entity name="hub_contents" query="select * from hub_contents" deltaQuery="select * from hub_contents where last_modified > '${dataimporter.last_index_time}'">
<field column="id_original" name="id" />
<field column="title" name="title" />
<field column="parent_id" name="parent_id" />
<field column="item_type" name="item_type" />
<field column="status" name="status" />
<field column="updated_at" name="updated_at" />
</entity>
</document>
</dataConfig>
You can add the below fields in your schema.xml
<field name="id" type="long" indexed="true" stored="true"/>
<field name="title" type="text_general" indexed="true" stored="true"/>
<field name="parent_id" type="long" indexed="true" stored="true"/>
<field name="item_type" type="text_general" indexed="true" stored="true"/>
<field name="status" type="text_general" indexed="true" stored="true" />
<field name="updated_at" type="date" indexed="true" stored="true"/>
It is left to you what type(fieldType) you want to add depending on your requirement.
indexed: true if this field should be indexed (searchable or
sortable)
stored: true if this field should be retrievable
Add the below tag:
<uniqueKey>id</uniqueKey>
This is to use to determine and enforce document uniqueness.

Solr: Indexed and stored field returning cannot be queried

Solr:4.8.1,I have a field called age which stores a single character like A or C and is stored is the field
<field name="age" type="text_general" indexed="true" stored="true"/>
When I get results back from other searches I can see the field age and its value but when I search for example age:* it returns 0 results. This just happened recently as I have been working with this field for a month and it worked fine but now nothing returns. I altered the schema a few times but nothing regarding this field. The only thing I can think of is that I accidentally put an invalid value into the age field of the mysql database that I import from, but fixed that and re-imported it.
I have searched this problem and found that <defaultSearchField> needs to be set but those results were older and that field is now depreciated.
EDIT:
My data config is:
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/gtw"
user="root"
password=""/>
<document>
<entity name="id" query="select id,price,title,description,main_image,retailer,link,age,gender,type,category,creation_date from solr_listings">
<field column="category" name="category" splitBy=","></field>
</entity>
</document>
</dataConfig>
The only thing that is different from the default example schema are the fields I added below:
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="title" type="text_general" indexed="true" stored="true" />
<field name="description" type="text_general" indexed="true" stored="true"/>
<field name="retailer" type="text_general" indexed="true" stored="true"/>
<field name="category" type="text_general" indexed="true" stored="true" />
<field name="main_image" type="text_general" indexed="true" stored="true"/>
<field name="last_modified" type="date" indexed="true" stored="true"/>
<field name="link" type="string" indexed="true" stored="true" />
<field name="gender" type="text_general" indexed="true" stored="true"/>
<field name="age" type="text_general" indexed="true" stored="true"/>
<field name="type" type="text_general" indexed="true" stored="true" />
<field name="creation_date" type="date" indexed="true" stored="true" />
<field name="price" type="float" indexed="true" stored="true"/>

Solr data indexing , returns only one field

I am trying to use solr for indexing data from my data base.
After I index data, when I query *.*
I get just the id field in result. not all the fields which I had in my query.
My data-config.xml
<document name="content">
<entity name="documen" query="SELECT indexId ,brand_id, category_id, product_name from Production">
<field column="indexId" name="id" />
<field column="category_id" name="categoryid" />
<field column="brand_id" name="brandid" />
<field column="product_name" name="id" />
</entity>
</document>
My schema.xml looks like this :
<field name="id" type="int" indexed="true" stored="true" required="true"/>
<field name="categoryid" type="int" indexed="true" stored="true"/>
<field name="brandid" type="int" indexed="true" stored="true" />
<field name="productname" type="string" indexed="true" stored="true"/>
When I query using *.* I get
<doc>
<str name="id">1</str>
<long name="_version_">1426653005792411648</long></doc>
<doc>
<str name="id">2</str>
<long name="_version_">1426653005793460224</long></doc>
<doc>
I get only "id" field as result.
Actually, whatever field is in "uniquekey" tag is returned as query result

Required field missing when importing dates from Mysql to Solr

I'm having problems getting solr and mysql dates to play nice. If I comment out the sent field from the schema everything works fine. However, as soon as I add back in the date field I get this error for every document.
org.apache.solr.common.SolrException: [doc=116] missing required field: sent
Here's how I have solr configured. I've ched to make sure that there are no empty/null dates and there are not. I've also tried dateTimeFormat=yyyy-MM-dd'T'hh:mm:ss and no dateTimeFormat being set. I've also tried both date and tdate for the type of sent in the schema.
dataconfig.xml
<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hoplite" user="root" password="root"/>
<document>
<entity name="document" query="select * from document">
<field column="ID" name="id" />
<field column="RAW_TEXT" name="raw_text" />
<entity name="email" query="select * from email where document_id='${document.id}'">
<field column="TIME_SENT" name="sent" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'"/>
<field column="BODY" name="body" />
</entity>
</entity>
</document>
</dataConfig>
schema.xml
<field name="id" type="tint" indexed="true" stored="true" required="true" />
<field name="raw_text" type="text_general" indexed="true" stored="false" required="true" multiValued="true"/>
<field name="sent" type="date" indexed="true" stored="true" required="true" /> <!-- Import succeeds if I comment this line out -->
<field name="body" type="text_general" indexed="true" stored="true" required="true" />
Apparently for dates the field name has to be the same as the column name. So changing the files to the below fixed the problem. Note that time_sent is now both the column and field name.
data-config.xml
<dataConfig>
<dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/hoplite" user="root" password="root"/>
<document>
<entity name="document" query="select * from document">
<field column="ID" name="id" />
<field column="RAW_TEXT" name="raw_text" />
<entity name="email" query="select * from email where document_id='${document.id}'">
<field column="TIME_SENT" name="time_sent" dateTimeFormat="yyyy-MM-dd'T'hh:mm:ss'Z'"/>
<field column="BODY" name="body" />
</entity>
</entity>
</document>
</dataConfig>
schema.xml
<field name="id" type="tint" indexed="true" stored="true" required="true" />
<field name="raw_text" type="text_general" indexed="true" stored="false" required="true" multiValued="true"/>
<field name="time_sent" type="date" indexed="true" stored="true" required="true" /> <!-- Import succeeds if I comment this line out -->
<field name="body" type="text_general" indexed="true" stored="true" required="true" />