Thanks, I edited my data-config.xml file
It's like this
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/lol"
user="root"
password="n"/>
<document name="content">
<entity name="id">
query="SELECT id from foo"
</entity>
</document>
</dataConfig>
When I run
http://localhost:8983/solr/dataimport?command=full-import
On the browser I get this
<response><lst name="responseHeader"><int name="status">0</int><int name="QTime">1</int></lst><lst name="initArgs"><lst name="defaults"><str name="config">data-config.xml</str></lst></lst><str name="status">idle</str><str name="importResponse"/><lst name="statusMessages"><str name="Time Elapsed">0:0:6.299</str><str name="Total Requests made to DataSource">1</str><str name="Total Rows Fetched">0</str><str name="Total Documents Processed">0</str><str name="Total Documents Skipped">0</str><str name="Full Dump Started">2013-06-28 11:17:34</str><str name="">Indexing failed. Rolled back all changes.</str><str name="Rolledback">2013-06-28 11:17:34</str></lst><str name="WARNING">This response format is experimental. It is likely to change in the future.</str></response>
I believe that configuration should look more like (note, query should be an attribute of the entity element):
<document name="content">
<entity name="id" query="SELECT id from foo">
<!--I assume you have a field in Solr and a column in MySQL, both of which are named "id"-->
</entity>
</document>
Related
I want to use Solr with MongoDB and MySQL together and need to combine in single core.
For example, I have a MongoDB collection which has depends on MySQL's one table,
I tried both with separate Solr core it's working fine but i want it in single core, i don't know its possible or not, if its possible then how we can use?
Updated
Here my DIHs: (Data import Handler)
- Solr with MySQL
<dataConfig>
<dataSource
name="MySQl"
type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/test"
user="root" password="root"
batchSize="-1"/>
<document>
<entity
query="select * from master_table"
name="master">
</entity>
</document>
</dataConfig>
- Solr with MongoDB
<dataConfig>
<dataSource
name="MyMongo"
type="MongoDataSource"
database="test" />
<document>
<entity
processor="MongoEntityProcessor"
query=""
collection="MarketCity"
datasource="MyMongo"
transformer="MongoMapperTransformer"
name="sample_entity">
<field column="_id" name="id" mongoField="_id" />
<field column="keyName" name="keyName" mongoField="keyName"/>
</entity>
</document>
</dataConfig>
So i want to do with the single core.
You can read the data from Mysql and MongoDB. Merge this records in single record and the index the same into solr.
To get the data from MySql, use any programming language and fetch the data.
For example you can use Java and fetch the data from mysql.
Apply the same logic to MongoDB. Get all the required records from mongoDB using Java.
Now By using the SolrJ apis create the solrDocument. Read more about the SolrDOcument and other apis here
Once your create the instance of SolrDocument then add the data that you fetched from Mysql and MongoDB into it using the below method.
addField(String name, Object value)
This will add a field to the document.
You can prepare the document something like this.
SolrInputDocument document = new SolrInputDocument();
document.addField("id", "123456");
document.addField("name", "Kevin Ross");
document.addField("price", "100.00");
solr.add(document);
solr.commit();
Get a solr instance of HttpSolrClient.
Once the SolrDocument is ready, index it to solr.
Im using solr 4.4 ,my config.xml file is as given below.
First time i do a full import of 40000 rows ,it is indexed.
Now in my application i add one more row the total count comes to 40001,i need
to do the full import or delta import?
I know that the delta import is applied to a row which is indexed ,
What is the approach when a new row is added to mysql ,do we need to use full
import of all 40001 data ?
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/mydb" user="uname" password="pwd" batchSize="1" />
<document name="resource">
<entity name="resource" query="SELECT * FROM resource"
deltaImportQuery="SELECT * FROM resource WHERE ref = '${dataimporter.delta.ref}'"
deltaQuery="SELECT * FROM resource WHERE last_modified > '${dataimporter.last_index_time}'" transformer="RegexTransformer">
<field column="ref" name="ref"/>
<field column="name" name="name"/>
................
............
</entity>
</document>
</dataConfig>
Here is a good article right from the documentation , that talks about a delta import via full import approach with being efficient . Have a look at it here https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport . Hope this helps :)
I don't figure out how to import the correct datetime from mysql to solr via the DataImportHandler. After the import the datetime values get substracted 2 hours
mysql "created_at 2013-04-05 15:04:21" gets in solr to "created_at":"2013-04-05T13:04:21Z"
mysql ##global.time_zone, ##session.time_zone are both system and display the correct CET time.
Here my data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/test"
user="+++" password="++++/> <document>
<entity name="id"
query="SELECT table.created_at, ... from table"
<field column="created_at" name="created_at"/>
I tried to use the CONVERT_TZ command. In mysql it worked out. But with solr I have no success, the created_at value is the not indexed at all.
<entity name="id"
query="SELECT query="SELECT CONVERT_TZ(table.created_at,'+00:00','+01:00'), ... from table"
<field column="created_at" name="created_at"/>
try this : i used this in my indexing,
$query= "SELECT DATE_FORMAT(CONVERT_TZ(table.created_at,'+00:00','+01:00'),'%Y-%m-%dT%TZ'),.. FROM table ";
DateField Maual of SOLR
I want to index MySQL table with Solr4.0 row by row . I have installed the necessary java my database is called 'twitter_db' and the table i want ot index called "tweets"
and i login using user : root and no password
the schema is so i added it in the :
<dataConfig>
<dataSource driver="org.hsqldb.jdbcDriver" url="jdbc:mysql://localhost/twitter_db" user="root" password="" />
<document name="tweet">
<entity name="tweet" query="select * from tweets">
<field column="tweet_id" name="tweet_id" />
<field column="text" name="text" />
<field column="user" name="user" />
<field column="tweet_time" name="tweet_time" />
<field column="topic_kw" name="topic_kw" />
<field column="timestamp" name="timestamp" />
</entity>
</document>
</dataConfig>
and the solrconfig changes is:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">db-data-config.xml</str>
</lst>
</requestHandler>
and when i hit [root]:8983/solr/db/dataimport?command=full-import
for full import
it fails : error message in the GUI is :
Indexing failed. Rolled back all changes.
and the the part of the error message in the log is :
SEVERE: Exception while processing: tweet document : SolrInputDocument[]:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select * from tweets Processing Document # 1
at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:71)
at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:252)
at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:209)
i wonder if it's a wrong data in selecting the database of the table
i tried a similar Question it's a similar problem but i didn't find the answer there
Check driver="org.hsqldb.jdbcDriver" it is hsqldb rather it should be pointing to MySQL driver class. Try updating the driver class to appropriate driver for MySQL & you can run it in debug mode.
I have Solr 4.0 up and running and using DataImportHandler to import data from MySQL.
I have notcied that if I point DataImportHandler at MySQL 5.5 data source everyhting works as expected. However when using exactly the same Solr/DataImportHandler config and exactly the same database but running on MySQL 5.0 certain fields come back base64 encoded.
Relevant entries in data-config.xml
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
name="DB-SOURCE"
url="jdbc:mysql://dbhost/dbname"
user="user"
password="password"
/>
<document name="articles">
<entity name="article_ph" transformer="HTMLStripTransformer" dataSource="DB-SOURCE" pk="article_id"
query="SELECT 'Politics Home' AS article_site,
CONCAT('ph-article-', article_id) AS article_id,
article_title,
article_text_plain AS article_content,
article_articletype_id,
article_datetime AS article_date,
'Uncategorised' AS article_section,
'Non Member' AS article_source
FROM articles
WHERE
article_datetime!='0000-00-00 00:00:00'
AND article_datetime is NOT NULL
AND article_live=1
AND article_text_plain!=''
AND article_text_plain IS NOT NULL
AND article_title is NOT NULL
AND article_title !=''">
<field column="ARTICLE_SITE" name="article_site" />
<field column="ARTICLE_ID" name="article_id" />
<field column="ARTICLE_TITLE" name="article_title" />
<field column="ARTICLE_CONTENT" name="article_content" stripHTML="true" />
<field column="ARTICLE_DATE" name="article_date" />
<field column="ARTICLE_SECTION" name="article_section" />
<field column="ARTICLE_SOURCE" name="article_source" />
<entity name="articletype_name" dataSource="DB-SOURCE"
query="SELECT
articletype_name
FROM articletypes
WHERE articletype_id='${article_ph.article_articletype_id}'">
<field column="articletype_name" name="article_type"/>
</entity>
</entity>
When I run import pointing at MySQL 5.5 I get :
<arr name="article_id"><str>ph-article-124</str></arr>
When I run import pointing at MySQL 5.0 I get articles with base64 encode IDs :
<arr name="article_id"><str>cGgtYXJ0aWNsZS0xMjQ=</str></arr>
All other fields come back correctly.
Collation and character sets on both DBs are the same.
Any help appreciated.
Try converting it back to string
CONCAT('ph-article-', CAST(article_id AS CHAR(50))