Solr index multiple tables from MySQL - mysql

I have following mysql tables
1. user(user_id,email)
2. tweets(tweet_id,user_id,tweet)
3. tags(tag_id,tag)
4. tweets_tags(tweet_id,tag_id)
I want to show current user's tweets under "My Tweets" Tab in application. I want to get following data from Solr
user_id
email
tweet where user_id=x
tags where tweet_id=xx
How to index those mysql table on Solr? I only what to know the code of schema.xml and data-config.xml for Full/Delta import.
Note : I am not asking about MySQL connector etc, I have done already.

The use case you've described doesn't seem to justify using solr. You would just make sure you have proper keys and indexes and do it in mysql directly.
If for some reason you MUST use solr, you could probably prepare all the data and feed it to solr in a tag/tweet/user structure like this
user1 - tweet1 - tag1
user1 - tweet1 - tag2
user1 - tweet2 - tag1
and so on.
Then from solr you query by user, and then sort and group by tweet and then tag.
However I must state again that the solution I just described is implemented much safer with a higher confidence on the result by using plain sql.
Should you provide more details on your desired outcome, I'd be happy to suggest the database structure along with the necessary foreign keys and indexes and the queries you need to get your data out.

If you are using DIH (dataimporterhandler), I guess that link should be the solution for you:
Import with sub entities
If you have problem with writing the exact configurations, please let me know, I can assist you.

Related

LUIS to MySQL query - Azure Chatbot

How to generate MySQL Querys with LUIS and fetch data from the DB hosted in Azure?
Should generate a natural language query to an MySQL Query.
e.g.
How much beer was drunken on the oktoberfest 2018?
--> GET amountOfBeer FROM Oktoberfest WHERE Year ==2018;
Does anyone has an idea how to get this to work?
Already generated small Intents in LUIS e.g. GetAmountOfBeer
Dont know how to generate the MySQL Statements and how to get the data from the DB.
Thanks.
You should be able to achieve this, or something similar, using intents and entities. How successful this can be depends on how many and how diverse your queries need to be. First lets start with the phrase you mentioned: "How much beer was drunken on the oktoberfest 2018". You can easily (as you've done) add this as an utterance for an intent, GetAmountOfBeer. Though I'm a fan of intent names that you can read as "I want to GetAmountOfBeer", here you may want to name the intent amountOfBeer so you can use it in your query directly.
Next you need to set up you entities. For year (or datetime rather) that should be easy, as I believe there are some predefined entities for this. I think you need to use a datetime recognizer to parse out the right attribute (like year), but I haven't tried to do this before. Next, Oktoberfest seems to be a specific holiday or event in your DB, so you could create a list entity of all the events you have.
What you are left with is something like (pseudocode) GET topIntent FROM eventEntity WHERE Year ==datetime.Year, or something like that.
If your query set is more complex, you might have to have multiple GET statements, but you could put those in a switch statement by topIntent so that, no matter what the intent is, you can parse out the correct values. You also might want to build this into a dialog where you can check if the entities exist, and if not, you can prompt the user for the missing data.

Firebase Database: how to compare two values

In my Firebase database, I have a data structure similar to this:
The post ID (1a3b3c4d5e) is generated by the ChildByAutoId() function.
The user ID (fn394nf9u3) is the UID of the user.
In my app, I have a UILabel (author) and I would like to update it with the 'full name' of the user who created the post.
Since I have a reference to the post ID in the users part of the database, I assume there must be some code (if statement?) to check if the value exists and if so, update the label.
Can you help with that?
While it is possible to do the query (ref.child("Users").queryOrdered(byChild: "Posts/1a3b3c4d5e").queryEqual(toValue:true)), you will need to have an index on each specific user's posts to allow this query to run efficiently. This is not a feasible strategy.
As usual when working with NoSQL databases: if you need to do something that your current data model doesn't allow, change your data model to allow the use-case.
In this case that can either be adding the UID of the user to each post, or alternative add the user name to each post (as Andre suggests) and determining if/how you deal with user name changes.
Having such relational data in both directions to allow efficient lookups in both directions is very common in NoSQL database such as Firebase and Firestore. In fact I wrote a separate answer about dealing with many-to-many relations.
If you can change the structure then that is very good because I don't think you are maintaining proper structure for database.
You should take one more key name createdBy inside the Post node so actully structure would be
{description:"Thus the post is here", title:"Hello User", createdBy:"Javed Multani"}
Once you do this, It will dam easy to get detail of user.
OR
Unethical solution,
You can achieve this thing like while you are going to show Post from post node of firabase. Definitely you'll get the auto generated postid like:
1a3b3c4d5e
now first you should first get only posts then inside the successfully getting data and parsing you have to get users and find inside the user by putting the codition like postId == UserPostId if match found take fullname value from there.

How to speed up solr DIH with subqueries

I would like to speed up the DIH for a solr configuration that has the following structure:
user entity (mapped to user table)
user entity has 1..n values mapped to field of the user entity. so n additional fields
every field is gathered through a subquery on value table.
example:
entity:user (select * from user)
user has the following fields:
value_1: (select * from value where uid=user.id and category=1)
value_2: (select * from value where uid=user.id and category=2)
value_3: (select * from value where uid=user.id and category=3)
As there is many subqueries, import takes too long.
whats the best approach to this using SOLR and DIH (MySQL) ?
You can speed things up using SortedMapBachedCache
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHandler-EntityProcessors
I have dealt with this exact same issue, and at heart the problem is that neither MySQL nor Solr's SQL DIH has the built-in capacity to use a field value to help name a MySQL result column or a Solr field.
Ideally, you could do something like this: THIS DOESN'T WORK!
<entity name="value" query="select myfield, category from t1 where uid=${user.id}">
<field column="myfield" name="value_${value.category}">
</entity>
Without this wonderful, non-existent feature, there are several ways to get similar functionality with less convenience.
This page shows a great analysis of two different methods for creating this functionality, either with the ScriptTransformer (he found it simple to implement but it slowed down the import badly), or the TemplateTransformer (which requires you to compile a very short Java snippet, but is apparently much more efficient.
Again, this is likely the solution you want.
In my own case, I hadn't found this solution, and instead wrote a short Java program to make the SQL requests, build the SolrInputDocument and then submit them in batches to Solr. And then later the whole thing was made irrelevant when we decided to put all of the values into Solr as a single JSON-encoded field.
Good luck!

Correct foreigen key approach in jpa Or data saving basic approach

This question of mine is subjective
i am getting a list of objects from a third site.
now i want to save that data in database.
suppose the data is List. This response is to a query that i fired to that site .
now i want to save two things
1) query name
2) the response(List) (answer)
the myobject can have lot of answers corresponding to my query. now i want to save all these answers separately so that each answer can be fetched independently.
now i have this DB approach
one table for query and query id
second table which will consist of query id and query answer. (which will be foreigen key in first table
My question is am i following right approach?
initially i thought of saving the whole list in database but as per my knowledge we can not save list in database directly although in jpa implementation 2.0 we can save list in db (correct me if i am wrong)
please guide me with my current approach or of there is any better approach
i am using JPA 2.0 eclipselink.
Regards
Anil Sharma
What is your object model?
You can use OneToMany or ManyToMany to store a collection of Entity objects.
If you have a List or List you can store this using an ElementCollection.
But you may be better off creating an Answer or AnswerReference Entity.
See,
http://en.wikibooks.org/wiki/Java_Persistence/ElementCollection

How to extract relevant data from MySQL?

I'm using a table named "url2" with tje MySQL InnoDB Engine. I'm having so many data with full HTML of a Page, URL of the page, and so on.... When I use the following SQL query I am getting lot of results:
SELECT url FROM url2 WHERE html LIKE '%Yuva%' OR url LIKE '%Yuva%'
The search term yuva can be changes as user request
It will select lot of data, mostly which I don't need, how can i avoid that?
The out put of the above query is
www.123musiq.com
www.123musiq.com/home.html
www.123musiq.com/yuva.html
www.sensongs.com/
www.sensongs.com/hindi.html
www.sensongs.com/yuva.html
The Output i need is
According to the relevancy it should be sorted Like
www.123musiq.com/yuva.html
www.sensongs.com/yuva.html
www.sensongs.com/hindi.html
As from the comment of my Friend i change table to MyISAM,but i am geting 123musiq.com files first about 25 after that i am geting sensongs.how can i get 2 from 123musiq.com and 2 from sensongs.com,order by relevance
It seems you're asking for a Full Text Index, which in MySQL are only available on MyISAM tables.
Since you're using InnoDB tables, the easiest solution is to create a new (MyISAM) table with only the text content and an index to join with the original table (this also helps with seek efficiency in some common cases).
Perhaps you want to use LIMIT?
SELECT * FROM url2 WHERE html LIKE '%Yuva%' OR url LIKE '%Yuva%' LIMIT 2