I have a very small MySQL database that stores information about goods and users. I am trying to implement search among users, who bought some goods by firstname and lastname. Sphinx search engine has lot of good recommendations. So I am using it. Now my search looks like following:
Search with Sphinx IDs of users according to firstname and lastname.
Search in MySQL (not with Sphinx) goods according to specific
filters (id or category, price, etc.) where user_id IN IDs from
item1.
How to implement this with one JOIN query?
You can't directly, because as you say the sphinx index, and the database live within different 'systems'.
So the 'join' is happening in your application. Sounds like you are already implemening what is effectivly a join.
But there are two alternatives if you really dont want to continue with that system,
1) SphinxSE. Its a fake mysql storage engine, when you make a query against the virtual table, a query is made in the background back to sphinx index, and the results of the query are presented as a table, to mysql. Now because its a mysql table, mysql then join it with the database table(s) to present resultset, combinging the query and the data. (there is still seperate systems, but mysql implements the joining logic)
2) Attributes. Can store data in the sphinx index, alongside the full-text index. Sphinx can return the attributes in result sets. In this way you avoid the need for the join, because you get the search results along with the data (which you would of got from mysql) in one go.
(in this way you create one big 'normalized' index)
Related
I have three to five search fields in my application and planning to integrate this with Apache Solr. I tried to do the sams with a single table and is working fine. Here are my questions.
Can we create index multiple tables in same core ? Or should i create separate core for each indexes (i guess this concept is wrong).
Suppose i have 4 tables users, careers, education and location. I have two search boxes in a php page where one is to search for simple locations (just like an autocomplete box) and another one is to get search for a keyword which should check on tables careers and education. If multiple indexes are possible under single core;
2.1 How do we define the query here ?
2.2 Can we specify index name in query (like table name in mysql) ?
Links which can answer my concerns are enough.
If you're expecting to query the same data as part of the same request, such as auto-completing users, educations and locations at the same time, indexing them to the same core is probably what you want.
The term "core" is probably identical to the term "index" in your usage, and having multiple sets of data in the same index will usually be achieved through having a field that indicates the type of document (and then applying a filter query if you want to get documents of only one type, such as fq=type:location. You can use the grouping feature of Solr to get separate result sets of documents back for each query as well.
If you're only ever going to query the data separately, having them in separate indexes are probably the way to go, as you'll be able to scale and perform analysis and tuning independent from each index in that case (and avoid having to always have a filter query to get the type of content you're looking for).
Specifying the index name is the same as specifying the core, and is part of the URL to Solr: http://localhost:8983/solr/index1/ or http://localhost:8983/solr/index2/.
I am developing a web application with spring, hibernate and mysql i would like to know how to fetch data from database very fast. I am trying to select a data from my database. There are thousand of record in my database so its taking more time to select records. I have to know how can i minimize my record fetch time.please give me some suggestion so that i can optimize my web application.
Note : My database has foreign key mapping so i am relating many table to produce final result.
To optimize the time of request. You have to use the execution plan. Here is the documentation for MySQL
In general, here are some recommendations to use :
Choose the good index. For instance, if you have to choose between Long and String prefer Long.
In Select clause just specify field you need.
Joins are expensive in terms of time. Make sure that you use all the keys that relate the two tables together and don't join to unused tables -- always try to join on indexed fields. The join type is important as well (INNER, OUTER,... ).
There are some others tips to use but those i list could really improve your time.
I'm trying to do sphinx search with a limited result set based on a mysql table has a user to network relationship.
So they should only be able to search within networks they are a member of. Since there is a near infinite amount of possible user to network combinations the only way I've been able to accomplish this is to do the sphinx search and than add it to a mysql query that joins it on the network table and includes an IN statement with the list of document IDs.
This is very efficient and I've already noticed as the site gets larger that this would be a really big issue.
When the data in sphinx was in a mysql fulltext column this wasn't an issue. but every since we added sphinx for the faster searching its complicated the way we get the final results.
I've thought about doing the opposite, getting a list of all networks the user is in and then doing a sphinx search with that as a limiting factor (a network id attribute).
Does anyone have a better solution for this? Is there anyway i can join this data directly on the sphinx data and limit it by a mysql result set?
Thanks
On our new site (a shopping site), we will use Solr for our site's search engine. In the Solr index we keep a list of product id's, and a list of keywords for each product. The search query is done against the keywords.
Solr returns a list of product id's. These id's are then inserted into a MySQL query to select all product data from the database. MySQL also handles the sorting of results. E.g., the MySQL query might look like:
SELECT * FROM product WHERE id IN (1,4,42,32,46,...,39482) ORDER BY price ASC
We have around 100,000 products on the site. This method works fine when there are a couple of thousand of results, but becomes slow when there are - for example - 50,000 results.
My assumption is that the bottleneck is the "WHERE IN" clause. A long-term solution will be to move all product data to Solr so it can handle sorting the results and also applying refine filters to the search (e.g., perhaps the user only wants to view products in a certain price range). However, we are inexperienced with Solr and need a short-term fix before we can implement this.
One option is to abandon Solr in the short-term and store keywords in a table in MySQL and do the search against this using a FULL-TEXT search.
Am I missing any other options?
The main problem for you is that Solr is going to return the results sorted by number of matching keywords, but you want the results to be sorted by price. Like you correctly mention, moving all your data to Solr is the best option - you would be very happy with Solr for your searching, sorting, faceting and pagination needs.
For the short-term however, it will be well worth to just add the price field to Solr. When you get a search query like tooth paste you can issue a Solr query like
q=keywords:(tooth AND paste)&rows=10&fl=id&sort=price%20asc
to get only the first 10 results and then do pagination by specifying the start parameter, so like:
q=keywords:(tooth AND paste)&rows=10&start=10&fl=id&sort=price%20asc
I am trying to make a database of products that can be searched by many facets(like newegg or amazon). At first I was going to try to do the whole thing with mysql but further research has led me to believe that is a bad idea so instead I am thinking about using Sphinx.
My question is how would I set up the mysql tables for this? Would I just have one table for the products and another one with all the facets that would just have a couple large varchar fields and foreign key to the product?
I am not a huge Sphinx expert, but I'd say that you don't have to stick all your data in one table. Sphinx can handle associations just fine. If you are planning to use Rails for your front-end then take a look at thinking_sphinx gem. It definitely allows you to specify attributes based on data spread out into many tables. In my experience I didn't have to change my data structure to accommodate Sphinx.
I'll pipe in.
You don't really need to actually. Facets in Sphinx are just ID's (at least in 0.9.9 the current stable release). I am going to assume that you have a standard product table with your different facets stored as foreign keys to other tables.
So assuming you have this you can just select over the main product table and set up the facets in sphinx as per the documentation.
I would really need to see your table structure to comment further. It sounds like you have your products spread over multiple tables. In this case as you mentioned I would go with a single table which you index on which is populated with the contents of all the others.
The great thing about Sphinx is that you can use a MySQL query to get your data into Sphinx. This allows you to structure your database in a way that's optimized for your business logic, without having to worry about how search will perform. As long as you're creative with the query you write for sql_query, you can normalize your database however you'd like, and still be able to grab all the text to be indexed with a single query. For example, if you need to get strings from a many-to-one relationship into your index, you can do so using a subquery.
sql_query = SELECT *, (SELECT pa.text FROM products_attr pa WHERE pa.product_id=p.id ) \
FROM products p;
Additionally, if you drop downs where you search on attribute IDs, you use Sphinx's multi-value attribute. This way, you can search by attribute ID, as well as the text of the attrbute.
sql_attr_multi = uint attributes from query; \
SELECT product_id AS id, id AS attribute FROM product_attributes ;