Lets say I want to code a search engine for a website which has different tables (users, posts, articles, products), and create a generic search for all them.
Rather than how to do it I would like to know if it makes sense to do this in a single query, being a site with really low traffic.
Is there any pros am I missing besides any performance gain, or it would be the same if I run a query for each table?
This is very subjective question, and you will soon get some feedback to revise it as Stackoverflow is not a discussion area. Anyways,
My view is that join query or single query has no relevance here. A user can be join by post. But in search you want to list a user even if he has not written post. Also, the fields of User are different from product and different from post. So, why combine them, when their display is different?
Even if you want to fire a single query you can use union between queries, but imagine a front end. For user you probably want to show their avatar with name and public thing, for product again you will show their image and price, and for post you might go with google like result, all are in different grid/area of page.
Better option is you use ajax and search them separately and show them in separate boxes on your result page. like google ads or this site shows.
Related
I want to create a complex filtering system.
The user goes into different categories, and he will see different filters for the model based on the category he is at.
It all needs to be on one HTML page. Any ideas?
(Sorry if it's a bit hard to understand I just don't really know how to describe it.)
In my MySQL database I have a table of products which contains almost 625k rows. The table has 162 columns.
Now there is a search box on my home page where you can search for anything and, if your search term is matched from any of my product titles, it give you a list of 15 products. This is similar to Amazon and other e-commerce websites.
What I did so far was to create a JSON file with all the product ID's and title names. When user inputs a minimum of 3 chars into the search field, an AJAX request is made and gets the list. But my issue is that the JSON file is almost 12MB in size, and the ajax calls it whenever user write's a char or removes a char. It was working fine until I was on local Machine and now as soon as I made it live it doesn't work for users, having lower then 5 MBPS internet connection. So I am looking for some advice, how do I create it fast as Amazon. I mean the search with auto suggestion from 625K products.
I am really sorry, but there is nothing more to give as an advice here then "go do some reading on database design and schema normalization".
If you have 162 columns in a table you will never be able to do an efficient search. The database (especially MySQL) will not hold the table in memory and indexes will not help either. Yes, you can throw it all into an ElasticSearch instance and it will fix some of your problems. But, honestly, this solution does not clean up the mess you have.
You should have a table with relevant information (titles, names, etc.) in one column (or also a numeric column for prices, etc). This metadata should reference the main table, the column should be fulltext-indexed. This way you ask for matches, filter results and JOIN relevant lines from the main table. This will work quickly with very little resources used.
I have a question about web usability related with tables, this is my use case:
I have a view with more than 1 table, I mean, I have N>0 tables in the view and each table has a title (for example "Photo list", "Video list", "Sound list").
Using javascript, users have the possibility to change the "view level", I mean, the detail level of the view. This means that clicking in different action buttons (basic, medium, advance view) the users can modify the amount of rows in each table. So, could be that some of the tables would be empty (no rows).
My question: What is the best usability practice to manage empty tables?
When you have identified tables that shows certain information you shouldn't hide then when they are empty, at least not without showing in any way that there's no data related to the empty table.
If you don't show the table maybe your users don't perceive that there is an entity of data that's empty, if you show it they will. This is important.
It could, however, be less important depending on the way you are showing your data. Let's say, for example, that your view shows on top a list of the different data types with the number of records in each one. If you keep a reminder there that X data type has 0 records, you can hide the table header on the view body, as all the info your user need is on the view.
On the contrary, if your users have no way to know that a specific data type is empty other than seeing an empty table, you need to keep it in your view to avoid them loosing information.
Keep in mind that information is the key on our world. Design is important to help and improve user experience, but you shouldn't put it before information.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
In building a web app recently, I started thinking about the information returned from a query I was making:
Find the user information and (for simplicity sake) the associated phone numbers tied to this user. Something as simple as:
SELECT a.fname, a.lname, b.phone
FROM users a
JOIN users_phones b
ON (a.userid = b.userid)
WHERE a.userid = 12345;
No problem here (yes I'm preventing injection, etc, not the point of this question). When I think about the data that is returned though, I am returning (potentially) several rows of information with that users name on each one. Let's say that single user has 1000 phone numbers associated with it. That's a first name and last name being returned each call a lot. Let's also assume I want to return a lot more than just the first name and last name of that user and in fact I'm starting to return quite a bit of extra rows which I really only needed once.
Are there circumstances in which it is "more appropriate" to make multiple calls to a database?
e.g.
SELECT firstname, lastname
FROM users
WHERE userid = 12345;
SELECT phone
FROM users_phones
WHERE userid = 12345;
If the answer is yes, is there a good/proper method of determining when to use multiple queries versus a single one?
I think that really depends on your use case. In the example you gave, it seems to make sense to return it as two queries, especially if you're passing that info back to a mobile device where you want to make sure you send them as little data as possible (not everyone has unlimited data.....)
I'd probably stick a DISTINCT in those queries as well if that's going to make a difference based on your tables.
A query with a JOIN may be slower than two independent queries. It really depends on the type of access you're doing.
For your example, I'd go with the two query approach. These queries could be executed in parallel, they could be cached, and there's no real reason to JOIN other than for arbitrary presentation concerns.
You'll also want to be concerned about returning duplicate data. In your example it looks like fname and lname would be repeated for each and every phone number, resulting in a lot of data being transmitted that's actually not useful. This is because of the one-to-many relationship you've described.
Generally you'll want to JOIN if it means sending less data, or because the two queries are not independent.
This should be driven by the application. Basically, you retrieve in one query all the information needed in one place. If you take this question page as an example, you see your user ID, the reputation counter, and the badge counters. There's no need to retrieve other user profile information when I first display the question page.
Only when one clicks on the user ID the rest of the profile may be queried, and may be not even all of it, as there are several tabs on the profile page.
However, if your application is guaranteed to access all 1000 phone numbers at once, along with the user's name, then you probably should fetch them all together.
Apologies if this is redundant, and it probably is, I gave it a look but couldn't find a question here that fell in with what I wanted to know.
Basically we have a table with about ~50000 rows, and it's expected to grow much bigger than that. We need to be able to allow admin users to add in custom data to an item based on its category, and users can just pick which fields defined by the administrators they want to add info to.
Initially I had gone with an item_categories_fields table which pairs up entries from item_fields to item_categories, so admins can add custom fields and reuse them across categories for consistency. item_fields has a relationship to item_field_values which links values with fields, which is how we handled things in .NET. The project is using CAKEPHP though, and we're just learning as we go, so it can get a bit annoying at times.
I'm however thinking of maybe just adding an item_custom_fields table that is essentially the item_id and a text field that stores XMLish formatted data. This is just for the values of the custom fields.
No problems if I want to fetch the item by its id as the required data is stored in the items table, but what if I wanted to do a search based on a custom field? Would a
SELECT * FROM item_custom_fields
WHERE custom_data LIKE '%<material>Plastic</material>%'
(user input related issues aside) be practical if I wanted to fetch items made of plastic in this case? Like how slow would that be?
Thanks.
Edit: I was afraid of that as realistically this thing will be around 400k rows for that one table at launch, thanks guys.
Any LIKE query that starts with % will not use any indexes you have on the column, so the query will scan the whole table to find the result.
The response time for that depends highly on your machine and the size of the table, but it definitely won't be efficient in any shape or form.
Your previous/existing solution (if well indexed) should be quite a bit faster.