I have +100 games on my website. Each game has it's own highscore table. The structure of the table is always the same (same columns etc.).
On a central page, as a summary, I would like to show the highscores a user has for all the games.
What would be the most efficient way to do this? I was thinking about just doing a SELECT query 100+ times (one for each table), but that doesn't sound very performance friendly:
$stmt = $mysqli->prepare("SELECT * FROM table WHERE user_id=? AND something_else=?");
$stmt->bind_param('is', $user_id, $something_else);
Tips?
The most efficient way is to fix your data model. Having 100+ tables with the same structure is a sign of a very poor design.
With one table -- and an additional column for the game -- you would simply do:
select ahs.*
from all_high_scores ahs
where ahs.use_id = ?;
Without that, you are stuck. MySQL is not great about optimizing queries, so I don't think this will work:
create view v_all_high_scores as
select * from high_scores_01 union all
select * from high_scores_02 union all
select * from high_scores_03 union all
. . .;
select *
from v_all_high_scores ahs
where user_id = ?
You might get lucky -- many databases will push the condition to the individual tables. And MySQL might do that in its current version.
Another approach is a brute force approach, with a zillion parameters:
select * from high_scores_1 where user_id = ? union all
select * from high_scores_2 where user_id = ? union all
select * from high_scores_3 where user_id = ? union all
. . .;
However, you should fix the data model, so you can write this efficiently in one query.
It would be better to be able to access the stats for all games from either one table, or a similar construct.
Have you considered using partitioned tables? This will allow the table to be split into many sub tables (one for each game) - based on a partitioning property (in this case the game ID). However, within SQL, it appears as one table, not 100 sub tables.
When you query based on a game ID say, the query will be directed to the one sub table that holds data for that game. This way, it will allow you to retain the optimisation you had by holding the game data in multiple tables, as before.
The same will happen, when you perform an insert - the new row will be stored in the relevant sub table.
However, if you want to select data across multiple sub-tables, then you can query the partitioned table without specifying the partitioning value, and it will query all sub tables, and act as if it is returning data from one table.
The following blog explains more:
https://www.percona.com/blog/2017/07/27/what-is-mysql-partitioning/#:~:targetText=Partitioning%20is%20a%20way%20in,find%20a%20natural%20partition%20key.
Related
I have a dozen of tables with the same structure. All of their names match question_20%. Each table has an indexed column named loaded which can have values of 0 and 1.
I want to count all of the records where loaded = 1. If I had only one table, I would run select count(*) from question_2015 where loaded = 1.
Is there a query I can run that finds the tables in INFORMATION_SCHEMA.TABLES, sums over all of these counts, and produces a single output?
You can do what you want with dynamic SQL.
However, you have a problem with your data structure. Having multiple parallel tables is usually a very bad idea. SQL supports very large tables, so having all the information in one table is a great convenience, from the perspective of querying (as you are now learning) and maintainability.
SQL offers indexes and partitioning schemes for addressing performance issues on large tables.
Sometimes, separate tables are necessary, to meet particular system requirements. If so, then a view should be available to combine all the tables:
create view v_tables as
select t1.*, 'table1' as which from table1 union all
select t2.*, 'table2' as which from table2 union all
. . .
If you had such a view, then your query would simply be:
select which, count(*)
from v_tables
where loaded = 1
group by which;
My problem is this:
select * from
(
select * from barcodesA
UNION ALL
select * from barcodesB
)
as barcodesTOTAL, boxes
where barcodesTotal.code=boxes.code;
Table barcodesA has 4000 entries
Table barcodesB has 4000 entries
Table boxes has like 180.000 entries
It takes 30 seconds to proccess the query.
Another problematic query:
select * from
viewBarcodesTotal, boxes
where barcodesTotal.code=boxes.code;
viewBarcodesTotal contains the UNION ALL from both barcodes tables. It also takes forever.
Meanwhile,
select * from barcodesA , boxes where barcodesA.code=boxes.code
UNION ALL
select * from barcodesB , boxes where barcodesB.code=boxes.code
This one takes <1second.
The question is obviously WHY?, is my code bugged? is mysql bugged?
I have to migrate from access to mysql, and i would have to rewrite all my code if the first option in bugged.
Add an index on boxes.code if you don't already have one. Joining 8000 records (4K+4K) to the 180,000 will benefit from an index on the 180K side of the equation.
Also, be explicit and specify the fields you need back in your SELECT statements. Using * in a production-use query is bad form as it encourages not having to think about what fields (and how big they might be), not to mention the fact that you have 2 different tables in your example, barcodesa and barcodesb with potentially different data types and column orders that you're UNIONing....
The REASON for the performance difference...
The first query says... First, do a complete union of EVERY record in A UNIONed with EVERY record in B, THEN Join it to boxes on the code. The union does not have an index to be optimized against.
By explicitly applying your SECOND query instance, each table individually IS optimized on the join (apparently there IS an index per performance of second, but I would ensure both tables have index on "code" column).
I have multiple select statements from different tables on the same database. I was using multiple, separate queries then loading to my array and sorting (again, after ordering in query).
I would like to combine into one statement to speed up results and make it easier to "load more" (see bottom).
Each query uses SELECT, LEFT JOIN, WHERE and ORDER BY commands which are not the same for each table.
I may not need order by in each statement, but I want the end result, ultimately, to be ordered by a field representing a time (not necessarily the same field name across all tables).
I would want to limit total query results to a number, in my case 100.
I then use a loop through results and for each row I test if OBJECTNAME_ID (ie; comment_id, event_id, upload_id) isset then LOAD_WHATEVER_OBJECT which takes the row and pushes data into an array.
I won't have to sort the array afterwards because it was loaded in order via mysql.
Later in the app, I will "load more" by skipping the first 100, 200 or whatever page*100 is and limit by 100 again with the same query.
The end result from the database would pref look like "this":
RESULT - selected fields from a table - field to sort on is greatest
RESULT - selected fields from a possibly different table - field to sort on is next greatest
RESULT - selected fields from a possibly different table table - field to sort on is third greatest
etc, etc
I see a lot of simpler combined statements, but nothing quite like this.
Any help would be GREATLY appreciated.
easiest way might be a UNION here ( http://dev.mysql.com/doc/refman/5.0/en/union.html ):
(SELECT a,b,c FROM t1)
UNION
(SELECT d AS a, e AS b, f AS c FROM t2)
ORDER BY a DESC
I have a mysql database like this
Post – 500,000 rows (Postid,Userid)
Photo – 200,000 rows (Photoid,Postid)
About 50,000 posts have photos, average 4 each, most posts do not have photos.
I need to get a feed of all posts with photos for a userid, average 50 posts each.
Which approach would be more efficient?
1: Big Join
select *
from post
left join photo on post.postid=photo.postid
where post.userid=123
2: Multiple queries
select * from post where userid=123
while (loop through rows) {
select * from photo where postid=row[postid]
}
I've not tested this, but I very much suspect (at an almost cellular level) that a join would be vastly, vastly faster - what you're attempting is pretty much the reason why joins exist after all.
Additionally, there would be considerably less overhead in terms of scripting language <-> MySQL communications, etc. but I suspect that's somewhat of a mute factor.
The JOIN is always faster with proper indexing (as mentioned before) but several smaller queries may be more easily cached, provided of course that you are using the query cache. The more tables a query contains the greater the chances of more frequent invalidations.
As long as the parsing and optimization procedure, I believe MySQL maintains its own statistics internally and this usually happens once. What you are losing when executing multiple queries is the roundtrip time and the client buffering lag, which is small if the resultset is relatively small in size.
A join will be much faster.
Each separate query will need to be parsed, optimized and executed which takes quite long.
Just don't forget to create the following indexes:
post (userid)
photo (postid)
With proper indexing on the postid columns, the join should be superior.
There's also the possibility of a sub-query:
SELECT * FROM photo WHERE postid IN (SELECT postid FROM post WHERE userid = 123);
I'd start with optimizing your queries, e.g. select * from post where userid=123 is obviously not needed as you only use row[postid] in your loop, so don't select * if you want to split the query.Then I'd run a couple of tests which ones faster but JOINing just two tables is usually the fastest (don't forget to create an index where needed).
If you're planning to make your "big query" very big (by joining more tables), things can get very slow and you may need to split your query. I once joined seven tables which took the query to run 30 seconds. Splitting the query made in run in a fraction of a second.
I'm not sure about this but there is another option. It might be much slower or faster depending upon indexes used.
In your case, something like:
select t1.postid FROM (select postid from post where userid = 23) AS t1 JOIN photo ON t1.postid = photo.postid
If the number of rows in table t1 is going to be small compared to table post there might be a chance for considerable performance improvement. But I haven't tested it yet.
SELECT * FROM photo, post
WHERE post.userid = 123 AND photo.postid = post.postid;
If you only want posts with photos, construct your query starting with the photo table as your base table. Note, you will get the post info repeated with each result row.
If you didn't want to return all of the post info with each row, an alternative would be to
SELECT DISTINCT postid from photo, post where post.userid = 123;
Then foreach postid, you could
SELECT * from photo WHERE postid = $inpostid;
What are views in MySQL? What's the point of views and how often are they used in the real world?
Normal Views are nothing more then queryable queries.
Example:
You have two tables, orders and customers, orders has the fields id, customer_id, performance_date and customers has id, first_name, last_name.
Now lets say you want to show the order id, performance date and customer name together instead of issuing this query:
SELECT o.id as order_id, c.first_name + ' ' + c.last_name as customer_name,
o.performance_date
FROM orders o inner join customers c
you could create that query as a view and name it orders_with_customers, in your application you can now issue the query
SELECT *
FROM orders_with_customer
One benefit is abstraction, you could alter the way you store the customers name, like inlcuding a middle name, and just change the views query. All applications that used the view continue to do so but include the middle name now.
It's simple: views are virtual tables.
Views are based on SELECT-queries on "real" tables, but the difference is that views do not store the information unlike real tables. A view only references to the tables and combines them the way SELECT says them to. This makes often used queries a lot more simplified.
Here's a simple example for you. Lets suppose you have a table of employees and a table of departments, and you'd like to see their salaries. First you can create a view for the salaries.
CREATE VIEW SALARIES
AS
SELECT e.name,
e.salary,
d.name
FROM employees AS e, deparments as d
WHERE e.depid = d.depid
ORDER BY e.salary DESC
This query lists the name of the employee, his/her salary and department and orders them by their salaries in descending order. When you've done this you can use queries such as:
SELECT * from SALARIES
On a larger scale you could make a view that calculates the average salary of the employees and lists who has a salary that's less than the average salary. In real life such queries are much more complex.
In mysql a view is a stored select statement
Look to this answer: MySQL Views - When to use & when not to
You can think of the view as a on-the-fly generated table. In your queries it behaves like an ordinary table but instead of being stored on the disk, it is created on the fly when it is necessary from a SQL statement that is defined when creating a view.
To create a view, use:
CREATE VIEW first_names AS SELECT first_name FROM people WHERE surname='Smith'
You can then use this view just as an ordinary table. The trick is when you update the table people the first_names view will be updated as well because it is just a result from the SELECT statement.
This query:
SELECT * FROM first_names
will return all the first names of people named Smith in the table people. If you update the people table and re-run the query, you will see the updated results.
Basically, you could replace views with nested SELECT statements. However, views have some advantages:
Shorter queries - nested SELECT statements make the query longer
Improved readability - if the view has a sensible name, the query is much easier to understand
Better speed - the view's SELECT statement is stored in the database engine and is pre-parsed, therefore it doesn't need to be transferred from the client and parsed over and over again
Caching and optimizations - the database engine can cache the view and perform other optimizations
They're a shorthand for common filters.
Say you have a table of records with a deleted column. Normally you wouldn't be interested in deleted records and hence you could make a view called Records that filters out deleted records from the AllRecords table.
This will make your code cleaning since you don't have to append/prepend deleted != 1 to every statement.
SELECT * FROM Records
would return all not-deleted records.
In simple words
In SQL, a view is a virtual table
based on the result-set of an SQL
statement.
A view contains rows and columns, just
like a real table. The fields in a
view are fields from one or more real
tables in the database.
You can add SQL functions, WHERE, and
JOIN statements to a view and present
the data as if the data were coming
from one single table.
Source
In addtion to this, you can insert rows in an underlying table from a view provided that the only one table is referred by the view and columns which are not referred in the view allow nulls.
A view is a way of pre-defining certain queries. It can be queried just like a table, is defined by a query rather than a set of on-disk data. Querying it allows you to query the table's results.
In most cases, querying a view can be seen as equivalent to using the view's defining query as a subquery in your main query. Views allow queries to be shorter and more modular (as the common parts are defined separately in the view). They also provide opportunity for optimization (although not all databases do so; I am not sure if MySQL provides any optimizations to make views faster or not).
If you update the underlying tables, queries against the view will automatically reflect those changes.