How to dynamically add SELECT statements if not enough results? - mysql

I am querying a table for a set of data that may or may not have enough results for the correct operation of my page.
Typically, I would just broaden the range of my SELECT statement to insure enough results, but in this particular case, we need to start with as small of a range of results as possible before expanding (if we need to).
My goal, therefore, is to create a query that will search the db, determine if it got a sufficient amount of rows, and continue searching if it didn't. In PHP something like this would be very simple, but I can't figure out how to dynamically add select statements in a query based off of a current count of rows.
Here is a crude illustration of what I'd like to do - in a SINGLE query:
SELECT *, COUNT(`id`) FROM `blogs` WHERE `date` IS BETWEEN '2011-01-01' AND '2011-01-02' LIMIT 25
IF COUNT(`id`) < 25 {
SELECT * FROM `blogs` WHERE `date` IS BETWEEN '2011-01-02' AND '2011-01-03' LIMIT 25
}
Is this possible to do with a single query?

You have 2 possible solutions:
Compare the count on the programming language side. And if there is not enough - perform one more query. (it is not as bad as you think: query cache, memcached, proper indexes, enough memory on server, etc - there are a lot of possibilities to improve performance)
Create stored procedure

Related

Two selects in one query

I have a product table.
I'm going to create a pagination on my frontend.
So I need to do this select:
select * from `product` limit 20 offset 0
I also need the total number of records in the table of products (I'll show the number of pages). I have this query:
select count(*) as all from `product`
I wanted to run these two queries in just one query.
Something like:
select *, count(*) as total from `hospedes` limit 20 offset 0
the above query does not work. is there any way to do this?
I don't believe you can run this as one query unless you are willing to give up the "LIMIT" and just retrieve all records, do a count on the result set, and throw away all but the first 20.
I recommend you run two queries. One to get the total COUNT(*) from the table, and then run a separate query for your pagination.
In theory you can do this:
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
This promises to let you get the information of how many tables match the query as if you had not used LIMIT. Read https://dev.mysql.com/doc/refman/5.7/en/information-functions.html#function_found-rows for details
It does give you the correct count, but using this method usually causes a high performance cost.
Read https://www.percona.com/blog/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/ for a test that shows it is a better strategy to run separate queries. Admittedly, that blog is from 2007 and the test was performed on MySQL 5.0. You can try to reproduce their test on the current version of MySQL to see if it has improved since then.

Does MySQL not use LIMIT to optimize query select functions?

I've got a complex query I have to run in an application that is giving me some performance trouble. I've simplified it here. The database is MySQL 5.6.35 on CentOS.
SELECT a.`po_num`,
Count(*) AS item_count,
Sum(b.`quantity`) AS total_quantity,
Group_concat(`web_sku` SEPARATOR ' ') AS web_skus
FROM `order` a
INNER JOIN `order_item` b
ON a.`order_id` = b.`order_key`
WHERE `store` LIKE '%foobar%'
LIMIT 200 offset 0;
The key part of this query is where I've placed "foobar" as a placeholder. If this value is something like big_store, the query takes much longer (roughly 0.4 seconds in the query provided here, much longer in the query I'm actually using) than if the value is small_store (roughly 0.1 seconds in the query provided). big_store would return significantly more results if there were not limit.
But there is a limit and that's what surprises me. Both datasets have more than the LIMIT, which is only 200. It appears to me that MySQL performing the select functions COUNT, SUM, GROUP_CONCAT for all big_store/small_store rows and then applies the LIMIT retroactively. I would imagine that it'd be best to stop when you get to 200.
Could it not do the select functions COUNT, SUM, GROUP_CONCAT actions after grabbing the 200 rows it will use, making my query much much quicker? This seems feasible to me except in cases where there's an ORDER BY on one of those rows.
Does MySQL not use LIMIT to optimize a query select functions? If not, is there a good reason for that? If so, did I make a mistake in my thinking above?
It can stop short due to the LIMIT, but that is not a reasonable query since there is no ORDER BY.
Without ORDER BY, it will pick whatever 200 rows it feels like and stop short.
With an ORDER BY, it will have to scan the entire table that contains store (please qualify columns with which table they come from!). This is because of the leading wildcard. Only then can it trim to 200 rows.
Another problem -- Without a GROUP BY, aggregates (SUM, etc) are performed across the entire table (or at least those that remain after filtering). The LIMIT does not apply until after that.
Perhaps what you are asking about is MariaDB 5.5.21's "LIMIT_ROWS_EXAMINED".
Think of it this way ... All of the components of a SELECT are done in the order specified by the syntax. Since LIMIT is last, it does not apply until after the other stuff is performed.
(There are a couple of exceptions: (1) SELECT col... must be done after FROM ..., since it would not know which table(s); (2) The optimizer readily reorders JOINed table and clauses in WHERE ... AND ....)
More details on that query.
The optimizer peeks ahead, and sees that the WHERE is filtering on order (that is where store is, yes?), so it decides to start with the table order.
It fetches all rows from order that match %foobar%.
For each such row, find the row(s) in order_item. Now it has some number of rows (possibly more than 200) with which to do the aggregates.
Perform the aggregates - COUNT, SUM, GROUP_CONCAT. (Actually this will probably be done as it gathers the rows -- another optimization.)
There is now 1 row (with an unpredictable value for a.po_num).
Skip 0 rows for the OFFSET part of the LIMIT. (OK, another out-of-order thingie.)
Deliver up to 200 rows. (There is only 1.)
Add ORDER BY (but no GROUP BY) -- big deal, sort the 1 row.
Add GROUP BY (but no ORDER BY) in, now you may have more than 200 rows coming out, and it can stop short.
Add GROUP BY and ORDER BY and they are identical, then it may have to do a sort for the grouping, but not for the ordering, and it may stop at 200.
Add GROUP BY and ORDER BY and they are not identical, then it may have to do a sort for the grouping, and will have to re-sort for the ordering, and cannot stop at 200 until after the ORDER BY. That is, virtually all the work is performed on all the data.
Oh, and all of this gets worse if you don't have the optimal index. Oh, did I fail to insist on providing SHOW CREATE TABLE?
I apologize for my tone. I have thrown quite a few tips in your direction; please learn from them.

SORTING OUT MULTIPLE SIMILAR TABLE IN A MYSQL VIEW

I all,
i have 2 similar very LARGE table(1M rows each) with the same layout, i would union them and sorting by a common column: start . also i would put a condition in "start" ie : start>X.
the problem is that the view doesnt take care abount start's index and the the complexity rise up much, a simple query takes about 15 seconds and inserting a LIMIT doesnt fix because the results are cutted off first.
CREATE VIEW CDR AS
(SELECT start, duration, clid, FROM cdr_md ORDER BY start LIMIT 1000)
UNION ALL
(SELECT start, duration, clid, FROM cdr_1025 ORDER BY start LIMIT 1000)
ORDER BY start ;
a query to:
SELECT * FROM CDR WHERE start>10
doesnt returns expected results cause LIMIT keyword cuts off results prior.
the expected results would be as a query like this:
CREATE VIEW CDR AS
(SELECT start, duration, clid, FROM cdr_md WHERE start>X ORDER BY start LIMIT 1000)
UNION ALL
(SELECT start, duration, clid, FROM cdr_1025 WHERE start>X ORDER BY start LIMIT 1000)
ORDER BY start ;
Is there a way to avoid this problem ?
Thaks all
Fabrizio
i have 2 similar table ... with the same layout
This is contrary to the Principle of Orthogonal Design.
Don't do it. At least not without very good reason—with suitable indexes, 1 million records per table is easily enough for MySQL to handle without any need for partitioning; and even if one did need to partition the data, there are better ways than this manual kludge (which can give rise to ambiguous, potentially inconsistent data and lead to redundancy and complexity in your data manipulation code).
Instead, consider combining your tables into a single one with suitable columns to distinguish the records' differences. For example:
CREATE TABLE cdr_combined AS
SELECT *, 'md' AS orig FROM cdr_md
UNION ALL
SELECT *, '1025' AS orig FROM cdr_1025
;
DROP TABLE cdr_md, cdr_1025;
If you will always be viewing your data along the previously "partitioned" axis, include the distinguishing columns as index prefixes and performance will generally improve versus having separate tables.
You then won't need to perform any UNION and your VIEW definition effectively becomes:
CREATE VIEW CDR AS
SELECT start, duration, clid, FROM cdr_combined ORDER BY start
However, be aware that queries on views may not always perform as well as using the underlying tables directly. As documented under Restrictions on Views:
View processing is not optimized:
It is not possible to create an index on a view.
Indexes can be used for views processed using the merge algorithm. However, a view that is processed with the temptable algorithm is unable to take advantage of indexes on its underlying tables (although indexes can be used during generation of the temporary tables).

Best way to count rows from mysql database

After facing a slow loading time issue with a mysql query, I'm now looking the best way to count rows numbers. I have stupidly used mysql_num_rows() function to do this and now realized its a worst way to do this.
I was actually making a Pagination to make pages in PHP.
I have found several ways to count rows number. But I'm looking the faster way to count it.
The table type is MyISAM
So the question is now
Which is the best and faster to count -
1. `SELECT count(*) FROM 'table_name'`
2. `SELECT TABLE_ROWS
FROM INFORMATION_SCHEMA.TABLES WHERE table_schema = 'database_name'
AND table_name LIKE 'table_name'`
3. `SHOW TABLE STATUS LIKE 'table_name'`
4. `SELECT FOUND_ROWS()`
If there are others better way to do this, please let me know them as well.
If possible please describe along with the answer- why it is best and faster. So I could understand and can use the method based on my requirement.
Thanks.
Quoting the MySQL Reference Manual on COUNT
COUNT(*) is optimized to return very quickly if the SELECT retrieves
from one table, no other columns are retrieved, and there is no WHERE
clause. For example:
mysql> SELECT COUNT(*) FROM student;
This optimization applies only to
MyISAM tables only, because an exact row count is stored for this
storage engine and can be accessed very quickly. For transactional
storage engines such as InnoDB, storing an exact row count is more
problematic because multiple transactions may be occurring, each of
which may affect the count.
Also read this question
MySQL - Complexity of: SELECT COUNT(*) FROM MyTable;
I would start by using SELECT count(*) FROM 'table_name' because it is the most portable, easiset to understand, and because it is likely that the DBMS developers optimise common idiomatic queries of this sort.
Only if that wasn't fast enough would I benchmark the approaches you list to find if any were significantly faster.
It's slightly faster to count a constant:
select count('x') from table;
When the parser hits count(*) it has to go figure out what all the columns of the table are that are represented by the * and get ready to accept them inside the count().
Using a constant bypasses this (albeit slight) column checking overhead.
As an aside, although not faster, one cute option is:
select sum(1) from table;
I've looked around quite a bit for this recently. it seems that there are a few here that I'd never seen before.
Special needs: This database is about 6 million records and is getting crushed by multi-insert queries all the time. Getting a true count is difficult to say the least.
SELECT TABLE_ROWS FROM INFORMATION_SCHEMA.TABLES WHERE table_schema = 'admin_worldDomination' AND table_name LIKE 'master'
Showing rows 0 - 0 ( 1 total, Query took 0.0189 sec)
This is decent, Very fast but inaccurate. Showed results from 4 million to almost 8 million rows
SELECT count( * ) AS counter FROM `master`
No time displayed, took 8 seconds real time. Will get much worse as the table grows. This has been killing my site previous to today.
SHOW TABLE STATUS LIKE 'master'
Seems to be as fast as the first, no time displayed though. Offers lots of other table information, not much of it is worth anything though (avg record length maybe).
SELECT FOUND_ROWS() FROM 'master'
Showing rows 0 - 29 ( 4,824,232 total, Query took 0.0004 sec)
This is good, but an average. Closer spread than others (4-5 million) so I'll probably end up taking a sample from a few of these queries and averaging.
EDIT: This was really slow when doing a query in php, ended up going with the first. Query runs 30 times quickly and I take an average, under 1 second ... it' still ranges between 5.3 & 5.5 million
One idea I had, to throw this out there, is to try to find a way to estimate the row count. Since it's just to give your user an idea of the number of pages, maybe you don't need to be exact and could even say Page 1 of ~4837 or Page 1 of about 4800 or something.
I couldn't quickly find an estimate count function, but you could try getting the table size and dividing by a determined/constant avg row size. I don't know if or why getting the table size from TABLE STATUS would be faster than getting the rows from TABLE STATUS.

Count from a table, but stop counting at a certain number

Is there a way in MySQL to COUNT(*) from a table where if the number is greater than x, it will stop counting there? Basically, I only want to know if the number of records returned from a query is more or less than a particular number. If it's more than that number, I don't really care how many rows there are, if it's less, tell me the count.
I've been able to fudge it like this:
-- let x be 100
SELECT COUNT(*) FROM (
SELECT `id` FROM `myTable`
WHERE myCriteria = 1
LIMIT 100
) AS temp
...but I was wondering if there was some handy built-in way to do this?
Thanks for the suggestions, but I should have been more clear about the reasons behind this question. It's selecting from a couple of joined tables, each with tens of millions of records. Running COUNT(*) using an indexed criteria still takes about 80 seconds, running one without an index takes about 30 minutes or so. It's more about optimising the query rather than getting the correct output.
SELECT * FROM WhateverTable WHERE WhateverCriteria
LIMIT 100, 1
LIMIT 100, 1 returns 101th record, if there is one, or no record otherwise. You might be able to use the above query as a sub-query in EXIST clauses, if that helps.
I can't think of anything. It looks to me like what you're doing exactly fulfils the purpose, and SQL certainly doesn't seem to go out of its way to make it easy for you to do this more succinctly.
Consider this: What you're trying to do doesn't make sense in a strict context of set arithmetic. Mathematically, the answer is to count everything and then take the MIN() with 100, which is what you're (pragmatically and with good reason) trying to avoid.
This works:
select count(*) from ( select * from stockinfo s limit 100 ) s
but was not any faster (that I could tell) from just:
select count(*) from stockinfo
which returned 5170965.