I have a problem with paginating two large tables:
Receipts table: id, receipt_date, record_details (650k records)
Z Reports table: id, receipt_date, record_details (88k records)
What I want to do is to sort both of these tables by receipt_date and union, after that I want to paginate them. Currently I have this SQL (not exactly but the main idea is this):
SELECT c.id, c.receipt_date, c.col_type FROM (
SELECT a.id, a.receipt_date, 'receipt' AS coltype
FROM `terminal_receipts` a
WHERE `a`.`deleted` IS NULL
UNION ALL
SELECT b.id, b.receipt_date, 'zreport' AS coltype
FROM z_reports` b WHERE `b`.`deleted` IS NULL
) c
ORDER BY receipt_date desc LIMIT 50 OFFSET 0
This way, the server selects all records from two tables, orders them by date and then applies the pagination.
But when the row counts increase, this query will take longer to complete. Is there any other algorithm to get the same result without being dependent to table sizes?
There is a technique called Seek Method, you can read about here. According to it, you need to identify a set of columns that uniquely identifies each row. This set of column will then be used with a predicate when searching through the database.
The link I mentioned has some examples, but here's another one, a very simple one:
CREATE TABLE IF NOT EXISTS `docs` (
`id` int(6) unsigned NOT NULL,
`rev` int(3) unsigned NOT NULL,
`content` varchar(200) NOT NULL,
PRIMARY KEY (`id`,`rev`)
) DEFAULT CHARSET=utf8;
INSERT INTO `docs` (`id`, `rev`, `content`) VALUES
('1', '1', 'The earth is flat'),
('2', '1', 'One hundred angels can dance on the head of a pin'),
('1', '2', 'The earth is flat and rests on a bull\'s horn'),
('1', '3', 'The earth is like a ball.');
SELECT *
FROM `docs`
ORDER BY rev, id
LIMIT 2;
SELECT *
FROM `docs`
WHERE (rev, id) > (1, 2) # let's use the last row from the previous select with the predicate
ORDER BY rev, id
LIMIT 2;
SELECT *
FROM `docs`
WHERE (rev, id) > (3, 1) # same idea
ORDER BY rev, id
LIMIT 2;
SQLFiddle Link
Having indexes will further speed up pagination.
I hope this helps.
Related
I have a database where I save information about my products. I use a query for getting those products from my table. The query looks like this:
SELECT * FROM products WHERE stock > 0 ORDER BY RAND();
This query returns all the products that have stock > 0 in a random order, and it works ok. However, now I want to get those products with stock = 0, but I want them to appear at the end of the query (also in a random way but always after products that have stock > 0). So I tried a new query which looks like this:
(SELECT * FROM products WHERE stock > 0 ORDER BY RAND())
UNION
(SELECT * FROM products WHERE stock = 0 ORDER BY RAND());
...this query returns the zero-stock products at the end, but it seems to ignore the ORDER BY RAND() statement and I always get them in the same order. So my question is: how can I get a random response from the query mantaining the condition of zero-stock products at the end?
You don't need UNION:
SELECT *
FROM products
ORDER BY stock = 0, RAND();
The condition stock = 0 in the ORDER BY clause makes sure that the zero-stock products are placed last and the 2nd level of sorting with RAND() randomizes the rows in each of the 2 groups.
SQL Fiddle
Use a case statement to create a field to order by
e.g.
CREATE TABLE IF NOT EXISTS `products` (
`id` int(6) unsigned NOT NULL,
`stock` int(3) unsigned NOT NULL,
`product` varchar(200) NOT NULL,
PRIMARY KEY (`id`,`product`)
) DEFAULT CHARSET=utf8;
INSERT INTO `products` (`id`, `stock`, `product`) VALUES
('1', '10', 'Timber'),
('2', '12', 'Nails'),
('1', '0', 'Glue'),
('1', '0', 'Left handed wrench.');
And run
SELECT stock, product, case when stock > 0 then 1 else 2 end as SetOrder
FROM products
ORDER BY SetOrder, RAND()
Gets you
stock product SetOrder
10 Timber 1
12 Nails 1
0 Glue 2
0 Left handed wrench. 2
SQL Fiddle
So i have two tables, this one is books
and this one is payment
now i want to select if there are any records in books that have a similiar(select * from a like a.field like '%blabla%) title or even the same title but not exist in payment
i tried not exist but im not sure because the executing query process is very long so i thought it wasn't the case.
Given the information, I have tried to put together an example. I hope this is helpful and gets you close to what you want.
CREATE TABLE books
(`number` int, `verification_date` date, `title` varchar(6))
;
INSERT INTO books
(`number`, `verification_date`, `title`)
VALUES
(14116299, '2020-05-01 18:00:00', 'Title1'),
(12331189, '2020-07-01 18:00:00', 'Title2'),
(13123321, NULL, 'Title4'),
(12318283, '2020-12-31 18:00:00', 'Title3'),
(12318284, '2021-01-31 18:00:00', 'Title2')
;
CREATE TABLE payments
(`number` int, `title` varchar(6), `pay_date` date)
;
INSERT INTO payments
(`number`, `title`, `pay_date`)
VALUES
(14116299, 'Title1', '2020-05-01 18:00:00'),
(12318283, 'Title3', '2020-12-31 17:00:00')
;
We are selecting all columns from books and keeping only records that don't have a match in the payments table. More info on this: How to select rows with no matching entry in another table?. Then added an additional where clause to search the books table for titles.
SELECT b.*
FROM books b
LEFT JOIN payments p ON b.number = p.number
WHERE p.number is NULL
AND b.title LIKE'%2'
Output:
number verification_date title
12331189 2020-07-01 Title2
12318284 2021-01-31 Title2
SQL Fiddle
I'm looking to make a SQL query, but I can't do it... and I can't find an example like mine.
I have a simple table People with 3 columns, 7 records :
I'd like to get for each team, the average points of 2 bests people.
My Query:
SELECT team
, (SELECT AVG(point)
FROM People t2
WHERE t1.team = t2.team
ORDER
BY point DESC
LIMIT 2) as avg
FROM People t1
GROUP
BY team
Current result: (average on all people of each team)
Apparently, it's not possible to use a limit into subquery. "ORDER BY point DESC LIMIT 2" is ignored.
Result expected:
I want the average points of 2 bests people (with highest points) for each team, not the average points of all people of each team.
How can I do that? If anyone has any idea..
I'm on MySQL Database
Link of Fiddle : http://sqlfiddle.com/#!9/8c80ef/1
Thanks !
You can try this.
try to make a order number by a subquery, which order by point desc.
then only get top 2 row by each team, if you want to get other top number just modify the number in where clause.
CREATE TABLE `People` (
`id` int(11) NOT NULL,
`name` varchar(20) NOT NULL,
`team` varchar(20) NOT NULL,
`point` int(4) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `People` (`id`, `name`, `team`, `point`) VALUES
(1, 'Luc', 'Jupiter', 10),
(2, 'Marie', 'Saturn', 0),
(3, 'Hubert', 'Saturn', 0),
(4, 'Albert', 'Jupiter', 50),
(5, 'Lucy', 'Jupiter', 50),
(6, 'William', 'Saturn', 20),
(7, 'Zeus', 'Saturn', 40);
ALTER TABLE `People`
ADD PRIMARY KEY (`id`);
ALTER TABLE `People`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=8;
Query 1:
SELECT team,avg(point) totle
FROM People t1
where (
select count(*)
from People t2
where t2.id >= t1.id and t1.team = t2.team
order by t2.point desc
) <=2 ## if you want to get other `top` number just modify this number
group by team
Results:
| team | totle |
|---------|-------|
| Jupiter | 50 |
| Saturn | 30 |
This is a pain in MySQL. If you want the two highest point values, you can do:
SELECT p.team, AVG(p2.point)
FROM people p
WHERE p.point >= (SELECT DISTINCT p2.point
FROM people p2
WHERE p2.team = p.team
ORDER BY p2.point DESC
LIMIT 1, 1 -- get the second one
);
Ties make this tricky, and your question isn't clear on what to do about them.
I'm trying to find duplicates and select the result with the least value combination in a table.
Until now I'm only able to select the result that has the lowest value on a column using MIN(). I thought it would be easy to just replace MIN with LEAST and change the columns.
Here's a layout:
CREATE TABLE `index`.`products` ( `id` INT NOT NULL AUTO_INCREMENT , `name` VARCHAR(10) NOT NULL , `price` INT NOT NULL , `availability` INT NOT NULL , PRIMARY KEY (`id`)) ENGINE = InnoDB;
INSERT INTO `products` (`id`, `name`, `price`, `availability`) VALUES
(NULL, 'teste', '10', '1'),
(NULL, 'teste', '5', '2'),
(NULL, 'teste', '3', '3');
The simplified layout
id - name - price - availabilty
1 - test - 10 - 1
2 - test - 5 - 2
3 - test - 3 - 3
using the following query:
select name, MIN(price) from products group by name having count(*) > 1
gets me the lowest price. I'm trying to get the lowest price and lowest availabilty.
select name, LEAST(price, availability) from products group by name having count(*) > 1
This doesn't work.
Clarification: I want to select the row with the lowest price and lowest availabity. In this case it should be the first one I guess.
I should clarifity that 1=available, 2=not available and 3=coming soon
The statement to select lowest price for the best availability is:
set sql_mode=only_full_group_by;
SELECT
name, MIN(price), availability
FROM
products
JOIN
(
SELECT
name, MIN(availability) availability
FROM
products
GROUP BY name
) as x
USING (name , availability)
GROUP BY name , availability;
So I am trying to calculate the amount of repeat orders in my system per restaurant. This is defined as the number of users (based on their email address, eo_email) that have ordered more than once from that restaurant. Examples under the schema
Here is the table that represents my restaurants
CREATE TABLE IF NOT EXISTS `lf_restaurants` (
`r_id` int(8) NOT NULL AUTO_INCREMENT,
`r_name` varchar(128) DEFAULT NOT NULL,
PRIMARY KEY (`r_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 ;
INSERT INTO `lf_restaurants` (`eo_id`, `eo_ref_id`) VALUES
('1', 'Restaurant X'),
('2', 'Cafe Y');
And this is my orders table
CREATE TABLE IF NOT EXISTS `ecom_orders` (
`eo_id` mediumint(9) NOT NULL AUTO_INCREMENT,
`eo_ref_id` varchar(12) DEFAULT NOT NULL,
`eo_email` varchar(255) DEFAULT NOT NULL,
`eo_order_parent` int(11) NOT NULL,
PRIMARY KEY (`eo_id`),
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
INSERT INTO `ecom_orders` (`eo_id`, `eo_ref_id`, `eo_email`, `eo_order_parent`) VALUES
('', '', 'a#a.com', '1'),
('', '', 'a#a.com', '1'),
('', '', 'a#a.com', '1'),
('', '', 'a#a.com', '1'),
('', '', 'a#a.com', '1'),
('', '', 'b#b.com', '1'),
('', '', 'b#b.com', '1'),
('', '', 'c#c.com', '1'),
('', '', 'd#d.com', '1'),
('', '', 'e#e.com', '1'),
('', '', 'a#a.com', '2'),
('', '', 'c#c.com', '2'),
('', '', 'c#c.com', '2'),
('', '', 'e#e.com', '2');
So Restaurant X (r_id 1) has 10 orders. Users a#a.com and b#b.com have ordered from that restaurant multiple times, and c#c.com, d#d.com, and e#e.com have only ordered once, so it would need to return 40%
Cafe Y (r_id 2) has 4 orders. User c#c.com has ordered twice, users a#a.com and e#e.com have only ordered once, so it would need to return 33%
I am not sure posting what I have got already will be much good, as I keep running into 'Subquery has more than 1 result' or if I wrap that subquery in its own dummy query with a count, it wont let me use fields I need from the main query such as r_id. But here goes:
SELECT r_name,
(SELECT COUNT(*) AS cnt_users
FROM (
SELECT *
FROM ecom_orders
WHERE eo_order_parent = r_id
GROUP BY eo_email
) AS cnt_dummy
) AS num_orders,
(SELECT COUNT(*) AS cnt
FROM ecom_orders
WHERE eo_order_parent = r_id
GROUP BY eo_order_parent, eo_email
) AS num_rep_orders
FROM lf_restaurants
ORDER BY num_orders DESC
The num_orders subquery is saying it doesnt recognise r_id, as I am guessing this is due to the order in which things are executed
The num_rep_orders subquery is coming back as multiple rows, but really i want that to come back with just a single value, which I could do if I made it like the num_orders subquery but then would run into the r_id doesnt exist problem.
So my question is: How do I get these values that I need without running into subquery has more than 1 row, and r_id does not exist?
Then from those 2 values I can work out the percentage and all should be gravy :) Any help much appreciated!
So Restaurant X (r_id 1) has 10 orders. Users a#a.com and b#b.com have
ordered from that restaurant multiple times, and c#c.com, d#d.com, and
e#e.com have only ordered once, so it would need to return 40%
Cafe Y (r_id 2) has 4 orders. User c#c.com has ordered twice, users
a#a.com and e#e.com have only ordered once, so it would need to return
33%
Okay. So let's start with getting the number of repeating customers.
SELECT eo_order_parent, eo_email, COUNT(eo_email) AS orders FROM ecom_orders
GROUP BY eo_order_parent, eo_email
HAVING orders > 1;
And the total number of different customers
SELECT eo_order_parent, COUNT(eo_email) FROM ecom_orders
GROUP BY eo_order_parent;
But we can do this in one go:
SELECT eo_order_parent,
SUM(CASE WHEN orders > 1 THEN 1 ELSE 0 END) AS repeats,
SUM(1) AS total FROM
(
SELECT eo_order_parent, eo_email, COUNT(*) AS orders FROM ecom_orders
GROUP BY eo_order_parent, eo_email
) AS eo_group_1
GROUP BY eo_order_parent;
This gives:
+-----------------+---------+-------+
| eo_order_parent | repeats | total |
+-----------------+---------+-------+
| 1 | 2 | 5 |
| 2 | 1 | 3 |
+-----------------+---------+-------+
2 rows in set (0.00 sec)
Then 2/5 is your 40%, and 1/3 is 33%.
The following query computes the number of repeat customers and the total number of customers per restaurant
SELECT
u.r_id,
u.r_name,
SUM(u.no_orders > 1) AS repeats,
SUM(u.no_orders) AS orders,
COUNT(u.eo_email) AS customers
FROM (
SELECT
r.*,
o.eo_email,
COUNT(o.eo_id) AS no_orders
FROM lf_restaurants r
LEFT JOIN ecom_orders o ON o.eo_order_parent = r.r_id
GROUP BY o.eo_email
) u
GROUP BY
r.r_id;
The subquery first computes the number of orders per customer/restaurant pair. The outer query computes from this the number of customers, the number of repeating customers and the total number of customers per restaurant. You can also compute the percentage (but this does not have to be done in the query).