How to combine sql request with data in another table - mysql

I have two tables, first "users_counts"
id int(11) AUTO_INCREMENT
name varchar(250)
And I have second table "counts_data"
id int(11) AUTO_INCREMENT
id_user int(11)
count int(11)
date datetime
I want to select all records from the first table and get some data from a second, and then I want to merge they. I want create temp (for one request) column where collect last count with order by date in second table and second column where collect collect penultimate count with order by date in second table.
INSERT INTO `users_counts` (`id`,`name`) VALUES ('1','John');
INSERT INTO `users_counts` (`id`,`name`) VALUES ('2','Michael');
INSERT INTO `users_counts` (`id`,`name`) VALUES ('3','Den');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('1','1', '200', '2012.09.09');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('2','1', '212', '2012.09.01');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('3','2', '20', '2012.01.09');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('4','3', '210', '2012.02.09');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('5','3', '2033', '2012.03.09');
INSERT INTO `counts_data` (`id`,`id_user`, `count`, `date`) VALUES ('6','3', '1', '2012.04.09');
In the end, after a request I want to get something like this
id name count count_before
1 John 200 212
2 Michael 20 0
3 Den 1 2033
Thank.

Another possible way to do this:
select uc.id,
uc.name,
(select count
from counts_data cd
where cd.id_user = uc.id
order by date desc limit 1) as count,
ifnull((select count
from counts_data cd
where cd.id_user = uc.id
order by date desc limit 1 offset 1),0) as count_before
from users_counts uc;
Since you only need one value from the counts_data for each row/record, you can use in-line queries in mySQL
SQL Fiddle

select uc.id
, uc.name
, cd1.count
, cd3.count as count_before
from users_counts uc
left join
counts_data cd1
on cd1.id_user = uc.id
and cd.date =
(
select max(date)
from counts_data cd2
where cd2.id_user = uc.id_user
)
left join
counts_data cd3
on cd3.id_user = uc.id
and cd.date =
(
select max(date)
from counts_data cd4
where cd4.id_user = uc.id_user
and cd4.date <> cd1.date
)

Related

How to get percentage of result set for each day?

I am trying to retrieve the percentage of available products at specific merchants over the last 30 days.
Desired result example:
20210504 merchant1 20%
20210504 merchant2 30%
20210505 merchant1 25%
20210505 merchant2 35%
There are 3 tables:
availability (containing availability info for each product and merchant and day)
products (where the manufacturer_id is, that we want to filter for)
merchants (merchant info)
Minimal example: https://www.db-fiddle.com/f/wtnK5R4DWi7Dy6LwLaP4mX/0
This returns the percentage for only one merchant and one day:
-- get percentage of available products per merchant over time
SELECT
m.name AS metric,
t.s AS AMOUNT_AVAILABLE,
count(*) AS AMOUNT_TOTAL,
t.s / count(*) AS percentage
FROM availability p
CROSS JOIN (
SELECT count(*) AS s FROM availability p2
INNER JOIN products mp on p2.SKU = mp.SKU
WHERE
availability = 'sofort lieferbar'
AND date = curdate() - interval 1 day -- testing for one day, but we want a time series
AND mp.MANUFACTURER_ID = 1
-- AND p2.merchant_id = p.merchant_id -- does not work
-- AND merchant_id = 2
-- GROUP BY merchant_id
) t
INNER JOIN products mp on p.SKU = mp.SKU
INNER JOIN merchants m ON m.id = p.MERCHANT_ID
WHERE
p.date = curdate() - interval 1 day
and mp.MANUFACTURER_ID = 1
-- and merchant_id = 2
GROUP BY
merchant_id
Now I am trying to somehow merge the cross join with the from table so I get the info for each merchant and day. How can a cross join be joined with the from table?
Data & Shema:
create table merchants
(
id tinyint unsigned not null
primary key,
name varchar(255) null
);
INSERT INTO merchants (id, name) VALUES (1, 'Amazon');
INSERT INTO merchants (id, name) VALUES (2, 'eBay');
create table availability
(
DATE date not null,
SKU char(10) not null,
merchant_id tinyint unsigned not null,
availability enum ('sofort lieferbar', 'verzögert lieferbar', 'nicht lieferbar', 'außer Handel') null,
constraint DATE
unique (DATE, SKU, merchant_id)
);
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-11', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 1, 'sofort lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-12', '1', 2, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 1, 'nicht lieferbar');
INSERT INTO test.availability (DATE, SKU, merchant_id, availability) VALUES ('2021-05-13', '1', 2, 'sofort lieferbar');
create table products
(
SKU char(8) not null
primary key,
NAME varchar(255) null,
MANUFACTURER_ID mediumint unsigned null,
updated datetime default CURRENT_TIMESTAMP not null on update CURRENT_TIMESTAMP
);
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('1', 'Sneaker', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('2', 'Ball', 1, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('3', 'Pen', 2, '2021-05-12 02:27:46');
INSERT INTO test.products (SKU, NAME, MANUFACTURER_ID, updated) VALUES ('4', 'Paper', 2, '2021-05-12 02:27:46');
I have written a query which seems to work for the data you have provided. Let me know if there's any issue and I'll see what I can do.
SELECT CONCAT('merchant', t.ID) as merchant,
t.Date,
g.prod_available / t.all_prod_from_merch AS percentage_available
# gets total number of products in time range Date,
FROM (SELECT ID,
COUNT(merchant_ID) AS all_prod_from_merch
FROM merchants m
JOIN availability a
ON m.ID = a.merchant_ID
WHERE Date < CURDATE()
AND Date >= curdate() - INTERVAL 10 DAY
GROUP BY merchant_ID,
Date ) t
LEFT JOIN (SELECT merchant_ID,
Date,
COUNT(merchant_ID) AS prod_available
FROM availability
WHERE AVAILABILITY = 'sofort lieferbar'
AND date IN (SELECT Date
FROM availability
WHERE date < CURDATE()
AND date >= CURDATE() - INTERVAL 10 DAY
GROUP BY Date )
GROUP BY merchant_ID,
Date ) g
ON g.merchant_ID = t.ID
AND g.Date = t.Date
ORDER BY t.date;
The first select in the join gets the total number of products in the time range for each merchant. The second one gets those available from each merchant. So the select at the beginning just does the fraction.

Mysql, select many rows and assign different values to each row transactionally

I have a simple MySQL 8 table like this:
CREATE TABLE IF NOT EXISTS `test_codes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`segment_id` int(11) unsigned NOT NULL,
`code_id` int(11) DEFAULT 0,
PRIMARY KEY (`id`)
);
The table is populated with some random values
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('2');
INSERT INTO `test_codes` (`segment_id`) VALUES ('2');
INSERT INTO `test_codes` (`segment_id`) VALUES ('2');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
INSERT INTO `test_codes` (`segment_id`) VALUES ('1');
My applications has the following task: A request comes with an array of codes eg [1,200,10,18] and I need to get 4 rows (equal to array size) from the database where the code = 0 and update the code_id at each row with the values 1,200, 10, 18 transitionally.
A concurrent request that wants to update the code from another running thread, should not access the selected rows of the first thread.
How can I do this?
After the update the first selected row will have code_id 1, the second selected row code_id 200, the third 10 and the last one 18. In other words the task must find rows with unassigned codes (code_id=0) and set a value to each row.
Link: http://sqlfiddle.com/#!9/60e555/1
You can do it with a single statement which you may build in the app
update test_codes
join
(
select id, row_number() over (order by id) rn
from test_codes
where code_id = 0
) t on t.id = test_codes.id
join (
-- a table of new values with their positions
select 1 rn, 1 val union all
select 2, 200 union all
select 3, 10 union all
select 4, 18
) v on v.rn = t.rn
set code_id = v.val
db<>fiddle
MySQL doesn't support arrays. I would suggest that you first load the array into a table. This is just a convenience, but it is handy.
Then you can use a complex update with join to handle this:
update test_codes tc join
(select tc2.*,
row_number() over (order by rand()) as seqnum
from test_codes tc2
) tc2
on tc2.id = tc.id join
(select nc.*,
row_number() over (order by code_id) as seqnum
from new_codes nc
) nc
on tc2.seqnum = nc.seqnum
set tc.code_id = nc.code_id;
EDIT:
You can construct the query directly from the codes:
update test_codes tc join
(select tc2.*,
row_number() over (order by rand()) as seqnum
from test_codes tc2
) tc2
on tc2.id = tc.id join
(select ? as code, 1 as seqnum union all
select ? as code, 2 union all
. . .
) nc
on tc2.seqnum = nc.seqnum
set tc.code_id = nc.code_id;

How to get users that purchased items ONLY in a specific time period (MySQL Database)

I have a table that contains all purchased items.
I need to check which users purchased items in a specific period of time (say between 2013-03-21 to 2013-04-21) and never purchased anything after that.
I can select users that purchased items in that period of time, but I don't know how to filter those users that never purchased anything after that...
SELECT `userId`, `email` FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21' GROUP BY `userId`
Give this a try
SELECT
user_id
FROM
my_table
WHERE
purchase_date >= '2012-05-01' --your_start_date
GROUP BY
user_id
HAVING
max(purchase_date) <= '2012-06-01'; --your_end_date
It works by getting all the records >= start date, groups the resultset by user_id and then finds the max purchase date for every user. The max purchase date should be <=end date. Since this query does not use a join/inner query it could be faster
Test data
CREATE table user_purchases(user_id int, purchase_date date);
insert into user_purchases values (1, '2012-05-01');
insert into user_purchases values (2, '2012-05-06');
insert into user_purchases values (3, '2012-05-20');
insert into user_purchases values (4, '2012-06-01');
insert into user_purchases values (4, '2012-09-06');
insert into user_purchases values (1, '2012-09-06');
Output
| USER_ID |
-----------
| 2 |
| 3 |
SQLFIDDLE
This is probably a standard way to accomplish that:
SELECT `userId`, `email` FROM my_table mt
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
AND NOT EXISTS (
SELECT * FROM my_table mt2 WHERE
mt2.`userId` = mt.`userId`
and mt2.`date` > '2013-04-21'
)
GROUP BY `userId`
SELECT `userId`, `email` FROM my_table WHERE (`date` BETWEEN '2013-03-21' AND '2013-04-21') and `date` >= '2013-04-21' GROUP BY `userId`
This will select only the users who purchased during that timeframe AND purchased after that timeframe.
Hope this helps.
Try the following
SELECT `userId`, `email`
FROM my_table WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
and user_id not in
(select user_id from my_table
where `date` < '2013-03-21' or `date` > '2013-04-21' )
GROUP BY `userId`
You'll have to do it in two stages - one query to get the list of users who did buy within the time period, then another query to take that list of users and see if they bought anything afterwards, e.g.
SELECT userID, email, count(after.*) AS purchases
FROM my_table AS after
LEFT JOIN (
SELECT DISTINCT userID
FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
) AS during ON after.userID = during.userID
WHERE after.date > '2013-04-21'
HAVING purchases = 0;
Inner query gets the list of userIDs who purchased at least one thing during that period. That list is then joined back against the same table, but filtered for purchases AFTER the period , and counts how many purchases they made and filters down to only those users with 0 "after" purchases.
probably won't work as written - haven't had my morning tea yet.
SELECT
a.userId,
a.email
FROM
my_table AS a
WHERE a.date BETWEEN '2013-03-21'
AND '2013-04-21'
AND a.userId NOT IN
(SELECT
b.userId
FROM
my_table AS b
WHERE b.date BETWEEN '2013-04-22'
AND CURDATE()
GROUP BY b.userId)
GROUP BY a.userId
This filters out anyone who has not purchased anything from the end date to the present.

MySQL multiple COUNTs

I have a table like this:
Fiddle: http://sqlfiddle.com/#!2/44d9e/14
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(20) NOT NULL,
`money_earned` int(20) NOT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO mytable (user_id,money_earned) VALUES ("111","10");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","6");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","40");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","45");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","1");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","5");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","19");
I need to know table has how many rows, how many different users, and how many times each user has earned.
I need this result:
TOTAL_ROWS: 7
TOTAL_INDIVIDUAL_USERS: 3
USER_ID USER_TIMES
111 3
222 2
333 2
Is your problem that you want the total as well? If so, then you can get this using rollup:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times
FROM mytable
GROUP BY user_id with rollup;
You can get the user counts in a separate column with this trick:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times, count(distinct user_id) as UserCount
FROM mytable
GROUP BY user_id with rollup;
You realize that a SQL query just returns a table of values. You are asking for very specific formatting, which is typically done better at the application level. That said, you can get close to what you want with something like this:
select user, times
from ((SELECT 3 as ord, cast(user_id as char(20)) as user, COUNT(*) as times
FROM mytable
GROUP BY user_id
)
union all
(select 1, 'Total User Count', count(*)
from mytable
)
union all
(select 2, 'Total Users', count(distinct user_id)
from mytable
)
) t
order by ord;
I think this could be a typo anyway your are trying to sum your COUNT() times, simply replace with money_earned
SELECT user_id,
COUNT(*) AS 'times',
SUM(money_earned) AS 'sum_money'
FROM mytable GROUP BY user_id;
SQL Fiddle

MySQL query, MAX() + GROUP BY

Daft SQL question. I have a table like so ('pid' is auto-increment primary col)
CREATE TABLE theTable (
`pid` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`timestamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
`cost` INT UNSIGNED NOT NULL,
`rid` INT NOT NULL,
) Engine=InnoDB;
Actual table data:
INSERT INTO theTable (`pid`, `timestamp`, `cost`, `rid`)
VALUES
(1, '2011-04-14 01:05:07', 1122, 1),
(2, '2011-04-14 00:05:07', 2233, 1),
(3, '2011-04-14 01:05:41', 4455, 2),
(4, '2011-04-14 01:01:11', 5566, 2),
(5, '2011-04-14 01:06:06', 345, 1),
(6, '2011-04-13 22:06:06', 543, 2),
(7, '2011-04-14 01:14:14', 5435, 3),
(8, '2011-04-14 01:10:13', 6767, 3)
;
I want to get the PID of the latest row for each rid (1 result per unique RID). For the sample data, I'd like:
pid | MAX(timestamp) | rid
-----------------------------------
5 | 2011-04-14 01:06:06 | 1
3 | 2011-04-14 01:05:41 | 2
7 | 2011-04-14 01:14:14 | 3
I've tried running the following query:
SELECT MAX(timestamp),rid,pid FROM theTable GROUP BY rid
and I get:
max(timestamp) ; rid; pid
----------------------------
2011-04-14 01:06:06; 1 ; 1
2011-04-14 01:05:41; 2 ; 3
2011-04-14 01:14:14; 3 ; 7
The PID returned is always the first occurence of PID for an RID (row / pid 1 is frst time rid 1 is used, row / pid 3 the first time RID 2 is used, row / pid 7 is first time rid 3 is used). Though returning the max timestamp for each rid, the pids are not the pids for the timestamps from the original table. What query would give me the results I'm looking for?
(Tested in PostgreSQL 9.something)
Identify the rid and timestamp.
select rid, max(timestamp) as ts
from test
group by rid;
1 2011-04-14 18:46:00
2 2011-04-14 14:59:00
Join to it.
select test.pid, test.cost, test.timestamp, test.rid
from test
inner join
(select rid, max(timestamp) as ts
from test
group by rid) maxt
on (test.rid = maxt.rid and test.timestamp = maxt.ts)
select *
from (
select `pid`, `timestamp`, `cost`, `rid`
from theTable
order by `timestamp` desc
) as mynewtable
group by mynewtable.`rid`
order by mynewtable.`timestamp`
Hope I helped !
SELECT t.pid, t.cost, to.timestamp, t.rid
FROM test as t
JOIN (
SELECT rid, max(tempstamp) AS maxtimestamp
FROM test GROUP BY rid
) AS tmax
ON t.pid = tmax.pid and t.timestamp = tmax.maxtimestamp
I created an index on rid and timestamp.
SELECT test.pid, test.cost, test.timestamp, test.rid
FROM theTable AS test
LEFT JOIN theTable maxt
ON maxt.rid = test.rid
AND maxt.timestamp > test.timestamp
WHERE maxt.rid IS NULL
Showing rows 0 - 2 (3 total, Query took 0.0104 sec)
This method will select all the desired values from theTable (test), left joining itself (maxt) on all timestamps higher than the one on test with the same rid. When the timestamp is already the highest one on test there are no matches on maxt - which is what we are looking for - values on maxt become NULL. Now we use the WHERE clause maxt.rid IS NULL or any other column on maxt.
You could also have subqueries like that:
SELECT ( SELECT MIN(t2.pid)
FROM test t2
WHERE t2.rid = t.rid
AND t2.timestamp = maxtimestamp
) AS pid
, MAX(t.timestamp) AS maxtimestamp
, t.rid
FROM test t
GROUP BY t.rid
But this way, you'll need one more subquery if you want cost included in the shown columns, etc.
So, the group by and join is better solution.
If you want to avoid a JOIN, you can use:
SELECT pid, rid FROM theTable t1 WHERE t1.pid IN ( SELECT MAX(t2.pid) FROM theTable t2 GROUP BY t2.rid);
Try:
select pid,cost, timestamp, rid from theTable order by timestamp DESC limit 2;