get certain number of rows using all id´s IN sentence - mysql

I have the next query:
SELECT a.id, a.brand_id
FROM articles a
WHERE a.deleted=0 AND a.brand_id IN (5,6)
LIMIT 4
How can I get 4 articles from all the brand_id's named at the IN sentence? For example, I would like to get 2 articles from brand_id=5 and 2 articles from brand_id=6

You can use union all
(
SELECT a.id, a.brand_id
FROM articles a
WHERE a.deleted=0 AND a.brand_id = 5 limit 2
)
union all
(
SELECT a.id, a.brand_id
FROM articles a
WHERE a.deleted=0 AND a.brand_id = 6 limit 2
)
UPDATE , this could be achieved using m-per-group logic and one way would be as -
Consider the table
mysql> select * from articles ;
+------+----------+---------+
| id | brand_id | deleted |
+------+----------+---------+
| 1 | 5 | 0 |
| 2 | 6 | 0 |
| 3 | 2 | 0 |
| 4 | 4 | 1 |
| 5 | 5 | 0 |
| 6 | 5 | 1 |
| 7 | 5 | 0 |
| 8 | 6 | 0 |
| 9 | 4 | 0 |
| 10 | 4 | 0 |
| 11 | 4 | 1 |
| 12 | 6 | 0 |
| 13 | 5 | 1 |
| 14 | 5 | 0 |
+------+----------+---------+
So with the query below will return n-per-group as
select
id,
brand_id
from (
select
id,
brand_id,
#r := if(#brand = brand_id,#r+1,1) as row_num,
#brand:= brand_id
from articles,(select #r:=0,#brand:='')rr
where
brand_id in (4,5,6)
and deleted = 0
order by brand_id
)x
where x.row_num <=2 limit 6;
+------+----------+
| id | brand_id |
+------+----------+
| 9 | 4 |
| 10 | 4 |
| 1 | 5 |
| 5 | 5 |
| 2 | 6 |
| 8 | 6 |
+------+----------+
6 rows in set (0.00 sec)
So here the limit will be always number of items inside IN clause * 2

SELECT a.id, a.brand_id
FROM articles a
WHERE a.deleted=0 AND a.brand_id=5
LIMIT 2
UNION ALL
SELECT a.id, a.brand_id
FROM articles a
WHERE a.deleted=0 AND a.brand_id=6
LIMIT 2

Related

Count occurences in Mysql

Let's say, in given num_table, there is a column, in which only numbers from 1 to 35 are stored.
Code for count nums in last 25rows is:
select num, count(*)
from (select C_1 as num from num_table order by id desc limit 25) n
group by num
order by num asc;
Result:
| num | count(*) |
|------|----------|
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
| 10 | 1 |
| 11 | 1 |
| 12 | 1 |
| 15 | 1 |
| 16 | 2 |
| 17 | 1 |
| 20 | 1 |
| 21 | 1 |
| 22 | 1 |
| 23 | 1 |
| 25 | 1 |
| 28 | 2 |
| 29 | 2 |
| 30 | 1 |
| 32 | 2 |
|------|----------|
How to get a result, where nums from 1 to 35 - which occured 0 times within last 25 rows - will be also displayed?
Example of desired result:
| num | count(*) |
|------|----------|
| 1 | 0 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 1 |
| ... | ... |
| 35 | 0 |
Maybe the quickest way is to make your existing query as sub-query and LEFT JOIN your num_table with it like :
SELECT A.C_1, IFNULL(cnt,0) total_count
FROM num_table A
LEFT JOIN
(SELECT num, COUNT(*) cnt
FROM (SELECT C_1 AS num FROM num_table ORDER BY id DESC LIMIT 25) n
GROUP BY num) B
ON A.C_1=B.num
GROUP BY A.C_1, cnt
ORDER BY A.C_1 ASC;
Here's a fiddle for reference:
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=3ced94d698fd8a55a8ad07a9d3b42f3d
And by the way, the current result you're showing is only 24 rows despite you did LIMIT 25 in the first sub-query. So in my example fiddle, the result is slightly different.
Here is another way to solve your problem.
In this solution, first, you need a table with numbers between 1 and 35, but only for the query, so then you can left join (because with a left join you can have also 0 counter values) it with your existent num_table.
You can do it like this:
WITH RECURSIVE numbers(id) AS (
SELECT 1 as id
UNION ALL
SELECT id+1 FROM numbers WHERE id < 35
)
SELECT numbers.id AS num, count(nt.id) AS total
FROM numbers
LEFT JOIN (SELECT C_1 FROM num_table ORDER BY id DESC LIMIT 25) nt ON (nt.C_1 = numbers.id)
GROUP BY numbers.id

How to fetch only continuous records from mysql?

I have a database table like below
___________
id | speed
-----------
1 | 3
2 | 2
3 | 0
4 | 0
5 | 0
6 | 2
7 | 0
8 | 0
9 | 2
10 | 0
Now I want to get the records where speed is 0 but only from 3 to 5 which are continuous and greater than any other continuous records. I don't want 7,8 records or the 10th record. How can I achieve this?
Probably the fastest method is to use MySQL session variables to increment the "group" each time the speed changes, as you scan through the rows.
select n.*, #groupid:=IF(#prev_speed=speed,#groupid,#groupid+1) as groupid, #prev_speed:=speed
from (select #groupid:=0, #prev_speed=-1) _init
cross join n
order by id;
+----+-------+---------+--------------------+
| id | speed | groupid | #prev_speed:=speed |
+----+-------+---------+--------------------+
| 1 | 3 | 1 | 3 |
| 2 | 2 | 2 | 2 |
| 3 | 0 | 3 | 0 |
| 4 | 0 | 3 | 0 |
| 5 | 0 | 3 | 0 |
| 6 | 2 | 4 | 2 |
| 7 | 0 | 5 | 0 |
| 8 | 0 | 5 | 0 |
| 9 | 2 | 6 | 2 |
| 10 | 0 | 7 | 0 |
+----+-------+---------+--------------------+
Then using the above query as a derived table, calculate the lowest and highest id per group, and the count of rows. Sort the groups by the count of rows.
select min(id) as minid, max(id) as maxid, count(*) as count
from (
select n.*, #groupid:=IF(#prev_speed=speed,#groupid,#groupid+1) as groupid, #prev_speed:=speed
from (select #groupid:=0, #prev_speed=-1) _init
cross join n
order by id
) as t1
group by t1.groupid
order by count desc;
+-------+-------+-------+
| minid | maxid | count |
+-------+-------+-------+
| 3 | 5 | 3 |
| 7 | 8 | 2 |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 6 | 6 | 1 |
| 9 | 9 | 1 |
| 10 | 10 | 1 |
+-------+-------+-------+
Then using the first row from the above as another derived table, join to the original table for the rows in the range from the min to max id.
select n.*
from (
select min(id) as minid, max(id) as maxid, count(*) as count
from (
select n.*, #groupid:=IF(#prev_speed=speed,#groupid,#groupid+1) as groupid, #prev_speed:=speed
from (select #groupid:=0, #prev_speed=-1) _init
cross join n
order by id
) as t1
group by t1.groupid
order by count desc limit 1
) as t2
inner join n on n.id between t2.minid and t2.maxid
+----+-------+
| id | speed |
+----+-------+
| 3 | 0 |
| 4 | 0 |
| 5 | 0 |
+----+-------+

SUM from the results of a subquery of N results as max for each user

Let's suppose this schema:
CREATE TABLE test
(
test_Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
user_Id INT NOT NULL,
date DATE,
result VARCHAR(255) NOT NULL,
) engine=innodb;
My goal is to pick up the last 5 results as maximum for each different user_Id, ordered from newest to oldest. Besides that, depending on result column I want to calculate a ratio of those last results, to be able to pick up the 3 users with best ratio.
So let's take this data as example:
test_Id | user_Id | date | result
1 | 1 |2016-09-05 | A
2 | 3 |2016-09-13 | A
3 | 3 |2016-09-30 | A
4 | 4 |2016-09-22 | A
5 | 4 |2016-09-11 | C
6 | 7 |2016-09-18 | D
7 | 4 |2016-09-08 | B
8 | 6 |2016-09-20 | E
9 | 7 |2016-09-16 | A
10 | 7 |2016-09-29 | E
11 | 7 |2016-09-23 | A
12 | 7 |2016-09-16 | B
13 | 4 |2016-09-15 | B
14 | 7 |2016-09-07 | C
15 | 7 |2016-09-09 | A
16 | 3 |2016-09-26 | A
17 | 4 |2016-09-11 | C
18 | 4 |2016-09-30 | E
What I have been able to achieve is this query:
SELECT p.user_Id, p.RowNumber, p.date, p.result,
SUM(CASE WHEN p.result='A' OR p.result='B'
THEN 1 ELSE 0 END) as avg
FROM (
SELECT #row_num := IF(#prev_value=user_Id,#row_num+1,1)
AS RowNumber, test_Id, user_Id, date, result,
#prev_value := user_Id
FROM test,
(SELECT #row_num := 1) x,
(SELECT #prev_value := '') y
WHERE #prev_value < 5
ORDER BY user_Id, YEAR(date) DESC, MONTH(date) DESC,
DAY(date) DESC
) p
WHERE p.RowNumber <=10
GROUP BY p.user_Id, p.test_Id
ORDER BY p.user_Id, p.RowNumber;
This query provides me this kind of output:
RowNumber |test_Id | user_Id | date | result | avg
1 | 1 | 1 |2016-09-05 | A | 1
1 | 3 | 3 |2016-09-30 | A | 1
2 | 16 | 3 |2016-09-26 | A | 1
3 | 2 | 3 |2016-09-13 | A | 1
1 | 18 | 4 |2016-09-30 | E | 0
2 | 4 | 4 |2016-09-22 | A | 1
3 | 13 | 4 |2016-09-15 | B | 1
4 | 5 | 4 |2016-09-11 | C | 0
5 | 17 | 4 |2016-09-11 | C | 0
1 | 8 | 6 |2016-09-20 | E | 0
1 | 10 | 7 |2016-09-29 | E | 0
2 | 11 | 7 |2016-09-23 | A | 1
3 | 6 | 7 |2016-09-18 | D | 0
4 | 9 | 7 |2016-09-16 | A | 1
5 | 12 | 7 |2016-09-16 | B | 1
What I was expecting is that in the avg column would get the total of the results for each user that match the condition (A or B value), to be able to calculate a ratio from the 5 results for each user_id. (0, 0.2, 0.4, 0.6, 0.8, 1).
Something like this:
RowNumber |test_Id | user_Id | date | result | avg
1 | 1 | 1 |2016-09-05 | A | 1
1 | 3 | 3 |2016-09-30 | A | 3
2 | 16 | 3 |2016-09-26 | A | 3
3 | 2 | 3 |2016-09-13 | A | 3
1 | 18 | 4 |2016-09-30 | E | 2
2 | 4 | 4 |2016-09-22 | A | 2
3 | 13 | 4 |2016-09-15 | B | 2
4 | 5 | 4 |2016-09-11 | C | 2
5 | 17 | 4 |2016-09-11 | C | 2
1 | 8 | 6 |2016-09-20 | E | 0
1 | 10 | 7 |2016-09-29 | E | 3
2 | 11 | 7 |2016-09-23 | A | 3
3 | 6 | 7 |2016-09-18 | D | 3
4 | 9 | 7 |2016-09-16 | A | 3
5 | 12 | 7 |2016-09-16 | B | 3
Am I being restricted by the GROUP BY p.user_Id, p.test_Id clause when doing the SUM? I tried the query with only user_Id as GROUP BY clause and only test_Id too as GROUP BY clause, without getting the expected results.
I think you need to calculate the avg and then join
select a.rn,a.test_id,a.user_id,a.date,a.result,u.avg from
(
select t1.*
, if (t1.user_id <> #p, #rn:=1,#rn:=#rn+1) rn
, #p:=t1.user_id p
from (select #rn:=0, #p:='') rn,test t1
order by t1.user_id, t1.date desc
) a
join
(
select s.user_id
, sum(case when s.result = 'A' or s.result = 'B' then 1 else 0 end) as avg
from
(
select t1.*
, if (t1.user_id <> #p, #rn:=1,#rn:=#rn+1) rn
, #p:=t1.user_id p
from (select #rn:=0, #p:='') rn,test t1
order by t1.user_id, t1.date desc
) s
where s.rn <= 5
group by s.user_id
) u on u.user_id = a.user_id
where a.rn <= 5

SQL, difficult fetching data query

Suppose I have such a table:
+-----+---------+-------+
| ID | TIME | DAY |
+-----+---------+-------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 1 | 1 | 2 |
| 2 | 2 | 2 |
| 3 | 3 | 2 |
| 1 | 1 | 3 |
| 2 | 2 | 3 |
| 3 | 3 | 3 |
| 1 | 1 | 4 |
| 2 | 2 | 4 |
| 3 | 3 | 4 |
| 1 | 1 | 5 |
| 2 | 2 | 5 |
| 3 | 3 | 5 |
+-----+---------+-------+
I want to fetch a table which represents 2 IDs which got the largest sum of TIME within the last 3 days (means from 3 to 5 in a DAY column)
So the correct result would be:
+-----+---------+
| ID | SUM |
+-----+---------+
| 3 | 9 |
| 2 | 6 |
+-----+---------+
The original table is much larger and more complex. So i need a generic approach.
Thanks in advance.
And so I just learned that MySQL used LIMIT instead of TOP...
fiddle
CREATE TABLE tbl (ID INT,tm INT,dy INT);
INSERT INTO tbl (id, tm, dy) VALUES
(1,1,1)
,(2,2,1)
,(3,3,1)
,(1,1,2)
,(1,1,1)
SELECT ID
,SUM(SumTimeForDay) SumTimeFromLastThreeDays
FROM (SELECT ID
,SUM(tm) SumTimeForDay
FROM tbl
GROUP BY ID, dy
HAVING dy > MAX(dy) -3) a
GROUP BY id
ORDER BY SUM(SumTimeForDay) DESC
LIMIT 2
select t1.`id`, sum(t1.`time`) as `sum`
from `table` t1
inner join ( select distinct `day` from `table` order by `day` desc limit 3 ) t2
on t2.`da`y = t1.`day`
group by t1.`id`
order by sum(t1.`time`) desc
limit 2

Advanced MySQL: Find correlations between poll responses

I've got four MySQL tables:
users (id, name)
polls (id, text)
options (id, poll_id, text)
responses (id, poll_id, option_id, user_id)
Given a particular poll and a particular option, I'd like to generate a table that shows which options from other polls are most strongly correlated.
Suppose this is our data set:
TABLE users:
+------+-------+
| id | name |
+------+-------+
| 1 | Abe |
| 2 | Bob |
| 3 | Che |
| 4 | Den |
+------+-------+
TABLE polls:
+------+-----------------------+
| id | text |
+------+-----------------------+
| 1 | Do you like apples? |
| 2 | What is your gender? |
| 3 | What is your height? |
| 4 | Do you like polls? |
+------+-----------------------+
TABLE options:
+------+----------+---------+
| id | poll_id | text |
+------+----------+---------+
| 1 | 1 | Yes |
| 2 | 1 | No |
| 3 | 2 | Male |
| 4 | 2 | Female |
| 5 | 3 | Short |
| 6 | 3 | Tall |
| 7 | 4 | Yes |
| 8 | 4 | No |
+------+----------+---------+
TABLE responses:
+------+----------+------------+----------+
| id | poll_id | option_id | user_id |
+------+----------+------------+----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 3 | 1 | 2 | 3 |
| 4 | 1 | 2 | 4 |
| 5 | 2 | 3 | 1 |
| 6 | 2 | 3 | 2 |
| 7 | 2 | 3 | 3 |
| 8 | 2 | 4 | 4 |
| 9 | 3 | 5 | 1 |
| 10 | 3 | 6 | 2 |
| 10 | 3 | 5 | 3 |
| 10 | 3 | 6 | 4 |
| 10 | 4 | 7 | 1 |
| 10 | 4 | 7 | 2 |
| 10 | 4 | 7 | 3 |
| 10 | 4 | 7 | 4 |
+------+----------+------------+----------+
Given the poll ID 1 and the option ID 2, the generated table should be something like this:
+----------+------------+-----------------------+
| poll_id | option_id | percent_correlated |
+----------+------------+-----------------------+
| 4 | 7 | 100 |
| 2 | 3 | 66.66 |
| 3 | 6 | 66.66 |
| 2 | 4 | 33.33 |
| 3 | 5 | 33.33 |
| 4 | 8 | 0 |
+----------+------------+-----------------------+
So basically, we're identifying all of the users who responded to poll ID 1 and selected option ID 2, and we're looking through all the other polls to see what percentage of them also selected each other option.
Don't have an instance handy to test, can you see if this gets proper results:
select
poll_id,
option_id,
((psum - (sum1 * sum2 / n)) / sqrt((sum1sq - pow(sum1, 2.0) / n) * (sum2sq - pow(sum2, 2.0) / n))) AS r,
n
from
(
select
poll_id,
option_id,
SUM(score) AS sum1,
SUM(score_rev) AS sum2,
SUM(score * score) AS sum1sq,
SUM(score_rev * score_rev) AS sum2sq,
SUM(score * score_rev) AS psum,
COUNT(*) AS n
from
(
select
responses.poll_id,
responses.option_id,
CASE
WHEN user_resp.user_id IS NULL THEN SELECT 0
ELSE SELECT 1
END CASE as score,
CASE
WHEN user_resp.user_id IS NULL THEN SELECT 1
ELSE SELECT 0
END CASE as score_rev,
from responses left outer join
(
select
user_id
from
responses
where
poll_id = 1 and
option_id = 2
)user_resp
ON (user_resp.user_id = responses.user_id)
) temp1
group by
poll_id,
option_id
)components
After a few hours of trial and error, I managed to put together a query that works correctly:
SELECT poll_id AS p_id,
option_id AS o_id,
COUNT(*) AS optCount,
(SELECT COUNT(*) FROM response WHERE option_id = o_id AND user_id IN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2')) /
(SELECT COUNT(*) FROM response WHERE poll_id = p_id AND user_id IN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2'))
AS percentage
FROM response
INNER JOIN
(SELECT user_id FROM response WHERE poll_id = '1' AND option_id = '2') AS user_ids
ON response.user_id = user_ids.user_id
WHERE poll_id != '1'
GROUP BY option_id DESC
ORDER BY percentage DESC, optCount DESC
Based on a tests with a small data set, this query looks to be reasonably fast, but I'd like to modify it so the "IN" subquery is not repeated three times. Any suggestions?
This seems to give the right results for me:
select poll_stats.poll_id,
option_stats.option_id,
(100 * option_responses / poll_responses) as percent_correlated
from (select response.poll_id,
count(*) as poll_responses
from response selecting_response
join response on response.user_id = selecting_response.user_id
where selecting_response.poll_id = 1 and selecting_response.option_id = 2
group by response.poll_id) poll_stats
join (select options.poll_id,
options.id as option_id,
count(response.id) as option_responses
from options
left join response on response.poll_id = options.poll_id
and response.option_id = options.id
and exists (
select 1 from response selecting_response
where selecting_response.user_id = response.user_id
and selecting_response.poll_id = 1
and selecting_response.option_id = 2)
group by options.poll_id, options.id
) as option_stats
on option_stats.poll_id = poll_stats.poll_id
where poll_stats.poll_id <> 1
order by 3 desc, option_responses desc