Count rows where value in row is also in previous row - mysql

I want to get a count where the contents of a value in one row is also in the previous row.
Row | Item1 | Item2 | Item 3 |
1 | Dog | Cat | Rat
2 | Bird | Cat | Horse
3 | Horse | Dog | Rat
4 | Bird | Cat | Horse
5 | Horse | Bird | Cat
Row 2 would increase the count of Cat because Cat is in row 1 and 2
Row 3 would increase the count of Horse because Horse is also in Row 2
Row 4 would increase the count of Horse because Horse is also in Row 3
Row 5 would increase the count of Horse AND Cat because both of those appear in row 4.
There can be a max of 100 items or SKU's and I can index on any or all fields. At any given time there's probably between 1000 and 2000 rows.
I can't even wrap my head around where to begin with this query other than "SELECT * FROM table WHERE"

First, create table with all available unique values of SKU:
CREATE TABLE results(
id VARCHAR(255) NOT NULL PRIMARY KEY
);
-- All fields should be listed here one-by-one.
INSERT IGNORE INTO results (select Item1 from example);
INSERT IGNORE INTO results (select Item2 from example);
INSERT IGNORE INTO results (select Item3 from example);
Previous row could be obtained by left join primary table again with itself, i.e. LEFT JOIN example AS previous ON previous.id + 1 = example.id.
After that we've to check that each unique result exists in example table within current row and in previous row and finally get this:
SELECT
r.*,
SUM(
CASE WHEN r.id IN (
prv.Item1, prv.Item2, prv.Item3 -- All fields should be listed here.
) THEN 1 ELSE 0 END
) AS total
FROM
results AS r
LEFT JOIN
example AS cur ON r.id IN (
cur.Item1, cur.Item2, cur.Item3 -- All fields should be listed here.
)
LEFT JOIN
example AS prv ON prv.id + 1 = cur.id
GROUP BY
r.id
ORDER BY
cur.id
;
See working example http://www.sqlfiddle.com/#!9/7ebd85/1/0

This can be done with window functions (available in MySQL 8.0).
An option is to unpivot the resultset, and then use lag() to check the previous record. Assuming that ids are always increasing by 1, you can do:
select
item,
sum(case when id = lag_id + 1 then 1 else 0 end) cnt_consecutive
from (
select
t.*,
lag(id) over(partition by item order by id) lag_id
from (
select id, item1 item from mytable
union all select id, item2 from mytable
union all select id, item3 from mytable
) t
) t
group by item
order by item
If you don't have an incremented column, you can generate one with dense_rank():
select
item,
sum(case when new_id = lag_new_id + 1 then 1 else 0 end) cnt_consecutive
from (
select
t.*,
lag(new_id) over(partition by item order by new_id) lag_new_id
from (
select
t.*,
dense_rank() over(order by id) new_id
from (
select id, item1 item from mytable
union all select id, item2 from mytable
union all select id, item3 from mytable
) t
) t
) t
group by item
order by item
In this DB Fiddle, both queries return:
item | cnt_consecutive
:---- | --------------:
Bird | 1
Cat | 2
Dog | 0
Horse | 3
Rat | 0

I see #frost-nzcr4 suggestion is very good and I was doing my own version quite similar to that yesterday. However, the approach I'm doing is a bit different because I didn't create a table specifically to store the unique value. Instead, I was doing similarly like #GMB UNION sub-query and it end up to be something like this :
SELECT B.row, A.allitem,
SUM(CASE WHEN A.allitem IN (C.Item1, C.Item2, C.Item3) THEN 1
ELSE 0 END) AS total
FROM
-- this sub-query will be dynamic and UNION will eliminate any duplicate
(SELECT item1 AS allitem FROM mytable UNION
SELECT item2 FROM mytable UNION
SELECT item3 FROM mytable) AS A
LEFT JOIN mytable AS B ON A.allitem IN (B.Item1, B.Item2, B.Item3)
LEFT JOIN mytable AS C ON C.row + 1 = B.row
GROUP BY A.allitem
ORDER BY B.row;
Fiddle here : https://www.db-fiddle.com/f/bUUEsaeyPpAMfR2bK1VpBb/2
As you can see this is exactly similar query to frost's suggestion with only minor modification. In the sub-query allitem value will be updated as long as there are new values inserted so you don't need to keep inserting new unique data into a separate table.
Also, this query would normally get this is incompatible with sql_mode=only_full_group_by error on MySQL v5.7 above unless you remove the sql_mode.

Related

MySQL select - If a column value is redundant, only show the newest by timestamp

I have a table like this:
timesent |nr | value
2018-10-31 05:23:06 | 4 | Value 3
2018-10-31 05:20:19 | 4 | Value 2
2018-10-31 05:19:35 | 4 | Value 1
2018-10-31 04:55:56 | 3 | Value 2
2018-10-31 03:05:15 | 3 | Value 1
2018-10-31 01:31:49 | 2 | Value 1
2018-10-30 04:11:16 | 1 | Value 1
At the moment, my select looks like this:
SELECT * FROM values WHERE ORDER BY timesent DESC
I want to do an sql-select statement which gives me back only the most recent value of each "nr".
My skills are not good enough to translate that into a sql-statement. I donĀ“t even know what I should google for.
Values is a Reserved Keyword in MySQL. Consider changing your table name to something else; otherwise you will have to use backticks around it
There are various ways to achieve the result for your problem. One way is to do a "Self-Left-Join" on nr (field on which you want to get the maximum timesent value row only).
SELECT v1.*
FROM `values` AS v1
LEFT JOIN `values` AS v2
ON v1.nr = v2.nr AND
v1.timesent < v2.timesent
WHERE v2.nr IS NULL
For MySQL version >= 8.0.2, you can use Window Functions. We will determine Row_Number() for each row over a partition of nr, with timesent in Descending order (Highest timesent value will have row number = 1). Then, use this result-set in a Derived Table and consider only those rows, where row number is equal to 1.
SELECT dt.timesent,
dt.nr,
dt.value
FROM
(
SELECT v.timesent, v.nr, v.value,
ROW_NUMBER() OVER (PARTITION BY v.nr
ORDER BY v.timesent DESC) AS row_num
FROM `values` AS v
) AS dt
WHERE dt.row_num = 1
Yet, another approach is to get the maximum value of timesent for a nr group in a Derived Table. Now join this result-set to the main table, so that only the rows corresponding to max value appear:
SELECT v.timesent,
v.nr,
v.value
FROM
`values` AS v
JOIN
(
SELECT nr, MAX(timesent) AS max_timesent
FROM `values`
GROUP BY nr
) AS dt ON dt.nr = v.nr AND
dt.max_timesent = v.timesent

selecting multiple max values

i have a table like this on a mysql database:
id | item
-----------
1 | 2
2 | 2
3 | 4
4 | 5
5 | 8
6 | 8
7 | 8
i want the result to be 3 record with the highest Item value
select max(item) returns only 1 value
how can i select multiple max values?
thank you
You can use a derived table to get the maximum value and join it back to the original table to see all rows corresponding to it.
select t.id, t.item
from tablename t
join (select max(item) as mxitem from tablename) x
on x.mxitem = t.item
Edit:
select t.co_travelers_id, t.booking_id, t.accounts_id
from a_co_travelers t
join (select accounts_id, max(booking_id) as mxitem
from a_co_travelers
group by accounts_id) x
on x.mxitem = t.booking_id and t.accounts_id = x.accounts_id
If you use an 'aggregate function' without GROUP BY only one row will be returned.
You may use GROUP BY , with aggregate functions.
Here is SQLFiddle Demo
SELECT id,max(item) AS item
FROM table_name
GROUP BY id
ORDER BY item DESC
LIMIT 3
Hope this helps.
There is the graphical explanation.
There is script mysql (low abstraction level, no inner join or sth)
select * from ocena, uczen where ocena.ocena = (SELECT MAX(ocena.ocena) FROM ocena WHERE ocena.przedmiot_id="4" and ocena.uczen_id="1") and ocena.uczen_id=uczen.id and ocena.przedmiot_id="4" and uczen_id="1"

Mysql count with case when statement

Consider:
SELECT(count(c.id),
case when(count(c.id) = 0)
then 'loser'
when(count(c.id) BETWEEN 1 AND 4)
then 'almostaloser'
when(count(c.id) >= 5)
then 'notaloser'
end as status,
...
When all is said and done, the query as a whole produces a set of results that look similar to this:
Count | status
--------|-------------
2 | almostaloser //total count is between 2 and 4
--------|-------------
0 | loser // loser because total count = 0
--------|-------------
3 | almostaloser //again, total count between 2 and 4
--------|-------------
What I would like to achieve:
a method to reatain the information from the above table, but add a third column that will give a total count of each status, something like
select count(c.id)
case when(count(c.id) = 0 )
then loser as status AND count how many of the total count does this apply to
results would look similar to:
Count | status |total_of each status |
--------|-------------|---------------------|
2 | almostaloser| 2 |
--------|-------------|---------------------|
0 | loser | 1 |
--------|-------------|---------------------|
3 | almostaloser| 2 |
--------|-------------|----------------------
I've been told this could be achieved using a derived table, but i've not yet been able to get them both, only one or the other.
This can be achieved with this query (you must place your original query as subquery in two places):
SELECT t1.*, t2.total_of_each_status
FROM (
-- put here your query --
) t1
INNER JOIN (
SELECT status, count(*) AS total_of_each_status
FROM (
-- put here your query --
) t2
GROUP BY status
) t2 ON t2.status = t1.status

Return NULL for missing values in an IN list

I have a table like this:
id | val
---------
1 | abc
2 | def
5 | xyz
6 | foo
8 | bar
and a query like
SELECT id, val FROM tab WHERE id IN (1,2,3,4,5)
which returns
id | val
---------
1 | abc
2 | def
5 | xyz
Is there a way to make it return NULLs on missing ids, that is
id | val
---------
1 | abc
2 | def
3 | NULL
4 | NULL
5 | xyz
I guess there should be a tricky LEFT JOIN with itself, but can't wrap my head around it.
EDIT: I see people are thinking I want to "fill the gaps" in a sequence, but actually what I want is to substitute NULL for the missing values from the IN list. For example, this
SELECT id, val FROM tab WHERE id IN (1,100,8,200)
should return
id | val
---------
1 | abc
100 | NULL
8 | bar
200 | NULL
Also, the order doesn't matter much.
EDIT2: Just adding a couple of related links:
How to select multiple rows filled with constants?
Is it possible to have a tableless select with multiple rows?
You could use this trick:
SELECT v.id, t.val
FROM
(SELECT 1 AS id
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5) v
LEFT JOIN tab t
ON v.id = t.id
Please see fiddle here.
Yes, you can. But that will be tricky since there are no sequences in MySQL.
I assume you want just any selection, so it's:
SELECT
*
FROM
(SELECT
(two_1.id + two_2.id + two_4.id +
two_8.id + two_16.id) AS id
FROM
(SELECT 0 AS id UNION ALL SELECT 1 AS id) AS two_1
CROSS JOIN (SELECT 0 id UNION ALL SELECT 2 id) AS two_2
CROSS JOIN (SELECT 0 id UNION ALL SELECT 4 id) AS two_4
CROSS JOIN (SELECT 0 id UNION ALL SELECT 8 id) AS two_8
CROSS JOIN (SELECT 0 id UNION ALL SELECT 16 id) AS two_16
) AS sequence
LEFT JOIN
t
ON sequence.id=t.id
WHERE
sequence.id IN (1,2,3,4,5);
(check the fiddle)
It will work as combination of powers of 2 to generate consecutive table of numbers. Your values are passed to WHERE clause, so you can substitute there any set of values.
I would recommend you to use application for this case - because it will be faster. It may have some sense if you want to use this row set somewhere else (i.e. in some other queries) - but if not, it's a work for your application.
If you'll need higher values, add more rows to sequence generator, like in this fiddle.

Identifying groups in Group By

I am running a complicated group by statement and I get all my results in their respective groups. But I want to create a custom column with their "group id". Essentially all the items that are grouped together would share an ID.
This is what I get:
partID | Description
-------+---------+--
11000 | "Oven"
12000 | "Oven"
13000 | "Stove"
13020 | "Stove"
12012 | "Grill"
This is what I want:
partID | Description | GroupID
-------+-------------+----------
11000 | "Oven" | 1
12000 | "Oven" | 1
13000 | "Stove" | 2
13020 | "Stove" | 2
12012 | "Grill" | 3
"GroupID" does not exist as data in any of the tables, it would be a custom generated column (alias) that would be associated to that group's key,id,index, whatever it would be called.
How would I go about doing this?
I think this is the query that returns the five rows:
select partId, Description
from part p;
Here is one way (using standard SQL) to get the groups:
select partId, Description,
(select count(distinct Description)
from part p2
where p2.Description <= p.Description
) as GroupId
from part p;
This is using a correlated subquery. The subquery is finding all the description values less than the current one -- and counting the distinct values. Note that this gives a different set of values from the ones in the OP. These will be alphabetically assigned rather than assigned by first encounter in the data. If that is important, the OP should add that into the question. Based on the question, the particular ordering did not seem important.
Here's one way to get it:
SELECT p.partID,p.Description,b.groupID
FROM (
SELECT Description,#rn := #rn + 1 AS groupID
FROM (
SELECT distinct description
FROM part,(SELECT #rn:= 0) c
) a
) b
INNER JOIN part p ON p.description = b.description;
sqlfiddle demo
This gets assigns a diferent groupID to each description, and then joins the original table by that description.
Based on your comments in response to Gordon's answer, I think what you need is a derived table to generate your groupids, like so:
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
which will give you:
DESCRIPTION GROUPID
Oven 1
Stove 2
Grill 3
Then you can use that in your original query, joining on description:
select
t1.partid,
t1.description,
t2.GroupID
from
table1 t1
inner join
(
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
) t2
on t1.description = t2.description
SQL Fiddle
SELECT partID , Description, #s:=#s+1 GroupID
FROM part, (SELECT #s:= 0) AS s
GROUP BY Description