In a mysql database, how can I condense all consecutive duplicates into 1 while maintaining order in a select output?
data:
id fruit
----------
1 Apple
2 Banana
3 Banana
4 Banana
5 Apple
6 Mango
7 Mango
8 Apple
Output I want:
fruit
-------
Apple
Banana
Apple
Mango
Apple
This is a very easy thing to do in unix with the uniq command, but 'distinct' is not as flexible.
IDs are not sequential, and gaps are possible. I was oversimplifying in my example.
Select could be like this:
data:
id fruit
----------
100 Apple
2 Banana
30 Banana
11 Banana
50 Apple
62 Mango
7 Mango
4 Apple
Try this - assuming consecutive IDs with no gaps.
SELECT T.fruit
FROM YOURTABLE T
LEFT JOIN YOURTABLE T2
ON T2.ID = T.ID + 1
WHERE T2.fruit <> T.fruit
OR T2.ID IS NULL
ORDER BY T.ID
Use:
SELECT x.fruit
FROM (SELECT t.fruit,
CASE
WHEN #fruit <> t.fruit THEN #rownum := #rownum + 1
ELSE #rownum
END as rank
FROM YOUR_TABLE t
JOIN (SELECT #rownum := 0, #fruit := NULL) r
ORDER BY t.id) x
GROUP BY x.fruit, x.rank
ORDER BY x.rank
Related
I have data like this
id otherid name
1 123 banana
2 123 banana
3 123 banana
4 456 grape
5 456 grape
6 789 orange
7 111 banana
How can I get output like this: (with MySQL query)
name count
banana 2
grape 1
orange 1
Try this:
SELECT
f.`name`,
COUNT(DISTINCT (f.`otherid`))
FROM
`fruits` f
GROUP BY f.`name`
You can use COUNT with GROUP BY:
SELECT
`name`, COUNT(*)
FROM
write_you_table_name
GROUP BY
`name`,`otherid`
Using distinct in the count function is recommended. But if you prefer not to using it, try this:
select name,count(*) from
(select name from fruit group by name,otherid) t
group by name;
SELECT name , count(*) as count FROM tb_stock GROUP BY other_id
I have a column fruits and it has rows like
banana
pineapple
orange
grapes
apple
mango
pomegranate
Kiwi
grapefruit
peach
or maybe like this
pineapple
grapefruit
orange
grapes
apple
mango
pomegranate
Kiwi
banana
peach
I want to retrieve all that with grapefruit in the middle all the time like following no matter whether it has even or odd number of rows
banana
pineapple
orange
grapes
grapefruit
apple
pomegranate
Kiwi
mango
I know basic SQL query SELECT fruit FROM FRUITTABLE
but dont know further
You can order by is_even/is_odd on some enumeration. (here: the difference in row_number() over nothing)
\i tmp.sql
CREATE TABLE fruits(fruit text);
INSERT INTO fruits(fruit) VALUES
('banana') ,('pineapple') ,('orange') ,('grapes') ,('apple') ,('mango') ,('pomegranate') ,('Kiwi') ,('grapefruit') ,('peach')
;
with numbered AS (
select fruit, row_number() OVER () rn
FROM fruits
)
, gnum AS (
SELECT rn FROM numbered
WHERE fruit = 'grapefruit'
)
SELECT n.fruit, n.rn
FROM numbered n
JOIN gnum g ON true
ORDER BY ((n.rn - g.rn) %2), (n.rn <> g.rn)
;
Result:
CREATE TABLE
INSERT 0 10
fruit | rn
-------------+----
mango | 6
grapes | 4
Kiwi | 8
pineapple | 2
grapefruit | 9
banana | 1
orange | 3
apple | 5
pomegranate | 7
peach | 10
(10 rows)
Edit: the tie-breaker is not always correct (caused by modulo on negative numbers) A better order would be
ORDER BY (ABS(n.rn - g.rn) %2) , (n.rn <> g.rn) DESC
I think you should explain why you want to do this, maybe there is a better way to do obtain the result you want.
But I think that you could add a weight col in your table
and order by that value in the select.
Just take into consideration that whenever you add a new row you have to update that weights in order to maintain grapefruit in the middle.
And you have to define how to manage the pair numbers of rows.
Table with examples
SELECT *
FROM Fruits
ORDER BY Weight;
Could you do:
(select * from fruit where id < (select max(id)/2 from fruit))
union select * from fruit where id = grapefruit
union
(select * from fruit where id > (select max(id)/2 from fruit))
Or something like that. If you don't have an id you might have to use rownumber, but it should be doable.
I'm building a table in MySQL and need to build a "grouped index" row that increments, but resets for new values in another column. Like this:
1 Apple
2 Apple
3 Apple
1 Banana
2 Banana
1 Pear
2 Pear
3 Pear
4 Pear
Any ideas how I can do this?
If you are running MySQL 8.0, just use row_number():
select
row_number() over(partition by fruit order by ?) rn,
fruit
from mytable
Note that, for your answer to produce consistent results, you need another column that can be used to order the records. I represented that as ? in the query.
If you use mysql 5.x you can use this Query
CREATE TABLE fruit (
`fruit` VARCHAR(6)
);
INSERT INTO fruit
( `fruit`)
VALUES
( 'Apple'),
( 'Apple'),
( 'Apple'),
( 'Banana'),
( 'Banana'),
( 'Pear'),
( 'Pear'),
( 'Pear'),
( 'Pear');
✓
✓
SELECT
IF(fruit = #fruit, #row_number := #row_number +1,#row_number := 1) rownumber
,#fruit := fruit
FROM
(SELECT * From fruit ORDER BY fruit ASC) t, (SELECT #row_number := 0) a,(SELECT #fruit := '') b ;
rownumber | #fruit := fruit
--------: | :--------------
1 | Apple
2 | Apple
3 | Apple
1 | Banana
2 | Banana
1 | Pear
2 | Pear
3 | Pear
4 | Pear
db<>fiddle here
The order of the columns has to be this way, so that the algorithm can work. If you need it in mysql to change, please use a an outer SELECT
I would like your assistance in solving an issue which I am battling now for days without even coming close to a solution. Unfortunately, I have already posted my issue and was not able to make any improvement with the suggestion delivered.
What I would like to achieve is somewhat attained by GROUP BY and HAVING with the possibility of CASE WHEN but whatever I do I am not getting to what I desire.
What I want to achieve is a GROUP BY only when the contents of the group exceed 3 rows and leave the individual items i.e. not grouped when group is less than or equal to three.
EXAMPLE
ID DESC VAL1 VAL 2 VAL 3
1 DESC1 2 2 4
2 DESC2 2 2 4
3 DESC3 2 2 4
4 DESC4 2 2 4
5 DESC5 1 1 2
6 DESC6 1 1 2
GROUP BY will be through VAL1, VAL2, VAL 3 through the following
SELECT * FROM TABLE1 GROUP BY VAL1,VAL2,VAL3
This will yield the following:
ID DESC VAL1 VAL 2 VAL 3
1 DESC1 2 2 4
5 DESC5 1 1 2
However what I need is the following:
ID DESC VAL1 VAL 2 VAL 3
1 DESC1 2 2 4
5 DESC5 1 1 2
6 DESC6 1 1 2
Can this be achieved with GROUP BY, what I think of is subquery but I cannot manage. Your assistance will be very much appreciated.
DBMS is MySQL.
Try this one. It might require some minor tweaks, as I didn't test it. But i think you will get the idea.
Select *
from table1
where md5(concat(val1,val2,val3)) in (
SELECT md5(concat(val1,val2,val3))
FROM TABLE1
GROUP BY VAL1,VAL2,VAL3
having count(*) > 3)
group by VAL1,VAL2,VAL3
union
Select *
from table1
where md5(concat(val1,val2,val3)) not in (
SELECT md5(concat(val1,val2,val3))
FROM TABLE1
GROUP BY VAL1,VAL2,VAL3
having count(*) > 3)
With UNION ALL, for the 2 different cases:
select t.* from tablename t inner join (
select min(id) minid
from tablename
group by val1, val2, val3
having count(*) > 3
) g on g.minid = t.id
union all
select * from tablename t
where (
select count(*) from tablename
where val1 = t.val1 and val2 = t.val2 and val3 = t.val3
) <= 3
See the demo
If you are using MySQL 8.0, you can achieve this simply with window functions COUNT() and ROW_NUMBER():
SELECT id, descr, val1, val2, val3
FROM (
SELECT
t.*,
COUNT(*) OVER(PARTITION BY val1, val2, val3) cnt,
ROW_NUMBER() OVER(PARTITION BY val1, val2, val3 ORDER BY id) rn
FROM mytable t
) x WHERE cnt < 3 OR rn = 1
ORDER BY id
In the inner query, cnt indicates how many records have the same val1, va2, val3 as the current one. rn assigns a rank to each record within groups of records having the same val1, va2, val3. The outer query then uses these two pieces of information to filter the relevant records.
Demo on DB Fiddle:
| id | descr | val1 | val2 | val3 |
| --- | ----- | ---- | ---- | ---- |
| 1 | DESC1 | 2 | 2 | 4 |
| 5 | DESC5 | 1 | 1 | 2 |
| 6 | DESC6 | 1 | 1 | 2 |
I want to analyse outliers a of grouped data. Lets say I have data:
+--------+---------+-------+
| fruit | country | price |
+--------+---------+-------+
| apple | UK | 1 |
| apple | USA | 3 |
| apple | LT | 2 |
| apple | LV | 5 |
| apple | EE | 4 |
| pear | SW | 6 |
| pear | NO | 2 |
| pear | FI | 3 |
| pear | PL | 7 |
+--------+---------+-------+
Lets take pears. If my method of finding outliers would be to take 25% highest prices of pears and lowest 25%, outliers of pears would be
+--------+---------+-------+
| pear | NO | 2 |
| pear | PL | 7 |
+--------+---------+-------+
As for apples:
+--------+---------+-------+
| apple | UK | 1 |
| apple | LV | 5 |
+--------+---------+-------+
That I want is to create a view, which would show table of all fruits outliers union. If I had this view, I could analyse only tails, also intersect view with main table to get table without outliers - that's my goal. Solution to this would be:
(SELECT * FROM fruits f WHERE f.fruit = 'pear' ORDER BY f.price ASC
LIMIT (SELECT ROUND(COUNT(*) * 0.25,0)
FROM fruits f2
WHERE f2.fruit = 'pear')
)
union all
(SELECT * FROM fruits f WHERE f.fruit = 'pear' ORDER BY f.price DESC
LIMIT (SELECT ROUND(COUNT(*) * 0.25,0)
FROM fruits f2
WHERE f2.fruit = 'pear')
)
union all
(SELECT * FROM fruits f WHERE f.fruit = 'apple' ORDER BY f.price ASC
LIMIT (SELECT ROUND(COUNT(*) * 0.25,0)
FROM fruits f2
WHERE f2.fruit = 'apple')
)
union all
(SELECT * FROM fruits f WHERE f.fruit = 'apple' ORDER BY f.price DESC
LIMIT (SELECT ROUND(COUNT(*) * 0.25,0)
FROM fruits f2
WHERE f2.fruit = 'apple')
)
This would give me a table I want, however code after LIMIT doesn't seem to be correct... Another problem is number of groups. In this example there are only two groups(pears,apples), but in my actual data there are around 100 groups. So 'union all' should somehow automatically go thru all unique fruits without writing code for each unique fruit, find number of outliers of each unique fruit, take only that numbe of rows and show it all in another table(view).
You can't supply LIMIT with a value from a subquery, in any RDBMS I'm aware of. Some dbs don't even allow host variables/parameters in their versions of the clause (I'm thinking of iSeries DB2).
This is essentially a greatest-n-per-group problem. Similar queries in most other RDBMSs are solved with what are called Windowing functions - essentially, you're looking at a movable selection of data.
MySQL doesn't have this functionality, so we have to counterfeit it. The actual mechanics of the query will depend on the actual data you need, so I can only speak to what you're attempting here. The techniques should be generally adaptable, but may require rather more creativity than otherwise.
To start with you want a function that will return a number indicating it's position - I'm assuming duplicate prices should be given the same rank (ties), and that doing so won't create a gap in the number. This is essentially the DENSE_RANK() windowing function. We can get these results by doing the following:
SELECT fruit, country, price,
#Rnk := IF(#last_fruit <> fruit, 1,
IF(#last_price = price, #Rnk, #Rnk + 1)) AS Rnk,
#last_fruit := fruit,
#last_price := price
FROM Fruits
JOIN (SELECT #Rnk := 0) n
ORDER BY fruit, price
Example Fiddle
... Which generates the following for the 'apple' group:
fruit country price rank
=============================
apple UK 1 1
apple LT 2 2
apple USA 3 3
apple EE 4 4
apple LV 5 5
Now, you're trying to get the top/bottom 25% of rows. In this case, you need a count of distinct prices:
SELECT fruit, COUNT(DISTINCT price)
FROM Fruits
GROUP BY fruit
... And now we just need to join this to the previous statement to limit the top/bottom:
SELECT RankedFruit.fruit, RankedFruit.country, RankedFruit.price
FROM (SELECT fruit, COUNT(DISTINCT price) AS priceCount
FROM Fruits
GROUP BY fruit) CountedFruit
JOIN (SELECT fruit, country, price,
#Rnk := IF(#last_fruit <> fruit, 1,
IF(#last_price = price, #Rnk, #Rnk + 1)) AS rnk,
#last_fruit := fruit,
#last_price := price
FROM Fruits
JOIN (SELECT #Rnk := 0) n
ORDER BY fruit, price) RankedFruit
ON RankedFruit.fruit = CountedFruit.fruit
AND (RankedFruit.rnk > ROUND(CountedFruit.priceCount * .75)
OR RankedFruit.rnk <= ROUND(CountedFruit.priceCount * .25))
SQL Fiddle Example
...which yields the following:
fruit country price
=======================
apple UK 1
apple LV 5
pear NN 2
pear NO 2
pear PL 7
(I duplicated a pear row to show "tied" prices.)
Does round not need 2 / 3 arguments? I.e. do you not need to put in, to what decimal place you wish to round?
so
...
LIMIT (SELECT ROUND(COUNT(*) * 0.25)
FROM #fruits f2
WHERE f2.fruit = 'apple')
becomes
...
LIMIT (SELECT ROUND(COUNT(*) * 0.25,2)
FROM #fruits f2
WHERE f2.fruit = 'apple')
also, just having a quick look at lunch, but it looks like you're just expecting the min / max values. Could you not just use those functions instead?