MySQL - GROUP BY, select only the first row when grouping - mysql

I'm having a problem with grouping specific columns into one. When I use GROUP BY, the last row always gets selected when it should be the first row.
The main query is:
SELECT cpme_id,
medicine_main_tbl.med_id,
Concat(med_name, ' (', med_dosage, ') ', med_type) AS Medicine,
med_purpose,
med_quantity,
med_expiredate
FROM medicine_main_tbl
JOIN medicine_inventory_tbl
ON medicine_main_tbl.med_id = medicine_inventory_tbl.med_id
WHERE Coalesce(med_quantity, 0) != 0
AND Abs(Datediff(med_expiredate, Now()))
ORDER BY med_expiredate;
SELECT without GROUP BY
If I GROUP BY using any duplicate column value (in this case, I used med_id):
SELECT with GROUP BY
I'm trying to get this output
Expected Output
The output should only be the first two from the first query. Obviously, I cannot use LIMIT.

Since you are using MariaDB, I recommend using ROW_NUMBER here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY med_id ORDER BY med_expireDate) rn
FROM yourTable
)
SELECT cpme_id, med_id, Medicine, med_purpose, med_quantity, med_expireDate
FROM cte
WHERE rn = 1;
This assumes that the "first" row for a given medicine is the one having the earliest expire date. This was the only interpretation of your data which agreed with the expected output.

Related

MySQL Select, unique in one column

Lets say I have a table with the following rows/values:
I need a way to select the values in amount but only once if they're duplicated. So from this example I'd want to select A,B and C the amount once. The SQL result should look like this then:
Use LAG() function and compare previous amount with current row amount for name.
-- MySQL (v5.8)
SELECT t.name
, CASE WHEN t.amount = t.prev_val THEN '' ELSE amount END amount
FROM (SELECT *
, LAG(amount) OVER (PARTITION BY name ORDER BY name) prev_val
FROM test) t
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=8c1af9afcadf2849a85ad045df7ed580
You can handle situation like these with different function depending on what you need:
Case1 : If you have same values per name:
select distinct name, amount from [table name]
Case2 : You have duplicates with different values for each name and you want to pick the one with the highest value. Use min() if you need the minimum one to show up.
select name, max(amount) from [table name] group by 1
Case 3: The one you need with blanks for the rest of the duplications.
Row number will create rows based on values in amount and since the values are the same it will create it incrementally and you can then use IF to create a new column where rank_ > 1 then blanks. This will also cover the case where you would like to select just the minimum value and then have blanks for the rest of the name values
With ranking as (
select
*,
ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY AMOUNT) AS RANK_
from [table]
)
SELECT
*,
IF(RANK_ > 1,"",AMOUNT) AS NEW_AMOUNT
FROM ranking
Case 4: You need to select maximum and put the other names as blank
You will just adjust the order by clause of ROW_NUMBER() to DESC. This will put the rank 1 to the highest amount per name and for the rest, the blank will be filled
With ranking as (
select
*,
ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY AMOUNT DESC) AS RANK_
from [table]
)
SELECT
*,
IF(RANK_ > 1,"",AMOUNT) AS NEW_AMOUNT
FROM ranking
If you are using mysql 8 you can use row_number for this:
with x as (
select *, row_number() over(partition by name order by amount) rn
from t
)
select name, case when rn=1 then amount else '' end amount
from x
See example Fiddle
The other answers are missing a really important point: A SQL table returns an unordered set unless there is an explicit order by.
The data that you have provides has rows that are exact duplicates. For this reason, I think the best approach uses row_number() and an order by in the outer query:
select name, (case when seqnum = 1 then amount end) as amount
from (select t.*,
row_number() over (partition by name, amount) as seqnum
from t
) t
order by name, seqnum;
Note that MySQL does not require an order by argument for row_number().
More commonly, though, you would have some other column (say a date or id) that would be used for ordering. I should also emphasize that this type of formatting is often handled at the application layer and not in the database.

Identify the last row of a distinct set of data in the field for an Alias ​Column

How can i identify the last Row of a distinct set of data in the field for an Alias ​​Column (signaling somehow, with "1" for example).
For this example i need to know, when the ordered GROUP "CARS, COLORS, DRINKS, FRUITS" ends.
Check my intended result on this image:
My base query:
SELECT * FROM `MY_DB` ORDER BY `ITEM`, `GROUP` ASC
As a starter: rows of a SQL table are unordered. There is no inherent ordering of rows. For your question to make sense, you need a column that defines the ordering of the rows in each group - I assumed id.
Then: in MySQL 8.0, one option uses window functions:
select t.*,
(row_number() over(partition by grp order by id desc) = 1) as last_group_flag
from mytable t
In earlier versions, you could use a subquery:
select t.*,
(id = (select max(t1.id) from mytable t1 where t1.grp = t.grp)) as last_group_flag
from mytable t
Note: group is a language keyword, hence not a good choice for a column name. I used grp instead in the query.
You need ordering by item column and order by group column to find the last record per distinct group column.
Use row_number as follows:
select t.*,
Case when row_number() over(partition by group
order by item desc) = 1
then 1 else 0 end as last_group_flag
from your_table t

Calculating Running totals across rows and grouping by ID

I want to compute running row totals across a table, however the totals must start over for new IDs
https://imgur.com/a/YgQmYQA
My code:
set #csum := 0;
select ID, name, marks, (#rt := #rt + marks) as Running_total from students order by ID;
The output returns the totals however doesn't break or start over for new IDs
Bro try this... It is tested on MSSQL..
select ID, name, marks,
marks + isnull(SUM(marks) OVER ( PARTITION BY ID ORDER BY ID ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) ,0) as Running_total
from students
You need to partition your running total by ID. A running total always needs an order of some column, by ordering on which you want to calculate the running total. Assuming running total under each ID is based on ORDER of marks,
Approach 1: It can be written in a simple query if your DBMS supports Analytical Functions
SELECT ID
,name
,marks
,Running_total = SUM(marks) OVER (PARTITION BY ID ORDER BY marks ASC)
FROM students
Approach 2: You can make use of OUTER APPLY if your database version / DBMS itself does not support Analytical Functions
SELECT S.ID
,S.name
,S.marks
,Running_total = OA.runningtotalmarks
FROM students S
OUTER APPLY (
SELECT runningtotalmarks = SUM(SI.marks)
FROM students SI
WHERE SI.ID = S.ID
AND SI.marks <= S.marks
) OA;
Note:- The above queries have been tested MS SQL Server.

Find most frequent value in SQL column

How can I find the most frequent value in a given column in an SQL table?
For example, for this table it should return two since it is the most frequent value:
one
two
two
three
SELECT
<column_name>,
COUNT(<column_name>) AS `value_occurrence`
FROM
<my_table>
GROUP BY
<column_name>
ORDER BY
`value_occurrence` DESC
LIMIT 1;
Replace <column_name> and <my_table>. Increase 1 if you want to see the N most common values of the column.
Try something like:
SELECT `column`
FROM `your_table`
GROUP BY `column`
ORDER BY COUNT(*) DESC
LIMIT 1;
Let us consider table name as tblperson and column name as city. I want to retrieve the most repeated city from the city column:
select city,count(*) as nor from tblperson
group by city
having count(*) =(select max(nor) from
(select city,count(*) as nor from tblperson group by city) tblperson)
Here nor is an alias name.
Below query seems to work good for me in SQL Server database:
select column, COUNT(column) AS MOST_FREQUENT
from TABLE_NAME
GROUP BY column
ORDER BY COUNT(column) DESC
Result:
column MOST_FREQUENT
item1 highest count
item2 second highest
item3 third higest
..
..
For use with SQL Server.
As there is no limit command support in that.
Yo can use the top 1 command to find the maximum occurring value in the particular column in this case (value)
SELECT top1
`value`,
COUNT(`value`) AS `value_occurrence`
FROM
`my_table`
GROUP BY
`value`
ORDER BY
`value_occurrence` DESC;
Assuming Table is 'SalesLT.Customer' and the Column you are trying to figure out is 'CompanyName' and AggCompanyName is an Alias.
Select CompanyName, Count(CompanyName) as AggCompanyName from SalesLT.Customer
group by CompanyName
Order By Count(CompanyName) Desc;
If you can't use LIMIT or LIMIT is not an option for your query tool. You can use "ROWNUM" instead, but you will need a sub query:
SELECT FIELD_1, ALIAS1
FROM(SELECT FIELD_1, COUNT(FIELD_1) ALIAS1
FROM TABLENAME
GROUP BY FIELD_1
ORDER BY COUNT(FIELD_1) DESC)
WHERE ROWNUM = 1
If you have an ID column and you want to find most repetitive category from another column for each ID then you can use below query,
Table:
Query:
SELECT ID, CATEGORY, COUNT(*) AS FREQ
FROM TABLE
GROUP BY 1,2
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY FREQ DESC) = 1;
Result:
Return all most frequent rows in case of tie
Find the most frequent value in mysql,display all in case of a tie gives two possible approaches:
Scalar subquery:
SELECT
"country",
COUNT(country) AS "cnt"
FROM "Sales"
GROUP BY "country"
HAVING
COUNT("country") = (
SELECT COUNT("country") AS "cnt"
FROM "Sales"
GROUP BY "country"
ORDER BY "cnt" DESC,
LIMIT 1
)
ORDER BY "country" ASC
With the RANK window function, available since MySQL 8+:
SELECT "country", "cnt"
FROM (
SELECT
"country",
COUNT("country") AS "cnt",
RANK() OVER (ORDER BY COUNT(*) DESC) "rnk"
FROM "Sales"
GROUP BY "country"
) AS "sub"
WHERE "rnk" = 1
ORDER BY "country" ASC
This method might save a second recount compared to the first one.
RANK works by ranking all rows, such that if two rows are at the top, both get rank 1. So it basically directly solves this type of use case.
RANK is also available on SQLite and PostgreSQL, I think it might be SQL standard, not sure.
In the above queries I also sorted by country to have more deterministic results.
Tested on SQLite 3.34.0, PostgreSQL 14.3, GitHub upstream.
Most frequent for each GROUP BY group
MySQL: MySQL SELECT most frequent by group
PostgreSQL:
Get most common value for each value of another column in SQL
https://dba.stackexchange.com/questions/193307/find-most-frequent-values-for-a-given-column
SQLite: SQL query for finding the most frequent value of a grouped by value
SELECT TOP 20 WITH TIES COUNT(Counted_Column) AS Count, OtherColumn1,
OtherColumn2, OtherColumn3, OtherColumn4
FROM Table_or_View_Name
WHERE
(Date_Column >= '01/01/2023') AND
(Date_Column <= '03/01/2023') AND
(Counted_Column = 'Desired_Text')
GROUP BY OtherColumn1, OtherColumn2, OtherColumn3, OtherColumn4
ORDER BY COUNT(Counted_Column) DESC
20 can be changed to any desired number
WITH TIES allows all ties in the count to be displayed
Date range used if date/time column exists and can be modified to search a date range as desired
Counted_Column 'Desired_Text' can be modified to only count certain entries in that column
Works in INSQL for my instance
One way I like to use is:
select *<given_column>*,COUNT(*<given_column>*)as VAR1 from Table_Name
group by *<given_column>*
order by VAR1 desc
limit 1

group_concat in SQL Server 2008 [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Combine multiple results in a subquery into a single comma-separated value
Concat groups in SQL Server
I want to be able to get the duplication's removed
SELECT Count(Data) as Cnt, Id
FROM [db].[dbo].[View_myView]
Group By Data
HAVING Count(Data) > 1
In MySQL it was as simple as this:
SELECT Count(Data), group_concat(Id)
FROM View_myView
Group By Data
Having Cnt > 1
Does anyone know of a solution? Examples are a plus!
In SQL Server as of version 2005 and newer, you can use a CTE (Common Table Expression) with the ROW_NUMBER function to eliminate duplicates:
;WITH LastPerUser AS
(
SELECT
ID, UserID, ClassID, SchoolID, Created,
ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY Created DESC) AS 'RowNum'
FROM dbo.YourTable
)
SELECT
ID, UserID, ClassID, SchoolID, Created,
FROM LastPerUser
WHERE RowNum = 1
This CTE "partitions" your data by UserID, and for each partition, the ROW_NUMBER function hands out sequential numbers, starting at 1 and ordered by Created DESC - so the latest row gets RowNum = 1 (for each UserID) which is what I select from the CTE in the SELECT statement after it.
Using the same CTE, you can also easily delete duplicates:
;WITH LastPerUser AS
(
SELECT
ID, UserID, ClassID, SchoolID, Created,
ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY Created DESC) AS 'RowNum'
FROM dbo.YourTable
)
DELETE FROM dbo.YourTable t
FROM LastPerUser cte
WHERE t.ID = cte.ID AND cte.RowNum > 1
Same principle applies: you "group" (or partition) your data by some criteria, you consecutively number all the rows for each data partition, and those with values larger than 1 for the "partitioned row number" are weeded out by the DELETE.
Just use distinct to remove duplicates. It sounds like you were using group_concat to join duplicates without actually wanting to use its value. In that case, MySQL also has a distinct you could have been using:
SELECT DISTINCT Count(Data) as Cnt, Id
FROM [db].[dbo].[View_myView]
GROUP BY Id
HAVING Count(Data) > 1
Also, you can't group by something you use in an aggregate function; I think you mean to group by id. I corrected it in the example above.