MySQL GROUP BY Multiple Columns - mysql

I need to perform a GROUP BY on 2 columns separately...
In common terms, I'd like the query to say: GROUP BY column 1, then once this grouping has been performed, and the rows returned have been refined, go back to the top and GROUP BY column 2 to refine the rows returned again.
For instance, instead of stating:
GROUP BY column_1, column_2
I want to state (I Understand this is incorrect syntax):
GROUP BY column_1
GROUP BY column_2
If this is unclear I can include a sample query with expected returned results.

Are you trying to do something like this?
select ...
from (
select ...
from some_table
where ...
group by column1
) as dt
group by column2
That's the closest thing I can think of that matches what your question appears to be asking.

Mostly you can group by multiple columns in mysql. The query is:select * from table group by col1, col2
But you can't get answer as you want as. So you've another chance to get correct answer in mysql. That is, you've to use subqueries.select * from (select * from table group by col2) tabl group by col1

Related

MYSQL nested select query from same table

What I am trying to do is select each distinct column1 value from table1 and then select all the columns from those rows returned from the above. Is this possible at all?
What I have so far, however, nothing is returned:
SELECT * FROM (SELECT DISTINCT column1 FROM table1)
I've thought about putting a unique/distinct restriction in the where clause of the query:
SELECT * FROM table1 WHERE some_unique_determiner column1
Any ideas how I could go about achieving the desired output?
Ok so answering my own question. What I need to do was to group the data by column1, without use of a nested query. Many thanks to #VR46 for the help.
SELECT * FROM table1 GROUP BY column1
Returned all columns from each unique value from column1
In your next posts, it will be better if you post your table structures, input and desired out put so it will be easier for us to understand.
If I did understand, there is one of two options:
Either you have duplicates, and you want to eliminate them so your correct query should be
select distinct COLUMNa,COLUMNb,COLUMNc... ETC
which will drop duplicates(that the entire row is the same).
Or you want to eliminate rows that have the same column1 and it doesn't matter if all the rest is the same or not.
In that case, You need to tell us which one of the result you want to keep, The up to date one,the older, random ETC.. because right now its impossible to make you a query that selects all the columns after you distinct, since all the duplicates will return like this:
SELECT * FROM TABLE WHERE COLUMN1 IN(SELECT DISTINCT COLUMN1 FROM TABLE)
Which is a wrong query since it doesn't do anything.

Select two last inserted rows from mysql based on a field

I have the following query:
SELECT DISTINCT field1, field2 FROM table1 WHERE something = 'x' ORDER BY time_of_insertion DESC LIMIT 2
What I want is to get the last two inserted rows of the table, one with a certain field1 ('1' for example) and another with another field1 ('2' for example). So, it's not really the last two rows, what I want is the last one from one certain field1 and the last one from a different field1. When I tried the query, DISTINCT was not respected. Any ideas on why and on how to solve this?
I think a UNION will do what you want. Select a single row with your exact criteria, and then combine its resultset with that of another SELECT statement that selected the other row you wanted.
It's hard to be concrete and definitive when your example is rather generalized, but something along the lines of this:
SELECT field1, field2 FROM table1 WHERE something = 'x' ORDER BY time_of_insertion DESC LIMIT 1
UNION
SELECT field1, field2 FROM table1 WHERE something = 'y' ORDER BY time_of_insertion DESC LIMIT 1;
Notice it's one statement: only the second SELECT has a semi-colon terminating it.

Combine multiple rows (of one column) from mysql request into one field

I'd like to gather the result of one query into one single field if possible.
The request is :
select group_concat(col1) from table1
group by col3, col2
having count(*)>1
The result is like :
'123','124','125'
'123','125'
'126','127'
'123','127'
The result I'm looking for :
'123','124','125','123','125','126','127','123','127'
I tried to use again my group_concat, the concat fonction, or to use this whole query as a subquery without much success...
You're looking for the GROUP_CONCAT aggregation function, which you found. But you need to group the intermediate results again.
Apparently you have tried that, but since you didn't post your query, I'll give it to you:
select group_concat(col1intermediate) as col1total
from
(select group_concat(col1) as col1intermediate
from table1
group by col3, col2
having count(*)>1) as alias_subquery
I don't know why your own attempt failed. Maybe you forgot to add an alias (col1intermediate) to the aggregated column?

using distinct with all attributes

We can use * to select all attribute from table ,I am using distinct and my table contain 16 columns, How can I use distinct with it.I cannot do select distinct Id,* from abc;
What would be the best way.
Another way could be select distinct id,col1,col2 etc.
If you want in the results, one row per id, you can use GROUP BY id. But then, it's not advisable to use the other columns in the SELECT list (even if MySQL allows it - that depends on whether you have ANSI setting On or Off). It's advisable to use the other columns with aggregate functions like MIN(), MAX(), COUNT(), etc. In MySQL, there is also a GROUP_CONCAT() aggregate function that will collect all values from a column for a group:
SELECT
id
, COUNT(*) AS number_of_rows_with_same_id
, MIN(col1) AS min_col1
, MAX(col1) AS max_col1
--
, GROUP_CONCAT(col1) AS gc_col1
, GROUP_CONCAT(col2) AS gc_col2
--
, GROUP_CONCAT(col16) AS gc_col16
FROM
abc
GROUP BY
id ;
The query:
SELECT *
FROM abc
GROUP BY id ;
is not valid SQL (up to 92) because you have non-aggregated results in the SELECT list and valid in SQL (2003+). Still, it's invalid here because the other columns are not functionally dependent on the grouping column (id). MySQL unfortunately allows such queries and does no checking of functional dependency.
So, you never know which row (of the many with same id) will be returned or even if - horror! - you get results from different rows (with same id). As #Andriy comments, the consequences are that values for columns other than id will be chosen arbitrarily. If you want predictable results, just don't use such a technique.
An example solution: If you want just one row from every id, and you have a datetime or timestamp (or some other) column that you can use for ordering, you can do this:
SELECT t.*
FROM abc AS t
JOIN
( SELECT id
, MIN(some_column) AS m -- or MAX()
FROM abc
GROUP BY id
) AS g
ON g.id = t.id
AND g.m = t.some_column ;
This will work as long as the (id, some_column) combination is unique.
use group by instead of distinct
group by col1, col2,col3
its doing like distinct
SELECT DISTINCT * FROM `some_table`
Is absolutely valid syntax.
The error is caused by the fact that you call Id, *. Well * includes the Id column too, which usually is unique anyway.
So what you'll need in your case is just:
SELECT DISTINCT * FROM `abc`
SELECT * FROM abc where id in(select distinct id from abc);
You can totally do this.
Hope this helps
Initially I thought it would work for group by is best one. This is same as doing select * froom abc. Sorry guys

Find and remove duplicate rows by two columns

I read all the relevant duplicated questions/answers and I found this to be the most relevant answer:
INSERT IGNORE INTO temp(MAILING_ID,REPORT_ID)
SELECT DISTINCT MAILING_ID,REPORT_IDFROM table_1
;
The problem is that I want to remove duplicates by col1 and col2, but also want to include to the insert all the other fields of table_1.
I tried to add all the relevant columns this way:
INSERT IGNORE INTO temp(M_ID,MAILING_ID,REPORT_ID,
MAILING_NAME,VISIBILITY,EXPORTED) SELECT DISTINCT
M_ID,MAILING_ID,REPORT_ID,MAILING_NAME,VISIBILITY,
EXPORTED FROM table_1
;
M_ID(int,primary),MAILING_ID(int),REPORT_ID(int),
MAILING_NAME(varchar),VISIBILITY(varchar),EXPORTED(int)
But it inserted all rows into temp (including duplicates)
The best way to delete duplicate rows by multiple columns is the simplest one:
Add an UNIQUE index:
ALTER IGNORE TABLE your_table ADD UNIQUE (field1,field2,field3);
The IGNORE above makes sure that only the first found row is kept, the rest discarded.
(You can then drop that index if you need future duplicates and/or know they won't happen again).
This works perfectly in any version of MySQL including 5.7+. It also handles the error You can't specify target table 'my_table' for update in FROM clause by using a double-nested subquery. It only deletes ONE duplicate row (the later one) so if you have 3 or more duplicates, you can run the query multiple times. It never deletes unique rows.
DELETE FROM my_table
WHERE id IN (
SELECT calc_id FROM (
SELECT MAX(id) AS calc_id
FROM my_table
GROUP BY identField1, identField2
HAVING COUNT(id) > 1
) temp
)
I needed this query because I wanted to add a UNIQUE index on two columns but there were some duplicate rows that I needed to discard first.
For Mysql:
DELETE t1 FROM yourtable t1
INNER JOIN yourtable t2 WHERE t1.id < t2.id
AND t1.identField1 = t2.identField1
AND t1.identField2 = t2.identField2;
You will first need to find your duplicates by grouping on the two fields with a having clause.
Select identField1, identField2, count(*) FROM yourTable
GROUP BY identField1, identField2
HAVING count(*) >1
If this returns what you want, you can then use it as a subquery and
DELETE FROM yourTable WHERE field in (Select identField1, identField2, count(*) FROM yourTable
GROUP BY identField1, identField2
HAVING count(*) >1 )
you can always get the primary ids by grouping that two unique fields
select count(*), id as count from table group by col a, col b having count(*)>1;
and then
delete from table where id in ( select count(*), id as count from table group by col a, col b having count(*)>1) limit maxlimit;
you can also use max() in place of limit
NOTE: This solution is an alternative & old school solution.
If you couldn't achieve what you wanted, then you can try my "oldschool" method:
First, run this query to get the duplicate records:
select column1,
column2,
count(*)
from table
group by column1,
column2
having count(*) > 1
order by count(*) desc
After that, select those results and paste them into the notepad++:
Now by using the find and replace specialty of the notepad++ replace them with; first "delete" then "insert" queries like this (from now on, for security reasons, my values will be AAAA).
Special Note: Please make another new line for the end of the last line of your data inside notepad++ because regex matched the '\r\n' at the end of the each line:
Find what regex: \D*(\d+)\D*(\d+)\D*\r\n
Replace with string: delete from table where column1 = $1 and column2 = $2; insert into table set column1 = $1, column2 = $2;\r\n
Now finally, paste those queries to your MySQL Workbench's query console and execute. You will see only one occurrences of each duplicate record.
This answer is for a relation table constructed of just two columns without ID. I think you can apply it to your situation.
In a large data set if you are selecting the multiple columns in the select clause ex:
select x,y,z from table1.
And the requirement is to remove duplicate based on two columns:from above example let y,z
then you may use below instead of using combo of "group by" and "sub query", which is bad in performance:
select x,y,z
from (
select x,y,z , row_number() over (partition by y,z) as index_num
from table1) main
where main.index_num=1