SELECT DISTINCT field1, field2, field3, ......
FROM table;
I am trying to accomplish the following SQL statement, but I want it to return all columns.
Is this possible?
Something like this:
SELECT DISTINCT field1, *
FROM table;
You're looking for a group by:
select *
from table
group by field1
Which can occasionally be written with a distinct on statement:
select distinct on field1 *
from table
On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)
You could fetch the distinct fields and stick to picking a single arbitrary row each time.
On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:
select *
from (
select *,
row_number() over (partition by field1 order by field2) as row_number
from table
) as rows
where row_number = 1
On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.
From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.
Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).
You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:
SELECT field1, MAX(field2), COUNT(field3), SUM(field4), ....
FROM table GROUP BY field1
If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.
If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.
If you make your query:
SELECT * from table GROUP BY field1;
It will show all your results based on a single instance of field1.
For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:
SELECT * FROM persons GROUP BY name;
The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.
SELECT * FROM persons GROUP BY lastName, firstName;
because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.
That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.
Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.
--country-- --city--
France Lyon
Poland Krakow
France Paris
France Marseille
Italy Milano
Now the query
SELECT country FROM locations
GROUP BY country
will result in:
--country--
France
Poland
Italy
However, the following query
SELECT country, city FROM locations
GROUP BY country
...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?
In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.
SELECT country, MAX(city) FROM locations
GROUP BY country
will result in:
--country-- --city--
France Paris
Poland Krakow
Italy Milano
or:
SELECT country, MIN(city) FROM locations
GROUP BY country
will result in:
--country-- --city--
France Lyon
Poland Krakow
Italy Milano
These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.
The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:
SELECT
countrylist.*,
(SELECT TOP 1 city
FROM locations
WHERE
country = countrylist.country
AND city like 'M%'
)
FROM
(SELECT country FROM locations
GROUP BY country) countrylist
will result in:
--country-- --city--
France Marseille
Poland NULL
Italy Milano
SELECT c2.field1 ,
field2
FROM (SELECT DISTINCT
field1
FROM dbo.TABLE AS C
) AS c1
JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1
Great question #aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!
I just tried to edit the accepted answer to include this, but in case my edit does not make it in:
If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:
SELECT
table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
SELECT field, MAX(id) as id
FROM table
GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
//this will result in only the last instance being seen
noDupes.id is not NULL
Try
SELECT table.* FROM table
WHERE otherField = 'otherValue'
GROUP BY table.fieldWantedToBeDistinct
limit x
You can do it with a WITH clause.
For example:
WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c
This also allows you to select only the rows selected in the WITH clauses query.
For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...
with t as (
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
select
*,
total_dr_rows = count(*) over(partition by dr)
from (
select
*,
dr = dense_rank() over(order by col1, col2, col3),
dr_rn = row_number() over(partition by col1, col2, col3 order by other)
from
t
) x
)
select * from tdr where total_dr_rows > 1
This is taking a row count for each distinct combination of col1, col2, and col3.
select min(table.id), table.column1
from table
group by table.column1
SELECT *
FROM tblname
GROUP BY duplicate_values
ORDER BY ex.VISITED_ON DESC
LIMIT 0 , 30
in ORDER BY i have just put example here, you can also add ID field in this
Found this elsewhere here but this is a simple solution that works:
WITH cte AS /* Declaring a new table named 'cte' to be a clone of your table */
(SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY val1 DESC) AS rn
FROM MyTable /* Selecting only unique values based on the "id" field */
)
SELECT * /* Here you can specify several columns to retrieve */
FROM cte
WHERE rn = 1
In this way can get 2 unique column with 1 query only
select Distinct col1,col2 from '{path}' group by col1,col2
you can increase your columns if need
Add GROUP BY to field you want to check for duplicates
your query may look like
SELECT field1, field2, field3, ...... FROM table GROUP BY field1
field1 will be checked to exclude duplicate records
or you may query like
SELECT * FROM table GROUP BY field1
duplicate records of field1 are excluded from SELECT
Just include all of your fields in the GROUP BY clause.
It can be done by inner query
$query = "SELECT *
FROM (SELECT field
FROM table
ORDER BY id DESC) as rows
GROUP BY field";
SELECT * from table where field in (SELECT distinct field from table)
SELECT DISTINCT FIELD1, FIELD2, FIELD3 FROM TABLE1 works if the values of all three columns are unique in the table.
If, for example, you have multiple identical values for first name, but the last name and other information in the selected columns is different, the record will be included in the result set.
I would suggest using
SELECT * from table where field1 in
(
select distinct field1 from table
)
this way if you have the same value in field1 across multiple rows, all the records will be returned.
Related
I have this very simple table:
CREATE TABLE MyTable
(
Id INT(6) PRIMARY KEY,
Name VARCHAR(200) /* NOT UNIQUE */
);
If I want the Name(s) that is(are) the most frequent and the corresponding count(s), I can neither do this
SELECT Name, total
FROM table2
WHERE total = (SELECT MAX(total) FROM (SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) table2);
nor this
SELECT Name, total
FROM (SELECT Name, COUNT(*) AS total FROM MyTable GROUP BY Name) table1
WHERE total = (SELECT MAX(total) FROM table1);
Also, (let's say the maximum count is 4) in the second proposition, if I replace the third line by
WHERE total = 4;
it works.
Why is that so?
Thanks a lot
You can try the following:
WITH stats as
(
SELECT Name
,COUNT(id) as count_ids
FROM MyTable
GROUP BY Name
)
SELECT Name
,count_ids
FROM
(
SELECT Name
,count_ids
,RANK() OVER(ORDER BY count_ids DESC) as rank_ -- this ranks all names
FROM stats
) s
WHERE rank_ = 1 -- the most popular ```
This should work in TSQL.
Your queries can't be executed because "total" is no column in your table. It's not sufficient to have it within a sub query, you also have to make sure the sub query will be executed, produces the desired result and then you can use this.
You should also consider to use a window function like proposed in Dimi's answer.
The advantage of such a function is that it can be much easier to read.
But you need to be careful since such functions often differ depending on the DB type.
If you want to go your way with a sub query, you can do something like this:
SELECT name, COUNT(name) AS total FROM myTable
GROUP BY name
HAVING COUNT(name) =
(SELECT MAX(sub.total) AS highestCount FROM
(SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) sub);
I created a fiddle example which shows both queries mentioned here will produce the same and correct result:
db<>fiddle
I'm stuck.
I'm trying to query one of my tables to obtain the maximum 'canister_change_date' with grouped pairs 'canister_type' and 'test_cell'.
I've put together a table with some dummy data (below) If you want the create table schema, let me know and I'll put it in the comments.
The final result would either need to have the id's or the whole row with id.
expected result (below) would have id's - 1, 2, 3, 5, 7, 8
6 should be removed as matching pair (test_cell =4, canister_type=Carbon Monoxide) and 7 to be taken as it has the later 'canister_change_date' date.
The expect result would either need to have the id's or id's and rest of fields.
Thanks!
You can use GROUP BY on multiple columns just like that
SELECT COUNT(*) FROM my_table GROUP BY column1, column2
If you want to find row with highest value of some column then you will need HAVING clause and MAX() aggregate function. You can combine them like this
SELECT max_column, column1, column2
GROUP BY column1, column2
HAVING max_column = MAX(max_column)
This example assumes you want to find highest value of max_column for each unique pair of column1 and column2
With NOT EXISTS:
select t.* from tablename
where not exists (
select 1 from tablename
where test_cell = t.test_cell and canister_type = canister_type
and canister_change_date > t.canister_change_date
)
or if your version of MySql is 8.0+ and supports window functions:
select t.* from (
select *,
row_number() over (partition by test_cell, canister_type order by canister_change_date desc) rn
from tablename
) t
where t.rn = 1
I want to use ORDER BY on every UNION ALL queries, but I can't figure out the right syntax. This is what I want:
(
SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 123 AND user_in IN (...)
ORDER BY name
)
UNION ALL
(
SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 456 AND user_id NOT IN (...)
ORDER BY name
)
EDIT:
Just to be clear: I need two ordered lists like this, not one:
1
2
3
1
2
3
4
5
Thank you very much!
Something like this should work in MySQL:
SELECT a.*
FROM (
SELECT ... FROM ... ORDER BY ...
) a
UNION ALL
SELECT b.*
FROM (
SELECT ... FROM ... ORDER BY ...
) b
to return rows in an order we'd like them returned. i.e. MySQL seems to honor the ORDER BY clauses inside the inline views.
But, without an ORDER BY clause on the outermost query, the order that the rows are returned is not guaranteed.
If we need the rows returned in a particular sequence, we can include an ORDER BY on the outermost query. In a lot of use cases, we can just use an ORDER BY on the outermost query to satisfy the results.
But when we have a use case where we need all the rows from the first query returned before all the rows from the second query, one option is to include an extra discriminator column in each of the queries. For example, add ,'a' AS src in the first query, ,'b' AS src to the second query.
Then the outermost query could include ORDER BY src, name, to guarantee the sequence of the results.
FOLLOWUP
In your original query, the ORDER BY in your queries is discarded by the optimizer; since there is no ORDER BY applied to the outer query, MySQL is free to return the rows in whatever order it wants.
The "trick" in query in my answer (above) is dependent on behavior that may be specific to some versions of MySQL.
Test case:
populate tables
CREATE TABLE foo2 (id INT PRIMARY KEY, role VARCHAR(20)) ENGINE=InnoDB;
CREATE TABLE foo3 (id INT PRIMARY KEY, role VARCHAR(20)) ENGINE=InnoDB;
INSERT INTO foo2 (id, role) VALUES
(1,'sam'),(2,'frodo'),(3,'aragorn'),(4,'pippin'),(5,'gandalf');
INSERT INTO foo3 (id, role) VALUES
(1,'gimli'),(2,'boromir'),(3,'elron'),(4,'merry'),(5,'legolas');
query
SELECT a.*
FROM ( SELECT s.id, s.role
FROM foo2 s
ORDER BY s.role
) a
UNION ALL
SELECT b.*
FROM ( SELECT t.id, t.role
FROM foo3 t
ORDER BY t.role
) b
resultset returned
id role
------ ---------
3 aragorn
2 frodo
5 gandalf
4 pippin
1 sam
2 boromir
3 elron
1 gimli
5 legolas
4 merry
The rows from foo2 are returned "in order", followed by the rows from foo3, again, "in order".
Note (again) that this behavior is NOT guaranteed. (The behavior we observer is a side effect of how MySQL processes inline views (derived tables). This behavior may be different in versions after 5.5.)
If you need the rows returned in a particular order, then specify an ORDER BY clause for the outermost query. And that ordering will apply to the entire resultset.
As I mentioned earlier, if I needed the rows from the first query first, followed by the second query, I would include a "discriminator" column in each query, and then include the "discriminator" column in the ORDER BY clause. I would also do away with the inline views, and do something like this:
SELECT s.id, s.role, 's' AS src
FROM foo2 s
UNION ALL
SELECT t.id, t.role, 't' AS src
FROM foo3 t
ORDER BY src, role
Don't use ORDER BY in an individual SELECT statement inside a UNION, unless you're using LIMIT with it.
The MySQL docs on UNION explain why (emphasis mine):
To apply ORDER BY or LIMIT to an individual SELECT, place the clause
inside the parentheses that enclose the SELECT:
(SELECT a FROM t1 WHERE a=10 AND B=1 ORDER BY a LIMIT 10) UNION
(SELECT a FROM t2 WHERE a=11 AND B=2 ORDER BY a LIMIT 10);
However, use of ORDER BY for individual SELECT statements implies
nothing about the order in which the rows appear in the final result
because UNION by default produces an unordered set of rows. Therefore,
the use of ORDER BY in this context is typically in conjunction with
LIMIT, so that it is used to determine the subset of the selected rows
to retrieve for the SELECT, even though it does not necessarily affect
the order of those rows in the final UNION result. If ORDER BY appears
without LIMIT in a SELECT, it is optimized away because it will have
no effect anyway.
To use an ORDER BY or LIMIT clause to sort or limit the entire UNION
result, parenthesize the individual SELECT statements and place the
ORDER BY or LIMIT after the last one. The following example uses both
clauses:
(SELECT a FROM t1 WHERE a=10 AND B=1)
UNION
(SELECT a FROM t2 WHERE a=11 AND B=2)
ORDER BY a LIMIT 10;
It seems like an ORDER BY clause like the following will get you what you want:
ORDER BY user_id, name
You just use one ORDER BY at the very end.
The Union turns two selects into one logical select. The order-by applies to the entire set, not to each part.
Don't use any parens either. Just:
SELECT 1 as Origin, blah blah FROM foo WHERE x
UNION ALL
SELECT 2 as Origin, blah blah FROM foo WHERE y
ORDER BY Origin, z
(SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 123
AND user_in IN (...))
UNION ALL
(SELECT id, user_id, other_id, name
FROM tablename
WHERE user_id = 456
AND user_id NOT IN (...)))
ORDER BY name
You can also simplify this query:
SELECT id, user_id, other_id, name
FROM tablename
WHERE (user_id = 123 AND user_in IN (...))
OR (user_id = 456 AND user_id NOT IN (...))
let's say I have the following Table:
ID, Name
1, John
2, Jim
3, Steve
4, Tom
I run the following query
SELECT Id FROM Table WHERE NAME IN ('John', 'Jim', 'Bill');
I want to get something like:
ID
1
2
NULL or 0
Is it possible?
How about this?
SELECT Id FROM Table WHERE NAME IN ('John', 'Jim', 'Bill')
UNION
SELECT null;
Start by creating a subquery of names you're looking for, then left join the subquery to your table:
SELECT myTable.ID
FROM (
SELECT 'John' AS Name
UNION SELECT 'Jim'
UNION SELECT 'Bill'
) NameList
LEFT JOIN myTable ON NameList.Name = myTable.Name
This will return null for each name that isn't found. To return a zero instead, just start the query with SELECT COALESCE(myTable.ID, 0) instead of SELECT myTable.ID.
There's a SQL Fiddle here.
The question is a bit confusing. "IN" is a valid operator in SQL and it means a match with any of the values (see here ):
SELECT Id FROM Table WHERE NAME IN ('John', 'Jim', 'Bill');
Is the same as:
SELECT Id FROM Table WHERE NAME = 'John' OR NAME = 'Jim' OR NAME = 'Bill';
In your answer you seem to want the replies for each of the values, in order. This is accomplished by joining the results with UNION ALL (only UNION eliminates duplicates and can change the order):
SELECT max(Id) FROM Table WHERE NAME = 'John' UNION ALL
SELECT max(Id) FROM Table WHERE NAME = 'Jim' UNION ALL
SELECT max(Id) FROM Table WHERE NAME = 'Bill';
The above will return 1 Id (the max) if there are matches and NULL if there are none (e.g. for Bill). Note that in general you can have more than one row matching some of the names in your list, I used "max" to select one, you may be better of in keeping the loop on the values outside the query or in using the (ID, Name) table in a join with other tables in your database, instead of making the list of ID and then using it.
Is it possible to group_concat records by distinct Ids:
GROUP_CONCAT(Column2 BY DISTINCT Column1)
I need to get the values from column2 by distinct values from column1. Because there are repeating values in column 2, that's why I can't use distinct on column2.
Any thoughts on this? Thanks!
EDIT 1
Sample Table Records:
ID Value
1 A
1 A
2 B
3 B
4 C
4 C
Using the GROUP_CONCAT the I wanted [GROUP_CONCAT(Value BY DISTINCT Id)], I will have an output:
A, B, B, C
EDIT 2
Somehow got my group_concat working:
GROUP_CONCAT(DISTINCT CONCAT(Id, '|', Value))
This will display the concatenated values by distinct id, just need to get rid of the Id somewhere. You can do it without the concat function, but I need the separator. This may not be a good answer but I'll post it anyway.
Try this one, (the simpliest way)
SELECT GROUP_CONCAT(VALUE)
FROM
(
SELECT DISTINCT ID, VALUE
FROM TableName
) a
SQLFiddle Demo
GROUP_CONCAT function support DISTINCT expression.
SELECT id, GROUP_CONCAT(DISTINCT value) FROM table_name GROUP BY id
This should work:
SELECT GROUP_CONCAT(value) FROM (SELECT id, value FROM table GROUP BY id, value) AS d