How to count occurrences with derived tables in SQL? - mysql

I have this very simple table:
CREATE TABLE MyTable
(
Id INT(6) PRIMARY KEY,
Name VARCHAR(200) /* NOT UNIQUE */
);
If I want the Name(s) that is(are) the most frequent and the corresponding count(s), I can neither do this
SELECT Name, total
FROM table2
WHERE total = (SELECT MAX(total) FROM (SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) table2);
nor this
SELECT Name, total
FROM (SELECT Name, COUNT(*) AS total FROM MyTable GROUP BY Name) table1
WHERE total = (SELECT MAX(total) FROM table1);
Also, (let's say the maximum count is 4) in the second proposition, if I replace the third line by
WHERE total = 4;
it works.
Why is that so?
Thanks a lot

You can try the following:
WITH stats as
(
SELECT Name
,COUNT(id) as count_ids
FROM MyTable
GROUP BY Name
)
SELECT Name
,count_ids
FROM
(
SELECT Name
,count_ids
,RANK() OVER(ORDER BY count_ids DESC) as rank_ -- this ranks all names
FROM stats
) s
WHERE rank_ = 1 -- the most popular ```
This should work in TSQL.

Your queries can't be executed because "total" is no column in your table. It's not sufficient to have it within a sub query, you also have to make sure the sub query will be executed, produces the desired result and then you can use this.
You should also consider to use a window function like proposed in Dimi's answer.
The advantage of such a function is that it can be much easier to read.
But you need to be careful since such functions often differ depending on the DB type.
If you want to go your way with a sub query, you can do something like this:
SELECT name, COUNT(name) AS total FROM myTable
GROUP BY name
HAVING COUNT(name) =
(SELECT MAX(sub.total) AS highestCount FROM
(SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) sub);
I created a fiddle example which shows both queries mentioned here will produce the same and correct result:
db<>fiddle

Related

MySQL Rollup format

How do I format the "total" row of a rollup?
Backgroup
I have a MySQL select statement that is using group by with rollup it works however, for formatting reasons I need to identify what row is a "detail" and what row is a "total." Doing this by a simple RowType column that is a 1 or a zero. I figured this would work:
select
if (MyId is null, 1,0) as RowType,
MyId,
sum(Quantity) as Quantity
from MyTable
group by MyId with rollup
This does not work. However, If I create a view of that select statement then select that view and do this it does:
Create view MyView as
select
MyId,
Sum(Quantity) as Quantity
from MyTable
group by MyId with rollup;
select
if (MyId is null, 1,0) as RowType,
MyId,
Quantity
from MyView;
Is there a better way? I am going to have to do this for a fair amount of queries and maintaining two sets is good way to have errors.
edit: screwed up my second select code and fixed it
You can put the grouped query into a subquery.
This is needed because grouping doesn't happen after selecting, so the SELECT list can't refer to the value created by WITH ROLLUP in the same query.
select
if(MyId is null, 1,0) as RowType, MyId, Quantity
FROM (
SELECT MyId, sum(Quantity) as Quantity
from MyTable
group by MyId with rollup
) AS x

How to display multiple rows for one id in MySql? [duplicate]

SELECT DISTINCT field1, field2, field3, ......
FROM table;
I am trying to accomplish the following SQL statement, but I want it to return all columns.
Is this possible?
Something like this:
SELECT DISTINCT field1, *
FROM table;
You're looking for a group by:
select *
from table
group by field1
Which can occasionally be written with a distinct on statement:
select distinct on field1 *
from table
On most platforms, however, neither of the above will work because the behavior on the other columns is unspecified. (The first works in MySQL, if that's what you're using.)
You could fetch the distinct fields and stick to picking a single arbitrary row each time.
On some platforms (e.g. PostgreSQL, Oracle, T-SQL) this can be done directly using window functions:
select *
from (
select *,
row_number() over (partition by field1 order by field2) as row_number
from table
) as rows
where row_number = 1
On others (MySQL, SQLite), you'll need to write subqueries that will make you join the entire table with itself (example), so not recommended.
From the phrasing of your question, I understand that you want to select the distinct values for a given field and for each such value to have all the other column values in the same row listed. Most DBMSs will not allow this with neither DISTINCT nor GROUP BY, because the result is not determined.
Think of it like this: if your field1 occurs more than once, what value of field2 will be listed (given that you have the same value for field1 in two rows but two distinct values of field2 in those two rows).
You can however use aggregate functions (explicitely for every field that you want to be shown) and using a GROUP BY instead of DISTINCT:
SELECT field1, MAX(field2), COUNT(field3), SUM(field4), ....
FROM table GROUP BY field1
If I understood your problem correctly, it's similar to one I just had. You want to be able limit the usability of DISTINCT to a specified field, rather than applying it to all the data.
If you use GROUP BY without an aggregate function, which ever field you GROUP BY will be your DISTINCT filed.
If you make your query:
SELECT * from table GROUP BY field1;
It will show all your results based on a single instance of field1.
For example, if you have a table with name, address and city. A single person has multiple addresses recorded, but you just want a single address for the person, you can query as follows:
SELECT * FROM persons GROUP BY name;
The result will be that only one instance of that name will appear with its address, and the other one will be omitted from the resulting table. Caution: if your fileds have atomic values such as firstName, lastName you want to group by both.
SELECT * FROM persons GROUP BY lastName, firstName;
because if two people have the same last name and you only group by lastName, one of those persons will be omitted from the results. You need to keep those things into consideration. Hope this helps.
That's a really good question. I have read some useful answers here already, but probably I can add a more precise explanation.
Reducing the number of query results with a GROUP BY statement is easy as long as you don't query additional information. Let's assume you got the following table 'locations'.
--country-- --city--
France Lyon
Poland Krakow
France Paris
France Marseille
Italy Milano
Now the query
SELECT country FROM locations
GROUP BY country
will result in:
--country--
France
Poland
Italy
However, the following query
SELECT country, city FROM locations
GROUP BY country
...throws an error in MS SQL, because how could your computer know which of the three French cities "Lyon", "Paris" or "Marseille" you want to read in the field to the right of "France"?
In order to correct the second query, you must add this information. One way to do this is to use the functions MAX() or MIN(), selecting the biggest or smallest value among all candidates. MAX() and MIN() are not only applicable to numeric values, but also compare the alphabetical order of string values.
SELECT country, MAX(city) FROM locations
GROUP BY country
will result in:
--country-- --city--
France Paris
Poland Krakow
Italy Milano
or:
SELECT country, MIN(city) FROM locations
GROUP BY country
will result in:
--country-- --city--
France Lyon
Poland Krakow
Italy Milano
These functions are a good solution as long as you are fine with selecting your value from the either ends of the alphabetical (or numeric) order. But what if this is not the case? Let us assume that you need a value with a certain characteristic, e.g. starting with the letter 'M'. Now things get complicated.
The only solution I could find so far is to put your whole query into a subquery, and to construct the additional column outside of it by hands:
SELECT
countrylist.*,
(SELECT TOP 1 city
FROM locations
WHERE
country = countrylist.country
AND city like 'M%'
)
FROM
(SELECT country FROM locations
GROUP BY country) countrylist
will result in:
--country-- --city--
France Marseille
Poland NULL
Italy Milano
SELECT c2.field1 ,
field2
FROM (SELECT DISTINCT
field1
FROM dbo.TABLE AS C
) AS c1
JOIN dbo.TABLE AS c2 ON c1.field1 = c2.field1
Great question #aryaxt -- you can tell it was a great question because you asked it 5 years ago and I stumbled upon it today trying to find the answer!
I just tried to edit the accepted answer to include this, but in case my edit does not make it in:
If your table was not that large, and assuming your primary key was an auto-incrementing integer you could do something like this:
SELECT
table.*
FROM table
--be able to take out dupes later
LEFT JOIN (
SELECT field, MAX(id) as id
FROM table
GROUP BY field
) as noDupes on noDupes.id = table.id
WHERE
//this will result in only the last instance being seen
noDupes.id is not NULL
Try
SELECT table.* FROM table
WHERE otherField = 'otherValue'
GROUP BY table.fieldWantedToBeDistinct
limit x
You can do it with a WITH clause.
For example:
WITH c AS (SELECT DISTINCT a, b, c FROM tableName)
SELECT * FROM tableName r, c WHERE c.rowid=r.rowid AND c.a=r.a AND c.b=r.b AND c.c=r.c
This also allows you to select only the rows selected in the WITH clauses query.
For SQL Server you can use the dense_rank and additional windowing functions to get all rows AND columns with duplicated values on specified columns. Here is an example...
with t as (
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r1' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r2' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r3' union all
select col1 = 'a', col2 = 'b', col3 = 'c', other = 'r4' union all
select col1 = 'c', col2 = 'b', col3 = 'a', other = 'r5' union all
select col1 = 'a', col2 = 'a', col3 = 'a', other = 'r6'
), tdr as (
select
*,
total_dr_rows = count(*) over(partition by dr)
from (
select
*,
dr = dense_rank() over(order by col1, col2, col3),
dr_rn = row_number() over(partition by col1, col2, col3 order by other)
from
t
) x
)
select * from tdr where total_dr_rows > 1
This is taking a row count for each distinct combination of col1, col2, and col3.
select min(table.id), table.column1
from table
group by table.column1
SELECT *
FROM tblname
GROUP BY duplicate_values
ORDER BY ex.VISITED_ON DESC
LIMIT 0 , 30
in ORDER BY i have just put example here, you can also add ID field in this
Found this elsewhere here but this is a simple solution that works:
WITH cte AS /* Declaring a new table named 'cte' to be a clone of your table */
(SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY val1 DESC) AS rn
FROM MyTable /* Selecting only unique values based on the "id" field */
)
SELECT * /* Here you can specify several columns to retrieve */
FROM cte
WHERE rn = 1
In this way can get 2 unique column with 1 query only
select Distinct col1,col2 from '{path}' group by col1,col2
you can increase your columns if need
Add GROUP BY to field you want to check for duplicates
your query may look like
SELECT field1, field2, field3, ...... FROM table GROUP BY field1
field1 will be checked to exclude duplicate records
or you may query like
SELECT * FROM table GROUP BY field1
duplicate records of field1 are excluded from SELECT
Just include all of your fields in the GROUP BY clause.
It can be done by inner query
$query = "SELECT *
FROM (SELECT field
FROM table
ORDER BY id DESC) as rows
GROUP BY field";
SELECT * from table where field in (SELECT distinct field from table)
SELECT DISTINCT FIELD1, FIELD2, FIELD3 FROM TABLE1 works if the values of all three columns are unique in the table.
If, for example, you have multiple identical values for first name, but the last name and other information in the selected columns is different, the record will be included in the result set.
I would suggest using
SELECT * from table where field1 in
(
select distinct field1 from table
)
this way if you have the same value in field1 across multiple rows, all the records will be returned.

using select statement in then clause of case

I'm trying to include select statement in the then of case statement but the output is not as expected. I know there is different method to do this but can it be done the way i'm trying to do.
Using the following example data:
create table example(name varchar(10));
insert into example values
('abc'),('bcd'),('xyz');
I have tried this query (here is the fiddle):
select
case when ((select * from example where name='abc')>=1)
then (select * from example where name='abc')
else (select count(*) from example)
end
from example
But it outputs
3
3
3
Expected output if name='abc' exist
name
abc
if not the count(*)
Thanks in advance
Your subquery in the example is (select * from example where name='abc') which is a result set, not a scalar value. Currently it "works" because it is comparing the only column in the table to the value 1 but if you had more than one column in the table it would error out. Perhaps you intended (select count(*) from example where name='abc')?
Similarly, the THEN clause in a case can only be used to provide a single column value. In order to do this, perhaps you meant the following:
select
case when exists (select * from example where name='abc')
then (select name from example where name='abc')
else (select count(*) from example)
end
from example
But even here you will get three rows and there is no correlation between the rows in example and the result set, so I am not really sure what you're trying to do. I imagine there is a higher purpose though so I will leave it at that.
This should do the trick
select distinct
case when ((select count(name) from example where name='abc')>=1)
then (select * from example where name='abc')
else (select count(*) from example)
end
from example
Let me know if it works.
Point 1:
For the query, you are trying, the from example in the last will cause to loop through all the records and fetch all the records. To restrict that, you have to remove that.
Point 2:
You can't combine multi row select * in a true condition with a single row count(*) in a false condition. You should limit to select a single row.
Example:
select
case when ( select count(*) from example where name='abc' ) >= 1
then ( select * from example where name='abc' limit 1 )
else ( select count(*) from example )
end as name
No need to bother with the complex queries.
SELECT COUNT(*) AS ct
FROM example
GROUP BY name = 'abc'
ORDER BY name = 'abc' DESC
LIMIT 1;
If you really want to use CASE just for the sake of using it:
SELECT
CASE name
WHEN 'abc' THEN 'abc'
ELSE 'others'
END AS name, COUNT(*) AS ct
FROM example
GROUP BY name = 'abc'
ORDER BY name = 'abc' DESC
LIMIT 1;
Try below query, which will work even you enter a second duplicate row as value 'abc'. Mostly above suggested queries will not work as you enter this duplicate row while as per your query condition (>=1), there can be multiple rows for name as 'abc'.
SELECT
CASE WHEN b.cnt>=1
THEN a.name
ELSE (SELECT COUNT(*) FROM EXAMPLE)
END
FROM (SELECT DISTINCT NAME FROM EXAMPLE WHERE NAME='abc') a
JOIN (SELECT NAME,COUNT(*) AS cnt FROM EXAMPLE WHERE NAME='abc') b
ON a.name=b.name

select 2nd row of every ID in mysql

I have a table :
ID | time
1 | 300
1 | 100
1 | 200
2 | 200
2 | 500
I want to get 2nd row for every ID
I know that I can get 1st row as
select ID,time from T group by ID;
But I don't know about how to get 2nd row for every ID.
I know about limit and offset clause in mysql, but can't figure out how to use them here.
How can I do it ?
EDIT : Actually, time is not ordered. I forgot to specify that. I have made an edit in the table.
i have just an idee how to make it but i couldnt fix it , maybe you can fix it. any suggest is appreciated to correct my query
first this to select the first row of each id.
SELECT min(id) id
FROM TableName t2
group by id
then select the min(id) which are not in the first query to select to min(id) (which is second row)
like that
SELECT min(id) id ,time
FROM TableName
WHERE id NOT IN (
SELECT min(id) id
FROM TableName
GROUP BY id
)
GROUP BY id
** as i said its just suggest . it returns me 0 values.if u fix it let me edit my post to be helpful
here a demo
SELECT ID, MAX(time) time
FROM
(
select ID, Time
from TableName a
where
(
select count(*)
from TableName as f
where f.ID = a.ID and f.time <= a.time
) <= 2
) s
GROUP BY ID
SQLFiddle Demo
SELECT x.*
FROM test x
JOIN test y
ON y.id = x.id
AND y.time >= x.time
GROUP
BY id,time
HAVING COUNT(*) = n;
Note that any entries with less than n results will be omitted
You cannot do this with the tables that you have. You could make a valiant attempt with:
select id, time
from (select id, time
from t
group by t
) t
where not exists (select 1 from t t2 where t2.id = t.id and t2.time = t.time)
group by id
That is, attempt to filter out the first row.
The reason this is not possible is because tables are inherently unordered, so there is not real definition of "second" in your tables. This gives the SQL engine the opportunity to rearrange the rows as it sees fit during processing -- which can result in great performance gains.
Even the construct that you are using:
select id, time
from t
group by id
is not guaranteed to return time from the first row. This is a (mis)feature of MySQL called Hidden Columns. It is really only intended for the case where all the values are the same. I will admit that in practice it seems to get the value from the first row, but you cannot guarantee that.
Probably your best solution is to select the data into a new table that has an auto-incrementing column:
create table newtable (
autoid int auto_increment,
id int,
time int
);
insert into newtable(id, time)
select id, time from t;
In practice, this will probably keep the same order as the original table, and you can then use the autoid to get the second row. I want to emphasize, though, the "in practice". There is no guarantee that the values are in the correct order, but they probably will be.

MySQL: check that a set of queries returns the same row count : : but I don't know what the count is

We read values from a set of sensors, occasionally a reading or two is lost for a particular sensor , so now and again I run a query to see if all sensors have the same record count.
GROUP BY sensor_id HAVING COUNT(*) != xxx;
So I run a query once to visually get a value of xxx and then run it again to see if any vary.
But is there any clever way of doing this automatically in a single query?
You could do:
HAVING COUNT(*) != (SELECT MAX(count) FROM (
SELECT COUNT(*) AS count FROM my_table GROUP BY sensor_id
) t)
Or else group again by the count in each group (and ignore the first result):
SELECT count, GROUP_CONCAT(sensor_id) AS sensors
FROM (
SELECT sensor_id, COUNT(*) AS count FROM my_table GROUP BY sensor_id
) t
GROUP BY count
ORDER BY count DESC
LIMIT 1, 18446744073709551615
SELECT sensor_id,COUNT(*) AS count
FROM table
GROUP BY sensor_id
ORDER BY count
Will show a list of the sensor_id along with a count of all the records it has, you can then manually check to see if any vary.
SELECT * FROM (
SELECT sensor_id,COUNT(*) AS count
FROM table
GROUP BY sensor_id
) AS t1
GROUP BY count
Will show all the counts that vary, but the group by will lose information about which sensor_ids have which counts.
---EDIT---
Taken a bit from both mine and eggyal's answer and created this, for the count that is most frequent I call the id default, and then for any values that stand out I have given them separate rows. This way you maintain the readability of a table if you have many results Multi Row, but also have a simple one row column if all counts are the same One Row. If however you are happy with the concocted strings then go with eggyal's answer.
Might be a bit over the top but here goes:
select 'default' as id,t5.c1 as count from(
select id,count(*) as c1 from your_table group by id having count(*)=
(select t4.count from
(
select max(t3.count2) as max,t3.count as count from
(
select count(*) as count2,t2.count from
(
SELECT id,COUNT(*) AS count
FROM your_table
GROUP BY id
) as t2
GROUP BY count
) as t3
) as t4)) as t5 group by count
union all
select t5.id as id,t5.c1 as count from(
select id,count(*) as c1 from your_table group by id having count(*)<>
(select t4.count from
(
select max(t3.count2) as max,t3.count as count from
(
select count(*) as count2,t2.count from
(
SELECT id,COUNT(*) AS count
FROM your_table
GROUP BY id
) as t2
GROUP BY count
) as t3
) as t4)) as t5