I hope you can help me with this one. I've been looking for ways to set up a MySQL query that selects rows based on the number of times a certain value occurs, but have had no luck so far. I'm pretty sure i need to use count(*) somewhere, but i can only found how to count all values or all distinct values, instead of counting all occurences.
I have a table as such:
info setid
-- --
A 1
B 1
C 2
D 1
E 2
F 3
G 1
H 3
What i need is a query that will select all the lines where a setid occurs a certain number (x) of times.
So using x=2 should give me
C 2
E 2
F 3
H 3
because both setIds 2 and 3 each occur two times. Using x=1 or x = 3 should not give any results, and choosing x=4 should give me
A 1
B 1
D 1
G 1
Because only setid 1 occurs 4 times.
I hope you guys can help me. At this point i've been looking for the answer for so long that i'm not even sure this can be done in MySQL anymore. :)
select * from mytable
where setid in (
select setid from mytable
group by setid
having count(*) = 2
)
you can specify the # of times a setid needs to occur in the table in the having count(*) part of the subquery
Consider the following statement that uses an uncorrelated subquery:
SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);
The optimizer rewrites the statement to a correlated subquery:
SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);
If the inner and outer queries return M and N rows, respectively, the execution time becomes on the order of O(M×N), rather than O(M+N) as it would be for an uncorrelated subquery.
But this time the subquery in Fuzzy Tree's solution is complety superfluous:
SELECT
set_id,
GROUP_CONCAT(info ORDER BY info) infos
COUNT(*) total
FROM
tablename
GROUP_BY set_id
HAVING COUNT(*) = 2
Related
I am facing a problem with MySQL query which is a variant of "Id for row with max value". I am either getting error or incorrect result for all my trials.
Here is the table structure
Row_id
Group_id
Grp_col1
Grp_col2
Field_for_aggregate_func
Another_field_for_row
For all rows with a particular group_id, I want to group by fields Grp_col1, Grp_col2 then get max value of Field_for_aggregate_func and then corresponding value of Another_field_for_row.
Query I have tried is like below
SELECT c.*
FROM mytable as c left outer join mytable as c1
on (
c.group_id=c1.group_id and
c.Grp_col1 = c1.Grp_col1 and
c.Grp_col2 = c1.Grp_col2 and
c.Field_for_aggregate_func > c1.Field_for_aggregate_func
)
where c.group_id=2
Among alternative solutions for this problem I want a high performance solution as this will be used for large set of data.
EDIT: Here is the sample set of row and expected answer
Group_ID Grp_col1 Grp_col2 Field_for_aggregate_func Another_field_for_row
2 -- N 12/31/2015 35
2 -- N 1/31/2016 15 select 15 from group for max value 1/31/2016
2 -- Y 12/31/2015 5
2 -- Y 1/1/2016 15
2 -- Y 1/2/2016 25
2 -- Y 1/3/2016 30 select 30 from group for max value 1/3/2016
You can use a sub-query to find the maximums, then join that with the original table, along the lines of:
select m1.group_id, m1.grp_col1, m1.grp_col2, m1.another_field_for_row, max_value
from mytable m1, (
select group_id, grp_col1, grp_col2, max(field_for_aggregate_func) as max_value
from mytable
group by group_id, grp_col1, grp_col2) as m2
where m1.group_id=m2.group_id
and m1.grp_col1=m2.grp_col1
and m1.grp_col2=m2.grp_col2
and m1.field_for_aggregate_func=m2.max_value;
Watch out for when there is more than one max_value for the given grouping. You'll get multiple rows for that grouping. Fiddle here.
Try this.
See Fiddle demo here
http://sqlfiddle.com/#!9/9a3c26/8
Select t1.* from table1 t1 inner join
(
Select a.group_id,a.grp_col2,
A.Field_for_aggregate_func,
count(*) as rnum from table1 a
Inner join table1 b
On a.group_id=b.group_id
And a.grp_col2=b.grp_col2
And a.Field_for_aggregate_func
<=b.Field_for_aggregate_func
Group by a.group_id,
a.grp_col2,
a.Field_for_aggregate_func) t2
On t1.group_id=t2.group_id
And t1.grp_col2=t2.grp_col2
And t1.Field_for_aggregate_func
=t2.Field_for_aggregate_func
And t2.rnum=1
Here first I am assigning a rownumber in descending order based on date. The selecting all the records for that date.
I have a mySQL dataset that looks like this:
ID PARENT_ID VALUE
1 100 This comment should be approved
2 100 Y
3 101 Another approved comment
4 101 Y
5 102 This comment is not approved
6 102 N
I need to construct an SQL query to select the rows that have a matching parent_id and corresponding value of Y (but ignore the rows with single letters as a value in the result) to result in:
ID PARENT_ID VALUE
1 100 This comment should be approved
3 101 Another approved comment
My idea is to use GROUP BY to combine the columns, but I can't work out how to select based on the Y/N values.
There is possibly a solution here How do I select a row based on a priority value in another row? but I don't think it is asking quite the same question.
Any ideas?
Although you can express this as an aggregation, you can express this using exists:
select d.*
from dataset d
where d.value <> 'Y' and
exists (select 1
from dataset d2
where d2.parent_id = d.parent_id and d2.value = 'Y'
);
This version is probably more efficient.
First, if you possibly can, change your table schema. Your table is storing two kinds of data in the same field (yes no flags and comments). This breaks normality and will haunt you later.
But if its not your table to change, you will need to self join. Try this.
SELECT a.id, a.parent_Id, a.value
FROM table a inner join table b
ON a.parent_id =b.parent_id
WHERE a.value <> 'Y' and b.value ='Y'
Given these entries in a table table:
user entry
A 1
A 2
A 5
A 6
B 1
B 2
B 3
B 4
B 5
B 6
C 1
C 4
D 1
D 2
D 5
D 6
D 7
D 9
And we have a subset entries_A to work with, which is the array [1,2,5,6].
Problems:
Find all users that have the same entries [1,2,5,6] and more, e.g. [1,2,5,6,7] or [1,2,3,5,6].
Find all users that have a lot of the same entries (and more), e.g. [1,2,5,9] or [2,5,6,3].
The best solution to the first problem I could come up with, is the following select query:
SELECT DISTINCT user AS u FROM table WHERE EXISTS (SELECT * FROM table WHERE entry=1 AND user=u)
AND EXISTS(SELECT * FROM table WHERE entry=2 AND user=u)
AND EXISTS(SELECT * FROM table WHERE entry=5 AND user=u)
AND EXISTS(SELECT * FROM table WHERE entry=6 AND user=u)
On the other hand, I get a feeling there's some algebraic vector-problem lurking below the surface (especially for problem two) but I can't seem to wrap my head around it.
All ideas welcome!
I think the easiest way to perform this type of query is using aggregation and having. Here is an example.
To get A's that have exactly those four elements:
select user
from table
group by user
having sum(entry in (1,2,5,6)) > 0 and
count(distinct entry) = 4;
To get A's that have those four elements and perhaps others:
select user
from table
group by user
having sum(entry in (1,2,5,6)) > 0 and
count(distinct entry) >= 4;
To order users by the number of matches they have and the number of other matches:
select count(distinct case when entry in (1, 2, 5, 6) then entry end) as Matches,
count(distinct case when entry not in (1, 2, 5, 6) then entry end) as Others,
user
from table
group by user
order by Matches desc, Others;
For the first problem:
SELECT user FROM (
SELECT
DISTINCT user
FROM
table
WHERE entry IN (1,2,5,6)
) a JOIN table b ON a.user = b.user
GROUP BY a.user
HAVING COUNT(*) >= 4
For the second problem just decrease the count in the having clause.
This is how I would to your first query (though I think Gordon Linoff's answer is more efficient):
select distinct user from so s1
where not exists (
select * from so s2 where s2.entry in (1,2,5,6)
and not exists (
select * from so s3 where s2.entry = s3.entry and s1.user = s3.user
)
);
For the second problem, you would need to specify what a lot should mean... three, four, ...
Lets say I have a MySQL table that has the following entries:
1
2
3
2
5
6
7
6
6
8
When I do an "SELECT * ..." I get back all the entries. But I want to get back only these entries, that exist only once within the table. Means the rows with the values 2 (exists two times) and 6 (exists three times) have to be dropped completely out of my result.
I found a keyword DISTINCT but as far as I understood it only avoids entries are shown twice, it does not filters them completely.
I think it can be done somehow with COUNT, but all I tried was not really successful. So what is the correct SQL statement here?
Edit: to clarify that, the result I want to get back is
1
3
5
7
8
You can use COUNT() in combination with a GROUP BY and a HAVING clause like this:
SELECT yourCol
FROM yourTable
GROUP BY yourCol
HAVING COUNT(*) < 2
Example fiddle.
You want to mix GROUP BY and COUNT().
Assuming the column is called 'id' and the table is called 'table', the following statement will work:
SELECT * FROM `table` GROUP BY id HAVING COUNT(id) = 1
This will filter out duplicate results entirely (e.g. it'll take out your 2's and 6's)
Three ways. One with GROUP BY and HAVING:
SELECT columnX
FROM tableX
GROUP BY columnX
HAVING COUNT(*) = 1 ;
one with a correlated NOT EXISTS subquery:
SELECT columnX
FROM tableX AS t
WHERE NOT EXISTS
( SELECT *
FROM tableX AS t2
WHERE t2.columnX = t.columnX
AND t2.pk <> t.pk -- pk is the primary key of the table
) ;
and an improvement on the first way (if you have a primary key pk column and an index on (columnX, pk):
SELECT columnX
FROM tableX
GROUP BY columnX
HAVING MIN(pk) = MAX(pk) ;
select id from foo group by id having count(*) < 2;
I have a MySQL table like this
id Name count
1 ABC 1
2 CDF 3
3 FGH 4
using simply select query I get the values as
1 ABC 1
2 CDF 3
3 FGH 4
How I can get the result like this
1 ABC 1
2 CDF 3
3 FGH 4
4 NULL 0
You can see Last row. When Records are finished an extra row in this format
last_id+1, Null ,0 should be added. You can see above. Even I have no such row in my original table. There may be N rows not fixed 3,4
The answer is very simple
select (select max(id) from mytable)+1 as id, NULL as Name, 0 as count union all select id,Name,count from mytable;
This looks a little messy but it should work.
SELECT a.id, b.name, coalesce(b.`count`) as `count`
FROM
(
SELECT 1 as ID
UNION
SELECT 2 as ID
UNION
SELECT 3 as ID
UNION
SELECT 4 as ID
) a LEFT JOIN table1 b
ON a.id = b.id
WHERE a.ID IN (1,2,3,4)
UPDATE 1
You could simply generate a table that have 1 column preferably with name (ID) that has records maybe up 10,000 or more. Then you could simply join it with your table that has the original record. For Example, assuming that you have a table named DummyRecord with 1 column and has 10,000 rows on it
SELECT a.id, b.name, coalesce(b.`count`) as `count`
FROM DummyRecord a LEFT JOIN table1 b
ON a.id = b.id
WHERE a.ID >= 1 AND
a.ID <= 4
that's it. Or if you want to have from 10 to 100, then you could use this condition
...
WHERE a.ID >= 10 AND
a.ID <= 100
To clarify this is how one can append an extra row to the result set
select * from table union select 123 as id,'abc' as name
results
id | name
------------
*** | ***
*** | ***
123 | abc
Simply use mysql ROLLUP.
SELECT * FROM your_table
GROUP BY Name WITH ROLLUP;
select
x.id,
t.name,
ifnull(t.count, 0) as count
from
(SELECT 1 AS id
-- Part of the query below, you will need to generate dynamically,
-- just as you would otherwise need to generate 'in (1,2,3,4)'
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) x
LEFT JOIN YourTable t
ON t.id = x.id
If the id does not exist in the table you're selecting from, you'll need to LEFT JOIN against a list of every id you want returned - this way, it will return the null values for ones that don't exist and the true values for those that do.
I would suggest creating a numbers table that is a single-columned table filled with numbers:
CREATE TABLE `numbers` (
id int(11) unsigned NOT NULL
);
And then inserting a large amount of numbers, starting at 1 and going up to what you think the highest id you'll ever see plus a thousand or so. Maybe go from 1 to 1000000 to be on the safe side. Regardless, you just need to make sure it's more-than-high enough to cover any possible id you'll run into.
After that, your query can look like:
SELECT n.id, a.*
FROM
`numbers` n
LEFT JOIN table t
ON t.id = n.id
WHERE n.id IN (1,2,3,4);
This solution will allow for a dynamically growing list of ids without the need for a sub-query with a list of unions; though, the other solutions provided will equally work for a small known list too (and could also be dynamically generated).