Get number of values that only appear once in a column - mysql

Firstly, if it is relevant, I'm using MySQL, though I assume a solution would work across DB products. My problem is thus:
I have a simple table with a single column. There are no constraints on the column. Within this column there is some simple data, e.g.
a
a
b
c
d
d
I need to get the number/count of values that only appear once. From the example above that would be 2 (since only b and c occur once in the column).
Hopefully it's clear I don't want DISTINCT values, but UNIQUE values. I have actually done this before, by creating an additional table with a UNIQUE constraint on the column and simply INSERTing to the new table from the old one, handling the duplicates accordingly.
I was hoping to find a solution that did not require the temporary table, and could somehow just be accomplished with a nifty SELECT.

Assuming your table is called T and your field is called F:
SELECT COUNT(F)
FROM (
SELECT F
FROM T
GROUP BY F
HAVING COUNT(*) = 1
) AS ONLY_ONCE

select count(*) from
(
select
col1, count(*)
from
Table
group by
Col1
Having
Count(Col1) = 1
)

just nest it a little...
select count( cnt ) from
( select count(mycol) cnt from mytab group by mycol )
where cnt = 1

select field1, count(field1) from my_table group by field1 having count(field1) = 1
select count(*) from (select field1, count(field1) from my_table group by field1 having count(field1) = 1)
first one will return the ones that are unique and second one will return the number of unique elements.

Could it be as simple as this:
Select count(*) From MyTable Group By MyColumn Where Count(MyColumn) = 1

This is what I did and it worked:
SELECT name
FROM people JOIN stars ON stars.person_id = people.id
JOIN movies ON movies.id = stars.movie_id
WHERE year = 2004
GROUP BY name, person_id ORDER BY birth;
note: I was working with several tables here.
CS50 Problem Set 7 (pset7) 9.sql fix!!

Related

How to use AVG() function after GROUP BY with CASE in MySQL [duplicate]

I am running this query on MySQL
SELECT ID FROM (
SELECT ID, msisdn
FROM (
SELECT * FROM TT2
)
);
and it is giving this error:
Every derived table must have its own alias.
What's causing this error?
Every derived table (AKA sub-query) must indeed have an alias. I.e. each query in brackets must be given an alias (AS whatever), which can the be used to refer to it in the rest of the outer query.
SELECT ID FROM (
SELECT ID, msisdn FROM (
SELECT * FROM TT2
) AS T
) AS T
In your case, of course, the entire query could be replaced with:
SELECT ID FROM TT2
I think it's asking you to do this:
SELECT ID
FROM (SELECT ID,
msisdn
FROM (SELECT * FROM TT2) as myalias
) as anotheralias;
But why would you write this query in the first place?
Here's a different example that can't be rewritten without aliases ( can't GROUP BY DISTINCT).
Imagine a table called purchases that records purchases made by customers at stores, i.e. it's a many to many table and the software needs to know which customers have made purchases at more than one store:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases)
GROUP BY customer_id HAVING 1 < SUM(1);
..will break with the error Every derived table must have its own alias. To fix:
SELECT DISTINCT customer_id, SUM(1)
FROM ( SELECT DISTINCT customer_id, store_id FROM purchases) AS custom
GROUP BY customer_id HAVING 1 < SUM(1);
( Note the AS custom alias).
I arrived here because I thought I should check in SO if there are adequate answers, after a syntax error that gave me this error, or if I could possibly post an answer myself.
OK, the answers here explain what this error is, so not much more to say, but nevertheless I will give my 2 cents, using my own words:
This error is caused by the fact that you basically generate a new table with your subquery for the FROM command.
That's what a derived table is, and as such, it needs to have an alias (actually a name reference to it).
Given the following hypothetical query:
SELECT id, key1
FROM (
SELECT t1.ID id, t2.key1 key1, t2.key2 key2, t2.key3 key3
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.key3 = 'some-value'
) AS tt
At the end, the whole subquery inside the FROM command will produce the table that is aliased as tt and it will have the following columns id, key1, key2, key3.
Then, with the initial SELECT, we finally select the id and key1 from that generated table (tt).

from the sample database , How could I make a query where I could search all the region ( region id) where Kivuto Id is duplicated?

I need to fetch the 3 lines as highlighted in the result with green i.e separate region id but same kivuto id.I need to rectify such products so that I could correct the kivuto id's
Try this.
select * from table_name
where kivuto_id in (
select email from table_name
group by kivuto_id
having count(*) > 1
)
You can refer to this as well: Find rows that have the same value on a column in MySQL
You can simply use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.kivuto_id = t.kivuto_id and
t2.region_id <> t.region_id
);
For performance, you want an index on (kivuto_id, region_id).

Obtain a list with the items found the minimum amount of times in a table

I have a MySQL table where I have a certain id as a foreign key coming from another table. This id is not unique to this table so I can have many records holding the same id.
I need to find out which ids are seen the least amount of times in this table and pull up a list containing them.
For example, if I have 5 records with id=1, 3 records with id=2 and 3 records with id=3, I want to pull up only ids 2 & 3. However, the data in the table changes quite often so I don't know what that minimum value is going to be at any given moment. The task is quite trivial if I use two queries but I'm trying to do it with just one. Here's what I have:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = MIN(SELECT COUNT(*) FROM table GROUP BY id)
If I substitute COUNT(*) = 3, then the results come up but using the query above gives me an error that MIN is not used properly. Any tips?
I would try with:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*) FROM table GROUP BY id ORDER BY COUNT(*) LIMIT 1);
This gets the minimum selecting the first row from the set of counts in ascendent order.
You need a double select in the having clause:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = (SELECT MIN(cnt) FROM (SELECT COUNT(*) as cnt FROM table GROUP BY id) t);
The MIN() aggregate function is suposed to take a column, not a query. So, I see two ways to solve this:
To properly write the subquery, or
To use temp variables
First alternative:
select id
from yourTable
group by id
having count(id) = (
select min(c) from (
select count(*) as c from yourTable group by id
) as a
)
Second alternative:
set #minCount = (
select min(c) from (
select count(*) as c from yourTable group by id
) as a
);
select id
from yourTable
group by id
having count(*) = #minCount;
You need to GROUP BY to produce a set of grouped values and additional select to get the MIN value from that group, only then you can match it against having
SELECT * FROM table GROUP BY id
HAVING COUNT(*) =
(SELECT MIN(X.CNT) AS M FROM(SELECT COUNT(*) CNT FROM table GROUP BY id) AS X)

Keep all records in "WHERE IN()" clause, even if they are not found

I have the following mysql query:
SELECT id, sum(views) as total_views
FROM table
WHERE id IN (1,2,3)
GROUP BY id
ORDER BY total_views ASC
If only id 1,3 are found in the database, i still want id 2 to appear, with total_views being set to 0.
Is there any way to do that? This cannot use any other table.
This query hard-codes the list of possible IDs using a sub-query consisting of unions... it then left joins this set of ids to the table containing the information to be counted.
This will preserve an ID in your results even if there are no occurrences:
SELECT ids.id, sum(views) as total_views
FROM (
SELECT 1 AS ID
UNION ALL SELECT 2 AS ID
UNION ALL SELECT 3 AS ID
) ids
LEFT JOIN table
ON table.ID = ids.ID
GROUP BY ids.id
ORDER BY total_views ASC
Alternately, if you had a numbers table, you could do the following query:
SELECT numbers.number, sum(views) as total_views
FROM
numbers
LEFT JOIN table
ON table.ID = ids.ID
WHERE numbers.number IN (1, 2, 3)
GROUP BY numbers.number
ORDER BY total_views ASC
Here's an alternative to Micheal's solution (not a bad solution, mind you -- even with "a lot" of ID's), so long as you're not querying against a cluster.
create temporary table __ids (
id int unsigned primary key
) engine=MEMORY;
insert into __ids (id) values
(1),
(2),
(3)
;
SELECT table.id, sum(views) as total_views
FROM __ids left join table using (id)
GROUP BY table.id
ORDER BY total_views ASC
And if your query becomes complex, I could even conceive of it running more efficiently this way. But, if I were you, I'd benchmark this option with Michael's ad-hoc UNION'ed table option using real data.
in #Michael's answer, if you do have a table with the ids you care about, you can use it as "ids" in place of Michael's in-line data.
Check this fiddle... http://www.sqlfiddle.com/#!2/a9392/3
Select B.ID, sum(A.views) sum from tableB B
left outer join tableA A
on B.ID = A.ID
group by A.ID
also check
http://www.sqlfiddle.com/#!2/a1bb7/1
try this
SELECT id
(CASE 1
IF EXISTS THEN views = mytable.views END
IF NOT EXIST THEN views = 0 END
CASE 2
IF EXISTS THEN views = mytable.views END
IF NOT EXIST THEN views = 0 END
CASE 3
IF EXISTS THEN views = mytable.views END
IF NOT EXIST THEN views = 0 END), sum(views) as total_views
FROM mytable
WHERE id IN (1,2,3)
GROUP BY id
ORDER BY total_views ASC
Does it have to be rows or could you pivot the data to give you one row and a column for every id?
SELECT
SUM(IF (id=1, views, 0)) views_1,
SUM(IF (id=2, views, 0)) views_2,
SUM(IF (id=3, views, 0)) views_3
FROM table

Check membership of elements of one column in another, mySQL

How would I go about counting values that appear in column 1, but not column 2. They are from the same table, without using subqueries or anything fancy. They may or may not share other common column values (like col 3 = col 4) but this doesnt matter.
I have it almost working with subqueries, but cannot figure how to do it without. The only problem (I think) is it will count something twice if the primary key (composed of col1,col3,col4) are different but col1 is the same.
SELECT DISTINCT COUNT(*)
FROM mytable t1
WHERE NOT EXISTS (
SELECT DISTINCT *
FROM mytable
WHERE t1.column1 = mytable.column2
);
But like I said, I'm trying to figure this without subqueries anyways
How about:
SELECT COUNT(*)
FROM mytable mt1
LEFT JOIN mytable mt2 ON mt1.column1 = mt2.column2
WHERE mt2.column IS NULL
Please see this:
SELECT
SUM(IF(column1 = column2, 0, 1)) as c
FROM
mytable