How can I count existing and non-existing values with MySQL? - mysql

I am new to MySQL.
I have a table with answer ids.
Answers can look like this:a1, a2, a3 ..., but due to some problems some are NULL, some are blank, and some are others like 1 a etc.
Now I want to calculate the count of ids with a1 a2 a3 distinctly. But how is it possible to do this leaving others like NULL, blanks and garbage.
The output should look like this
atype count
a1 45
a2 0
a3 56
If there is no row entry for a particular answer, the count should be 0.

Solution 1: Two queries
You should use a table that contains your desired (correct) answer types:
| id | answer |
---------------
| 1 | a1 |
| 2 | a2 |
etc.
Then you can count the results that actually exist in your table:
SELECT atype, COUNT( * ) cnt FROM answers JOIN mytable
ON mytable.atype=answers.answer GROUP BY answers.answer;
(Replace mytable with the appropriate table name).
Of course, this will only return existing results. To count the zero rows, you can look for answers that do not appear in your table:
SELECT answer, '0' AS cnt FROM answers WHERE answer NOT IN(
SELECT DISTINCT answer FROM answers JOIN mytable WHERE answer=mytable.atype );
Here is an example.
Solution 2: A counter table
Another way would be to use a counter table:
| id | answer | cnt |
---------------------
| 1 | a1 | 0 |
| 2 | a2 | 0 |
etc.
Then every time you want to count the results, do:
UPDATE answers SET cnt=0;
UPDATE answers SET cnt=
(SELECT cnt FROM
((SELECT answers.answer, COUNT(*) AS cnt
FROM answers JOIN mytable ON answers.answer=myTable.aType
GROUP BY answers.answer) AS tbl)
WHERE answers.answer=tbl.answer)
WHERE EXISTS
(SELECT cnt FROM
((SELECT answers.answer, COUNT(*) AS cnt
FROM answers JOIN mytable ON answers.answer=mytable.atype
GROUP BY answers.answer) AS tbl)
WHERE answers.answer=tbl.answer);
This will update the counter values in your answers table, and you can just SELECT * FROM answers ORDER BY answer to get your result.
Be warned, though: I believe the second version, while convenient, will take a lot more computing power than the first one, due to all the subqueries needed.
Here is this example (UPDATE statements are on the left side!)
Solution 3: Update upon write
The best and least performance hungry solution for use cases like yours, in my opinion, is to create a counter table like the one I described in #2, but update the counter values at the time users are answering the questions, instead of re-calculating all the entries everytime you want to know the count.
This can easily be done. Everytime a question is answered correctly, increase the counter in the answers table:
UPDATE answers SET cnt=cnt+1 WHERE answers.answer='a1';
And again, your query will be
SELECT * FROM answers ORDER BY answer;

select a.atype,count(*) as `Count`
from
(select 'a1' as atype union all
select 'a2' as atype union all
select 'a3' as atype )a
left join <your_table> b
on a.atype =b.atype
group by atype

I have tried to present all major choices, each query is followed by demo link. Link contains description
select AType,count(*) as Count from tb2
where atype!='' and atype is not null and atype!='0' group by atype
Link1 (Best One) To count answers of each (existing) type other than balank 0 and null
select (select count(id) from tb2 where atype='a1') as A1,(select count(id) from
tb2 where atype='a2') as A2,(select count(id) from tb2 where atype='a3') as A3;
Link2 (Simple and most suitable for you) count answers of types a1,a2 or a3 only (my older method to get similar results to link3)
select AType,count(*) as Count from tb2
where atype in('a1','a2','a3') group by atype
Link3 count answers of types a1,a2 or a3 only.
First and last link do not take into account those types which are totally not present in the data. For example, If there is no answer with type a2 then first and third query would tell only number of answers with type a1 and a3 and will not mention answers of atype=a2. However 2nd query does.

Related

GROUP BY inverse (mysql)

Is there any way to get the inverse of a group by statement in mysql? My use case is to delete all duplicates.
Say my table looks like this:
ID | columnA | ...
1 | A
2 | A
3 | A
4 | B
5 | B
6 | C
I want my result set to look like this:
ID | columnA | ...
2 | A
3 | A
5 | B
(Essentially this finds all duplicates leaving one behind. Could be used to purge all duplicate records down to 1, or to perform other analysis later).
One way is to take all but the first id for each value of ColumnA:
select t.*
from t
where t.id > (select min(t2.id) from t t2 where t2.columnA = t.columnA);
Your result seems
select max(id), columnA group by columnA
This should perform a lot better then inner select based queries.
SELECT
*
FROM
TABLE
QUALIFY
RANK() OVER (partition by columnA order by ID ASC ) = 1
EDIT : This apparently wont work in MySQL. Guess the only answer is to by a oracle license - or use another answer. ;)
I realized my own solution based on #scaisEdge response before he edited it. In need the opposite of my group by, so using a subquery:
SELECT * FROM mytable WHERE ID NOT IN (SELECT ID FROM mytable GROUP BY columnA);
I am confident this will help.
create table test.temptable select distinct * from YourTable;
truncate YourTable;
insert into YourTable select * from test.temptable ;

Select where field in values (if field in value not exist return as null)

I want to select a data in where clause. Say that I have $values: 1,2,3,4
And then I want to select rows from table in that values.
SELECT `date` from another_table where id in (1,2,3,4)
If another_table only have rows with id 1,2 and 3 only, it mean that the id of 4 is not exist. I want the id of 4 is still selected with return of null or nol or NEVER.
another_table
-------+-------------+
| id | date |
----------------------
1 Yesterday
2 Today
3 Tomorow
5 Today
The expected result would be
-------+-------------+
| id | date |
----------------------
1 Yesterday
2 Today
3 Tomorow
4 Never
How to do this?
Thanks in advance
As one option use an inline view as a rowsource, and perform an outer join operation. We can use an IF expression to check whether a matching row was returned from another_table, we know the id value will not be NULL if there was a matching row, the join predicate (in the ON clause) guarantees us that.)
As an example:
SELECT n.id
, IF(a.id IS NULL,'Never',a.date) AS `date`
FROM ( SELECT 1 AS id
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
) n
LEFT
JOIN another_table a
ON a.id = n.id
ORDER BY n.id
The inline view query gets executed, and the results are materialized into a temporary table (MySQL calls it a derived table). When the outer query runs, n is effectively a table containing 4 rows.
Obviously, if the list of id values has to change, you'd need to change the view definition. The SQL text for the inline view can be generated dynamically from an array in a programming language from an array.
For a large number of values, the inline view becomes unwieldy, and you'd get better performance from a table, rather than the view.

Select row based on identifier and value from another row

I have a mySQL dataset that looks like this:
ID PARENT_ID VALUE
1 100 This comment should be approved
2 100 Y
3 101 Another approved comment
4 101 Y
5 102 This comment is not approved
6 102 N
I need to construct an SQL query to select the rows that have a matching parent_id and corresponding value of Y (but ignore the rows with single letters as a value in the result) to result in:
ID PARENT_ID VALUE
1 100 This comment should be approved
3 101 Another approved comment
My idea is to use GROUP BY to combine the columns, but I can't work out how to select based on the Y/N values.
There is possibly a solution here How do I select a row based on a priority value in another row? but I don't think it is asking quite the same question.
Any ideas?
Although you can express this as an aggregation, you can express this using exists:
select d.*
from dataset d
where d.value <> 'Y' and
exists (select 1
from dataset d2
where d2.parent_id = d.parent_id and d2.value = 'Y'
);
This version is probably more efficient.
First, if you possibly can, change your table schema. Your table is storing two kinds of data in the same field (yes no flags and comments). This breaks normality and will haunt you later.
But if its not your table to change, you will need to self join. Try this.
SELECT a.id, a.parent_Id, a.value
FROM table a inner join table b
ON a.parent_id =b.parent_id
WHERE a.value <> 'Y' and b.value ='Y'

Adding one extra row to the result of MySQL select query

I have a MySQL table like this
id Name count
1 ABC 1
2 CDF 3
3 FGH 4
using simply select query I get the values as
1 ABC 1
2 CDF 3
3 FGH 4
How I can get the result like this
1 ABC 1
2 CDF 3
3 FGH 4
4 NULL 0
You can see Last row. When Records are finished an extra row in this format
last_id+1, Null ,0 should be added. You can see above. Even I have no such row in my original table. There may be N rows not fixed 3,4
The answer is very simple
select (select max(id) from mytable)+1 as id, NULL as Name, 0 as count union all select id,Name,count from mytable;
This looks a little messy but it should work.
SELECT a.id, b.name, coalesce(b.`count`) as `count`
FROM
(
SELECT 1 as ID
UNION
SELECT 2 as ID
UNION
SELECT 3 as ID
UNION
SELECT 4 as ID
) a LEFT JOIN table1 b
ON a.id = b.id
WHERE a.ID IN (1,2,3,4)
UPDATE 1
You could simply generate a table that have 1 column preferably with name (ID) that has records maybe up 10,000 or more. Then you could simply join it with your table that has the original record. For Example, assuming that you have a table named DummyRecord with 1 column and has 10,000 rows on it
SELECT a.id, b.name, coalesce(b.`count`) as `count`
FROM DummyRecord a LEFT JOIN table1 b
ON a.id = b.id
WHERE a.ID >= 1 AND
a.ID <= 4
that's it. Or if you want to have from 10 to 100, then you could use this condition
...
WHERE a.ID >= 10 AND
a.ID <= 100
To clarify this is how one can append an extra row to the result set
select * from table union select 123 as id,'abc' as name
results
id | name
------------
*** | ***
*** | ***
123 | abc
Simply use mysql ROLLUP.
SELECT * FROM your_table
GROUP BY Name WITH ROLLUP;
select
x.id,
t.name,
ifnull(t.count, 0) as count
from
(SELECT 1 AS id
-- Part of the query below, you will need to generate dynamically,
-- just as you would otherwise need to generate 'in (1,2,3,4)'
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) x
LEFT JOIN YourTable t
ON t.id = x.id
If the id does not exist in the table you're selecting from, you'll need to LEFT JOIN against a list of every id you want returned - this way, it will return the null values for ones that don't exist and the true values for those that do.
I would suggest creating a numbers table that is a single-columned table filled with numbers:
CREATE TABLE `numbers` (
id int(11) unsigned NOT NULL
);
And then inserting a large amount of numbers, starting at 1 and going up to what you think the highest id you'll ever see plus a thousand or so. Maybe go from 1 to 1000000 to be on the safe side. Regardless, you just need to make sure it's more-than-high enough to cover any possible id you'll run into.
After that, your query can look like:
SELECT n.id, a.*
FROM
`numbers` n
LEFT JOIN table t
ON t.id = n.id
WHERE n.id IN (1,2,3,4);
This solution will allow for a dynamically growing list of ids without the need for a sub-query with a list of unions; though, the other solutions provided will equally work for a small known list too (and could also be dynamically generated).

Getting the good row in a MySQL GROUP BY query

I have a MySql table named 'comments' :
id | date | movie_id | comment_value
1 2011/11/05 10 comment_value_1
2 2012/01/10 10 comment_value_2
3 2011/10/10 15 comment_value_3
4 2011/11/20 15 comment_value_4
5 2011/12/10 30 comment_value_5
And i try to have the most recent comment for each movie with the query :
SELECT MAX(date),id,date,movie_id,comment_value FROM comments GROUP BY movie_id
The MAX(date) return the most recent date, but the row associated (movie_id,id,comment_value,date) did not match. It returns the value of the first comment of the movie, like this :
MAX(date) | id | date | movie_id | comment_value
2012/01/10 1 2011/11/05 10 comment_value_1
2011/11/20 3 2011/10/10 15 comment_value_3
2011/12/10 5 2011/12/10 30 comment_value_5
So, my question is : how can i have the most recent comment for each movie, in only one query ( i'm actually using a second query to get the good comment)
Using two queries isn't so bad. Otherwise you can do something like
SELECT id, date, movie_id, comment_value FROM comments c JOIN
(SELECT movie_id, MAX(date) date FROM comments GROUP BY movie_id) x
ON x.movie_id=c.movie_id AND x.date=c.date GROUP BY movie_id;
Try this:
SELECT c1.*
FROM comments c1
LEFT JOIN comments c2 ON (c1.movie_id = c2.movie_id AND c1.date < c2.date)
WHERE c2.id IS NULL
Because of the join condition it will be able to join only the rows which don't contain the maximum date value, so filtering the rows with c2.id IS NULL gives you rows with maximum values.
create table comments (id int,movie_dt datetime,movie_id int,comment_value nvarchar(100))
insert into comments values (1,'2011/11/05',10,'comment_value_1')
insert into comments values (2,'2012/01/10',10,'comment_value_2')
insert into comments values (3,'2011/10/10',15,'comment_value_3')
insert into comments values (4,'2011/11/20',15,'comment_value_4')
insert into comments values (5,'2011/12/10',30,'comment_value_5')
select a.id, m.movie_dt, m.movie_id,a.comment_value
from comments a
inner join
(
SELECT MAX(movie_dt) movie_dt,movie_id
FROM comments
GROUP BY movie_id
) m on (a.movie_dt = m.movie_dt and a.movie_id = m.movie_id)
Is it possible to use a DATETIME field instead of just DATE? That would make the query a lot easier plus give better reporting capabilities. You can always aggregate the DATETIME field down to something more specific if needed.