Number of different records (rows) in a table - mysql

quite often I have to count then number of different records in a table. However, in MySQL neither
select count(distinct *) from t;
nor
select count(distinct t.*) from t;
work. I know that I can work around that by
select count(*) as countdistinctrows
from (
select distinct * from t
) x;
but this is ugly. Is there really no way do ask for the number of distinct rows? By the way,
select distinct count(*) from t;
is not the answer since then distinct is applied to the number of rows in the table and thus gives the same as
select count(*) from t;

If you don't want to use this:
select count(*) as countdistinctrows
from (
select distinct * from t
) x;
the only alternative I can think of is to specify all the column names manually:
select count(distinct id, col1, col2, col3, ...)
from t;

Related

Selecting row with max date from multiple almost identical tables

If I can retrieve the most recent name for each id from a table in a MySQL database like so:
SELECT n.id, n.name, n.date
FROM $table AS n
INNER JOIN
(SELECT id, MAX(date) AS date
FROM $table GROUP BY id)
AS max
USING (id, date);
How could I retrieve the most recent name from three almost identical tables (call them $table, $table2, $table3)? They all share the same column structure and the id found from one table may or may not be present in the other two. Think of it as one large table split into three (but with two of them containing two extra columns that are irrelevant in this instance). Would UNION be the best solution? If so, is there a way to do it without a mile-long query?
Constraint:
id is not an auto-incrementing unique integer unfortunately
You can use union all. One slight simplification is the group_concat()/substring_index() trick:
select id, max(date) as date,
substring_index(group_concat(name order by date desc), ',', '') as MostRecentName
from (select t.* from $table1 t union all
select t.* from $table2 t union all
select t.* from $table3
) t
group by id;
This does make certain assumptions. The name cannot contain , (although it is easy enough to change the separator. In addition, the intermediate result for the group_concat() cannot exceed a certain threshold (which is determined by a user-settable system parameter).
You could try:
SELECT n.id, n.name, n.date
FROM table1 where id in (select max(id) from table1)
union
SELECT n.id, n.name, n.date
FROM table2 where id in (select max(id) from table2)
union
SELECT n.id, n.name, n.date
FROM table3 where id in (select max(id) from table3)
Every inner query selects the highest id from the table and then searches for the corresponding fields in the outer query.
This ended up being the only solution I could think of:
SELECT n.id, n.name, n.date FROM (
SELECT id, name, date FROM $table
UNION ALL
SELECT id, name, date FROM $table2
UNION ALL
SELECT id, name, date FROM $table3
) AS n INNER JOIN (
SELECT id, MAX(date) AS date FROM (
SELECT id, date FROM $table
UNION ALL
SELECT id, date FROM $table2
UNION ALL
SELECT id, date FROM $table3
) AS t
GROUP BY id
) AS max USING (id, date)

Select and order by an union

Is it possible to:
SELECT * FROM table1 , table2 ORDER BY (a UNION)
I tried that but doesn't work.
I looked on Google for some answers but got nothing and I don't know how to look anymore, what to search so this is my last solution: ask here. Maybe one of you knows a clause I don't and would help in my case. I don't know how else to think this query...
The union is made between two columns from two tables (or more). So i want to order every possible row by this new column made with union. Something like (so this will be generic) :
SELECT * FROM table1 , table2 ORDER BY ((SELECT col1 AS col FROM table1) UNION ALL (SELECT col2 AS col FROM table2) ORDER BY col DESC);
Try this query like that :-
SELECT * FROM(
SELECT * FROM table1
UNION
SELECT * FROM table2
) as tab ORDER BY col_name
If you want to do the union and then order, you can do:
select t1.*
from table1 t1
union
select t2.*
from table2 t2
order by a;
Notes:
Use union all rather than union, unless you specifically want to incur the overhead of removing duplicates.
The use of * implies that the two tables have the same columns in the same order (and compatible types).

How to do a count on a union query

I have the following query:
select distinct profile_id from userprofile_...
union
select distinct profile_id from productions_...
How would I get the count of the total number of results?
If you want a total count for all records, then you would do this:
SELECT COUNT(*)
FROM
(
select distinct profile_id
from userprofile_...
union all
select distinct profile_id
from productions_...
) x
you should use Union All if there are equals rows in both tables, because Union makes a distinct
select count(*) from
(select distinct profile_id from userprofile_...
union ALL
select distinct profile_id from productions_...) x
In this case, if you got a same Profile_Id in both tables (id is probably a number, so it's possible), then if you use Union, if you got Id = 1 in both tables, you will lose one row (it will appear one time instead of two)
This will perform pretty well:
select count(*) from (
select profile_id
from userprofile_...
union
select profile_id
from productions_...
) x
The use of union guarantees distinct values - union removes duplicates, union all preserves them. This means you don't need the distinct keyword (the other answers don't exploit this fact and end up doing more work).
Edited:
If you want to total number of different profile_id in each, where given values that appear in both table are considered different values, use this:
select sum(count) from (
select count(distinct profile_id) as count
from userprofile_...
union all
select count(distinct profile_id)
from productions_...
) x
This query will out-perform all other answers, because the database can efficiently count distinct values within a table much faster than from the unioned list. The sum() simply adds the two counts together.
These will not work if in one of the COUNT(*) the result is equals to 0.
This will be better:
SELECT SUM(total)
FROM
(
select COUNT(distinct profile_id) AS total
from userprofile_...
union all
select COUNT(distinct profile_id) AS total
from productions_...
) x
As omg ponies has already pointed out that there is no use of using distinct with UNION, you can use UNION ALL in your case.....
SELECT COUNT(*)
FROM
(
select distinct profile_id from userprofile_...
union all
select distinct profile_id from productions_...
) AS t1
Best solution is to add count of two query results. It will not be a problem if the table contains large number of records. And you don't need to use union query.
Ex:
SELECT (select COUNT(distinct profile_id) from userprofile_...) +
(select COUNT(distinct profile_id) from productions_...) AS total

distinct count(*)

How to do get distinct count(*) in MySQL.
for example, in table1 i have 10 million record, there are duplicate records in it.
I want to find out distinct count(*) from the table.
I know, I can do
select distinct * from table1
but, i don't want to fetch 10 million records, not even want to insert distinct records in other table like,
create table table2 select distinct * from table1
So, please help me with any other option.
Help from anyone welcome
SELECT COUNT(DISTINCT field) FROM table
or
SELECT COUNT(*) FROM table GROUP BY field;
(btw - this has been answered quite a few times elsewhere on this site)
Try using a subquery:
SELECT COUNT(*) FROM (SELECT DISTINCT * FROM table1) T1
Maybe like:
SELECT SUM(cnt) FROM ( SELECT COUNT(*) as cnt FROM tab GROUP BY some_value )

How do you store result row count in nested SELECT?

I have a MySQL query where I have a nested SELECT that returns an array to the parent:
SELECT ...
FROM ...
WHERE ... IN (SELECT .... etc)
I would like to store the number of returned results (row count) from the nested SELECT, but doing something like IN (SELECT count(...), columnA) does not work, as the IN expects just one result.
Is there a way to store the returned result count for later use within the parent statement?
You're probably going to have to select the results of your nested statement into a temporary table. Then you can do an IN and a count on it later. I'm more familiar with MS-SQL, but I think you should be able to do it like this:
CREATE TEMPORARY TABLE tmp_table AS
SELECT something
FROM your_table;
SELECT ...
FROM ...
WHERE ... IN (SELECT * FROM tmp_table);
SELECT count(*) FROM tmp_table;
If that doesn't work, you may have to provide full details to the temporary table creation statement as you would with a normal "CREATE TABLE". See here in the MySQL manual, and here for a similar example.
CREATE TEMPORARY TABLE tmp_table
(
tableid INT,
somedata VARCHAR(50)
);
INSERT INTO tmp_table
SELECT ...
FROM ...
SELECT ...
FROM ...
WHERE ... IN (SELECT * FROM tmp_table);
SELECT count(*) FROM tmp_table;
Rich
You mentioned in your comment that your query look like this:
SELECT
tabA.colA,
tabA.colB
FROM tabA
WHERE tabA.colA IN ( SELECT tabA.colA FROM tabA WHERE tabA.colB = 1 )
I might be missing something, but you don't need a subquery for this. Why don't you do it in a regular where condition:
SELECT
tabA.colA,
tabA.colB,
FROM tabA
WHERE tabA.colB = 1
You can use IN predicate for multiple columns like this:
SELECT *
FROM table
WHERE (col1, col2) IN
(
SELECT col3, col4
FROM othertable
)
If you want to select COUNT(*) along with each value, use this:
SELECT colA, colB, cnt
FROM (
SELECT COUNT(*) AS cnt
FROM tabA
WHERE colB = 1
) q,
tabA
WHERE colB = 1