Group By Two Tables - mysql

I have two tables with identical schema. I want to get a count of all the people with a given surname in both tables, and have found I can do it like this:
SELECT surname, count(*) AS cnt
FROM
(
SELECT surname
FROM people.NorthKorea
UNION ALL
SELECT surname
FROM peopleGlobal.NorthKorea
) AS t
GROUP BY surname
ORDER BY cnt DESC
This is fine for small tables, but I have tables with up to 250 million rows, so was wondering if there may be a more efficient way of doing this? Such as INSERTING the result of the COUNT from one table into a table, and then updating / inserting (REPLACE?) the result of the COUNT on the second table.
N.B. I actually want to store the result of the COUNT on both tables in another table.

An index on the surname column should help a lot. I would try with this query, if there are a lot more rows than surnames I expect it to run faster:
SELECT surname, SUM(cnt)
FROM
(
SELECT surname, COUNT(*) as cnt
FROM people.NorthKorea
GROUP BY surname
UNION ALL
SELECT surname, COUNT(*) as cnt
FROM peopleGlobal.NorthKorea
GROUP BY surname
)
GROUP BY surname
ORDER BY cnt DESC

Related

Need a field in SELECT DISTINCT but I do not want it to be printed

I need to have the ID field in the SELECT DISTINCT in order to differentiate 2 cases: duplicates from not duplicates but namesake.
In other words you may have the same person duplicated many times and people with same name and surname in the same db.
If I do not place the ID field in the SELECT, the query returns duplicates and namesakes.
I have to place the ID to eliminate duplicates only. But at the same time, I would like not to print the ID. IS this possible without using the group by ID?
SELECT DISTINCT ID, Name, Surname
FROM (SUBQUERY THAT RETURNS DUPLICATES)
Sure:
Select c.Name, c.Surname
From (
SELECT DISTINCT ID, Name, Surname
FROM (SUBQUERY THAT RETURNS DUPLICATES)
) as c;
A simple way a select wrapper
select Name, Surname from (
SELECT DISTINCT
ID
, Name
, Surname
FROM (SUBQUERY THAT RETURNS DUPLICATES) ) T

Overall Unique count from two tables in MySQL

I have two tables, both having column a device_id column that I want to count. For the purposes of demonstration, the schema looks like:
Table 1: 'id', 'save_val', 'device_id_major'
Table 2: 'id', 'save_val', 'location', 'device_id_team'
Table 1 could have many of the same 'device_id_major'.
I basically want to get the unique device_id's from both tables, then from that result set, get the count of unique device_id's (the same device_id can appear in both tables).
Is this possible in one query?
select distinct aa.device_id, count(*)
from(select distinct device_id from table1
union all
select distinct device_id from table2) as aa
group by device_id
order by device_id
Or something like... As I don't have the schema to hand, I can't fully validate it.
SELECT count(DISTINCT aa.id)
FROM (SELECT DISTINCT major_id AS id FROM `major`
UNION ALL
SELECT DISTINCT team_id AS id FROM `team`)
AS aa
This seems to do the trick.
You could use a query that takes the UNION of both tables, then SELECT the unique values.

SQL Query - How to find out how which employees worked at more than one store

I have a relationship table such that it has
employeeID | storeID
What would be the query to find out which employees worked at more than one store?
SELECT employeeID WHERE ???
And possibly also list each different stores just once per employee...
Use group by and having, as in:
select employeeID, count(*) from table group by employeeID having count(distinct storeID) > 1
This will give you the employees working at more than one store. Use this as a subquery to list the stores for each such employee.
you can try -
select distinct employeeID,StoreID from table1
where storeID in
(
select storeID from table1 group by storeID having count(distinct employeeID) >1
)
cor storing count and showing store ID also in one query you can use following query..
select a.employeeID,a.storeID,b.cnt
from table1 a,
(select employeeID,count(*) cnt
from table1
group by employeeID
having count(distinct storeID) >1) b
where a.employeID=b.employeeid

Mysql query to find distinct rows shows incorrect result

I wish to find the total number of distinct records in a table.
I have a table with the following columns
id, name, product, rating, manufacturer price
This has around 128 rows with some duplicates based on different column names.
I only want to select distinct rows:
select distinct name, product, rating, maufacturer, price from table
This returns 47 rows
For pagination purposes, I need to find the total number of distinct records, so I have another satatement:
select distinct count(name), product, rating, maufacturer, price from table
But this returns 128 instead of 47.
How can I get the total number of distinct rows? Any help will be much appreciated. Thanks
You have the distinct and count reversed.
SELECT COUNT(DISTINCT column_name) FROM table_name
Also, I would drop the extra fields when counting, your results will be unexpected for those other fields.
It is not quite clear if you want to get the count in the SAME query with the results or if you want to run a different query. Here go both solutions. In the result as a new column:
select distinct name, product, rating, manufacturer, price, (
select count(*) from (
select distinct name, product, rating, manufacturer, price from table1
) as resultCount) as resultCount
from table1
Notice the previous solution will repeat the count(*) for each row, which is not very efficient, not even visually appealing. Try running two queries one getting the actual data and the other one to get the amount of records in the table that match that data:
select distinct name, product, rating, manufacturer, price from table1
select count(*) from (
select distinct name, product, rating, manufacturer, price from table1
) as result
Hope this helps
Try adding GROUP BY name, product, rating, maufacturer, price clause
It would require running your actual query TWICE... an INNER for distinct and then get the count of those as a single row returned, and then join that to the original select distinct...
select distinct
t1.product,
t1.rating,
t1.maufacturer,
t1.price,
JustTheCount.DistCnt
from
table t1,
( select count(*) as DistCnt
from ( select distinct
t2.product,
t2.rating,
t2.maufacturer,
t2.price
from
table t2 )
) JustTheCount
In the following query, you're getting rows with distinct names since the DISTINCT clause precedes the name column:
SELECT DISTINCT name, product, rating, maufacturer, price FROM table
However, to get the count of the same records, use the following format:
SELECT COUNT(DISTINCT name) FROM table
Notice that DISTINCT goes inside of the COUNT function so that you're counting the distinct names. You probably don't want to include the other columns in the count query because they will be a random sample from the set. Of course, if you want a random sample, then include them.
Most applications will run the count query first, followed by the query to return the results. Also keep in mind that COUNT(*) is only an estimate, and the value may differ from the actual number of records returned.
SELECT DISTINCT COUNT(name), product FROM table isn't even a valid query in MySQL 4.x. You can't mix aggregate and non-aggregate columns. IN 5.x, it'll run, but the values for the non aggregate columns will be a random sample from the set.
At the risk of sparking some flames here.. you could always use:
SQL_CALC_FOUND_ROWS as the first part of your SQL. This is very mysql specific though.
http://dev.mysql.com/doc/refman/5.0/en/information-functions.html
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();

Count multiple appearances in DISTINCT statement with two parameters?

I am trying to filter (or count) companies with multiple contact persons from a table containing a company_id and a person_id. Currently I just do this:
SELECT DISTINCT company_id,person_id FROM mytable GROUP BY company_id ORDER BY company_id
as well as
SELECT DISTINCT company_id FROM mytable
The first query returns a couple of rows more. Hence it is obvious that there are companies with multiple contact persons. From the different row count between the two queries I can even tell how many of them. Though I´d like to know how I can select exactly those companies that have more than one person_id assigned.
Thx in advance for any help!
How about this?
SELECT company_id, COUNT(DISTINCT person_id)
FROM mytable
GROUP BY company_id
HAVING COUNT(DISTINCT person_id) > 1
SELECT
company_id
FROM
mytable
GROUP BY
company_id
HAVING
COUNT(person_id) > 1
ORDER BY
company_id