Is it possible to select distinct company names from the customer table but also displaying the iD's related?
at the minute I'm using
SELECT company,id, COUNT(*) as count FROM customers GROUP BY company HAVING COUNT(*) > 1;
which returns
MyDuplicateCompany1 64 2
MyDuplicateCompany2 20 3
MyDuplicateCompany6 175 2
but what I'm after is all the duplicate ID's for each.
so
CompanyName, TimesDuplicated, DuplicateId1, DuplicateId2, DuplicateId3
or a row for each so
MyDuplicateCompany1, DuplicateId1, TimesDuplicated
MyDuplicateCompany1, DuplicateId2, TimesDuplicated
MyDuplicateCompany2, DuplicateId1, TimesDuplicated
MyDuplicateCompany2, DuplicateId2, TimesDuplicated
MyDuplicateCompany2, DuplicateId3, TimesDuplicated
is this possible?
Not sure if this would be acceptable but there's a function in mySQL which allows you to combine multiple rows into one Group_Concat(Field), but show the distinct values for each record for columns specified (like ID in this case)
SELECT company
, COUNT(*) as count
, group_concat(ID) as DupCompanyIDs
FROM customers
GROUP BY company
HAVING COUNT(*) > 1;
SQL Fiddle
showing similar results with duplicate companies listed in one field.
If you need it in multiple columns or multiple rows, you could wrap the above as an inline view and inner join it back to customers on the name to list the duplicates and times duplicated.
You can use GROUP_CONCAT(id) to concat your id by comma, your query should be:
SELECT company, GROUP_CONCAT(id) as ids, COUNT(id) as cant FROM customers GROUP BY company HAVING cant > 1
You can test the query with this
CREATE TABLE IF NOT EXISTS `customers` (
`id` int(11) NOT NULL,
`company` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `customers` (`id`, `company`) VALUES
(1, 'MyDuplicateCompany1'),
(2, 'MyDuplicateCompany1'),
(3, 'MyDuplicateCompany1'),
(4, 'MyDuplicateCompany2'),
(5, 'MyDuplicateCompany2'),
(6, 'MyDuplicateCompany3'),
(7, 'MyDuplicateCompany3'),
(8, 'MyDuplicateCompany3'),
(9, 'MyDuplicateCompany3'),
(10, 'MyDuplicateCompany4');
Output:
Read more at:
http://monksealsoftware.com/mysql-group_concat-and-postgres-array_agg/
You are not looking for companies with more than 1 entry (GROUP BY company), but for duplicate company IDs (GROUP BY company, id):
SELECT company, id, COUNT(*)
FROM customers
GROUP BY company, id
HAVING COUNT(*) > 1;
This should give exactly what you're looking for without GROUP_CONCAT()
SELECT
company, id,
( SELECT COUNT(*) from customers AS b
WHERE a.company = b.company
) AS cnt
FROM customers AS a
GROUP BY company, id
HAVING cnt > 1
;
Note: GROUP_CONCAT does the same thing, just all in one row per company.
Related
I am relatively new to SQL and I am trying to extract rows where they have the highest values.
For example, the table look like this:
user_id fruits
1 apple
1 orange
2 apple
1 pear
I would like to extract the data such that it would look like this:
user_id fruits
1 3
If user_id 2 has 3 fruits, it should display:
user_id fruits
1 3
2 3
I can only manage to get the if I use LIMIT = 1 by DESC order, but that is not the right way to do it. Otherwise I am getting only:
user_id fruits
1 3
2 1
Not sure where to store the max value to put in the where clause. Appreciate any help, thank you
Use RANK():
WITH cte AS (
SELECT user_id, COUNT(*) AS cnt, RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM yourTable
GROUP BY user_id
)
SELECT user_id, cnt AS fruits
FROM cte
WHERE rnk = 1;
Here's one answer (with sample data):
CREATE TABLE something (user_id INT NOT NULL, fruits VARCHAR(10) NOT NULL, PRIMARY KEY (user_id, fruits));
INSERT INTO something VALUES (1, 'apple');
INSERT INTO something VALUES (1, 'orange');
INSERT INTO something VALUES (2, 'apple');
INSERT INTO something VALUES (1, 'pear');
INSERT INTO something VALUES (2, 'orange');
INSERT INTO something VALUES (2, 'pear');
SELECT user_id, COUNT(*) AS cnt
FROM something
GROUP BY user_id
HAVING COUNT(*) >= ALL (SELECT COUNT(*) FROM something GROUP BY user_id);
The table structure is as below,
My first SQL query is as below,
SELECT DISTINCT(IndustryVertical)
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City
ORDER
BY `IndustryVerticalCount` DESC
by running the above query I'm getting the below,
What I'm trying to achieve is to get the List of all the DISTINCT CITY with ONLY ONE MAX(IndustryVerticalCount) and IndustryVertical.
Tried several things with no hope.
Anyone, please guide me.
There're several records in each City values. what I'm trying to achieve is that getting,
All the distinct City Values
The MAX COUNT of industryVertical
Name of industryVertical
The record I'm getting is as below,
What I'm trying to get,
The above record is reference purpose. Here, you can see only distinct city values with only one the vertical name having max count.
Since you are using group by, it will automatically select only distinct rows. Since you are using group by on two columns, you will get rows in which only combination of both columns is distinct.
What you now have to do is use this resulting table, and perform a query on it to find the maximum count grouped by city.
SELECT IndustryVertical, IndustryVerticalCount, City from
( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl where IndustryVerticalCount IN (Select max(IndustryVerticalCount) from ( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl2 where tbl.City=tbl2.city)
This may not be the most efficient method, but I think it will work.
How about this? I think it should be worked:
DECLARE #DataSet TABLE (
City VARCHAR(50),
IndustryVertical VARCHAR(50),
IndustryVerticalCount INT
)
INSERT INTO #DataSet SELECT 'Bangalore', 'Consumer Internet', 279
INSERT INTO #DataSet SELECT 'Bangalore', 'Technology', 269
INSERT INTO #DataSet SELECT 'Bangalore', 'Logistics', 179
INSERT INTO #DataSet SELECT 'Mumbai', 'Technology', 194
INSERT INTO #DataSet SELECT 'Mumbai', 'Consumer Internet', 89
SELECT
table_a.*
FROM #DataSet table_a
LEFT JOIN #DataSet table_b
ON table_a.City = table_b.City
AND table_a.IndustryVerticalCount < table_b.IndustryVerticalCount
WHERE table_b.IndustryVerticalCount IS NULL
I think you simply want a HAVING clause:
SELECT r.IndustryVertical,
COUNT(*) AS IndustryVerticalCount,
r.City
FROM records r
WHERE r.City <> ''
GROUP BY r.IndustryVertical, r.City
HAVING COUNT(*) = (SELECT COUNT(*)
FROM records r2
WHERE r2.City = r.City
ORDER BY COUNT(*) DESC
LIMIT 1
)
ORDER BY IndustryVerticalCount DESC;
TABLE [tbl_hobby]
person_id (int) , hobby_id(int)
has many records. I want to get a SQL query to find all pairs of personid who have the same hobbies( same hobby_id ).
If A has hobby_id 1, B has too, if A doesn't have hobby_id 2, B doesn't have too, we will output A & B 's person_ids.
If A and B and C reach the limits, we output A & B , B & C, A & C.
I've finished in a very very very stupid method, multiple joins the table itself and multiple sub-queries. And of course be laughed by leader.
Is there any high performance method in a SQL for this question?
I have been thinking hard for this since 36 hrs ago......
sample data in mysql dump
CREATE TABLE `tbl_hobby` (
`person_id` int(11) NOT NULL,
`hobby_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `tbl_hobby` (`person_id`, `hobby_id`) VALUES
(1, 1),(1, 2),(1, 3),(1, 4),(1, 5),(2, 2),
(2, 3),(2, 4),(3, 1),(3, 2),(3, 3),(3, 4),
(4, 1),(4, 3),(4, 4),(5, 1),(5, 5),(5, 9),
(6, 2),(6, 3),(6, 4),(7, 1),(7, 3),(7, 7),
(8, 2),(8, 3),(8, 4),(9, 1),(9, 2),(9, 3),
(9, 4),(10, 1),(10, 5),(10, 9),(10, 11);
COMMIT;
Expert result: (2 and 6 and 8 same, 3 and 9 same)
2,6
2,8
6,8
3,9
Order of result records and order of the two number in one record is not important. Result record in one column or in two columns are all accepted since it can be easily concated or seperated.
Aggregate per person to get strings of their hobbies. Then aggregate per hobby list find out which belong to more than one person.
select hobbies, group_concat(person_id order by person_id) as persons
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) persons
group by hobbies
having count(*) > 1
order by hobbies;
This gives a a list of persons per hobby. Which is the easiest way to output a solution as we would otherwise have to build all possible pairs.
UPDATE: If you want pairs, you'll have to query the table twice:
select p1.person_id as person 1, p2.person_id as person2
from
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p1
join
(
select person_id, group_concat(hobby_id order by hobby_id) as hobbies
from tbl_hobby
group by person_id
) p2 on p2.person_id > p1.person_id and p2.hobbies = p1.hobbies
order by person1, person2;
Alternative version, without using any proprietary string handling:
select distinct t1.person_id, t2.person_id
from tbl_hobby t1
join tbl_hobby t2
on t1.person_id < t2.person_id
where 2 = all (select count(*)
from tbl_hobby
where person_id in (t1.person_id, t2.person_id)
group by hobby_id);
Perhaps less efficient, but portable!
my table:
drop table if exists new_table;
create table if not exists new_table(
obj_type int(4),
user_id varchar(30),
payer_id varchar(30)
);
insert into new_table (obj_type, user_id, payer_id) values
(1, 'user1', 'payer1'),
(1, 'user2', 'payer1'),
(2, 'user3', 'payer1'),
(1, 'user1', 'payer2'),
(1, 'user2', 'payer2'),
(2, 'user3', 'payer2'),
(3, 'user1', 'payer3'),
(3, 'user2', 'payer3');
I am trying to select all the payer id's whose obj_type is only one value and not any other values. In other words, even though each payer has multiple users, I only want the payers who are only using one obj_type.
I have tried using a query like this:
select * from new_table
where obj_type = 1
group by payer_id;
But this returns rows whose payers also have other user's with other obj_types. I am trying to get a result that looks like:
obj | user | payer
----|-------|--------
3 | user1 | payer3
3 | user2 | payer3
Thanks in advance.
That is actually easy:
SELECT player_id
FROM new_table
GROUP BY player_id
HAVING COUNT(DISTINCT obj_type) = 1
Having filters rows just like WHERE but it does so after the aggregation.
The difference is best explained by an example:
SELECT dept_id, SUM(salary)
FROM employees
WHERE salary > 100000
GROUP BY dept_id
This will give you the sum of the salaries of people earning more than 100000 each.
SELECT dept_id, SUM(salary)
FROM employees
GROUP BY dept_id
HAVINF salary > 100000
The second query will give you the departments where all employees together earn more than 100000 even if no single employee earns that much.
If you want to return all rows without grouping them you can use analytic functions:
SELECT * FROM (
SELECT obj_type,user_id,
payer_id,
COUNT(DISTINCT obj_type) OVER (PARTITION BY payer_id) AS distinct_obj_type
FROM new_table)
WHERE distinct_obj_type = 1
Or you can use exist with the query above:
SELECT *
FROM new_table
WHERE payer_id IN (SELECT payer_id
FROM new_table
GROUP BY payer_id
HAVING COUNT(DISTINCT obj_type) = 1)
I have the following table:
CREATE TABLE entries(
`id` INT UNSIGNED AUTO_INCREMENT,
`level` INT UNSIGNED,
`type` CHAR(2),
`attribute` INT UNSIGNED,
PRIMARY KEY(id)
);
From this table, I'm currently doing the same query for 3 different columns:
SELECT level, COUNT(*) FROM entries GROUP BY level;
SELECT type, COUNT(*) FROM entries GROUP BY type;
SELECT attribute, COUNT(*) FROM entries GROUP BY attribute;
I know I can use GROUP_CONCAT to get the DISTINCT entries for each of these in a single SQL call:
SELECT GROUP_CONCAT(DISTINCT level) AS levels, GROUP_CONCAT(DISTINCT type) AS types, GROUP_CONCAT(attribute) AS attributes FROM entries;
But can I manipulate this query to include the counts? OR is there a different way that I can get the distinct values and counts for these columns in a single call?
EDIT: here's some data to add to the table
INSERT INTO entries (level, type, attribute) VALUES (1, 'VA', 5), (1, 'CD', NULL), (NULL, 'VA', 3), (NULL, 'CD', NULL), (1, 'VA', 1);
And the sample output
LEVELS LEVEL_COUNTS TYPES TYPES_COUNTS ATTRIBUTES ATTRIBUTES_COUNTS
1 3 VA,CD 3,2 5,3,1 1,1,1
You can use the below query. The only things remaining are to add some column aliases, and to maybe add a condition to ignore rows where there is NULL.
SELECT *
FROM
(SELECT GROUP_CONCAT(lvlCount.level) as LEVELS,
GROUP_CONCAT(lvlCount.cnt) as LEVELS_COUNTS
FROM
(SELECT LEVEL,
COUNT(*) AS cnt
FROM entries where NOT(LEVEL IS NULL)
GROUP BY LEVEL
ORDER BY LEVEL DESC) AS lvlCount) AS LEVEL,
(SELECT GROUP_CONCAT(typeCount.type) as TYPES,
GROUP_CONCAT(typeCount.cnt) as TYPES_COUNTS
FROM
(SELECT TYPE,
COUNT(*) AS cnt
FROM entries where NOT(TYPE IS NULL)
GROUP BY TYPE
ORDER BY TYPE DESC) AS typeCount) AS TYPE,
(SELECT GROUP_CONCAT(attrCount.attribute) as ATTRIBUTES,
GROUP_CONCAT(attrCount.cnt) as ATTRIBUTES_COUNTS
FROM
(SELECT attribute,
COUNT(*) AS cnt
FROM entries where NOT(attribute IS NULL)
GROUP BY attribute
ORDER BY attribute DESC) AS attrCount) AS attribute;
SQLFiddle: http://sqlfiddle.com/#!2/4ea92/44