The table structure is as below,
My first SQL query is as below,
SELECT DISTINCT(IndustryVertical)
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City
ORDER
BY `IndustryVerticalCount` DESC
by running the above query I'm getting the below,
What I'm trying to achieve is to get the List of all the DISTINCT CITY with ONLY ONE MAX(IndustryVerticalCount) and IndustryVertical.
Tried several things with no hope.
Anyone, please guide me.
There're several records in each City values. what I'm trying to achieve is that getting,
All the distinct City Values
The MAX COUNT of industryVertical
Name of industryVertical
The record I'm getting is as below,
What I'm trying to get,
The above record is reference purpose. Here, you can see only distinct city values with only one the vertical name having max count.
Since you are using group by, it will automatically select only distinct rows. Since you are using group by on two columns, you will get rows in which only combination of both columns is distinct.
What you now have to do is use this resulting table, and perform a query on it to find the maximum count grouped by city.
SELECT IndustryVertical, IndustryVerticalCount, City from
( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl where IndustryVerticalCount IN (Select max(IndustryVerticalCount) from ( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl2 where tbl.City=tbl2.city)
This may not be the most efficient method, but I think it will work.
How about this? I think it should be worked:
DECLARE #DataSet TABLE (
City VARCHAR(50),
IndustryVertical VARCHAR(50),
IndustryVerticalCount INT
)
INSERT INTO #DataSet SELECT 'Bangalore', 'Consumer Internet', 279
INSERT INTO #DataSet SELECT 'Bangalore', 'Technology', 269
INSERT INTO #DataSet SELECT 'Bangalore', 'Logistics', 179
INSERT INTO #DataSet SELECT 'Mumbai', 'Technology', 194
INSERT INTO #DataSet SELECT 'Mumbai', 'Consumer Internet', 89
SELECT
table_a.*
FROM #DataSet table_a
LEFT JOIN #DataSet table_b
ON table_a.City = table_b.City
AND table_a.IndustryVerticalCount < table_b.IndustryVerticalCount
WHERE table_b.IndustryVerticalCount IS NULL
I think you simply want a HAVING clause:
SELECT r.IndustryVertical,
COUNT(*) AS IndustryVerticalCount,
r.City
FROM records r
WHERE r.City <> ''
GROUP BY r.IndustryVertical, r.City
HAVING COUNT(*) = (SELECT COUNT(*)
FROM records r2
WHERE r2.City = r.City
ORDER BY COUNT(*) DESC
LIMIT 1
)
ORDER BY IndustryVerticalCount DESC;
Related
========================================================
this is the sample db
I just want to get user who has both 2 and 14 in skills column. The answer should be "2"
Try this:
SELECT seekerID
FROM mytable
WHERE skillID IN (2, 14)
GROUP BY seekerID
HAVING COUNT(DISTINCT skillID) = 2
DISTINCT keyword is necessary only in case skillID values can occur multiple times for a single seekerID.
The easiest way to do this would be
select seekerID, count(*) as cnt
from table_name
where skillid in (2,14)
group by seekerID
having cnt = 2
use this:
select seekerID from table_name where skillid="2" and seekerID = ( select author from table_name where skillid="14")
Is it possible to select distinct company names from the customer table but also displaying the iD's related?
at the minute I'm using
SELECT company,id, COUNT(*) as count FROM customers GROUP BY company HAVING COUNT(*) > 1;
which returns
MyDuplicateCompany1 64 2
MyDuplicateCompany2 20 3
MyDuplicateCompany6 175 2
but what I'm after is all the duplicate ID's for each.
so
CompanyName, TimesDuplicated, DuplicateId1, DuplicateId2, DuplicateId3
or a row for each so
MyDuplicateCompany1, DuplicateId1, TimesDuplicated
MyDuplicateCompany1, DuplicateId2, TimesDuplicated
MyDuplicateCompany2, DuplicateId1, TimesDuplicated
MyDuplicateCompany2, DuplicateId2, TimesDuplicated
MyDuplicateCompany2, DuplicateId3, TimesDuplicated
is this possible?
Not sure if this would be acceptable but there's a function in mySQL which allows you to combine multiple rows into one Group_Concat(Field), but show the distinct values for each record for columns specified (like ID in this case)
SELECT company
, COUNT(*) as count
, group_concat(ID) as DupCompanyIDs
FROM customers
GROUP BY company
HAVING COUNT(*) > 1;
SQL Fiddle
showing similar results with duplicate companies listed in one field.
If you need it in multiple columns or multiple rows, you could wrap the above as an inline view and inner join it back to customers on the name to list the duplicates and times duplicated.
You can use GROUP_CONCAT(id) to concat your id by comma, your query should be:
SELECT company, GROUP_CONCAT(id) as ids, COUNT(id) as cant FROM customers GROUP BY company HAVING cant > 1
You can test the query with this
CREATE TABLE IF NOT EXISTS `customers` (
`id` int(11) NOT NULL,
`company` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `customers` (`id`, `company`) VALUES
(1, 'MyDuplicateCompany1'),
(2, 'MyDuplicateCompany1'),
(3, 'MyDuplicateCompany1'),
(4, 'MyDuplicateCompany2'),
(5, 'MyDuplicateCompany2'),
(6, 'MyDuplicateCompany3'),
(7, 'MyDuplicateCompany3'),
(8, 'MyDuplicateCompany3'),
(9, 'MyDuplicateCompany3'),
(10, 'MyDuplicateCompany4');
Output:
Read more at:
http://monksealsoftware.com/mysql-group_concat-and-postgres-array_agg/
You are not looking for companies with more than 1 entry (GROUP BY company), but for duplicate company IDs (GROUP BY company, id):
SELECT company, id, COUNT(*)
FROM customers
GROUP BY company, id
HAVING COUNT(*) > 1;
This should give exactly what you're looking for without GROUP_CONCAT()
SELECT
company, id,
( SELECT COUNT(*) from customers AS b
WHERE a.company = b.company
) AS cnt
FROM customers AS a
GROUP BY company, id
HAVING cnt > 1
;
Note: GROUP_CONCAT does the same thing, just all in one row per company.
We read values from a set of sensors, occasionally a reading or two is lost for a particular sensor , so now and again I run a query to see if all sensors have the same record count.
GROUP BY sensor_id HAVING COUNT(*) != xxx;
So I run a query once to visually get a value of xxx and then run it again to see if any vary.
But is there any clever way of doing this automatically in a single query?
You could do:
HAVING COUNT(*) != (SELECT MAX(count) FROM (
SELECT COUNT(*) AS count FROM my_table GROUP BY sensor_id
) t)
Or else group again by the count in each group (and ignore the first result):
SELECT count, GROUP_CONCAT(sensor_id) AS sensors
FROM (
SELECT sensor_id, COUNT(*) AS count FROM my_table GROUP BY sensor_id
) t
GROUP BY count
ORDER BY count DESC
LIMIT 1, 18446744073709551615
SELECT sensor_id,COUNT(*) AS count
FROM table
GROUP BY sensor_id
ORDER BY count
Will show a list of the sensor_id along with a count of all the records it has, you can then manually check to see if any vary.
SELECT * FROM (
SELECT sensor_id,COUNT(*) AS count
FROM table
GROUP BY sensor_id
) AS t1
GROUP BY count
Will show all the counts that vary, but the group by will lose information about which sensor_ids have which counts.
---EDIT---
Taken a bit from both mine and eggyal's answer and created this, for the count that is most frequent I call the id default, and then for any values that stand out I have given them separate rows. This way you maintain the readability of a table if you have many results Multi Row, but also have a simple one row column if all counts are the same One Row. If however you are happy with the concocted strings then go with eggyal's answer.
Might be a bit over the top but here goes:
select 'default' as id,t5.c1 as count from(
select id,count(*) as c1 from your_table group by id having count(*)=
(select t4.count from
(
select max(t3.count2) as max,t3.count as count from
(
select count(*) as count2,t2.count from
(
SELECT id,COUNT(*) AS count
FROM your_table
GROUP BY id
) as t2
GROUP BY count
) as t3
) as t4)) as t5 group by count
union all
select t5.id as id,t5.c1 as count from(
select id,count(*) as c1 from your_table group by id having count(*)<>
(select t4.count from
(
select max(t3.count2) as max,t3.count as count from
(
select count(*) as count2,t2.count from
(
SELECT id,COUNT(*) AS count
FROM your_table
GROUP BY id
) as t2
GROUP BY count
) as t3
) as t4)) as t5
I have a table with 3 columns: id, date and name. What I am looking for is to delete the records that have a duplicate name. The rule should be to keep the record that has the oldest date. For instance in the example below, there is 3 records with the name Paul. So I would like to keep the one that has the oldest date (id=1) and remove all the others (id = 4 and 6). I know how to make insert, update, etc queries, but here I do not see how to make the trick work.
id, date, name
1, 2012-03-10, Paul
2, 2012-03-10, James
4, 2012-03-12, Paul
5, 2012-03-11, Ricardo
6, 2012-03-13, Paul
mysql_query(?);
The best suggestion I can give you is create a unique index on name and avoid all the trouble.
Follow the steps as Peter Kiss said from 2 to 3. Then do this
ALTER Table tablename ADD UNIQUE INDEX name (name)
Then Follow 4 Insert everything from the temporary table to the original.
All the new duplicate rows, will be omitted
Select all the records what you want to keep
Insert them to a temporary table
Delete everything from the original table
Insert everything from the temporary table to the original
Like Matt, but without the join:
DELETE FROM `table` WHERE `id` NOT IN (
SELECT `id` FROM (
SELECT `id` FROM `table` GROUP BY `name` ORDER BY `date`
) as A
)
Without the first SELECT you will get "You can't specify target table 'table' for update in FROM clause"
Something like this would work:
DELETE FROM tablename WHERE id NOT IN (
SELECT tablename.id FROM (
SELECT MIN(date) as dateCol, name FROM tablename GROUP BY name /*select the minimum date and name, for each name*/
) as MyInnerQuery
INNER JOIN tablename on MyInnerQuery.dateCol = tablename.date
and MyInnerQuery.name = tablename.name /*select the id joined on the minimum date and the name*/
) /*Delete everything which isn't in the list of ids which are the minimum date fore each name*/
DELETE t
FROM tableX AS t
LEFT JOIN
( SELECT name
, MIN(date) AS first_date
FROM tableX
GROUP BY name
) AS grp
ON grp.name = t.name
AND grp.first_date = t.date
WHERE
grp.name IS NULL
DELETE FROM thetable tt
WHERE EXISTS (
SELECT *
FROM thetable tx
WHERE tx.thename = tt.thename
AND tx.thedate > tt. thedate
);
(note that "date" is a reserver word (type) in SQL, "and" name is a reserved word in some SQL implementations)
I have a table like this:
jobid, orderid
And with some data inside:
jobid, orderid
1245, 6767
1235, 9058
6783, 6767
4991, 6767
9512, 9058
5123, 1234
Now I want the following output:
jobid, orderid, orderid(total)
1245, 6767, 3
1235, 9058, 2
6783, 6767, 3
4991, 6767, 3
9512, 9058, 2
5123, 1234, 1
Now, the COUNT() doesn't work the way I want to, and I probably need some group by but I don't know how.
Thanks in advance.
It looks like you're trying to get rows which look like jobid, orderid, number of times that orderid appears. For that, you could use a subquery:
SELECT jobid, orderid,
(SELECT COUNT(*) FROM
MY_TABLE INR
WHERE INR.orderid = OTR.orderid) as "orderid(total)"
FROM MY_TABLE OTR
Why are doing it this way? You will be doing a lot of redundant countings and put a lot of unnecessary pressure on your server. I would do this with two queries:
SELECT jobid, orderid FROM my_table
to get the complete list, and then:
SELECT orderid, COUNT(*) FROM my_table GROUP BY orderid
to get the total count for each orderid. Then combine these two results in your application. This will be much faster than your solution.
SELECT jobid, orderid, count(orderid)
FROM sometable
GROUP BY orderid, jobid
SELECT t.jobid
, t.orderid
, grp.orderid_total
FROM
tableX AS t
JOIN
( SELECT orderid
, COUNT(*) AS orderid_total
FROM tableX
GROUP BY orderid
) AS grp
ON grp.orderid = t.orderid
select jobid, orderid, count(*) from table group by orderid;