MySQL count duplicate row - mysql

I have a table like this:
jobid, orderid
And with some data inside:
jobid, orderid
1245, 6767
1235, 9058
6783, 6767
4991, 6767
9512, 9058
5123, 1234
Now I want the following output:
jobid, orderid, orderid(total)
1245, 6767, 3
1235, 9058, 2
6783, 6767, 3
4991, 6767, 3
9512, 9058, 2
5123, 1234, 1
Now, the COUNT() doesn't work the way I want to, and I probably need some group by but I don't know how.
Thanks in advance.

It looks like you're trying to get rows which look like jobid, orderid, number of times that orderid appears. For that, you could use a subquery:
SELECT jobid, orderid,
(SELECT COUNT(*) FROM
MY_TABLE INR
WHERE INR.orderid = OTR.orderid) as "orderid(total)"
FROM MY_TABLE OTR

Why are doing it this way? You will be doing a lot of redundant countings and put a lot of unnecessary pressure on your server. I would do this with two queries:
SELECT jobid, orderid FROM my_table
to get the complete list, and then:
SELECT orderid, COUNT(*) FROM my_table GROUP BY orderid
to get the total count for each orderid. Then combine these two results in your application. This will be much faster than your solution.

SELECT jobid, orderid, count(orderid)
FROM sometable
GROUP BY orderid, jobid

SELECT t.jobid
, t.orderid
, grp.orderid_total
FROM
tableX AS t
JOIN
( SELECT orderid
, COUNT(*) AS orderid_total
FROM tableX
GROUP BY orderid
) AS grp
ON grp.orderid = t.orderid

select jobid, orderid, count(*) from table group by orderid;

Related

How to count occurrences with derived tables in SQL?

I have this very simple table:
CREATE TABLE MyTable
(
Id INT(6) PRIMARY KEY,
Name VARCHAR(200) /* NOT UNIQUE */
);
If I want the Name(s) that is(are) the most frequent and the corresponding count(s), I can neither do this
SELECT Name, total
FROM table2
WHERE total = (SELECT MAX(total) FROM (SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) table2);
nor this
SELECT Name, total
FROM (SELECT Name, COUNT(*) AS total FROM MyTable GROUP BY Name) table1
WHERE total = (SELECT MAX(total) FROM table1);
Also, (let's say the maximum count is 4) in the second proposition, if I replace the third line by
WHERE total = 4;
it works.
Why is that so?
Thanks a lot
You can try the following:
WITH stats as
(
SELECT Name
,COUNT(id) as count_ids
FROM MyTable
GROUP BY Name
)
SELECT Name
,count_ids
FROM
(
SELECT Name
,count_ids
,RANK() OVER(ORDER BY count_ids DESC) as rank_ -- this ranks all names
FROM stats
) s
WHERE rank_ = 1 -- the most popular ```
This should work in TSQL.
Your queries can't be executed because "total" is no column in your table. It's not sufficient to have it within a sub query, you also have to make sure the sub query will be executed, produces the desired result and then you can use this.
You should also consider to use a window function like proposed in Dimi's answer.
The advantage of such a function is that it can be much easier to read.
But you need to be careful since such functions often differ depending on the DB type.
If you want to go your way with a sub query, you can do something like this:
SELECT name, COUNT(name) AS total FROM myTable
GROUP BY name
HAVING COUNT(name) =
(SELECT MAX(sub.total) AS highestCount FROM
(SELECT Name, COUNT(*) AS total
FROM MyTable GROUP BY Name) sub);
I created a fiddle example which shows both queries mentioned here will produce the same and correct result:
db<>fiddle

Need list of data using DISTINCT, COUNT, MAX

The table structure is as below,
My first SQL query is as below,
SELECT DISTINCT(IndustryVertical)
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City
ORDER
BY `IndustryVerticalCount` DESC
by running the above query I'm getting the below,
What I'm trying to achieve is to get the List of all the DISTINCT CITY with ONLY ONE MAX(IndustryVerticalCount) and IndustryVertical.
Tried several things with no hope.
Anyone, please guide me.
There're several records in each City values. what I'm trying to achieve is that getting,
All the distinct City Values
The MAX COUNT of industryVertical
Name of industryVertical
The record I'm getting is as below,
What I'm trying to get,
The above record is reference purpose. Here, you can see only distinct city values with only one the vertical name having max count.
Since you are using group by, it will automatically select only distinct rows. Since you are using group by on two columns, you will get rows in which only combination of both columns is distinct.
What you now have to do is use this resulting table, and perform a query on it to find the maximum count grouped by city.
SELECT IndustryVertical, IndustryVerticalCount, City from
( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl where IndustryVerticalCount IN (Select max(IndustryVerticalCount) from ( SELECT IndustryVertical
, COUNT(IndustryVertical) AS IndustryVerticalCount
, City
FROM `records`
WHERE City!=''
GROUP
BY IndustryVertical
, City) as tbl2 where tbl.City=tbl2.city)
This may not be the most efficient method, but I think it will work.
How about this? I think it should be worked:
DECLARE #DataSet TABLE (
City VARCHAR(50),
IndustryVertical VARCHAR(50),
IndustryVerticalCount INT
)
INSERT INTO #DataSet SELECT 'Bangalore', 'Consumer Internet', 279
INSERT INTO #DataSet SELECT 'Bangalore', 'Technology', 269
INSERT INTO #DataSet SELECT 'Bangalore', 'Logistics', 179
INSERT INTO #DataSet SELECT 'Mumbai', 'Technology', 194
INSERT INTO #DataSet SELECT 'Mumbai', 'Consumer Internet', 89
SELECT
table_a.*
FROM #DataSet table_a
LEFT JOIN #DataSet table_b
ON table_a.City = table_b.City
AND table_a.IndustryVerticalCount < table_b.IndustryVerticalCount
WHERE table_b.IndustryVerticalCount IS NULL
I think you simply want a HAVING clause:
SELECT r.IndustryVertical,
COUNT(*) AS IndustryVerticalCount,
r.City
FROM records r
WHERE r.City <> ''
GROUP BY r.IndustryVertical, r.City
HAVING COUNT(*) = (SELECT COUNT(*)
FROM records r2
WHERE r2.City = r.City
ORDER BY COUNT(*) DESC
LIMIT 1
)
ORDER BY IndustryVerticalCount DESC;

MySQL: create new table using same table twice

I want to use 1 table to create a new table using 2 sets of queries.
To test out the code: http://sqlfiddle.com/#!9/02e3ff/5
Reference table:
Desired table:
They share the same order_id.
type = A, updated_at = pDate
type = B, updated_at = dDate
Query 1:
select t.order_id, t.updated_at as pDate, weekday(t.updated_at) from transactions t
where t.type = 'A' group by t.order_id
Query 2:
select t.order_id, max(t.updated_at) as dDate, weekday(max(t.updated_at)) from transactions t
where t.type= 'B'
group by t.order_id;
For type = A, I want to get the earliest updated_at date, while for type = B, I want to get the latest updated_at date.
Currently, I tried union but they give me 2 rows instead of the desired table.
How do I join or union these 2 queries to get the desired table?
Alternatively, is there a better method to do this? Thanks!
You can try something like this:
SELECT order_id, min(pDate) pDate, max(dDate) dDate FROM(
SELECT
order_id,
if(type='A',updated_at,null) pDate,
if(type='B',updated_at,null) dDate
FROM transactions
) as d
GROUP BY order_id
SQLFiddle

MYSQL - Group By / Order By not working

I have the following data inside a table:
id person_id item_id price
1 1 1 10
2 1 1 20
3 1 3 50
Now what I want to do is group by the item ID, select the id that has the highest value and take the price.
E.g. the sum would be: (20 + 50) and ignore the 10.
I am using the following:
SELECT SUM(`price`)
FROM
(SELECT id, person_id, item_id, price
FROM `table` tbl
INNER JOIN person p USING (person_id)
WHERE p.person_id = 1
ORDER BY id DESC) x
GROUP BY item_id
However, this query is still adding (10 + 20 + 50), which is obviously not what I need to have.
Any ideas to where I am going wrong?
Here is what you are trying to achieve. First you need grouping in a subquery and not in outer query. In outer query you need only sum:
SELECT SUM(`price`)
FROM
(SELECT MAX(price) as price
FROM `table` tbl
INNER JOIN person p USING (person_id)
WHERE p.person_id = 1
GROUP BY item_id) x
http://sqlfiddle.com/#!9/40803/5
SELECT SUM(t1.price)
FROM tbl t1
LEFT JOIN tbl t2
ON t1.person_id= t2.person_id
AND t1.item_id = t2.item_id
AND t1.id<t2.id
WHERE t1.person_id = 1
AND t2.id IS NULL;
I'm not sure if this is the only requirement you have. If so, try this.
SELECT SUM(price)
FROM
(SELECT MAX(price)
FROM table
WHERE person_id = 1
GROUP BY item_id)
First of all - you don't need the person table, because the other table already contains the person_id. So i removed it from the examples.
Your query returns a sum of prices for each item.
If you replace SELECT SUM(price) with SELECT item_id, SUM(price) you wil get
item_id SUM(`price`)
1 30
3 50
But that is not what you want. Neither is it what you wrote in the question " (10 + 20 + 50)".
Now replacing the first line with SELECT id, item_id, SUM(price) you will get one row for each item with the highest id.
id item_id price
2 1 20
3 3 50
This works because of the "undocumented feature" of MySQL, wich allows you to select columns that are not listed in the GROUP BY clause and get the first row from the subselect each group (each item in this case).
Now you only need to sum the price column in an additional outer select
SELECT SUM(price)
FROM (
SELECT id, item_id ,price
FROM (
SELECT id, person_id, item_id, price
FROM `table` tbl
WHERE tbl.person_id = 1
ORDER BY id DESC ) x
GROUP BY item_id
) y
However i do not recomend to use that "feature". While it still works on MySQL 5.6, you never know if that will work with newer versions. It already doesn't work on MariaDB.
Instead you can determite the MAX(id) for each item in an subselect, select only the rows with the determined ids and get the summed price of them.
SELECT SUM(`price`)
FROM `table` tbl
WHERE tbl.id IN (
SELECT MAX(tbl2.id)
FROM `table` tbl2
WHERE tbl2.person_id = 1
GROUP BY tbl2.item_id
)
Another solution (wich internaly does the same) is
SELECT SUM(`price`)
FROM `table` tbl
JOIN (
SELECT MAX(tbl2.id) as id
FROM `table` tbl2
WHERE tbl2.person_id = 1
GROUP BY tbl2.item_id
) x ON x.id = tbl.id
Alex's solution also works fine, if the groups (number of rows per person and item) are rather small.
You have used group by in main query, but it is on subquery like
SELECT id, person_id, item_id, SUM(`price`) FROM ( SELECT MAX(price) FROM `table` tbl WHERE p.person_id = 1 GROUP BY item_id ) AS x

Count duplicates records in Mysql table?

I have table with, folowing structure.
tbl
id name
1 AAA
2 BBB
3 BBB
4 BBB
5 AAA
6 CCC
select count(name) c from tbl
group by name having c >1
The query returning this result:
AAA(2) duplicate
BBB(3) duplicate
CCC(1) not duplicate
The names who are duplicates as AAA and BBB. The final result, who I want is count of this duplicate records.
Result should be like this:
Total duplicate products (2)
The approach is to have a nested query that has one line per duplicate, and an outer query returning just the count of the results of the inner query.
SELECT count(*) AS duplicate_count
FROM (
SELECT name FROM tbl
GROUP BY name HAVING COUNT(name) > 1
) AS t
Use IF statement to get your desired output:
SELECT name, COUNT(*) AS times, IF (COUNT(*)>1,"duplicated", "not duplicated") AS duplicated FROM <MY_TABLE> GROUP BY name
Output:
AAA 2 duplicated
BBB 3 duplicated
CCC 1 not duplicated
For List:
SELECT COUNT(`name`) AS adet, name
FROM `tbl` WHERE `status`=1 GROUP BY `name`
ORDER BY `adet` DESC
For Total Count:
SELECT COUNT(*) AS Total
FROM (SELECT COUNT(name) AS cou FROM tbl GROUP BY name HAVING cou>1 ) AS virtual_tbl
// Total: 5
why not just wrap this in a sub-query:
SELECT Count(*) TotalDups
FROM
(
select Name, Count(*)
from yourTable
group by name
having Count(*) > 1
) x
See SQL Fiddle with Demo
The accepted answer counts the number of rows that have duplicates, not the amount of duplicates. If you want to count the actual number of duplicates, use this:
SELECT COALESCE(SUM(rows) - count(1), 0) as dupes FROM(
SELECT COUNT(1) as rows
FROM `yourtable`
GROUP BY `name`
HAVING rows > 1
) x
What this does is total the duplicates in the group by, but then subtracts the amount of records that have duplicates. The reason is the group by total is not all duplicates, one record of each of those groupings is the unique row.
Fiddle: http://sqlfiddle.com/#!2/29639a/3
SQL code is:
SELECT VERSION_ID, PROJECT_ID, VERSION_NO, COUNT(VERSION_NO) AS dup_cnt
FROM MOVEMENTS
GROUP BY VERSION_NO
HAVING (dup_cnt > 1 && PROJECT_ID = 11660)
I'm using this query for my own table in PHP, but it only gives me one result whereas I'd like to the amount of duplicate per username, is that possible?
SELECT count(*) AS duplicate_count
FROM (
SELECT username FROM login_history
GROUP BY username HAVING COUNT(time) > 1
) AS t;