Sum unequal and removing duplicates from SQL query results

Sum unequal and removing duplicates from SQL query results - mysql

My base query:
SELECT project_id
name
stories_produced
on_date
FROM project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
ORDER BY project_id
It can get me these outputs:
Output example:
id name stories_produced on_date
1042 project 1 1001 (wanted) 2017-03-01
1042 project 1 1801 (wanted) 2017-06-10
1568 project 2 355 (wanted) 2017-06-10
1405 project 3 1 (not wanted) 2017-03-10
1405 project 3 1 (not wanted) 2017-06-10
Obs: There is a constraint on (id, on_date) meaning there can always be only one record of a project production on a specific date.
Duplicate records, that have the same id, and exist in both dates and have different production values (wanted)
Single records, that exists on only one of the dates (wanted)
The problem:*
Duplicate records, that have the same id, and exist in both dates and have equal production values (not wanted)
My current query, that need change
select project_id
name
CASE
WHEN max(stories_produced) - min(stories_produced) = 0
THEN max(stories_produced)
ELSE max(stories_produced) - min(stories_produced)
END AS 'stories_produced'
from project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
group by project_id;
output example:
id name stories_produced
1042 project 1 800 (wanted)
1568 project 2 355 (wanted)
1405 project 3 1 (not wanted)
The CASE is currently not taking care of the third constraint (Duplicate records, that have the same id, and exist in both dates and have EQUAL production values (not wanted))
Is there any possible condition that can accommodate this?

One option uses not exists to drop rows that have the same id, and exist in both dates and have equal production values:
select
p.project_id,
p.name,
p.stories_produced,
p.on_date,
from project_prod p
where
on_date in ('2017-03-01', '2017-06-10')
and not exists (
select 1
from project_prod p1
where
p1.on_date in ('2017-03-01', '2017-06-10')
and p1.on_date <> p.date
and p1.id = p.id
and p1.stories_produced = p.stories_produced
)
order by project_id
In MySQL 8.0, you can use window functions:
select
project_id,
name,
stories_produced,
on_date,
from (
select
p.*,
min(stories_produced) over(partition by project_id) min_stories_produced,
max(stories_produced) over(partition by project_id) max_stories_produced,
count(*) over(partition by project_id) max_stories_produced cnt
from project_prod p
where on_date in ('2017-03-01', '2017-06-10')
) t
where not (cnt = 2 and min_stories_produced = max_stories_produced)
oder by project_id

Related

Select 5 most recent unique entries in a database

Struggling with an SQL query to select the 5 most recent, unique, entries in a MySQL 5.7.22 table. For example, here's the 'activity' table:
uaid nid created
9222 29722 2018-05-17 03:19:33
9221 31412 2018-05-17 03:19:19
9220 31160 2018-05-16 23:47:34
9219 31160 2018-05-16 23:47:30
9218 31020 2018-05-16 22:35:59
9217 31020 2018-05-16 22:35:54
9216 28942 2018-05-16 22:35:20
...
The desired query should return the 5 most recent, unique entries by the 'nid' attribute, in this order (but only need the nid attribute):
uaid nid created
9222 29722 2018-05-17 03:19:33
9221 31412 2018-05-17 03:19:19
9220 31160 2018-05-16 23:47:34
9218 31020 2018-05-16 22:35:59
9216 28942 2018-05-16 22:35:20
I have tried a variety of combinations of DISTINCT but none work, ie:
select distinct nid from activity order by created desc limit 5
What is the proper query to return the 5 most recent, uniq entries by nid?

Your problem is the simplest form of the top-N-per-group problem. In general, this problem is a real headache to handle in MySQL, which doesn't support analytic functions (at least not in most versions folks are using in production these days). However, since you only want the first record per group, we can do a join to subquery which finds the max created value for each nid group.
SELECT a1.*
FROM activity a1
INNER JOIN
(
SELECT nid, MAX(created) AS max_created
FROM activity
GROUP BY nid
) a2
ON a1.nid = a2.nid AND a1.created = a2.max_created;
Demo

You can use a subquery and join
select * from activity m
inner join (
select nid, min(created) min_date
from activity
group by nid
limit 5
) t on t.nid = m.nin and t.min_date = m.created

Create date_from and date_to columns in mysql

I've got a table named 'T1' which I want to transpose and have date_from and date_to columns. The table itself has the data of who is a manager of a particular company. So I want to know since when to when a user was responsible for a company. I can do it easily in BigQuery with the following query but I'm struggling to do the same in MySQL.
WITH T1 AS ( SELECT 9 as rating, 'company1' as cid, 100 as user, '2017-08-20' AS created UNION ALL
SELECT 9 as rating, 'company1' as cid, 101 as user, '2017-08-22' AS created UNION ALL
SELECT 10 as rating, 'company1' as cid, 101 as user, '2017-08-21' AS created
)
SELECT cid, rating, user, CAST(created as DATE) as date_from,
CAST(COALESCE(MIN(CAST(created as DATE)) OVER(PARTITION BY cid, rating ORDER BY CAST(created as DATE) DESC ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING),
DATE_ADD(current_date(), INTERVAL 1 DAY)) as DATE) AS date_to
FROM T1
The original table format:
rating cid user created
9 company1 100 2017-08-20
9 company1 101 2017-08-22
10 company1 101 2017-08-21
The final table should have the following format:
cid rating user date_from date_to
1 company1 9 101 2017-08-22 2018-02-24
2 company1 9 100 2017-08-20 2017-08-22
3 company1 10 101 2017-08-21 2018-02-24
Thank you!

You really need lead(), which is not available in MySQL (and which would make the BigQuery query simpler). One method uses a correlated subquery:
select t1.*, t1.created as date_from,
(select min(tt1.created)
from t1 tt1
where tt1.cid = t1.cid and tt1.created > t1.created
) as date_to
from t1;

Get last updated value SQL

I have the following table structure..
emp_id | base_rate | base_sal | effective_on
1001 26.22 1200 2015-10-12
1001 26.00 1100 2015-11-12
1001 26.00 1100 2015-12-12
1002 18 1200 2015-10-12
1002 19 1100 2015-11-12
I need to find get the last updated base_rate with effective_on date for each emp_id
Like output ..
1001 26.00 1100 2015-11-12
1002 19 1100 2015-11-12
See, for 1001 2015-11-12 is selected instead of 2015-12-12 which is latest as the base_rate is same and hence previously effective from 2015-11-12
I have tried.. everything.. not able to find the exact query..

This method is simple and easy to understand.
1) Assign rank for all the effective dates in descending order by partitioning
for each employee.
2) Select all the required fields for the last updated effective date from the
inner query and display the result.
SELECT emp_id,base_rate,base_sal
FROM
(
SELECT *,
ROW_NUMBER() OVER ( PARTITION BY emp_id ORDER BY effective_on DESC ) AS rn
FROM table
)
WHERE rn = 1;

One method is to generate a subset of employees with max effective on and join back to the base set..
In the below we generate set "B" with Emp_ID and ME (max effective) and then we join back to the entire data set in the table and use the columns emp_ID and ME to limit the data in the base set and return all columns we care about.
Put in English:
We generated a data set for all the employess with only their max effective date, and then joined this data set back to the base set to limit the data in the base set to only contain records for employees with their most recent effective_on date.
SELECT A.Emp_ID, A.Base_Rate, A.Base_Sal, min(C.Effective_On)
FROM Table A
INNER JOIN (SELECT emp_ID, Max(Effective_on) ME
FROM Table A
GROUP BY Emp_ID) B
on A.Emp_ID = B.Emp_ID
and A.Effective_ON = B.ME
INNER JOIN TABLE C
on C.Emp_ID = A.Emp_ID
and C.Base_Rate= A.Base_rate
and C.base_Sal = A.Base_Sal
GROUP BY A.Emp_ID, A.Base_Rate, A.Base_Sal
This is more or less database agnostic whereas a row_number and limit would not work on mySQL as it doesn't support window functions.

You can first get the minimum date each base_rate becomes effective on for every employee and then take the max from there. Here is how you can do it using row_number() in oracle:
with temp(emp_id, base_rate, base_sal, effective_on)
as (select 1001, 26.22, 1200, '2015-10-12' from dual union all
select 1001, 26.00, 1100, '2015-11-12' from dual union all
select 1001, 26.00, 1100, '2015-12-12' from dual union all
select 1002, 18, 1200, '2015-10-12' from dual union all
select 1002, 19, 1100, '2015-11-12' from dual
)
SELECT emp_id,base_rate,base_sal,effective_on FROM(
SELECT temp2.*,
row_number() OVER (PARTITION BY EMP_ID ORDER BY effective_on DESC) AS rn2
FROM
(
SELECT temp.*,
row_number() OVER (PARTITION BY EMP_ID, BASE_RATE ORDER BY effective_on) AS rn
FROM temp
) temp2
WHERE rn = 1
)
WHERE rn2 = 1;

Distinct outside group in sql

I have a table called order_status_log, which logs who changed order statuses.
Simplified table and query below:
order_id user_id status time
1 1 1 2016-01-27 19:35:44
2 2 2 2016-01-27 19:36:45
4 3 2 2016-01-27 19:37:43
2 1 5 2016-01-27 19:38:41
I also have SQL which counts changes by each user:
SELECT
COUNT(*) as count,
user_id
FROM order_status_log
WHERE status = 1
GROUP BY user_id
ORDER BY count
Now I want to improve my query to count only first status changes in order.
In other words I need unique order_id with older time.
How I can change my query to do that?

Something like this?
SELECT *
FROM order_status_log o
WHERE NOT EXISTS
( SELECT 'x'
FROM order_status_log o2
WHERE o2.user_id = o.user_id
AND o2.time < o.time)

How to use count after using group by and count in sql?

I am trying to see statistics of how many passenger passed from my application. After selecting this query;
select count(person_id), person_id from Passenger group by person_id;
count(person_id) person_id
6 123
2 421
1 542
3 612
1 643
2 876
I see that passenger "123" passed 6 times. "421" passed 2 times. "542" passed 1 times etc.. So I want to make analyze and say that;
> 1 passenger passed 6 times,
> 2 passenger passed 2 times,
> 2 passenger passed 1 times,
> 1 passenger passed 3 times..
Here is sqlFiddle for your better understanding..

You can use a SELECT with a subquery to obtain the result you want:
SELECT Concat(COUNT(*), ' passenger passed ', table.theCount, ' times,') FROM
(
SELECT COUNT(person_id) AS theCount, person_id
FROM Passenger
GROUP BY person_id
) table
GROUP BY table.theCount

select cnt, count(person_id)
from
(
select count(person_id) as cnt, person_id
from Passenger
group by person_id
) tmp
group by cnt

Is this what you are looking for?
select count(person_id) as "Num passengers", times
from (
select count(person_id) as times, person_id
from Passenger
group by person_id
) sub
group by times order by times ASC

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Sum unequal and removing duplicates from SQL query results - mysql

Related

Select 5 most recent unique entries in a database

Create date_from and date_to columns in mysql

Get last updated value SQL

Distinct outside group in sql

How to use count after using group by and count in sql?

Categories

Resources