Show duplicate records for a SQL query

Show duplicate records for a SQL query - mysql

I am trying to see the duplicate records for an object over a week period. I am interested in seeing the duplicates, not objects that have had only a single instance. This is what I have written so far:
SELECT a.asset, t.ticketnum, t.symptom_mask, t.setsolution, t.`otherdesc`
FROM lamarinfo AS a
JOIN lfso AS t
ON (a.id = t.asset_id)
WHERE open_dt BETWEEN CURDATE() - INTERVAL 7 DAY AND SYSDATE()
GROUP BY a.`asset` HAVING COUNT(*) > 1;
This returns the records that are duplicate, but not each record for the duplicates. Any ideas?

Right so you should be able to handle this with a subquery.
SELECT a.asset, t.ticketnum, t.symptom_mask, t.setsolution, t.`otherdesc`
FROM lamarinfo AS a
JOIN lfso AS t
ON (a.id = t.asset_id)
WHERE a.asset IN (SELECT asset FROM lamarinfo WHERE open_dt BETWEEN CURDATE() - INTERVAL 7 DAY AND SYSDATE() GROUP BY asset HAVING COUNT(*) > 1)

Related

MySQL SELECT all rows between date time with interval

I have a column in my sql table called loggedTime which is a datetime field and I want to select between two dates startDate and endDate along with the interval may be 5 minutes, 10 minutes, 1 hour etc. I tried to write the SQL query but it says You have syntax error next interval, I am not sure what wrong with my query. If I remove INTERVAL 5 MINUTE my query works fine but I want to pass the Interval along with the date so it will select all rows between two dates and also with interval
Here is SQL
SELECT * FROM mytable WHERE loggedTime BETWEEN '2021-06-01' and '2021-06-03' INTERVAL 5 MINUTE

If you have any unique consecutively increasing column like id, then you can use an INNER JOIN as done followingly:
SELECT *
FROM mytable a
INNER JOIN mytable b
ON a.ID = b.ID + 1
WHERE TIMESTAMPDIFF(minute, a.timestamp, b.timestamp) = 5;
If you do not have that column in your table then use this code :
SELECT *
FROM (SELECT mt.*,
TIMESTAMPDIFF(minute, #prevTS, `loggedTime`) AS timeinterval,
#prevTS:=mt.`loggedTime`
FROM mytable mt,
(SELECT #prevTS := (SELECT MIN(`loggedTime`)
FROM yourTable)) vars
ORDER BY ID)subquery_alias
WHERE loggedTime BETWEEN '2021-06-01' AND '2021-06-03'
AND timeinterval = 5
Check this thread as reference too.

Parsed date in where clause

How to use parsed date in where clause with 2 tables f.e
SELECT *
FROM companies
INNER JOIN acquisitions ON companies.id = acquisitions.company_id
WHERE companies.created_at >= acquisitions.delivery_date
The companies.created_at is a date column and acquisitions.delivery_date is a dateTime one.
If I do this one record is skipped
companies.created_at = '2021-04-16'
acquisitions.delivery_date = '2021-04-16 10:00:00'
We see that delivery_date is not greater that created_at BUT both are on the same day. So how can I parse to date and then compare, I've tried with date(acquisitions.delivery_date) and cast(acquisitions.delivery_date as DATE) and didn't work

https://www.db-fiddle.com/f/jB1PyysytiusorEJoHuwx8/0
SELECT *
FROM companies
INNER JOIN acquisitions
ON companies.id= acquisitions.company_id
WHERE companies.created_at >= date(acquisitions.delivery_date);

SELECT *
FROM companies
INNER JOIN acquisitions
ON companies.id= acquisitions.company_id
WHERE companies.created_at + INTERVAL 1 DAY > acquisitions.delivery_date
I'd prefer this variant because the amount of rows in companies must be less than one in acquisitions. So the query will be slightly faster than with WHERE companies.created_at >= DATE(acquisitions.delivery_date).

MySQL Limit and Order Left Join

I have two tables: Processes and Validations; p and v respectively.
For each process there are many validations.
The aim is to:
Retrieve the latest validation for each process.
Generate a
dynamic date (Due_Date) as to when the next validation is due (being 365 days
after the latest validation date).
Filter the results to any due
dates that fall in the current month.
In short terms; I want to see what processes are due to be validated in the current month.
I'm 99% there with the query code. Having read through some posts on here I'm fairly certain I'm on the right track. My problem is that my query still returns all of the results for each process, instead of the top 1.
FYI: The processes table uses "Process_ID" as a primary key; whereas the Validations Table uses "Validation_Process_ID" as a foreign key.
Code at present :
Select p.Process_ID,
p.Process_Name,
v.Validation_Date,
Date_Add(v.Validation_Date, Interval 365 Day) as Due_Date
From processes_active p
left JOIN processes_validations v
on p.Process_ID = (select v.validation_process_id
from processes_validations
order by validation_date desc
limit 1)
Having Month(Due_Date) = Month(Now()) and Year(Due_Date) = Year(Now())
Any help would be thoroughly appreciated! I'm probably pretty close just can't sort that final section!
Thanks

Your actual query is wrong, the subquery will return the very latest record in your validation table, instead of returning the latest per process id.
You should decompose to get what you need.
1) compute the latest validation for each process in the validation table:
SELECT validation_process_id, MAX(validation_date) AS maxdate
FROM processes_validations
GROUP BY validation_process_id
2) For each process in the process table, get the latest validation, and compute the next validation date (use interval 1 YEAR and not 365 DAY... think leap years)
SELECT p.Process_ID, p.Process_Name, v.maxdate,
Date_Add(v.maxdate, Interval 1 year) as Due_Date
FROM processes_active p
LEFT JOIN
(
SELECT validation_process_id, MAX(validation_date) AS maxdate
FROM processes_validations
GROUP BY validation_process_id
)
ON p.Process_ID = v.validation_process_id
3) Filter to keep only the due_date this month. This can be done with a WHERE on query 2, I just make a nested query for your understanding
SELECT * FROM
(
SELECT p.Process_ID, p.Process_Name, v.maxdate,
Date_Add(v.maxdate, Interval 1 year) as Due_Date
FROM processes_active p
LEFT JOIN
(
SELECT validation_process_id, MAX(validation_date) AS maxdate
FROM processes_validations
GROUP BY validation_process_id
)
ON p.Process_ID = v.validation_process_id
) T
WHERE Month(Due_Date) = Month(Now()) and Year(Due_Date) = Year(Now())

Incorrect rows returned with date condition

I have a query that left joins two table with a date condition. I want to fetch rows for yesterday's transactions only.
Here the query:
When I add the AND condition still all the rows are returned but with null values to those not matching condition.
SELECT
B.txn_id,
B.txn_time,
B.svc_method,
B.customer_number,
B.amount,
B.amount_commission,
B.status,
A.partner_txn_id,
A.session_id as partner_session_id
FROM Partner A
LEFT JOIN Transaction B
ON A.log_id = B.txn_id
AND B.txn_time >= (CURDATE() - INTERVAL 1 DAY);

YOu should either change LEFT JOIN to INNER JOIN
or
move the call to WHERE section
B.txn_time >= (CURDATE() - INTERVAL 1 DAY)

SQL query union best practice

I am trying to classify data as I extract it from a table. the data has a history kept via "valid_from" and "valid_to" date fields in each row.
I want to extract the data and qualify it as follows:
NEW => WHERE CURRENT_DATE BETWEEN valid_from AND (valid_from + 1 MOTNH)
CURRENT => WHERE CURRENT_DATE > (valid_from + 1 MOTNH)
RETIRED => the rest of the rows, so the "dish_id" items not in the tables above, BUT
returning the values from the row containing MAX(valid_to) date.
Am I doing this the best / more efficient way? Thanks in advance!
SELECT
menu_table.dish_id,
menu_table.dish_title,
menu_table.marketing_desc,
menu_table_status.rrp_inc_gst,
menu_table_status.lowest_rrp,
menu_table_status.highest_rrp,
'n' as status
FROM
menu_table,
menu_table_status
WHERE
CURRENT_DATE BETWEEN menu_table_status.valid_from_date AND DATE_ADD(menu_table_status.valid_from_date, INTERVAL 1 MONTH)
AND CURRENT_DATE < menu_table_status.valid_to_date
AND menu_table.dish_id = menu_table_status.dish_id
UNION
SELECT
menu_table.dish_id,
menu_table.dish_title,
menu_table.marketing_desc,
menu_table_status.rrp_inc_gst,
menu_table_status.lowest_rrp,
menu_table_status.highest_rrp,
'c' as status
FROM
menu_table,
menu_table_status
WHERE
CURRENT_DATE > DATE_ADD(menu_table_status.valid_from_date, INTERVAL 1 MONTH)
AND CURRENT_DATE < menu_table_status.valid_to_date
AND menu_table.dish_id = menu_table_status.dish_id
UNION
SELECT
menu_table.dish_id,
menu_table.dish_title,
menu_table.marketing_desc,
menu_table_status.rrp_inc_gst,
menu_table_status.lowest_rrp,
menu_table_status.highest_rrp,
'r' as status
FROM
menu_table,
menu_table_status
WHERE
menu_table_status.valid_to_date
AND menu_table.dish_id NOT IN (SELECT inside_table1.dish_id
FROM menu_table_status AS inside_table1
WHERE CURRENT_DATE BETWEEN inside_table1.valid_from_date
AND inside_table1.valid_to_date)
AND menu_table_status.valid_to_date = (SELECT MAX(inside_table2.valid_to_date)
FROM menu_table_status AS inside_table2
WHERE inside_table2.dish_id = menu_table_status.dish_id)
AND menu_table.dish_id = menu_table_status.dish_id

Without much looking at it you are certainly confusing dates in your last where clause. Anyhow, your statement is way to complicated. Simply select all records (which you want to do anyhow) and look at each record's dates to decide for the status to give:
SELECT
menu_table.dish_id,
menu_table.dish_title,
menu_table.marketing_desc,
menu_table_status.rrp_inc_gst,
menu_table_status.lowest_rrp,
menu_table_status.highest_rrp,
CASE
WHEN
CURRENT_DATE BETWEEN menu_table_status.valid_from_date AND DATE_ADD(menu_table_status.valid_from_date, INTERVAL 1 MONTH)
AND CURRENT_DATE < menu_table_status.valid_to_date
THEN 'n'
WHEN
CURRENT_DATE > DATE_ADD(menu_table_status.valid_from_date, INTERVAL 1 MONTH)
AND CURRENT_DATE < menu_table_status.valid_to_date
THEN 'c'
ELSE 'r'
END as status
FROM menu_table
INNER JOIN menu_table_status ON menu_table.dish_id = menu_table_status.dish_id;
BTW: Please don't use that old join syntax where you list all tables comma-separated. It's prone to errors, which is why there is a "new" syntax available as of 1992.
EDIT: I've spotted your error. Instead of checking for CURRENT_DATE < menu_table_status.valid_to_date you check for menu_table_status.valid_to_date only thus treating the date as a boolean value, which is something special in MySQL.
One more remark: When unioning sets that are distinct (yours are because of different status letters)use UNION ALL, not UNION. UNION is used to remove duplicates. Why have the dbms check all your records when you know there are no duplicates?

If you don't need to perform this in one go, I would recommend to extract step one into a temporary table, and then define step two as left join on dish_id with that temporary table, where dish_id is NULL:
CREATE TEMPORARY TABLE step1 AS (
SELECT
mt.dish_id,
mt.dish_title,
mt.marketing_desc,
mts.rrp_inc_gst,
mts.lowest_rrp,
mts.highest_rrp,
(
if(CURRENT_DATE<DATE_ADD(mts.valid_from_date, INTERVAL 1 MONTH),
'n', 'c')
) as status
FROM
menu_table mt
JOIN menu_table_status mts ON mt.dish_id=mts.dish_id
WHERE CURRENT_DATE BETWEEN mts.valid_from_date AND mts.valid_to_date-1
);
SELECT step1.*
UNION
SELECT
mt.dish_id,
mt.dish_title,
mt.marketing_desc,
mts.rrp_inc_gst,
mts.lowest_rrp,
mts.highest_rrp,
'r' as status
FROM
menu_table mt
LEFT JOIN step1 s1 on s1.dish_id=mt.dish_id WHERE s1.dish_id is NULL
JOIN menu_table_status mts ON mt.dish_id=mts.dish_id;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008