Find and Delete Duplicate rows in MySQL

Find and Delete Duplicate rows in MySQL - mysql

I'm having trouble finding duplicates in a database table with the following setup:
==========================================================================
| stock_id | product_id | store_id | stock_qty | updated_at |
==========================================================================
| 9990 | 51 | 1 | 13 | 2014-10-25 16:30:01 |
| 9991 | 90 | 2 | 5 | 2014-10-25 16:30:01 |
| 9992 | 161 | 1 | 3 | 2014-10-25 16:30:01 |
| 9993 | 254 | 1 | 18 | 2014-10-25 16:30:01 |
| 9994 | 284 | 2 | 12 | 2014-10-25 16:30:01 |
| 9995 | 51 | 1 | 11 | 2014-10-25 17:30:02 |
| 9996 | 90 | 2 | 5 | 2014-10-25 17:30:02 |
| 9997 | 161 | 1 | 3 | 2014-10-25 17:30:02 |
| 9998 | 254 | 1 | 16 | 2014-10-25 17:30:02 |
| 9999 | 284 | 2 | 12 | 2014-10-25 17:30:02 |
==========================================================================
Stock updates are imported into this table every hour, I'm trying to find duplicate stock entries (any rows which have a matching product id and store id) so I can delete the oldest. The query below is my attempt, by comparing product ids and store ids on a join like this I can find one set of duplicates:
SELECT s.`stock_id`, s.`product_id`, s.`store_id`, s.`stock_qty`, s.`updated_at`
FROM `stock` s
INNER JOIN `stock` j ON s.`product_id`=j.`product_id` AND s.`store_id`=j.`store_id`
GROUP BY `stock_id`
HAVING COUNT(*) > 1
ORDER BY s.updated_at DESC, s.product_id ASC, s.store_id ASC, s.stock_id ASC;
While this query will work, it doesn't find ALL duplicates, only 1 set, which means if an import goes awry and isn't noticed until the morning, there's a possibility that we'll be left with tons of duplicate stock entries. My MySQL skills are sadly lacking and I'm at a complete loss about how to find and delete all duplicates in a fast, reliable manner.
Any help or ideas are welcome. Thanks

You can use this query:
DELETE st FROM stock st, stock st2
WHERE st.stock_id < st2.stock_id AND st.product_id = st2.product_id AND
st.store_id = st2.store_id;
This query will delete older record having same product_id and store_id and will keep latest record.

A self join on store_id, product_id and 'is older' in combination with DISTINCT should give you all rows where also a newer version exists:
> SHOW CREATE TABLE stock;
CREATE TABLE `stock` (
`stock_id` int(11) NOT NULL,
`product_id` int(11) DEFAULT NULL,
`store_id` int(11) DEFAULT NULL,
`stock_qty` int(11) DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`stock_id`)
> select * from stock;
+----------+------------+----------+-----------+---------------------+
| stock_id | product_id | store_id | stock_qty | updated_at |
+----------+------------+----------+-----------+---------------------+
| 1 | 1 | 1 | 1 | 2001-01-01 12:00:00 |
| 2 | 2 | 2 | 1 | 2001-01-01 12:00:00 |
| 3 | 2 | 2 | 1 | 2002-01-01 12:00:00 |
+----------+------------+----------+-----------+---------------------+
> SELECT DISTINCT s1.stock_id, s1.store_id, s1.product_id, s1.updated_at
FROM stock s1 JOIN stock s2
ON s1.store_id = s2.store_id
AND s1.product_id = s2.product_id
AND s1.updated_at < s2.updated_at;
+----------+----------+------------+---------------------+
| stock_id | store_id | product_id | updated_at |
+----------+----------+------------+---------------------+
| 2 | 2 | 2 | 2001-01-01 12:00:00 |
+----------+----------+------------+---------------------+
> DELETE stock FROM stock
JOIN stock s2 ON stock.store_id = s2.store_id
AND stock.product_id = s2.product_id
AND stock.updated_at < s2.updated_at;
Query OK, 1 row affected (0.02 sec)
> select * from stock;
+----------+------------+----------+-----------+---------------------+
| stock_id | product_id | store_id | stock_qty | updated_at |
+----------+------------+----------+-----------+---------------------+
| 1 | 1 | 1 | 1 | 2001-01-01 12:00:00 |
| 3 | 2 | 2 | 1 | 2002-01-01 12:00:00 |
+----------+------------+----------+-----------+---------------------+

Or you can use a stored Procedure:
DELIMITER //
DROP PROCEDURE IF EXISTS removeDuplicates;
CREATE PROCEDURE removeDuplicates(
stockID INT
)
BEGIN
DECLARE stockToKeep INT;
DECLARE storeID INT;
DECLARE productID INT;
-- gets the store and product value
SELECT DISTINCT store_id, product_id
FROM stock
WHERE stock_id = stockID
LIMIT 1
INTO
storeID, productID;
SELECT stock_id
FROM stock
WHERE product_id = productID AND store_id = storeID
ORDER BY updated_at DESC
LIMIT 1
INTO
stockToKeep;
DELETE FROM stock
WHERE product_id = productID AND store_id = storeID
AND stock_id != stockToKeep;
END //
DELIMITER ;
And afterwards call it for every pair of the product id and store id via a cursor procedure:
DELIMITER //
CREATE PROCEDURE updateTable() BEGIN
DECLARE done BOOLEAN DEFAULT FALSE;
DECLARE stockID INT UNSIGNED;
DECLARE cur CURSOR FOR SELECT DISTINCT stock_id FROM stock;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done := TRUE;
OPEN cur;
testLoop: LOOP
FETCH cur INTO stockID;
IF done THEN
LEAVE testLoop;
END IF;
CALL removeDuplicates(stockID);
END LOOP testLoop;
CLOSE cur;
END//
DELIMITER ;
And then just call the second procedure
CALL updateTable();

Related

MySQL subquery select first row for each group

I need to create a MySQL stored procedure it selects each User SUM of all the Points they've earned.
The query should group Game by StartTime and only select the first row of each group ordered by Points. I'm trying to ignore duplicate StartTime values for each User but still keep the first one. This should avoid cheating if the User saves the same game twice.
If the User hasn't been in any Game, it should still return NULL.
CREATE PROCEDURE `spGetPoints`(
IN _StartDate DATETIME,
IN _EndDate DATETIME,
IN _Limit INT,
IN _Offset INT
)
BEGIN
SELECT `User`.`UserId`, `User`.`Username`,
(SELECT SUM(`Game`.`Points`)
FROM `Game`
WHERE `Game`.`UserId` = `User`.`UserId` AND
`Game`.`StartDate` > _StartDate AND `Game`.`StartDate` < _EndDate
GROUP BY `Game`.`StartDate`
ORDER BY `Game`.`Points` DESC
LIMIT 1
) AS `Value`
FROM `User`
ORDER BY `Value` DESC, `User`.`Username` ASC
LIMIT _Limit OFFSET _Offset;
END
Sample User Table
+--------+----------+
| UserId | Username |
+--------+----------+
| 1 | JaneDoe |
| 2 | JohnDoe |
+--------+----------+
Sample Game Table
+--------+--------+-------------------------+--------+
| GameId | UserId | StartDate | Points |
+--------+--------+-------------------------+--------+
| 1 | 1 | 2019-01-09 12:43:00 AM | 1789 |
| 2 | 1 | 2019-01-09 11:35:00 AM | 1048 |
| 3 | 1 | 2019-01-09 9:22:00 AM | 900 |
| 4 | 1 | 2019-01-09 12:43:00 AM | 1789 |
| 5 | 1 | 2019-01-09 11:35:00 AM | 1048 |
| 6 | 1 | 2019-01-09 9:22:00 AM | 900 |
| 7 | 1 | 2019-01-09 12:43:00 AM | 1789 |
| 8 | 1 | 2019-01-09 11:35:00 AM | 1048 |
| 9 | 2 | 2019-01-17 12:05:00 AM | 552 |
| 10 | 2 | 2019-01-24 12:08:00 AM | 512 |
| 11 | 2 | 2019-01-27 5:13:00 PM | 0 |
+--------+--------+-------------------------+--------+
Current Result
+--------+----------+-------+
| UserId | Username | Value |
+--------+----------+-------+
| 1 | JaneDoe | 5367 |
| 2 | JohnDoe | 552 |
+--------+----------+-------+
Expected Result
+--------+----------+-------+
| UserId | Username | Value |
+--------+----------+-------+
| 1 | JaneDoe | 3737 |
| 2 | JohnDoe | 1064 |
+--------+----------+-------+
I was able to get the expected result with the following statement by selecting the SUM from a subquery and hardcoding the UserId.
SELECT SUM(`x`.`Points`) FROM
(SELECT `Points`
FROM `Game`
WHERE `Game`.`UserId` = 1 AND
`Game`.`StartDate` > STR_TO_DATE('01/09/2019', '%m/%d/%Y') AND `Game`.`StartDate` < STR_TO_DATE('02/09/2019', '%m/%d/%Y')
GROUP BY `Game`.`StartDate`
ORDER BY `Game`.`Points` ASC) AS `x`;
When I try to put that statement in a subquery like in the following statement, I get this error message Error Code: 1054. Unknown column 'User.UserId' in 'where clause'. I'm getting this error because the UserId isn't visible in the second subquery.
SELECT `User`.`UserId`, `User`.`Username`,
(SELECT SUM(`x`.`Points`) FROM (SELECT `Game`.`Points`
FROM `Game`
WHERE `Game`.`UserId` = `User`.`UserId` AND
`Game`.`StartDate` > STR_TO_DATE('01/09/2019', '%m/%d/%Y') AND `Game`.`StartDate` < STR_TO_DATE('02/09/2019', '%m/%d/%Y')
GROUP BY `Game`.`StartDate`
ORDER BY `Game`.`Points` DESC) AS `x`
) AS `Value`
FROM `User`
ORDER BY `Value` DESC, `User`.`Username` ASC;

I changed the query to use LEFT JOIN on Game. I also added GROUP BY 'Game'.'UserId', 'Game'.'StartDate' and GROUP BY 'User'.'UserId'.
CREATE PROCEDURE `spGetPoints`(
IN _StartDate DATETIME,
IN _EndDate DATETIME,
IN _Limit INT,
IN _Offset INT
)
BEGIN
SELECT `User`.`UserId`, `User`.`Username`,
SUM(`Game`.`Points`) AS `Value`
FROM `User`
LEFT JOIN (SELECT *
FROM `Game`
WHERE `Game`.`StartDate` > _StartDate AND `Game`.`StartDate` < _EndDate
GROUP BY `Game`.`UserId`, `Game`.`StartDate`
ORDER BY `Game`.`Points`
) AS `Game` ON `User`.`UserId` = `Game`.`UserId`
GROUP BY `User`.`UserId`
ORDER BY `Value` DESC, `User`.`Username` ASC
LIMIT _Limit OFFSET _Offset;
END
This link also helped. Select first row in each GROUP BY group?

Join a table and calculate a percentage from this new table

I'm trying to make a report of financial datas for my company:
I have actually two two tables:
___BillableDatas:
|--------|------------|----------|----------|--------------|---------------------|
| BIL_Id | BIL_Date | BIL_Type | BIL_Rate | BIL_Quantity | BIL_ApplicableTaxes |
|--------|------------|----------|----------|--------------|---------------------|
| 1 | 2017-01-01 | Night | 95 | 1 | 1 |
| 2 | 2017-01-02 | Night | 95 | 1 | 1 |
| 3 | 2017-01-15 | Night | 105 | 1 | 1 |
| 4 | 2017-01-15 | Item | 8 | 2 | 1,2 |
| 5 | 2017-02-14 | Night | 95 | 1 | 1 |
| 6 | 2017-02-15 | Night | 95 | 1 | 1 |
| 7 | 2017-02-16 | Night | 95 | 1 | 1 |
| 8 | 2017-03-20 | Night | 89 | 1 | 1 |
| 9 | 2017-03-21 | Night | 89 | 1 | 1 |
| 10 | 2017-03-21 | Item | 8 | 3 | 1,2 |
|--------|------------|----------|----------|--------------|---------------------|
___SalesTaxes:
|--------|------------|
| STX_Id | STX_Amount |
|--------|------------|
| 1 | 14.00 |
| 2 | 5.00 |
|--------|------------|
I need to know for each month the sum of my revenue with and without taxes.
Actually I can make the report but don't know how to loop into the ___SalesTaxes table.
What I have actually:
SELECT month(BIL_Date) AS month,
sum(BIL_Rate * BIL_Quantity) AS sumval
FROM `___BillableDatas`
WHERE BIL_Date BETWEEN "2017-01-01" AND "2017-12-31"
AND BIL_Type = "Night" OR BIL_Type = "Item"
GROUP BY year(BIL_Date), month(BIL_Date)
Thanks for your help.

as kbball mentioned you have an unresolved many to many relationship in your main table. A proper table should never be designed to have more than one value per field. Resolving many to many relationships is quite simple. You will need to create a new table something like bill_taxType or some relation like that. The new table would have two fields as well as the standard primary key, it will have bill_id and applicable tax id. In the case of your 1,2 fields like bill id 4 in the new table it will look like
primary key, bill id, applicable tax id
1 4 1
2 4 2
In your final query you will join all three together on the appropriate primary key-foreign key relationship. This final query should have the data that you need.

This will work, I've created following example will help you lot for debugging and implementation. try to implement as below :
If(OBJECT_ID('tempdb..#___BillableDatas') Is Not Null)
Begin
Drop Table #___BillableDatas
End
If(OBJECT_ID('tempdb..#___SalesTaxes') Is Not Null)
Begin
Drop Table #___SalesTaxes
End
CREATE TABLE #___BillableDatas
(
BIL_Id INT IDENTITY (1,1),
BIL_Date DATETIME,
BIL_Type VARCHAR(50),
BIL_Rate FLOAT,
BIL_Quantity INT,
BIL_ApplicableTaxes VARCHAR(10)
);
INSERT INTO #___BillableDatas (BIL_Date,BIL_Type,BIL_Rate,BIL_Quantity,BIL_ApplicableTaxes)
VALUES ('2017-01-01','Night',95,1,'1'),
('2017-01-02','Night',95,1,'1'),
('2017-01-15','Night',105,1,'1'),
('2017-01-15','Item',8,2,'1,2'),
('2017-02-14','Night',95,1,'1'),
('2017-02-15','Night',95,1,'1'),
('2017-02-16','Night',95,1,'1'),
('2017-03-20','Night',89,1,'1'),
('2017-03-21','Night',89,1,'1'),
('2017-03-21','Item',8,1,'1,2')
CREATE TABLE #___SalesTaxes
(
STX_Id INT IDENTITY (1,1),
STX_Amount FLOAT
);
INSERT INTO #___SalesTaxes (STX_Amount) VALUES (14.00),(5.00)
-----------------------------------------------------------------
SELECT * FROM #___BillableDatas
SELECT * FROM #___SalesTaxes
SELECT MONTH(BD.BIL_Date) AS [Month],SUM(BD.BIL_Rate * BD.BIL_Quantity) AS 'Without Tax'
,(SUM(BD.BIL_Rate * BD.BIL_Quantity)+((SUM(BD.BIL_Rate * BD.BIL_Quantity)/100)*BD.Tax1)) AS 'With Tax 1'
,(SUM(BD.BIL_Rate * BD.BIL_Quantity)+((SUM(BD.BIL_Rate * BD.BIL_Quantity)/100)*BD.Tax2)) AS 'With Tax 2'
FROM
(
SELECT *,
(SELECT ST1.STX_Amount FROM Func_Split(BIL_ApplicableTaxes,',') AS F LEFT JOIN #___SalesTaxes AS ST1 ON F.Element=ST1.STX_Id WHERE F.Element='1') AS Tax1 ,
(SELECT ST1.STX_Amount FROM Func_Split(BIL_ApplicableTaxes,',') AS F LEFT JOIN #___SalesTaxes AS ST1 ON F.Element=ST1.STX_Id WHERE F.Element='2') AS Tax2
FROM #___BillableDatas) AS BD
WHERE BD.BIL_Date BETWEEN '2017-01-01' AND '2017-12-31' AND BD.BIL_Type = 'Night' OR BD.BIL_Type = 'Item'
GROUP BY YEAR(BD.BIL_Date), MONTH(BD.BIL_Date),BD.Tax1,BD.Tax2
You will require function Func_Split for above solution, use this :
CREATE FUNCTION [dbo].[func_Split]
(
#DelimitedString varchar(8000),
#Delimiter varchar(100)
)
RETURNS #tblArray TABLE
(
ElementID int IDENTITY(1,1), -- Array index
Element varchar(1000) -- Array element contents
)
AS
BEGIN
-- Local Variable Declarations
-- ---------------------------
DECLARE #Index smallint,
#Start smallint,
#DelSize smallint
SET #DelSize = LEN(#Delimiter)
-- Loop through source string and add elements to destination table array
-- ----------------------------------------------------------------------
WHILE LEN(#DelimitedString) > 0
BEGIN
SET #Index = CHARINDEX(#Delimiter, #DelimitedString)
IF #Index = 0
BEGIN
INSERT INTO
#tblArray
(Element)
VALUES
(LTRIM(RTRIM(#DelimitedString)))
BREAK
END
ELSE
BEGIN
INSERT INTO
#tblArray
(Element)
VALUES
(LTRIM(RTRIM(SUBSTRING(#DelimitedString, 1,#Index - 1))))
SET #Start = #Index + #DelSize
SET #DelimitedString = SUBSTRING(#DelimitedString, #Start , LEN(#DelimitedString) - #Start + 1)
END
END
RETURN
END

Database to track visits vs dynamic targets

I'm designing a new database to track accounts achieved visits vs monthly targets. The final report shall be requested by start/end dates and one account to show the months inbetween with monthly target and sum of visits.
The complication started when I knew the number of accounts is more than 10 thousands and the targets should be changed monthly or not for the only changed accounts targets(i.e each target will have start and end date. if no end date then the target is always valid). At this point I lost and I need help
For simplicity I will assume I have table with dates periods and simplest situation as follow
accounts
+----+---------+
|id | name |
+----+---------+
| 1 | account1|
| 2 | account2|
+----+---------+
targets
+---+------------+------------+-----------+----------------+
|id | account_id | start_date | end_date | monthly_target |
+---+------------+------------+-----------+----------------+
|1 | 1 | 1-1-2016 | 31-1-2016 | 5 |
|2 | 1 | 1-2-2016 | 31-5-2016 | 4 |
|3 | 1 | 1-7-2016 | null | 7 |
|4 | 2 | 1-1-2016 | null | 10 |
+---+------------+------------+-----------+----------------+
visits
+---+-----------+------------+
|id | date | account_id |
+----------------------------+
|1 | 15-1-2016 | 1 |
|2 | 20-1-2016 | 1 |
|3 | 10-5-2016 | 1 |
|3 | 20-5-2016 | 1 |
|4 | 20-5-2016 | 2 |
+---+-----------+------------+
calendar (Optional)
----------+----------+
|start | end |
----------+----------+
|1-1-2016 | 31-1-2016|
|1-2-2016 | 29-2-2016|
|1-3-2016 | 31-3-2016|
|1-4-2016 | 30-4-2016|
|1-5-2016 | 31-5-2016|
|1-6-2016 | 30-6-2016|
|1-7-2016 | 31-7-2016|
|1-8-2016 | 31-7-2016|
+---------+----------+
Expected report for account1 coverage from 1-4-2016 to 31-7-2016
+---------+-----------+--------+----+
|start | end | target | sum|
+---------+-----------+--------+----+
|1-4-2016 | 30-4-2016 | 4 | 0 |
|1-5-2016 | 31-5-2016 | 4 | 2 |
|1-6-2016 | 30-6-2016 | 0 | 0 |
|1-7-2016 | 31-7-2016 | 7 | 0 |
+---------+-----------+--------+----+
I can accept changing my initial design if it causes problems but assuming the design of targets table is the most practical design for system admin.
I need help in SQL needed to generate the final report.

I modified the range of dates in targets to have an explicit end date even if that means end of year. This way, avoiding the null, the sql could range ok. It also uses the ISO 8601 Standard for dates. And it is implemented in a Stored Proc that takes 3 parameters: account_id, start and end date.
Alias v, the derived table, prevents double counts versus a flat out LEFT JOIN against the visits table. For instance, that 2 would be an errant 7 without that strategy. So it used the LAST_DAY() function.
Schema:
create table accounts
( id int not null,
name varchar(100) not null
);
insert accounts values
(1,'account1'),
(2,'account2');
-- drop table targets;
create table targets
( id int not null,
account_id int not null,
start_date date not null,
end_date date not null,
monthly_target int not null
);
-- truncate targets;
insert targets values
(1,1,'2016-01-01','2016-01-31',5),
(2,1,'2016-02-01','2016-05-31',4),
(3,1,'2016-07-01','2016-12-31',7),
(4,2,'2016-01-01','2016-12-31',10);
create table visits
( id int not null,
date date not null,
account_id int not null
);
-- truncate visits;
insert visits values
(1,'2016-01-15',1),
(2,'2016-01-20',1),
(3,'2016-05-10',1),
(4,'2016-05-20',1),
(5,'2016-05-20',2);
create table calendar
( start date not null,
end date not null
);
insert calendar values
('2016-01-01','2016-01-31'),
('2016-02-01','2016-02-29'),
('2016-03-01','2016-03-31'),
('2016-04-01','2016-04-30'),
('2016-05-01','2016-05-31'),
('2016-06-01','2016-06-30'),
('2016-07-01','2016-07-31'),
('2016-08-01','2016-08-31'),
('2016-09-01','2016-09-30'),
('2016-10-01','2016-10-31'),
('2016-11-01','2016-11-30'),
('2016-12-01','2016-12-31');
Stored Proc:
DROP PROCEDURE IF EXISTS uspGetRangeReport007;
DELIMITER $$
CREATE PROCEDURE uspGetRangeReport007
( p_account_id INT,
p_start DATE,
p_end DATE
)
BEGIN
SELECT c.start,c.end,
IFNULL(t.monthly_target,0) as target,
-- IFNULL(sum(v.id),0) as visits
IFNULL(v.theCount,0) as visits
FROM calendar c
LEFT JOIN targets t
ON account_id=p_account_id
AND c.start BETWEEN t.start_date AND t.end_date
AND c.end BETWEEN t.start_date AND t.end_date
LEFT JOIN
( SELECT LAST_DAY(date) as lastDayOfMonth,
count(id) as theCount
FROM VISITS
WHERE account_id=p_account_id
GROUP BY LAST_DAY(date)
) v
ON v.lastDayOfMonth BETWEEN c.start AND c.end
WHERE c.start BETWEEN p_start AND p_end
AND c.end BETWEEN p_start AND p_end
GROUP BY c.start,c.end,t.monthly_target
ORDER BY c.start;
END;$$
DELIMITER ;
Test:
call uspGetRangeReport007(1,'2016-04-01','2016-07-31');
+------------+------------+--------+--------+
| start | end | target | visits |
+------------+------------+--------+--------+
| 2016-04-01 | 2016-04-30 | 4 | 0 |
| 2016-05-01 | 2016-05-31 | 4 | 2 |
| 2016-06-01 | 2016-06-30 | 0 | 0 |
| 2016-07-01 | 2016-07-31 | 7 | 0 |
+------------+------------+--------+--------+

SELECT c.start,
c.end,
t.monthly_target AS target,
(
SELECT COUNT(*)
FROM visits
WHERE `date` BETWEEN c.start AND c.end
AND account_id = ? -- Specify '1'
) AS `sum` -- Correlated subquery for counting visits
FROM Calendar AS c
JOIN targets AS t ON c.start_date >= t.start_date
AND ( t.end_date IS NULL
OR c.start_date < t.end_date )
WHERE c.start >= ? -- Specify date range
AND c.end <= ?

Update the next row of the target row in MySQL

Suppose I have a table that tracks if a payment is missed like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | NULL |
| 5 | 1 | 100 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
And, for example, I ran a query that checks if a payment is missed like this:
UPDATE loan_payments
SET missed = 1
WHERE DATEDIFF(NOW(), due_at) >= 10
AND paid_at IS NULL
Then suppose that the row with id = 1 gets affected. I want the amount_due of row with id = 1 be added to the amount_due of the next row so the table would look like this:
+----+---------+------------+------------+---------+--------+
| id | loan_id | amount_due | due_at | paid_at | missed |
+----+---------+------------+------------+---------+--------+
| 1 | 1 | 100 | 2013-08-17 | NULL | 1 |
| 5 | 1 | 200 | 2013-09-17 | NULL | NULL |
| 7 | 1 | 100 | 2013-10-17 | NULL | NULL |
+----+---------+------------+------------+---------+--------+
Any advice on how to do it?
Thanks

Take a look at this :
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE loan_payments
(`id` int, `loan_id` int, `amount_due` int,
`due_at` varchar(10), `paid_at` varchar(4), `missed` varchar(4))
;
INSERT INTO loan_payments
(`id`, `loan_id`, `amount_due`, `due_at`, `paid_at`, `missed`)
VALUES
(1, 1, 100, '2013-09-17', NULL, NULL),
(3, 2, 100, '2013-09-17', NULL, NULL),
(5, 1, 100, '2013-10-17', NULL, NULL),
(7, 1, 100, '2013-11-17', NULL, NULL)
;
UPDATE loan_payments AS l
LEFT OUTER JOIN (SELECT loan_id, MIN(ID) AS ID
FROM loan_payments
WHERE DATEDIFF(NOW(), due_at) < 0
GROUP BY loan_id) AS l2 ON l.loan_id = l2.loan_id
LEFT OUTER JOIN loan_payments AS l3 ON l2.id = l3.id
SET l.missed = 1, l3.amount_due = l3.amount_due + l.amount_due
WHERE DATEDIFF(NOW(), l.due_at) >= 10
AND l.paid_at IS NULL
;
Query 1:
SELECT *
FROM loan_payments
Results:
| ID | LOAN_ID | AMOUNT_DUE | DUE_AT | PAID_AT | MISSED |
|----|---------|------------|------------|---------|--------|
| 1 | 1 | 100 | 2013-09-17 | (null) | 1 |
| 3 | 2 | 100 | 2013-09-17 | (null) | 1 |
| 5 | 1 | 200 | 2013-10-17 | (null) | (null) |
| 7 | 1 | 100 | 2013-11-17 | (null) | (null) |

Unfortunately I don't have time at the moment to write out full-blown SQL, but here's the psuedocode I think you need to implement:
select all DISTINCT loan_id from table loan_payments
for each loan_id:
set missed = 1 for all outstanding payments for loan_id (as determined by date)
select the sum of all outstanding payments for loan_id
add this sum to the amount_due for the loan's next due date after today
Refer to this for how to loop using pure MySQL: http://dev.mysql.com/doc/refman/5.7/en/cursors.html

I fixed my own problem by adding a missed_at field. I put the current timestamp ($now) in a variable before I update the first row to missed = 1 and missed_at = $now then I ran this query to update the next row's amount_due:
UPDATE loan_payments lp1 JOIN loan_payments lp2 ON lp1.due_at > lp2.due_at
SET lp1.amount_due = lp2.amount_due + lp1.amount_due
WHERE lp2.missed_at = $now AND DATEDIFF(lp1.due_at, lp2.due_at) <= DAYOFMONTH(LAST_DAY(lp1.due_at))
I wish I could use just use LIMIT 1 to that query but it turns out that it's not possible for an UPDATE query with a JOIN.
So all in all, I used two queries to achieve what I want. It did the trick.
Please advise if you have better solutions.
Thanks!

query to fetch records and their rank in the DB

I have a table that holds usernames and results.
When a user insert his results to the DB, I want to execute a query that will return
the top X results ( with their rank in the db) and will also get that user result
and his rank in the DB.
the result should be like this:
1 playername 4500
2 otherplayer 4100
3 anotherone 3900
...
134 current player 140
I have tried a query with union, but then I didnt get the current player rank.
ideas anyone?
The DB is MYSQL.
10x alot and have agreat weekend :)
EDIT
This is what I have tried:
(select substr(first_name,1,10) as first_name, result
FROM top_scores ts
WHERE result_date >= NOW() - INTERVAL 1 DAY
LIMIT 10)
union
(select substr(first_name,1,10) as first_name, result
FROM top_scores ts
where first_name='XXX' and result=3030);

SET X = 0;
SELECT #X:=#X+1 AS rank, username, result
FROM myTable
ORDER BY result DESC
LIMIT 10;
Re your comment:
How about this:
SET X = 0;
SELECT ranked.*
FROM (
SELECT #X:=#X+1 AS rank, username, result
FROM myTable
ORDER BY result DESC
) AS ranked
WHERE ranked.rank <= 10 OR username = 'current';

Based on what I am reading here:
Your table structure is:
+--------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------+-------------+------+-----+---------+-------+
| name | varchar(50) | YES | | NULL | |
| result | int(11) | YES | | NULL | |
+--------+-------------+------+-----+---------+-------+
Table Data looks like:
+---------+--------+
| name | result |
+---------+--------+
| Player1 | 4500 |
| Player2 | 4100 |
| Player3 | 3900 |
| Player4 | 3800 |
| Player5 | 3700 |
| Player6 | 3600 |
| Player7 | 3500 |
| Player8 | 3400 |
+---------+--------+
You want a result set to look like this:
+------+---------+--------+
| rank | name | result |
+------+---------+--------+
| 1 | Player1 | 4500 |
| 2 | Player2 | 4100 |
| 3 | Player3 | 3900 |
| 4 | Player4 | 3800 |
| 5 | Player5 | 3700 |
| 6 | Player6 | 3600 |
| 7 | Player7 | 3500 |
| 8 | Player8 | 3400 |
+------+---------+--------+
SQL:
set #rank = 0;
select
top_scores.*
from
(select ranks.* from (select #rank:=#rank+1 AS rank, name, result from ranks) ranks) top_scores
where
top_scores.rank <= 5
or (top_scores.result = 3400 and name ='Player8');
That will do what you want it to do

assuming your table has the following columns:
playername
score
calculated_rank
your query should look something like:
select calculated_rank,playername, score
from tablename
order by calculated_rank limit 5

I assume you have PRIMARY KEY on this table. If you don't, just create one. My table structure (because you didn't supply your own) is like this:
id INTEGER
result INTEGER
first_name VARCHAR
SQL query should be like that:
SELECT #i := #i+1 AS position, first_name, result FROM top_scores, (SELECT #i := 0) t ORDER BY result DESC LIMIT 10 UNION
SELECT (SELECT COUNT(id) FROM top_scores t2 WHERE t2.result > t1.result AND t2.id > t1.id) AS position, first_name, result FROM top_scores t1 WHERE id = LAST_INSERT_ID();
I added additional condition into subquery ("AND t2.id > t1.id") to prevent multiple people with same result having same position.
EDIT: If you have some login system, it would be better to save userid with result and get current user result using it.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Find and Delete Duplicate rows in MySQL - mysql

You can use this query: DELETE st FROM stock st, stock st2 WHERE st.stock_id < st2.stock_id AND st.product_id = st2.product_id AND st.store_id = st2.store_id; This query will delete older record having same product_id and store_id and will keep latest record.

Related

MySQL subquery select first row for each group

Join a table and calculate a percentage from this new table

Database to track visits vs dynamic targets

Update the next row of the target row in MySQL

query to fetch records and their rank in the DB

Categories

Resources