MySQL move rows from one table to another by condition - mysql

We have a MySQL database of users, their sold and purchased products. We need to move inactive users who have not logged-in from last 3 years and never purchased or sold anything on our website to another table. Each table has millions of entries.
This is a table for sold items. Around 40m entries;
CREATE TABLE `sold` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`uid` int(10) unsigned NOT NULL,
`item` bigint(20) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `uid` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This is a table for purchased items. Around 6m entries;
CREATE TABLE `purchased` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`uid` int(10) unsigned NOT NULL,
`item` bigint(20) DEFAULT '0',
PRIMARY KEY (`id`),
KEY `uid` (`uid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This is a user table, around 17m entries;
CREATE TABLE `users` (
`uid` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(32) DEFAULT '',
`email` varchar(64) DEFAULT NULL,
`lastlogin` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`uid`),
UNIQUE KEY `email_index` (`email`),
KEY `lastlogin` (`lastlogin`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This is the table where we need to move inactive users to
CREATE TABLE `inactiveusers` (
`uid` int(10) unsigned NOT NULL,
`name` varchar(32) DEFAULT '',
`email` varchar(64) DEFAULT NULL,
`lastlogin` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`uid`),
UNIQUE KEY `email_index` (`email`),
KEY `lastlogin` (`lastlogin`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Any suggestions on how to achieve this with minimum downtime?

#Yumoji, I guess your probelm is How to calculate the last three 3 years of any activity. Right?
So, if that is the case then you should add a "created_at" column in your DB tables.
Now, after you add a created_at column in your tables you can easily calculate last three years activity.
Here it is-
SELECT uid FROM sold AND purchased WHERE sold.created_at>DATE_SUB(NOW(), INTERVAL 3 YEAR) OR purchased.created_at>DATE_SUB(NOW(), INTERVAL 3 YEAR);
from this query you get the user ids, Then -
INSERT INTO inactiveusers select * from users where uid = 'the user ids u get';
DELETE FROM users where uid = 'the user ids u get';
I hope it solves problem. Cheers.

Related

Partitioning large table by dates

I have implemented custom url shortener in my app and I have one table for that. table structure looks like this:
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
ALTER TABLE `urls`
ADD PRIMARY KEY (`id`),
ADD KEY `url_id` (`url_id`),
ADD KEY `type` (`type`),
ADD KEY `campaign` (`campaign`),
ADD KEY `ad_id` (`ad_id`),
ADD KEY `date` (`date`),
ADD KEY `user_id` (`user_id`);
The table now has 20.000.000 records and currently growing by 300k-400k records per day.
url_id column is unique varchar(10) and url looks like that: http://example.com/asdfghjklu
Now i have partitioned this table into 10 partitions by HASH(id):
PARTITION BY HASH (`id`)
PARTITIONS 10;
When I try to generate reports and join this table on others query is getting really slow, so slow even can't get 1 week report.
When I try to make big query in this table I filter almost every query with dates and I think it will be much better if I partition this table by date column.
Is it good idea?
As I read if I want to partition this table by date I need to add date in composite primary key: PRIMARY KEY(id, date)
What do you think about this? How do I improve my query performance?
I wold recommend use hash partition using date or month or YEAR
CREATE TABLE `urls` (
`id` int(11) NOT NULL,
`url_id` varchar(10) DEFAULT NULL,
`long_url` varchar(255) DEFAULT NULL,
`clicked` mediumint(5) NOT NULL DEFAULT 0,
`user_id` varchar(7) DEFAULT NULL,
`type` varchar(15) DEFAULT NULL,
`ad_id` int(11) DEFAULT NULL,
`campaign` int(11) DEFAULT,
`increment` tinyint(1) NOT NULL DEFAULT 0,
`date` date DEFAULT NULL,
`del` enum('1','0') NOT NULL DEFAULT '0',
PartitionsID int(4) unsigned NOT NULL,
KEY PartitionsID (PartitionsID)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITION BY HASH (PartitionsID)
PARTITIONS 366;
IN PARTITION ID you just need to insert TO_DAYS(date) so you have only one value for entire day .
SOURCE
and it will make easy for partition for each day or you can do with month wise also depending on your data size .
for select
you can use below query as example
SELECT *
FROM TT ACT
WHERE ACT.CustomerID = vCustomerID
AND ACT.TransactionTime BETWEEN vInvoiceEndDate AND vPaymentDueDate
AND ACT.TrxnInfoTypeID IN (19, 23)
AND ACT.PaymentType = '1'
AND ACT.PartitionsID BETWEEN TO_DAYS(vInvoiceEndDate) AND TO_DAYS(vPaymentDueDate);

User and game database many-to-many connection

I'm trying to create a game database. I have already created a user table where the users, password and email are stored.
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`username` varchar(50) DEFAULT NULL,
`password` char(40) DEFAULT NULL,
`email` varchar(60) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=111 DEFAULT CHARSET=utf8;
I have also created a games table where the game type, name of the game, duration, description, active and who has created the game.
CREATE TABLE `games` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`type` varchar(30) DEFAULT NULL,
`name` varchar(60) DEFAULT NULL,
`duration` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`description` varchar(255) DEFAULT NULL,
`Active` tinyint(1) DEFAULT NULL,
`Completed` tinyint(1) DEFAULT NULL,
`createdBy` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8;
The users will be able to create a game and invite x number of users, hence it will be a many-to-many relations. I have tried to create a table called active_games but I'm not sure how I should proceed. I need a connection so that I know who has created the game and who is playing that game.
CREATE TABLE `active_games` (
`user_id` int(11) NOT NULL DEFAULT '0',
`game_id` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`user_id`,`game_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
What would be the next step? If a user creates a game and sends the "invite" to other friends, is it possible to automatically assign a game ID and the users invited to the that game ID? And if I want to find all the active games for a specific user, how can I do that?
You need two more tables. UserGames would have information about users who create games:
CREATE TABLE UserGames (
UserGameId int not null auto_increment primary key,
userid int NOT NULL references users(id),
gameid int not null references games(id),
CreationDate datetime,
. . .
);
And GameInvites:
CREATE TABLE GameInvites (
GameInviteId int not null auto_increment primary key,
UserGamesId int not null references UserGames(UserGamesId),
invited_userid int NOT NULL references users(id),
AcceptedFlag bit
. . .
);
The . . . represent additional information that might want to store about each relationship.

MySql - Create view to read from Multiple Tables

I have archived some old line items for invoices that are no longer current but still need to reference them. I think I need to create a VIEW but not really understanding it. Can someone help so I can run a query to pull the invoice and then the total of all the line items assigned (no matter what table the items are in)?
CREATE TABLE `Invoice` (
`Invoice_ID` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`Invoice_CreatedDateTime` DATETIME DEFAULT NULL,
`Invoice_Status` ENUM('Paid','Sent','Unsent','Hold') DEFAULT NULL,
`LastUpdatedAt` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`ID`),
KEY `LastUpdatedAt` (`LastUpdatedAt`)
) ENGINE=MYISAM DEFAULT CHARSET=latin1
CREATE TABLE `Invoice_LineItem` (
`LineItem_ID` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`LineItem_ChargeType` VARCHAR(64) NOT NULL DEFAULT '',
`LineItem_InvoiceID` INT(11) UNSIGNED DEFAULT NULL,
`LineItem_Amount` DECIMAL(11,4) DEFAULT NULL,
`LastUpdatedAt` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`LineItem_ID`),
KEY `LastUpdatedAt` (`LastUpdatedAt`),
KEY `LineItem_InvoiceID` (`LineItem_InvoiceID`)
) ENGINE=MYISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1
CREATE TABLE `Invoice_LineItem_Archived` (
`LineItem_ID` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`LineItem_ChargeType` VARCHAR(64) NOT NULL DEFAULT '',
`LineItem_InvoiceID` INT(11) UNSIGNED DEFAULT NULL,
`LineItem_Amount` DECIMAL(11,4) DEFAULT NULL,
`LastUpdatedAt` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`LineItem_ID`),
KEY `LastUpdatedAt` (`LastUpdatedAt`),
KEY `LineItem_InvoiceID` (`LineItem_InvoiceID`)
) ENGINE=MYISAM AUTO_INCREMENT=1 DEFAULT CHARSET=latin1
Typically I would just run the following query to get the amount due on the invoices
SELECT
Invoice_ID,
Invoice_CreatedDateTime,
Invoice_Status,
(SELECT SUM(LineItem_Amount) AS totAmt FROM Invoice_LineItem WHERE LineItem_InvoiceID=Invoice_ID) AS Invoice_Total
FROM
Invoice
WHERE
Invoice_Status='Sent'
Also how can I select all the line items from both tables in one query?
SELECT
LineItem_ID,
LineItem_ChargeType,
LineItem_Amount
FROM
Invoice_LineItem
WHERE
LineItem_InvoiceID='1234'
You can use the MERGE Storage Engine to create a virtual table that's the union of two real tables:
CREATE TABLE Invoice_LineItem_All
(
`LineItem_ID` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`LineItem_ChargeType` VARCHAR(64) NOT NULL DEFAULT '',
`LineItem_InvoiceID` INT(11) UNSIGNED DEFAULT NULL,
`LineItem_Amount` DECIMAL(11,4) DEFAULT NULL,
`LastUpdatedAt` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
KEY (`LineItem_ID`),
KEY `LastUpdatedAt` (`LastUpdatedAt`),
KEY `LineItem_InvoiceID` (`LineItem_InvoiceID`)
) ENGINE=MERGE UNION=(Invoice_LineItem_Archived, Invoice_LineItem);
You can use UNION :
SELECT a.* FROM a
UNION
SELECT b.* FROM b;
You just need to have the same number and type of column in your different queries.
As far as I remember, you can add test in sub-queries, but I'm not sure you can order on the global result.
http://dev.mysql.com/doc/refman/4.1/en/union.html

MySQL - delete rows rejected on INNODB table

I have two tables - a user table and a userlog table.
CREATE TABLE `client_user` (
`id_client_user` int(11) NOT NULL auto_increment,
`Nom` varchar(45) NOT NULL,
`Prenom` varchar(45) NOT NULL,
`email` varchar(255) NOT NULL,
`userid` varchar(45) NOT NULL,
`password` varchar(45) NOT NULL,
`active` tinyint(1) NOT NULL default '0',
`lastaccess` timestamp NULL default NULL,
`user_must_change_pwd` tinyint(1) NOT NULL default '0',
PRIMARY KEY (`id_client_user`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `user_log` (
`id_user_log` int(11) NOT NULL auto_increment,
`access` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
`zone_updated` varchar(255) NOT NULL,
`id_client_user` int(11) NOT NULL,
PRIMARY KEY (`id_user_log`),
KEY `fk_user_log_client_user1` (`id_client_user`),
CONSTRAINT `fk_user_log_client_user1`
FOREIGN KEY (`id_client_user`)
REFERENCES `client_user` (`id_client_user`)
ON DELETE NO ACTION
ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I create a user in the client_user table and then his activity is logged within the user_log table.
I now need to delete rows in the user_log table.
This is rejected because of the foreign key constraint - that much I have understood.
After having looked at the documentation, I have not seen how I can change the foreign key to allow me to delete the user_log records.
What I need is a foreign key (1:n), client_user (1) to user_log (n), where user_log records can be deleted without impacting the associated client_user record.
I am sure that this is possible with innodb, but I cannot see how.
Help ?
From the specification
InnoDB supports the use of ALTER TABLE to drop foreign keys:
ALTER TABLE tbl_name DROP FOREIGN KEY fk_symbol;

SQL: Refactoring a multi-join query

I have a query that should be quite simple and yet it causes me a lot of headaches.
I have a simple ads system that requires filtering ads according to a few variables.
I need to limit the number of views/clicks per day and the total number of views/clicks for a given ad. Also each ad is linked to one or more slots in which the ad can appear. I have a table that saves the statistics that I need about each ad. Note that the statistics table changes very frequently.
These are the tables that I'm using:
CREATE TABLE `t_ads` (
`id` int(10) unsigned NOT NULL auto_increment,
`name` varchar(255) NOT NULL,
`content` text NOT NULL,
`is_active` tinyint(1) unsigned NOT NULL,
`start_date` date NOT NULL,
`end_date` date NOT NULL,
`max_views` int(10) unsigned NOT NULL,
`type` tinyint(3) unsigned NOT NULL default '0',
`refresh` smallint(5) unsigned NOT NULL default '0',
`max_clicks` int(10) unsigned NOT NULL,
`max_daily_clicks` int(10) unsigned NOT NULL,
`max_daily_views` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `t_ad_slots` (
`id` int(10) unsigned NOT NULL auto_increment ,
`name` varchar(255) NOT NULL,
`width` int(10) unsigned NOT NULL,
`height` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `t_ads_to_slots` (
`ad_id` int(10) unsigned NOT NULL,
`slot_id` int(10) unsigned NOT NULL,
`value` int(10) unsigned NOT NULL,
PRIMARY KEY (`ad_id`,`slot_id`),
KEY `slot_id` (`slot_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `t_ads_to_slots`
ADD CONSTRAINT `t_ads_to_slots_ibfk_1` FOREIGN KEY (`ad_id`) REFERENCES `t_ads` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
ADD CONSTRAINT `t_ads_to_slots_ibfk_2` FOREIGN KEY (`slot_id`) REFERENCES `t_ad_slots` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION;
CREATE TABLE `t_ad_stats` (
`ad_id` int(10) unsigned NOT NULL,
`slot_id` int(10) unsigned NOT NULL,
`date` date NOT NULL COMMENT,
`views` int(10) unsigned NOT NULL,
`unique_views` int(10) unsigned NOT NULL,
`clicks` int(10) unsigned NOT NULL default '0',
PRIMARY KEY (`ad_id`,`slot_id`,`date`),
KEY `slot_id` (`slot_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
ALTER TABLE `t_ad_stats`
ADD CONSTRAINT `t_ad_stats_ibfk_1` FOREIGN KEY (`ad_id`) REFERENCES `t_ads` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
ADD CONSTRAINT `t_ad_stats_ibfk_2` FOREIGN KEY (`slot_id`) REFERENCES `t_ad_slots` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION;
This is the query that I use to get ads for a given slot (Note that in this example I hard coded 20 as the slot id and 0,1,2 as the ad type, I get this data from a php script which invokes this query)
SELECT `ads`.`content`, `slots`.`value`, `ads`.`id`, `ads`.`refresh`, `ads`.`type`,
SUM(`total_stats`.`views`) AS "total_views",
SUM(`total_stats`.`clicks`) AS "total_clicks"
FROM (`t_ads` AS `ads`,
`t_ads_to_slots` AS `slots`)
LEFT JOIN `t_ad_stats` AS `total_stats`
ON `total_stats`.`ad_id` = `ads`.`id`
LEFT JOIN `t_ad_stats` AS `daily_stats`
ON (`daily_stats`.`ad_id` = `ads`.`id`) AND
(`daily_stats`.`date` = CURDATE())
WHERE (`ads`.`id` = `slots`.`ad_id`) AND
(`ads`.`type` IN(0,1,2)) AND
(`slots`.`slot_id` = 20) AND
(`ads`.`is_active` = 1) AND
(`ads`.`end_date` >= NOW()) AND
(`ads`.`start_date` <= NOW()) AND
((`ads`.`max_views` = 0) OR
(`ads`.`max_views` > "total_views")) AND
((`ads`.`max_clicks` = 0) OR
(`ads`.`max_clicks` > "total_clicks")) AND
((`ads`.`max_daily_clicks` = 0) OR
(`ads`.`max_daily_clicks` > IFNULL(`daily_stats`.`clicks`,0))) AND
((`ads`.`max_daily_views` = 0) OR
(`ads`.`max_daily_views` > IFNULL(`daily_stats`.`views`,0)))
GROUP BY (`ads`.`id`)
I believe that this query is self explanatory, even though its quite long. Note that the MySQL version that I'm using is: 5.0.51a-community. It seems to me like the big issue here is the double join to the stats table (I did that so that I will be able to get the data from a specific record and from multiple records (sum)).
How would you implement this query in order to get better results? (Note that I can't change from InnoDB).
Hopefully everything is clear about my question, but if that is not the case, please ask and I will clarify.
Thanks in advance,
Kfir
Add indexes to following columns:
t_ads.is_active
t_ads.start_date
t_ads.end_date
Change the order of the primary key on t_ad_stats to:
(`ad_id`,`date`,`slot_id`)
or add a covering index to t_ad_stats
('ad_id', 'date')
Change from 0 meaning "no limit" to 2147483647 meaning no limit, so you can change things like:
((`ads`.`max_views` = 0) OR (`ads`.`max_views` > "total_views"))
to
(`ads`.`max_views` > "total_views")
You could greatly improve this is if you were keeping running totals instead of having to calculate them each time.
Expanding on a comment above I believe that the following columns should be indexed:
ads.id
ads.type
ads.start_date
ads.end_date
daily_stats.date
As well as these:
slots.slot_id
ads.is_active
And these as well:
ads.max_views
ads.max_clicks
ads.max_daily_clicks
ads.max_daily_views
daily_stats.clicks
daily_stats.views
Do note that applying indexes on these columns will speed up your SELECTs but slow down your INSERTs since the indexes will need updating as well. But, you don't have to apply all of this all at once. You can do it incrementally and see how the performance shakes out for selects as well as inserts. If you cannot find a good middleground then I would suggest denormalization.