MySQL INSERT multiple rows if certain values don't exist - mysql

I have the following query which works correctly:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT 10, 'user_other_unit_moved', now(), 8383
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
LIMIT 1;
What the query does is check to see if a row exists in my Events table that matches the event type and unit ID I wish to INSERT. If it finds an existing record, it does not proceed with the INSERT. However, if it does not find a record then it proceeds with the INSERT.
This is the structure of my Events table:
CREATE TABLE `Events` (
`event_ID` int(11) NOT NULL,
`user_ID` int(11) NOT NULL,
`event_type` varchar(35) NOT NULL,
`event_creation_datetime` datetime NOT NULL,
`unit_ID` int(10) UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
ALTER TABLE `Events`
ADD PRIMARY KEY (`event_ID`),
ADD KEY `unit_ID` (`unit_ID`);
ALTER TABLE `Events`
MODIFY `event_ID` int(11) NOT NULL AUTO_INCREMENT;
COMMIT;
The problem I have is trying to get the above query to work correctly when trying to INSERT multiple rows. I know how to INSERT multiple rows using comma delimited VALUES, but no matter what I try I get syntax errors. Here is the query I have been playing with:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
VALUES (
(SELECT 10, 'user_other_unit_moved', now(), 8383
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
LIMIT 1)),
(SELECT 10, 'user_other_unit_moved', now(), 8380
FROM Events
WHERE NOT EXISTS (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380)
LIMIT 1))
);
However, no matter what I try (inserting, removing parentheses etc.) I get either the generic "You have an error in your SQL syntax;" or "Operand should contain only 1 column".
I have also tried this alternative based on other StackOverflow posts:
INSERT IGNORE INTO Events (event_ID, user_ID, event_type, event_creation_datetime, unit_ID)
VALUES
(SELECT (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383), 10, 'user_other_unit_moved', now(), 8383),
(SELECT (SELECT event_ID FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383), 10, 'user_other_unit_moved', now(), 8383);
But this fails with "Can't specify target table for update in FROM clause" even if I try to return results using temporary tables.
Is it just an error with my syntax, or am I trying to do something not possible with the way my query is laid out? And if it's just an error, how would I write the query so that it works as I've intended? Note that I do not want to use multi-queries - I want the query to work as one statement.
Thanks,
Arj

Don't use VALUES, just INSERT ... SELECT and not FROM events.
Then UNION ALL.
This code works for MySql 5.6:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT *
FROM (
SELECT 10 user_ID, 'user_other_unit_moved' event_type,
now() event_creation_datetime, 8383 unit_ID
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
) t
WHERE NOT EXISTS (
SELECT 1 FROM Events e
WHERE e.event_type = t.event_type AND e.unit_ID = t.unit_ID
);
See the demo.
This code works for MySql 5.7+:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT * FROM (
SELECT 10, 'user_other_unit_moved', now(), 8383
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380)
) t
See the demo
And this for MySql 8.0+:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT 10, 'user_other_unit_moved', now(), 8383
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8383)
UNION ALL
SELECT 10, 'user_other_unit_moved', now(), 8380
WHERE NOT EXISTS (SELECT 1 FROM Events WHERE event_type = 'user_other_unit_moved' AND unit_ID = 8380);
See the demo.

Although you can write this with just union all:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
SELECT x.user_id, x.event_type, now(), x.unit_id
FROM (SELECT 10 as user_id, 8383 as unit_id, 'user_other_unit_moved' as event_type
) x
WHERE NOT EXISTS (SELECT 1 FROM Events e2 WHERE e2.event_type = x.event_type AND e2.unit_ID = x.unit_ID)
UNION ALL
SELECT x.user_id, x.event_type, now(), x.unit_id
FROM (SELECT 10 as user_id, 8380 as unit_id, 'user_other_unit_moved' as event_type
) x
WHERE NOT EXISTS (SELECT 1 FROM Events e2 WHERE e2.event_type = x.event_type AND e2.unit_ID = x.unit_ID);
I suspect there is a better way. If a unit_id can have only one row for each event type, then you should specify that using a unique constraint or index:
create unique constraint unq_events_unit_id_event_type on events(unit_id, event_type);
It is better to have the database ensure integrity. In particularly, your version is subject to race conditions. And to duplicates being inserted within the same statement.
Then you can use on duplicate key to prevent duplicate rows:
INSERT INTO Events (user_ID, event_type, event_creation_datetime, unit_ID)
VALUES (10, 'user_other_unit_moved', now(), 8383),
(10, 'user_other_unit_moved', now(), 8380)
ON DUPLICATE KEY UPDATE unit_ID = VALUES(unit_ID);
The update actually does nothing (because unit_ID already has that value). But it does prevent an error and a duplicate row from being inserted.

Related

MySQL select and then insert or update if exists

I need to find multiple rows related to users and then insert into another table or update if record exists for current day.
I am doing this way
SELECT CASE WHEN
(
SELECT
DISTINCT `userid`,
COUNT(DISTINCT `userip`,`userid`) AS `count`,
#date:=UNIX_TIMESTAMP(CURDATE())
FROM `tablename`
WHERE (`date` >= UNIX_TIMESTAMP(CURDATE())) GROUP BY `userid`
)
THEN
(
UPDATE `tablename2` SET `count`=`count`,`userid`=`userid`,`date`=`date` WHERE `date` >= UNIX_TIMESTAMP(CURDATE()))
)
ELSE
(
INSERT INTO `tablename2` (`count`,`userid`,`date`) VALUES(`count`,`userid`,`date`);
)
END
But this is giving me syntax error near UPDATE..
How can I fix this?
I am guessing that you want one row per user and date in tablename2. If so, enforce this rule with a unique index:
CREATE UNIQUE INDEX idx_tablename2(userid, date)
Then the database enforces it.
Your SQL is a mess, but I think I can see what you are trying to do. The basic idea is INSERT . . . ON DUPLICATE KEY UPDATE. I think the following does what you want:
INSERT INTO `tablename2` (`count`, `userid`, `date`)
SELECT `userid`, COUNT(DISTINCT `userip`, `userid`) AS `count`,
UNIX_TIMESTAMP(CURDATE())
FROM `tablename`
WHERE `date` >= UNIX_TIMESTAMP(CURDATE())
GROUP BY `userid`
ON DUPLICATE KEY UPDATE `count` = VALUES(`count`);

How to optmize query with a query with multiple GROUP BY's,sub queries and WHERE IN over a large table?

I am working on a scraping project to crawl items and their scores over different schedules.Schedule is a user defined period (date) when the script is intended to run.
Table structure is as follows:
--
-- Table structure for table `test_join`
--
CREATE TABLE IF NOT EXISTS `test_join` (
`schedule_id` int(11) NOT NULL,
`player_name` varchar(50) NOT NULL,
`type` enum('celebrity','sportsperson') NOT NULL,
`score` int(11) NOT NULL,
PRIMARY KEY (`schedule_id`,`player_name`,`type`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `test_join`
--
INSERT INTO `test_join` (`schedule_id`, `player_name`, `type`, `score`) VALUES
(1, 'sachin', 'sportsperson', 100),
(1, 'ganguly', 'sportsperson', 80),
(1, 'dravid', 'sportsperson', 60),
(1, 'sachin', 'celebrity', 100),
(2, 'sachin', 'sportsperson', 120),
(2, 'ganguly', 'sportsperson', 100),
(2, 'sachin', 'celebrity', 120);
The scraping is done over periods and for each schedule it is expected to have about 10k+ entries.The schedules could be made in daily basis,hence the data would grow to be be around 2 million in 5-6 months.
Over this data I need to perform queries to aggregate the player who come across each schedules in a selected range of schedules.
For example:
I need aggregate same players who come across multiple schedules. If schedule 1 and 2 are selected,items which come under both of the schedules only will be selected.
I am using the following query to aggregate results based on the type,
For schedule 1:
SELECT fullt.type,COUNT(*) as count,SUM(fullt.score) FROM
(SELECT tj.*
FROM `test_join` tj
RIGHT JOIN
(SELECT `player_name`,`type`,COUNT(`schedule_id`) as c FROM `test_join` WHERE `schedule_id` IN (1,2) GROUP BY `player_name`,`type` HAVING c=2) stj
on tj.player_name = stj.player_name
WHERE tj.`schedule_id`=1
GROUP BY tj.`type`,tj.`player_name`)AS fullt
GROUP BY fullt.type
Reason for c = 2;
WHERE `schedule_id` IN (1,2) GROUP BY `player_name`,`type` HAVING c=2
Here we are selecting two schedules,1 and 2.Hence the count 2 is taken to make the query to to fetch records which belongs to both the schedules and occurs twice.
It would generate a results as follows,
Schedule 1 :Expected Results
Schedule 2 :Expected Results
This is my expected result and the query returns the results as above.
(In the real case I have to work across pretty big MySQL tables)
On my understanding of standardized MySQL queries, using sub queries,WHERE IN, varchar comparison fields ,multiple GROUP BY's would affect in the query performance.
I need the aggregate results in real time and query speed and well as standards are a concern too.How this could be optimized for better performance in this context.
EDIT:
I had reduced sub queries now:
SELECT fullt.type,COUNT(*) as count,SUM(fullt.score) FROM (
SELECT t.*
FROM `test_join` t
INNER JOIN test_join t1 ON t.`player_name` = t1.player_name AND t1.schedule_id = 1
INNER JOIN test_join t2 ON t.player_name = t2.player_name AND t2.schedule_id = 2
WHERE t.schedule_id = 2
GROUP BY t.`player_name`,t.`type`) AS fullt
GROUP BY fullt.type
Is this a better way to do so.I had replaced WHERE IN with JOINS.
Any advise would be highly appreciated.I would be happy to provide any supporting information if needed.
try below SQL Query in MYSQL:
SELECT tj.`type`,COUNT(*) as count,SUM(tj.`score`) FROM
`test_join` tj
where tj.`schedule_id`=1
and `player_name` in
(
select tj1.`player_name` from `test_join` tj1
group by tj1.`player_name` having count(tj1.`player_name`) > 1
)
group by tj.`type`
Actuallly I tried same data in Sybase as i dont have MySQL installed in my machine.It worked as exepected !
CREATE TABLE #test_join
(
schedule_id int NOT NULL,
player_name varchar(50) NOT NULL,
type1 varchar(15) NOT NULL,
score int NOT NULL,
)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES
(1, 'sachin', 'sportsperson', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'ganguly', 'sportsperson', 80)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'dravid', 'sportsperson', 60)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(1, 'sachin', 'celebrity', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'sachin', 'sportsperson', 120)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'ganguly', 'sportsperson', 100)
INSERT INTO #test_join (schedule_id, player_name, type1, score) VALUES(2, 'sachin', 'celebrity', 120)
select * from #test_join
Print 'Solution #1 : Inner join'
select type1,count(*),sum(score) from
#test_join
where schedule_id=1 and player_name in (select player_name from #test_join t1 group by player_name having count(player_name) > 1 )
group by type1
select player_name,type1,sum(score) Score into #test_join_temp
from #test_join
group by player_name,type1
having count(player_name) > 1
Print 'Solution #2 using Temp Table'
--select * from #test_join_temp
select type1,count(*),sum(score) from
#test_join
where schedule_id=1 and player_name in (select player_name from #test_join_temp )
group by type1
I hope This Helps :)

How to get users that purchased items ONLY in a specific time period (MySQL Database)

I have a table that contains all purchased items.
I need to check which users purchased items in a specific period of time (say between 2013-03-21 to 2013-04-21) and never purchased anything after that.
I can select users that purchased items in that period of time, but I don't know how to filter those users that never purchased anything after that...
SELECT `userId`, `email` FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21' GROUP BY `userId`
Give this a try
SELECT
user_id
FROM
my_table
WHERE
purchase_date >= '2012-05-01' --your_start_date
GROUP BY
user_id
HAVING
max(purchase_date) <= '2012-06-01'; --your_end_date
It works by getting all the records >= start date, groups the resultset by user_id and then finds the max purchase date for every user. The max purchase date should be <=end date. Since this query does not use a join/inner query it could be faster
Test data
CREATE table user_purchases(user_id int, purchase_date date);
insert into user_purchases values (1, '2012-05-01');
insert into user_purchases values (2, '2012-05-06');
insert into user_purchases values (3, '2012-05-20');
insert into user_purchases values (4, '2012-06-01');
insert into user_purchases values (4, '2012-09-06');
insert into user_purchases values (1, '2012-09-06');
Output
| USER_ID |
-----------
| 2 |
| 3 |
SQLFIDDLE
This is probably a standard way to accomplish that:
SELECT `userId`, `email` FROM my_table mt
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
AND NOT EXISTS (
SELECT * FROM my_table mt2 WHERE
mt2.`userId` = mt.`userId`
and mt2.`date` > '2013-04-21'
)
GROUP BY `userId`
SELECT `userId`, `email` FROM my_table WHERE (`date` BETWEEN '2013-03-21' AND '2013-04-21') and `date` >= '2013-04-21' GROUP BY `userId`
This will select only the users who purchased during that timeframe AND purchased after that timeframe.
Hope this helps.
Try the following
SELECT `userId`, `email`
FROM my_table WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
and user_id not in
(select user_id from my_table
where `date` < '2013-03-21' or `date` > '2013-04-21' )
GROUP BY `userId`
You'll have to do it in two stages - one query to get the list of users who did buy within the time period, then another query to take that list of users and see if they bought anything afterwards, e.g.
SELECT userID, email, count(after.*) AS purchases
FROM my_table AS after
LEFT JOIN (
SELECT DISTINCT userID
FROM my_table
WHERE `date` BETWEEN '2013-03-21' AND '2013-04-21'
) AS during ON after.userID = during.userID
WHERE after.date > '2013-04-21'
HAVING purchases = 0;
Inner query gets the list of userIDs who purchased at least one thing during that period. That list is then joined back against the same table, but filtered for purchases AFTER the period , and counts how many purchases they made and filters down to only those users with 0 "after" purchases.
probably won't work as written - haven't had my morning tea yet.
SELECT
a.userId,
a.email
FROM
my_table AS a
WHERE a.date BETWEEN '2013-03-21'
AND '2013-04-21'
AND a.userId NOT IN
(SELECT
b.userId
FROM
my_table AS b
WHERE b.date BETWEEN '2013-04-22'
AND CURDATE()
GROUP BY b.userId)
GROUP BY a.userId
This filters out anyone who has not purchased anything from the end date to the present.

MySQL multiple COUNTs

I have a table like this:
Fiddle: http://sqlfiddle.com/#!2/44d9e/14
CREATE TABLE IF NOT EXISTS `mytable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(20) NOT NULL,
`money_earned` int(20) NOT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO mytable (user_id,money_earned) VALUES ("111","10");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","6");
INSERT INTO mytable (user_id,money_earned) VALUES ("111","40");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","45");
INSERT INTO mytable (user_id,money_earned) VALUES ("222","1");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","5");
INSERT INTO mytable (user_id,money_earned) VALUES ("333","19");
I need to know table has how many rows, how many different users, and how many times each user has earned.
I need this result:
TOTAL_ROWS: 7
TOTAL_INDIVIDUAL_USERS: 3
USER_ID USER_TIMES
111 3
222 2
333 2
Is your problem that you want the total as well? If so, then you can get this using rollup:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times
FROM mytable
GROUP BY user_id with rollup;
You can get the user counts in a separate column with this trick:
SELECT coalesce(cast(user_id as char(20)), 'TOTAL USER_TIMES'),
COUNT(*) as times, count(distinct user_id) as UserCount
FROM mytable
GROUP BY user_id with rollup;
You realize that a SQL query just returns a table of values. You are asking for very specific formatting, which is typically done better at the application level. That said, you can get close to what you want with something like this:
select user, times
from ((SELECT 3 as ord, cast(user_id as char(20)) as user, COUNT(*) as times
FROM mytable
GROUP BY user_id
)
union all
(select 1, 'Total User Count', count(*)
from mytable
)
union all
(select 2, 'Total Users', count(distinct user_id)
from mytable
)
) t
order by ord;
I think this could be a typo anyway your are trying to sum your COUNT() times, simply replace with money_earned
SELECT user_id,
COUNT(*) AS 'times',
SUM(money_earned) AS 'sum_money'
FROM mytable GROUP BY user_id;
SQL Fiddle

Select data based on date of id in table

I am currently showing the last 5 events in my database where WHERE eventdate < CURDATE()
eg
CREATE TABLE venues (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
venue VARCHAR(255)
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
CREATE TABLE categories (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
category VARCHAR(255)
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
CREATE TABLE events (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
eventdate DATE NOT NULL,
title VARCHAR(255),
venueid INT,
categoryid INT
) DEFAULT CHARACTER SET utf8 ENGINE=InnoDB;
INSERT INTO venues (id, venue) VALUES
(1, 'USA'),
(2, 'UK'),
(3, 'Japan');
INSERT INTO categories (id, category) VALUES
(1, 'Jazz'),
(2, 'Rock'),
(3, 'Pop');
INSERT INTO events (id, eventdate, title, venueid, categoryid) VALUES
(1,20121003,'Title number 1',1,3),
(2,20121010,'Title number 2',2,1),
(3,20121015,'Title number 3',3,2),
(4,20121020,'Title number 4',1,3),
(5,20121022,'Title number 5',2,1),
(6,20121025,'Title number 6',3,2),
(7,20121030,'Title number 7',1,3),
(8,20121130,'Title number 8',1,1),
(9,20121230,'Title number 9',1,2),
(10,20130130,'Title number 10',1,3);
SELECT DATE_FORMAT(events.eventdate,'%M %d %Y') AS DATE, title,
cats.category AS CATEGORY, loc.venue AS LOCATION
FROM events
INNER JOIN categories as cats ON events.categoryid=cats.id
INNER JOIN venues as loc ON events.venueid=loc.id
WHERE eventdate < CURDATE()
ORDER BY eventdate DESC
LIMIT 0 , 5
See fiddle below.
http://sqlfiddle.com/#!2/21ad85/14
I want to show the last 5 events in my database where the eventdate < (events.eventdate WHERE events.id =10)
so where it = 10 you should be able to see event id 9,8,7,6,5 where it = 9 you should be able to see 8,7,6,5,4 etc.
But I am not quite sure how to write it in sql. I think it should be along the lines of
WHERE eventdate < (events.eventdate WHERE events.id =10)
but this doesn't work
Maybe you need this?
WHERE eventdate < (SELECT eventdate FROM events WHERE events.id =10)
Can you try this?
wHERE eventdate < curdate() and events.id < 10
updated for the typo: `events.eventdate to curdate()`