multicolumn unique key mysql insert - mysql

I have a mysql database, and a table structure like this:
CREATE TABLE `user_session_log` (
`stat_id` int(8) NOT NULL AUTO_INCREMENT,
`metric` tinyint(1) NOT NULL DEFAULT '0',
`platform` tinyint(1) NOT NULL DEFAULT '0',
`page_id` varchar(128) DEFAULT '_empty_',
`target_date` date DEFAULT NULL,
`country` varchar(2) DEFAULT NULL COMMENT 'ISO 3166 country code (2 symbols)',
`amount` int(100) NOT NULL DEFAULT '0.000000' COMMENT 'counter or amount',
`unique_key` varchar(180) DEFAULT NULL COMMENT 'Optional unique identifier',
`created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`modified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`stat_id`),
UNIQUE KEY `unique_key` (`unique_key`) USING BTREE,
KEY `target_date` (`target_date`)
) ENGINE=InnoDB AUTO_INCREMENT=21657473 DEFAULT CHARSET=utf8
What I'm trying to achieve is to log the active sessions / unique users based on date, and page_id, and country. Currently I'm able to achieve this by generating multiple insert statements with unique_key, buy adding a page_id and date in the unique key but I want something a little bit different.
The logic should be: insert new row of unique_key (semi-unique user id), where country = this, date = this, page_id = this. If there is already a row with such information (same page_id, unique_key, and date + country) - update the amount = (amount) + 1; (session).
So I could do lookups like :
SELECT sum(amount) WHERE page_id = "something" AND target_date = "2018-12-21"
This would give me a number of sessions. OR:
SELECT COUNT(*) WHERE page_id = "something" AND target_date = "2018-12-21"
This would give me a number of active users on that pagee_id on that day
OR:
SELECT COUNT(*) WHERE target_date = "2018-12-21"
Which would give me a result of total users on that day.
I know about unique index, but would it give me a result I'm looking for?
Edit, a sample insert:
INSERT INTO `user_session_log` (`platform`,`page_id`,`target_date`,`country`,`amount`,`unique_key`,`created`,`modified`) VALUES ('1','page_id_54','2018-10-08','US',1,'ea3d0ce0406a838d9fd31df2e2ec8085',NOW(),NOW()) ON DUPLICATE KEY UPDATE `amount` = (amount) +1, `modified` = NOW();
and the table should know if theres a duplicate based on if theres a same unique_key + date + country + platform + page_id, otherwise just insert a new row.
Right now I'm doing this differently by having different metrics and a unique_key generated already containing the date + page_id and then hashed. that way it's unique by means i can filter the different unique users on a day basis, but I can't filter the amount of sessions that unique user has had, or how long he uses the software and similar.

Firstly create unique index on all the columns that needs to be unique as follows:
ALTER TABLE user_session_log ADD UNIQUE INDEX idx_multi_column ON user_session_log (unique_key, date, country, platform, page_id);
then you can use INSERT ... ON DUPLICATE KEY UPDATE query to insert/update.

Related

How to get auto-increment PK on a multi-row insert in MySql

I need to get back a list of "affected" ids when inserting multiple rows at once. Some rows might already be there, resulting in an update instead.
DDL:
CREATE TABLE `users` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT PRIMARY KEY,
`email` varchar(100) NOT NULL,
`is_active` tinyint(1) NOT NULL DEFAULT '1',
`update_time` timestamp(3) NOT NULL DEFAULT CURRENT_TIMESTAMP(3) ON UPDATE CURRENT_TIMESTAMP(3),
UNIQUE KEY `email` (`email`)
)
Query:
INSERT INTO users (id, email, is_active)
VALUES (NULL, "joe#mail.org", true),
(NULL, "jack#mail.org", false),
(NULL, "dave#mail.org", true)
ON DUPLICATE KEY UPDATE
is_active = VALUES(is_active)
There is a UNIQUE constraint on email.
From what I gathered, LAST_INSERT_ID() would only give me first generated id of the batch. But I wouldn't know how many inserts/updates really took place.
The only way I could come up with is to follow with a second SELECT statement:
SELECT id
FROM users
WHERE email IN ("joe#mail.org", "jack#mail.org", "dave#mail.org")
Is there a better way?

SQL: How to store and select person who available on the same date range

From the image below, I want to...
1.Find solution to store unavailable date of each users.
2.I want to select users who available on specify date range.
Red = Unavailable,White = Available.
Example: I will have a training course on day 3 to 4, so i should get Mr.A and Mr.C as my query result.
Well, you can have a table like this:
Create table course_calendar (
date date not null,
course_id int not null,
tutor_id int default null,
primary key (date, course_id)
) engine = innodb;
create table tutor_calendar (
date date not null,
tutor_id int not null,
available enum('Y','N') default 'N',
Primary key (date, tutor_id)
) engine = innodb;
Then you would need to join the two tables and see which course_calendar has tutor_id is null
and see which tutor_calendar has tutor_id and available='Y' for the same dates.

mysql query to find contiguous datetime records with start and stop fields

In the transportation industry, regulations limit hours-of-service.
I have a mysql database that includes a table with an auto_increment field called Task_ID, an integer field with a foreign key called User_ID, and two datetime fields, Start_DT, and End_DT.
Table:
Task_ID | User_ID | Start_DT | End_DT
An employee's shift can include several tasks, each creating a record.
I already have a query that identifies the most recent endtime, by employee, (User_ID). (The html user interface prevents data entry where the starttime would be later than the endtime.).
I need to create a query that would return all the records contiguous to the most recent endtime (by employee). In other words, all the records in which the starttime of the current record is equal to the end time of the previous record (by employee). The number of tasks (records) in the series varies.
Should I nest enough subqueries to be confident that no contiguous series of tasks will exceed that in number of tasks, or is there a way to look for a gap and return all records later than the gap?
There's lots of advice for finding gaps in a single field, but my searching hasn't found much appropriate to this question.
(Responding to Strawberry's comments below:)
1)
CREATE TABLE `hoursofservice` (
`Task_ID` MEDIUMINT(8) UNSIGNED NOT NULL AUTO_INCREMENT,
`User_ID` SMALLINT(5) UNSIGNED NOT NULL,
`Location` VARCHAR(20) NULL DEFAULT NULL,
`Task` VARCHAR(30) NULL DEFAULT NULL,
`Start_DT` DATETIME NOT NULL,
`End_DT` DATETIME NULL DEFAULT NULL,
`Comment_1` VARCHAR(20) NULL DEFAULT NULL,
`Comment_2` VARCHAR(256) NULL DEFAULT NULL,
`Bad_Data` BIT(1) NOT NULL DEFAULT b'0',
PRIMARY KEY (`Task_ID`),
UNIQUE INDEX `Task_ID` (`Task_ID`),
INDEX `FK_hoursofservice_employee_id` (`User_ID`),
CONSTRAINT `FK_hoursofservice_employee_id` FOREIGN KEY (`User_ID`) REFERENCES `employee_id` (`User_ID`) ON UPDATE CASCADE
and ...
INSERT INTO hoursofservice (User_ID, Location, Task, Start_DT, End_DT, Comment_1, Comment_2)
SELECT User_ID, Location, Task, Start_DT, End_DT, Comment_1, Comment_2 FROM read_text_file;
2) Result set would be records selected from the table such that the start time from the most recent record would equal the end time from the previous, and likewise until there was no record that met the condition. (These would be ordered by User_ID.)
Something like this might work for you:
SELECT islands.[all relevant fields from "theTable"]
FROM (
SELECT [all relevant fields from "theTable"]
, #i := CASE
WHEN #prevUserId <> User_ID THEN 1 -- Reset "island" counter for each user
WHEN #prevStart <> End_DT THEN #i + 1 -- Increment "island" counter when gaps found
ELSE #i -- Do not change "island" counter
END AS island
, #prevStart := Start_DT -- Remember this Start_DT value for the next row
, #prevUserId := User_ID -- Remember this User_ID value for the next row
FROM (
SELECT t.*
FROM theTable AS t
WHERE End_DT < [known or "ceiling" End_DT]
AND [limiting condition (for speed),
something like Start_DT > [ceiling] - INTERVAL 3 DAY
]
ORDER BY User_ID, End_DT DESC
) AS candidates -- Insures rows are in appropriate order for use with session variables.
-- Allows us to treat the inclosing query somewhat like a loop.
, (SELECT #i := 0
, #prevStart := '9999-12-31 23:59:59'
, #prevUserId := -1
) AS init -- Initializing session variables;
-- can actually be done in preceeding SET statements
-- if the connection is preserved between the SETs and the query.
ORDER BY User_ID, Start_DT
) AS islands -- Data from "theTable" should now have appropriate "islands" calculated
WHERE islands.island = 1 -- Only keep the first "island" for each user
ORDER BY islands.User_ID, islands.Start_DT
;

How to aggregate data without group by

I am having a little bit of a situation here.
The environment
I have a database for series here.
One table for the series itself, one for the season connected to the series table, one for the episodes connected to the seasons table.
Since there are air dates for different countries I have another table called 'series_data` which looks like the following:
CREATE TABLE IF NOT EXISTS `episode_data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`episode_id` int(11) NOT NULL,
`country` char(3) NOT NULL,
`title` varchar(255) NOT NULL,
`date` date NOT NULL,
`tba` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `episode_id` (`episode_id`),
KEY `date` (`date`),
KEY `country` (`country`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now I am trying to collect the last aired episodes from each series in the database using the following query:
SELECT
*
FROM
`episode_data` ed
WHERE
`ed`.`date` < CURDATE( ) &&
`ed`.`date` != '1970-01-01' &&
`ed`.`series_id` = 1
GROUP BY
`ed`.`country` DESC
ORDER BY
`ed`.`date` DESC
Since I have everything normalized I changed 'episode_id' with 'series_id' to make the query less complicated.
What I am trying to accomplish
I want to have the last aired episodes for each country which are actually announced (ed.date != '1970-01-01') as the returning result of one query.
What's the problem
I know now (searched google, found not for me working answers here), that the ordering takes place AFTER grouping, so my "date" ordering is completly useless.
The other problem is that the query above is working, but always takes those entries with the lowest id matching my conditions, because those are the first ones in the tables index.
What is the question?
How may accomplish the above. I do not know if the grouping is the right way to do it. If there is no "one liner", I think the only way is a sub query which I want to avoid since this is as far as I know slower than a one liner with the right indexes set.
Hope in here is everything you need :)
Example data
CREATE TABLE IF NOT EXISTS `episode_data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`episode_id` int(11) NOT NULL,
`country` char(3) NOT NULL,
`title` varchar(255) NOT NULL,
`date` date NOT NULL,
`tba` tinyint(1) NOT NULL,
PRIMARY KEY (`id`),
KEY `episode_id` (`episode_id`),
KEY `date` (`date`),
KEY `country` (`country`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `episode_data` (`id`, `episode_id`, `country`, `title`, `date`, `tba`) VALUES
(4942, 2471, 'de', 'Väter und Töchter', '2013-08-06', 0),
(4944, 2472, 'de', 'Neue Perspektiven', '2013-08-13', 0),
(5013, 2507, 'us', 'Into the Deep', '2013-08-06', 0),
(5015, 2508, 'us', 'The Mirror Has Three Faces', '2013-08-13', 0);
Attention!
This is the original table data with "EPISODE_ID" not "SERIES_ID".
The data I want are those with closest dates to today, which are here 4944 and 5015.
If you want the last aired date for each country, then use this aggregation:
SELECT country, max(date) as lastdate
FROM `episode_data` ed
WHERE `ed`.`date` < CURDATE( ) AND
`ed`.`date` != '1970-01-01' AND
`ed`.`series_id` = 1
GROUP BY `ed`.`country`;
If you are trying to get the episode_id and title as well, you can use group_concat() and substring_index():
SELECT country, max(date) as lastdate,
substring_index(group_concat(episode_id order by date desc), ',', 1
) as episode_id,
substring_index(group_concat(title order by date desc separator '|'), '|', 1
) as title
FROM `episode_data` ed
WHERE `ed`.`date` < CURDATE( ) AND
`ed`.`date` != '1970-01-01' AND
`ed`.`series_id` = 1
GROUP BY `ed`.`country`;
Note that this uses a different separator for the title, under the assumption that it might have a comma.

MySQL: update a record, on duplicate key leave NEWER one, and delete OLDER one? See inside

There is a table:
CREATE TABLE `mytable` (
`user_id` INT(10) UNSIGNED NOT NULL,
`thing_id` VARCHAR(100) NOT NULL DEFAULT '',
`lock_date` DATETIME NOT NULL,
`lock_id` VARCHAR(36) NOT NULL,
PRIMARY KEY (`user_id`,`thing_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;
and some values there:
INSERT INTO mytable(user_id,thing_id,lock_date,lock_id)
VALUES
(51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2012-03-16 00:39:12','ec7b2008-6ede-11e1-aac2-5924aae99221'),
(108325,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'),
(108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221');
I want to delegate all records of user_id = 108325 to user_id = 51082, and if both users have an equal thing_id field, leave the newer one only (lock_date1 > lock_date2), so that I have following result:
51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'
108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221'
Note that 51082 now has a newer record: lock_date = '2013-02-05 19:30:03' instead of '2012-03-16 00:39:12'.
So, how can I update a row, and on duplicate key leave the newer one (by some particular field)?
Thanks!
INSERT INTO
mytable(user_id,thing_id,lock_date,lock_id)
VALUES
(51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2012-03-16 00:39:12','ec7b2008-6ede-11e1-aac2-5924aae99221'),
(108325,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'),
(108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221')
ON DUPLICATE KEY UPDATE SET
user_id = VALUES(user_id),
lock_date = VALUES(lock_date),
lock_id = VALUES(lock_id)