SQL join category from timeline into timestamp table - mysql

Consider the following structure:
create table timestamps(id int, stamp timestamp);
insert into timestamps values
(1,'2017-10-01 10:05:01'),
(2,'2017-10-01 11:05:01'),
(3,'2017-10-01 12:05:01'),
(4,'2017-10-01 13:05:01');
create table category_timeline(begin timestamp,end timestamp, category varchar(100));
insert into category_timeline values
('2017-10-01 10:01:03','2017-10-01 12:01:03','Cat1'),
('2017-10-01 12:01:03','2017-10-01 12:42:43','Cat3'),
('2017-10-01 12:42:43','2017-10-01 14:01:03','Cat2');
Sqlfiddle of same: SQL Fiddle
I have two tables, one (timestamps) containing timestamps, and one (category_timeline) containing a timeline of categories, that is, we assume the records in category_timeline form a continuous non-overlapping timeline assigning a category to each time period.
I want to assign the categories to the timestamps table, resulting in:
| id | stamp | category |
|----|----------------------|----------|
| 1 | 2017-10-01T10:05:01Z | Cat1 |
| 2 | 2017-10-01T11:05:01Z | Cat1 |
| 3 | 2017-10-01T12:05:01Z | Cat3 |
| 4 | 2017-10-01T13:05:01Z | Cat2 |
which is the result of the following query:
SELECT id, stamp, category FROM timestamps ts
LEFT JOIN category_timeline tl
ON ts.stamp >= tl.begin
AND ts.stamp < tl.end
However, as soon as the tables get bigger, this operation seems to get exponentially slower, is there a better way to do this, using the assumption that any timestamp only falls within a unique period in the other table.

I would suggest this approach:
SELECT ts.id, ts.stamp,
(SELECT tl.category
FROM category_timeline tl
WHERE tl.end > ts.stamp
ORDER BY tl.end ASC
LIMIT 1
) as category
FROM timestamps ts ;
Be sure you have an index on category_timeline(end, category).

Related

Reduce number of joins in mysql

I have 12 fixed tables (group, local, element, sub_element, service, ...), each table with different numbers of rows.
The columns 'id_' in all table is a primary key (int). The others columns are of datatype varchar(20). The maximum number of rows in these tables are 300.
Each table was created in this way:
CREATE TABLE group
(
id_G int NOT NULL,
name_group varchar(20) NOT NULL,
PRIMARY KEY (id_G)
);
|........GROUP......| |.......LOCAL.......| |.......SERVICE.......|
| id_G | name_group | | id_L | name_local | | id_S | name_service |
+------+------------+ +------+------------+ +------+--------------+
| 1 | group1 | | 1 | local1 | | 1 | service1 |
| 2 | group2 | | 2 | local2 | | 2 | service2 |
And I have one table that combine all these tables depending on user selects.
The 'id_' come from fixed tables selected by the user are recorded into this table.
This table was crate in this way:
CREATE TABLE group
(
id_E int NOT NULL,
event_name varchar(20) NOT NULL,
id_G int NOT NULL,
id_L int NOT NULL,
...
PRIMARY KEY (id_G)
);
The tables (event) look like this:
|....................EVENT.....................|
| id_E | event_name | id_G | id_L | ... |id_S |
+------+-------------+------+------+-----+-----+
| 1 | mater1 | 1 | 1 | ... | 3 |
| 2 | master2 | 2 | 2 | ... | 6 |
This table get greater each day, an now it has about thousunds of rows.
Column id_E is the primary key (int), event_name is varchar(20).
This table has, in addition of id_E and event_name columns, 12 other columns the came from the fixed tables.
Every time than I need to retrieve information on the event table, to turn more readable, I need to do about 12 joins.
My query look like this where i need to retrieve all columns from table event:
SELECT event_name, name_group, name_local ..., name_service
FROM event
INNER JOIN group on event.id_G = group.id_G
INNER JOIN local on event.id_L = local.id_L
...
INNER JOIN service on event.id_S = service.id_S
WHERE event.id_S = 7 (for example)
This slows down my system performance. Is there a way to reduce the number of joins? I've heard about using Natural Keys, but I think this is not a good idea to form my case thinking in future maintenance.
My queries are taking about 7 seconds and I need to reduce this time.
I changed the WHERE clause and this caused not affect. So, I am sure that the problem is that the query has so many joins.
Could someone give some help? thanks a lot...
MySQL has a great keyword of "STRAIGHT_JOIN" and might be what you are looking for. First, each of your lookup tables (id/description) I have to assume already have an index on the ID column since that is primary key.
Your event table is the one you are querying as the primary basis of the details and joining to the lookups per their respective IDs. As long as your WHERE clause applicable to the EVENT table is optimized, such as the ID you are looking for, it SHOULD be virtually instantaneous.
If it is not, then it might be that MySQL is trying to think for you and take one of the secondary lookup tables and make it a primary basis of the query for whatever reason, such as much lower record count. In this case, add the keyword and try it..
SELECT STRAIGHT_JOIN ... rest of your query
This tells MySQL to do the query in the order you gave it, thus the Event table first and it's where clause on the ID. It should find that one thing, then grab all the corresponding lookup descriptions from the other tables.
Create indexes, concretely use compound indexes, for instance, start creating a compound index for event and groups:
on table events create one for (event id, group id).
then, on the group table create another one for the next relation (group id, local id).
on local do the same with service, and so on...

Find latest value in a comparison of data between 2 tables

I Have 2 tables in my DB and I want to compare values of 2 select queries Ive made on each one
Table 1: click_log
Query table 1:
SELECT *
FROM click_log
Table 2: km_articles
Query table 2:
SELECT km_article_no
FROM km_articles
WHERE km_article_date <= "2017-10-31" AND km_article_status = "Published" AND km_article_view_count <= "5"
The columns I want to compare are table link_clicked for table 1 with km_article_no and I know I will find repeated matched, nevertheless from those repeated matches I want to find the latest one that I want to get from another column in table 1 called "when_clicked" that contains data information, not sure How can i put together those to queries and then narrow them down.
this is how the tables look like:
Table 1:
|link_clicked|when_clicked
KB00001 | 2017-08-02
KB00001 | 2017-12-02
KB00002 | 2017-08-02
KB00002 | 2017-09-02
KB00003 | 2017-09-02
KB00003 | 2017-09-02
Table 2:
km_article_no|km_article_ti|km_article_status|km_article_view_count|km_article_date
KB00001 |outlook IOS | Published | 5 | 2017-01-02
KB00002 |outlook CSS | Published | 4 | 2017-01-05
KB00003 |outlook ZTE | Retired | 3 | 2017-01-09
If I understand correctly, you want to show all km_articlesrows, each with the latest related click_log.when_clicked date. So aggregate your click_log per link_clicked and find the maximum when_clicked. Then join this to km_articles.
select kma.*, cl.last_clicked
from km_articles kma
join
(
select link_clicked, max(when_clicked) as last_clicked
from click_log
group by link_clicked
) cl on cl.link_clicked = kma.km_article_no
where kma.km_article_date <= date '2017-10-31'
and kma.km_article_status = 'Published'
and kma.km_article_view_count <= 5;
(If you also want to show km_articles rows that have no match in click_log, then change join to left join.)

Get most unique combinations of 2 pictures

I have a mySQL table that holds n number of pictures.
+------------+--------------+
| picture_id | picture_name |
+------------+--------------+
| 1 | ben.jpg |
| 2 | nick.jpg |
| 3 | mark.jpg |
| 4 | james.jpg |
| .. | ... |
| n | abraham.jpg |
+------------+--------------+
For a web application, i need to display 2 pictures simultaneously where the user can vote for one picture or the other. After voting, the user gets a new set of two pictures.
(application use interface)
+---------------------+--------------------+
| Vorte for picture 1 | Vote for picture 2 |
+---------------------+--------------------+
I would like to avoid displaying the same combinations as much as possible. I can create a helper table that will hold all possible combinations.
+----------------+--------------+--------------+
| combination_id | picture_id_1 | Picture_id_2 |
+----------------+--------------+--------------+
| 1 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
| 4 | 1 | 5 |
| .. | .. | .. |
| (n^2-n)/2 | .. | .. |
+----------------+--------------+--------------+
but for 100 pictures, that would be (100^2 - 100)/2 = 4950 (edit) rows, and with every added picture the table would grow exponentially. (which is not a big issue in todays computing i suppose)
But how do i query this table in a way that the user always sees as little duplicates as possible.
Expected result:
run 1: picture_id's = 4,5 (any numbers between 1 and n)
run 2: picture_id's = 2,7
run 3: picture_id's = 5 and 20
...
DEMO:http://rextester.com/VNWIOA4679 (added 100 pic samples) 2 sec query for 1 user w/o any indexes.
I see no need for a helper table as the data can easily be constructed on the fly with proper indexes. at 1000 pictures you're looking at 499,500 combinations a user could vote upon. still easily managed within a database construct as we operate on a set level, not a record level.
Here's one way assuming my own table structures. I can't think of a more efficient way to store/process the data.
Using this approach as new pictures are added the query will generate a larger and larger combination set but always exclude those on which a user has already voted. no code changes for new pics, no regenerating sets just processing each time the ones a user hasn't made a selection upon.
Create table SO46205797_Pics(
PICID int);
Insert into SO46205797_Pics values (1);
Insert into SO46205797_Pics values (2);
Insert into SO46205797_Pics values (3);
Insert into SO46205797_Pics values (4);
Insert into SO46205797_Pics values (5);
Insert into SO46205797_Pics values (6);
Insert into SO46205797_Pics values (7);
Create table SO46205797_UserPicResults (
USERID int,
PICID int,
PICID2 int,
PICChoiceID int);
Insert into SO46205797_UserPicResults values (1,1,2,1);
Insert into SO46205797_UserPicResults values (1,1,3,1);
Insert into SO46205797_UserPicResults values (1,1,4,4);
magic happens here the above was just data setup.
SELECT A.PICID, B.PICID, C.PICChoiceID
FROM SO46205797_Pics A
INNER JOIN SO46205797_Pics B
on A.PICID < B.PICID
LEFT JOIN SO46205797_UserPicResults C
on A.PICID = C.PicID
and B.PICID = C.PICID2
and C.USERID = 1
WHERE C.userID is null;
Note that if we eliminate the C.userID is null part then we see all of the possible combinations (for user1) (note I treat ID 1, 2 the same as ID 2,1 which I think youw ant) for the 2 photos and which ones the user has selected. Since we don't want to display that choice again, we use the c.userID is null to exclude combinations the user already made a choice for.
Also when saving data to the userPicResults, you need to ensure PICID1 is always less than PICID2.
A different way to do this is using a not exists which may be slightly faster.
obviously indexes on USERID, PICID, PICID2 and in that order would be beneficial (i'd probably make it the a combined PK) for SO46205797_UserPicResults and an index on PICID for SO46205797_Pics as the PK.
SELECT A.PICID, B.PICID
FROM SO46205797_Pics A
INNER JOIN SO46205797_Pics B
on A.PICID < B.PICID
WHERE not exists (SELECT *
FROM SO46205797_UserPicResults C
WHERE A.PICID = C.PicID
and B.PICID = C.PICID2
and C.USERID = 1);
I considered maintaining a parent/child relationship for each image for each user; but this approach doesn't store the choices for all combinations.
The goal of this application is to let people vote for one picture against another, right? Then you need to have some kind of vote results table:
vote_results:
| vote_id | user_id | vote_up_picture_id | vote_down_picture_id | ...
Then, based on data from this table you can easily show to a user picture pairs, which he haven't seen yet:
select first.picture_id, second.picture_id
from pictures as first, pictures as second
where not exists(
select * from vote_results v
where (v.vote_up_picture_id = first.picture_id and v.vote_down_picture_id = second.picture_id)
or (v.vote_up_picture_id = second.picture_id and v.vote_down_picture_id = first.picture_id)
) and first.picture_id != second.picture_id
order by rand()
limit 1
PS. As you see, there is no need in helper table with combination_id

mysql update multiplecolumns vs compare values in multiple rows

Okay, talking millions of rows here..
Structure of like
EXAMPLE 1
some_data_before this| x_counter_total | y_counter_total | x_counter_week | y_counter_week | x_counter_year | y_counter_year
--------------------------------------------------------------------------------------------------------------------------------------
some_data_here... | 42142142....... | `241242142..... | 23214124...... | .............. | .............. |` ..............
And every of X and Y events to increment these columns vs this
EXAMPLE 2
table A
some_data_before this| x_counter_total | y_counter_total |
----------------------------------------------------------
some_data_here...... | 42142142....... | `241242142..... |
table B
key_connected_with_table_A | x_event | y_event | occured_timestamp
-------------------------------------------------------------------
id 21...................... | true | false | current_timestamp
My need is this. I need number of X and Y events in some time, past day/week/month/year etc.
My question is that Is it better to update(increment) multiple columns describing the time period i need, like in EXAMPLE 1 or is it better to
on each Event add a Row like in EXAMPLE 2 and then count total VOTES with same ID WHERE occured_timestamp - current_timestamo < TIMESTAMP_OF_A_WEEK for example. Which one is more efficient? talking millions of records, and thousands of request in a minute.
NO, I would keep them in a single table since then I would need to fire only one UPDATE statement but if you separate them to 2 tables then either you will need to execute 2 update statement (or) create a AFTER UPDATE TRIGGER to insert into the other table (or) probably do a update join to update all the respective values ion both tables which to me looks more performance hit than having all the columns in single table.

Reorder rows in a MySQL table

I have a table:
+--------+-------------------+-----------+
| ID | Name | Order |
+--------+-------------------+-----------+
| 1 | John | 1 |
| 2 | Mike | 3 |
| 3 | Daniel | 4 |
| 4 | Lisa | 2 |
| 5 | Joe | 5 |
+--------+-------------------+-----------+
The order can be changed by admin hence the order column. On the admin side I have a form with a select box Insert After: to entries to the database. What query should I use to order+1 after the inserted column.
I want to do this in a such way that keeps server load to a minimum because this table has 1200 rows at present. Is this the correct way to save an order of the table or is there a better way?
Any help appreciated
EDIT:
Here's what I want to do, thanks to itsmatt:
want to reorder row number 1 to be after row 1100, you plan to leave 2-1100 the same and then modify 1 to be 1101 and increment 1101-1200
You need to do this in two steps:
UPDATE MyTable
SET `Order` = `Order` + 1
WHERE `Order` > (SELECT `Order`
FROM MyTable
WHERE ID = <insert-after-id>);
...which will shift the order number of every row further down the list than the person you're inserting after.
Then:
INSERT INTO MyTable (Name, `Order`)
VALUES (Name, (SELECT `Order` + 1 FROM MyTable WHERE ID = <insert-after-id>));
To insert the new row (assuming ID is auto increment), with an order number of one more than the person you're inserting after.
Just add the new row in any normal way and let a later SELECT use ORDER BY to sort. 1200 rows is infinitesimally small by MySQL standards. You really don't have to (and don't want to) keep the physical table sorted. Instead, use keys and indexes to access the table in a way that will give you what you want.
you can
insert into tablename (name, `order`)
values( 'name', select `order`+1 from tablename where name='name')
you can also you id=id_val in your inner select.
Hopefully this is what you're after, the question isn't altogether clear.