Adding a constraint to both tables independently in join - mysql

I am trying to create a SELECT statement from two different tables, where I use to send messages, I don't want to send a message to someone for 1 wk after they join, however if I have already sent them a message I want to wait 10 days before sending next one.
Here's what I have:
SELECT c.*, g.resToMeeting, g.replied, g.conlevel, g.coffee, g.lastMessageSent
FROM connections c
INNER JOIN groupmembers g
ON c.Id=g.Id
AND g.groupN=244 # this decides the client I'm pulling for...(or group they own)
AND g.client=c.client
AND g.conlevel=1
AND (datediff(curdate(), g.lastMessageSent) > 10 OR datediff(curdate(), c.dateConnected) > 7)
AND c.validated=1
AND c.process_rank=0
ORDER BY c.dateAdded ASC
LIMIT 0, 200
The trouble I'm having is that it shows EITHER people who haven't joined within the last week, OR people who haven't received a messages within the last 10 days. It seems that it isn't working:
I received a record that had lastMessageSent as 2015-04-29, (which isn't 10 days ago) but the dateConnected was 2015-04-15 which was over 7 days. How can I enforce both rules "together" not either or, sometimes there is no data in lastMessageSent or dateConnected, and that should be OK.

Generally speaking, the ON clause should say how the tables are connected.
After that, you have a WHERE clause that lists whatever filtering you need for any of the tables.
I suspect you should have had:
... INNER JOIN groupmembers g
ON c.Id=g.Id
WHERE g.groupN=244 ...
Edit
I think you want this, not XOR:
AND g.lastMessageSent < NOW() - INTERVAL 10 DAY -- Avoid frequent spamming
AND c.dateConnected < NOW() - INTERVAL 7 DAY -- Wait a while before first message
Note that I reforumlated the comparisions -- it is always good to have the column on one side and a constant on the other. This allows for the possibility of using an INDEX; hiding the column inside DATEDIFF() does not.
CURDATE counts back from midnight this morning; NOW counts back from this second. (You pick; I am merely suggesting an alternative.)
Caution: If "no message ever sent" is stored as NULL in g.lastMessageSent, then the above will be false. So perhaps you need
AND ( g.lastMessageSent IS NULL
OR g.lastMessageSent < CURDATE() - INTERVAL 10 DAY ) -- Avoid frequent spamming
AND c.dateConnected < CURDATE() - INTERVAL 7 DAY -- Wait a while before first message
"Either..or", in my understanding, is OR. "Either, but not both" is XOR.

Why can you just change:
AND (datediff(curdate(), g.lastMessageSent) > 10 OR datediff(curdate(), c.dateConnected) > 7)
to:
AND datediff(curdate(), g.lastMessageSent) > 10
AND datediff(curdate(), c.dateConnected) > 7

"EITHER" is an XOR. i.e. (A and not B) or (not A and B), so change this condition
AND (datediff(curdate(), g.lastMessageSent) > 10 OR datediff(curdate(), c.dateConnected) > 7)
To
AND ( ((datediff(curdate(), g.lastMessageSent) > 10 AND datediff(curdate(), c.dateConnected) < 7)) OR ((datediff(curdate(), g.lastMessageSent) < 10 AND datediff(curdate(), c.dateConnected) > 7)) )
Note that instead of using NOT, I inverted the condition.
EDIT
Since MySQL has XOR operator, it is possible to rewrite the line as:
AND (datediff(curdate(), g.lastMessageSent) > 10 XOR datediff(curdate(), c.dateConnected) > 7)

Related

performance of big INSERT with data from huge select

We are generating some crosssell-data for a shop. We want to display stuff like "customers who looked at this product, also looked at these products".
To generate this data, we do this query on a daily routine from session-based product-viewed-data.
INSERT INTO
product_viewed_together
(
product,
product_associate,
viewed
)
SELECT
v.product,
v2.product,
COUNT(*)
FROM
product_view v
INNER JOIN
product_view v2
ON
v2.session = v.session
AND v2.product != v.product
AND DATE_ADD(v2.created, INTERVAL %d DAY) > NOW()
WHERE
DATE_ADD(v.created, INTERVAL %d DAY) > NOW()
GROUP BY
v.product,
v2.product;
Table product_view is joined to itself. As this table is quite big (circa 26 million rows), the result is even bigger. The query issues a huge amount of performance and time.
I am not use, we choosed a layout fitting the problem in a good way. Is there a better way to store and generate this data?
Make the date tests sargable:
DATE_ADD(v.created, INTERVAL %d DAY) > NOW()
-->
v.created > NOW - INTERVAL %d DAY
Is product_view a VIEW? Or a TABLE? If a table, provide two "covering" indexes:
INDEX(created, session, product) -- (for v)
INDEX(session, created, product) -- (for v2)
Perhaps all the counts you get are even? This bug can be fixed in about 3 ways, each will double the speed. I think the optimal one is to change one line in the ON to
DATE_ADD(v2.created, INTERVAL %d DAY) > NOW()
-->
v2.created > v.created
I think that will double the speed.
However, the counts may not be exactly correct if you can have two different products with the same created.
Another issue: You will end up with
prod assoc CT
123 234 43
234 123 76 -- same pair, opposite order
My revised test says that 234 came before 123 more often than the other way.
Give those things a try. But if you still need more; I have another, more invasive, thought.

Using DateDiff to find examples where table_1.column A and table_2.column B have <90 mins difference

I am trying to find a way to filter records where the difference in two date/time fields is less than 90 minutes.
Example:
orders.created_at = 2015-08-09 20:30:20
table2.created_at = 2015-08-09 20:09:30
I have tried using TimeDiff, although I don't understand how the syntax would apply to this example.
Data comes from separate tables both linking to the same order information. The aim is to find examples of where an order has been placed within 90 minutes, but a third field has not been updated. I would be using an AND query for only including results where a third field is NULL
In mysql, you need to use TIMEDIFF() or UNIX_TIMESTAMP for this. I prefer the UNIX_TIMESTAMP solution because it's simpler:
WHERE
thirdfield IS NULL
AND UNIX_TIMESTAMP(orders.created_at) - UNIX_TIMESTAMP(table2.created_at) < 5400
WHERE
thirdfield IS NULL
AND UNIX_TIMESTAMP(orders.created_at) - UNIX_TIMESTAMP(table2.created_at) < 5400
Does that work both ways, or do you have to expand to "(A-B < 5400 AND A-B >0) OR (B-A <5400 AND B-A > 0)"?
((UNIX_TIMESTAMP(orders.created_at) - UNIX_TIMESTAMP(table2.created_at) < 5400)
AND (UNIX_TIMESTAMP(orders.created_at) - UNIX_TIMESTAMP(table2.created_at) > 0))
OR ((UNIX_TIMESTAMP(table2.created_at) - UNIX_TIMESTAMP(orders.created_at) < 5400)
AND (UNIX_TIMESTAMP(table2.created_at) - UNIX_TIMESTAMP(orders.created_at) > 0))

pulling records from mysql where the difference between 2 dates is greater than 8 days

According to this documentation from my understanding using INTERVAL 8 DAY will return any records greater than 8 days.
In my statement $moztimestampnow is the current date in this format 2015-05-21 and moztimestamp pertains to the column in the DB that contains the other earlier date in which I need to calculate with.
I am not sure if I am able to use moztimestamp as the column name in this statement and it is not working.
How do I get the difference in days?
$moztimestampnow = date('Y-m-d');
SELECT *,DATEDIFF('$moztimestampnow',moztimestamp) INTERVAL 8 DAYS FROM backlinks WHERE user_id = '$user_id' LIMIT 10
First, you a misinterpreting the documentation. The interval keyword is for adding values to dates. If you want to filter data, you need to use the where clause.
In your case, the best where clause looks like this:
SELECT bl.*, DATEDIFF('$moztimestampnow', moztimestamp)
FROM backlinks bl
WHERE user_id = '$user_id' and
moztimestamp <= DATE_SUB(CURDATE(), INTERVAL 8 DAY)
LIMIT 10
This can take advantage of an index on backlinks(user_id, moztimestamp). In addition, you probably should have an ORDER BY clause. That is expected when using LIMIT.
Your syntax doesn't make sense. Try something like:
SELECT *
FROM backlinks
WHERE DATE_SUB(moztimestamp, INTERVAL 8 DAY) > '$moztimestampnow'
AND user_id = '$user_id'
LIMIT 10
I don't follow your objective, so you may have to change the order and direction in the first WHERE clause.

MYSQL First and last datetime within a day

I have a table with 3 days of data (about 4000 rows). The 3 sets of data are all from a 30 minutes session. I want to have the start and ending time of each session.
I currently use this SQL, but it's quite slow (even with only 4000 records). The datetime table is indexed, but I think the index is not properly used because of the conversion from datetime to date.
The tablelayout is fixed, so I cannot change any part of that. The query takes about 20 seconds to run.. (and every day longer and longer). Anyone have some good tips to make it faster?
select distinct
date(a.datetime) datetime,
(select max(b.datetime) from bike b where date(b.datetime) = date(a.datetime)),
(select min(c.datetime) from bike c where date(c.datetime) = date(a.datetime))
from bike a
Maybe I'm missing something, but...
Isn't the result returned by the OP query equivalent to the result from this query:
SELECT DATE(a.datetime) AS datetime
, MAX(a.datetime) AS max_datetime
, MIN(a.datetime) AS min_datetime
FROM bike a
GROUP BY DATE(a.datetime)
Alex, warning, this in typed "freehand" so may have some syntax problems. But kind of shows what I was trying to convey.
select distinct
date(a.datetime) datetime,
(select max(b.datetime) from bike b where b.datetime between date(a.datetime) and (date(a.datetime) + interval 1 day - interval 1 second)),
(select min(c.datetime) from bike c where c.datetime between date(a.datetime) and (date(a.datetime) + interval 1 day - interval 1 second))
from bike a
Instead of comparing date(b.datetime), it allows comparing the actual b.datetime against a range calculated form the a.datetime. Hopefully this helps you out and does not make things murkier.

Mysql Date manipulation and divisible by

I have a cron that runs some php with some mysql just after midnight everyday. I want to take all registered users (to my website) and send them a reminder and copy of the newsletter. However I want to do this every 30 days from their registration.
I have thought as far as this:
SELECT * FROM users
WHERE DATE(DT_stamp) = DATE(NOW() - INTERVAL 30 DAY
But this will only work for 30 days after they have registered, not 60 and 90.
Effectively I want:
Where days since registration is divisible by 30
That way every 30 days that user will get picked up in the sql.
Can someone help me formulate this WHERE clause, I am struggling with mysql where day(date1-date2) divisible 30
The DATEDIFF function returns the difference between two dates in days, ignoring the time:
SELECT * FROM users
WHERE DATEDIFF(DT_stamp, NOW()) % 30 = 0
or the other way round...
SELECT * FROM users WHERE MOD(DATEDIFF(NOW(),registration_date),30) = 0;
Use SQL modulo function MOD():
SELECT * FROM users
WHERE MOD( DATE(DT_stamp) - DATE(NOW()), 30) = 0
In mysql, you can also use the % operator, which does the same thing:
SELECT * FROM users
WHERE (DATE(DT_stamp) - DATE(NOW()) % 30 = 0
Just an addition (not that nine years had passed :)
If you want to skip today's date you should add
AND DATEDIFF(NOW(), DT_stamp) != 0;
making it
SELECT * FROM users WHERE MOD(DATEDIFF(NOW(), DT_stamp), 30) = 0 AND DATEDIFF(NOW(), DT_stamp) != 0;