Is this a case for denormalisation? - mysql

I have a site with about 30,000 members to which I'm adding a functionality that involves sending a random message from a pool of 40 possible messages. Members can never receive the same message twice.
One table contains the 40 messages and another table maps the many-to-many relationship between messages and members.
A cron script runs daily, selects a member from the 30,000, selects a message from the 40 and then checks to see if this message has been sent to this user before. If not, it sends the message. If yes, it runs the query again until it finds a message that has not yet been received by this member.
What I'm worried about now is that this m-m table will become very big: at 30,000 members and 40 messages we already have 1.2 million rows through which we have to search to find a message that has not yet been sent.
Is this a case for denormalisation? In the members table I could add 40 columns (message_1, message_2 ... message_40) in which a 1 flag is added each time a message is sent. If I'm not mistaken, this would make the queries in the cron script run much faster
?

I know that doesn't answer your original question, but wouldn't it be way faster if you selected all the messages that weren't yet sent to a user and then select one of those randomly?
See this pseudo-mysql here:
SELECT
CONCAT_WS(',', messages.ids) unsent_messages,
user.id user
FROM
messages,
user
WHERE
messages.id NOT IN (
SELECT
id
FROM
sent_messages
WHERE
user.id = sent_messages.user
)
GROUP BY ids

You could also append the id of the sent messages to a varchar-field in the members-table.
Despite of good manners, this would make it easily possible to use one statement to get a message that has not been sent yet for a specific member.
Just like this (if you surround the ids with '-')
SELECT message.id
FROM member, message
WHERE member.id = 2321
AND member.sentmessages NOT LIKE '%-' && id && '-%'

1.2 M rows # 8 bytes (+ overhead) per row is not a lot. It's so small I wouldn't even bet it needs indexing (but of course you should do it).

Normalization reduces redundancy and it is what you'll do if you have large amount of data which seems to be your case. You need not denormalize. Let there be an M-to-M table between members and messages.
You can archive the old data as your M-to-M data increases. I don't even see any conflicts because your cron job runs daily for this task and accounts only for the data for the current day. So you can archive M-to-M table data every week.
I believe there will be maintenance issue if you denormalize by adding additional coloumns to members table. I don't recommend the same. Archiving of old data can save you from trouble.

You could store only available (unsent) messages. This implies extra maintenance when you add or remove members or message types (nothing that can't be automated with foreign keys and triggers) but simplifies delivery: pick a random line from each user, send the message and remove the line. Also, your database will get smaller as messages get sent ;-)

You can achieve the effect of sending random messages by preallocating the random string in your m-m table and a pointer to the offset of the last message sent.
In more detail, create a table MemberMessages with columns
memberId,
messageIdList char(80) or varchar ,
lastMessage int,
primary key is memberId.
Pseudo-code for the cron job then looks like this...
ONE. Select next message for a member. If no row exists in MemberMessages for this member, go to step TWO. The sql to select next message looks like
select substr(messageIdList, 2*lastMessage + 1, 2) as nextMessageId
from MemberMessages
where member_id = ?
send the message identified by nextMessageId
then update lastMessage incrementing by 1, unless you have reached 39 in which case reset it to zero.
update MemberMessages
set lastMessage = MOD(lastMessage + 1, 40)
where member_id = ?
TWO. Create a random list of messageIds as a String of couplets like 2117390740... This is your random list of message IDs as an 80 char String. Insert a row to MemberMessages for your member_id setting message_id_list to your 80 char String and set last_message to 1.
Send the message identified by the first couplet from the list to the member.

You can create a kind of queue / heap.
ReceivedMessages
UserId
MessageId
then:
Pick up a member and select message to send:
SELECT * FROM Messages WHERE MessageId NOT IN (SELECT MessageId FROM ReceivedMessages WHERE UserId = #UserId) LIMIT 1
then insert MessageId and UserId to ReceivedMessages
and do send logic here
I hope that helps.

There are potential easier ways to do this, depending on how random you want "random" to be.
Consider that at the beginning of the day you shuffle an array A, [0..39] which describes the order of the messages to be sent to users today.
Also, consider that you have at most 40 Cron jobs, which are used to send messages to the users. Given the Nth cron job, and ID the selected user ID, numeric, you can choose M, the index of the message to send:
M = (A[N] + ID) % 40.
This way, a given ID would not receive the same message twice in the same day (because A[N] would be different), and two randomly selected users have a 1/40 chance of receiving the same message. If you want more "randomness" you can potentially use multiple arrays.

Related

How to create a "Read Message" query from table messages and messages viewed?

I would like to get the messages that someone hasnt read... it could be a count o just a "1" if there are pending messages to read.
The trick is that there are many " users" shareing the same system. So if I usear "A" reads a message from the table then the notification wont appear anymore to A, but for "B" there should be a notification of pending messages. They are sharing the same message lets say.
I create a query that works somehow , but I know is not 100% right.
I did review
Querying conversations from messages table
sql messages table query
In the example below is the deal.
"A" last viewbymessage for the docid 93 was on 2019-01-28 10:02:15, then user B send a new message BUT never reads the message sent by "A", so in my query, "A" will never be able to see there was a new message since he was the last to see if, and I not using the MessageTable only the Messages_View .. I know this is the wrong part, but im just stating how I used to have it.
SELECT B.*
FROM Comments_Viewed_Tbl B LEFT JOIN Comments_Viewed_Tbl C
ON (B.DOCID =C.DOCID and B.Date_Viewed < C.Date_Viewed)
WHERE C.Date_Viewed IS NULL and B.viewedby <>'A' and
B.RPDOC = 93 and B.Country ='USA'
*sorry for the image, I did try to put it as text but the system format irt ugly
How would be the best approach to do the query.
In this scenario A should have an alert or counter of the new message as also B since he/she didnt check it and just send a new one.
So adding a comment is the same as sending a message?
From my point of view, you need to add the CommentID column to the Comments_Viewed_Tbl, otherwise you will never be able to see the read status of each specific comment, only for the whole document.
Otherwise you will need to assume that the last person to add a comment to the document has read all previous comments.

MySQL finding data if any 4 of 5 columns are found in a row

I have an imported table of several thousand customers, the development I am working on runs on the basis of anonymity for purchase checkouts (customers do not need to log in to check out), but if enough of their details match the database record then do a soft match and email the (probably new) email address and eventually associate the anonymous checkout with the account record on file.
This is rolling out this way due to the age of the records, many people have the same postal address or names but not the same email address, likewise some people will have moved house and some people will have changed name (marriage etc).
What I think I am looking for is a MySQL CASE system, however the CASE questions on Stack Overflow I've found don't appear to cover what I'm trying to get from this query.
The query should work something like this:
$input[0] = postcode (zip code)
$input[1] = postal address
$input[2] = phone number
$input[3] = surname
$input[4] = forename
SELECT account_id FROM account WHERE <4 or more of the variables listed match the same row>
The only way I KNOW I can do this is with a massive bunch of OR statements but that's excessive and I'm sure there's a cleaner more concise method.
I also apologise in advance if this is relatively easy but I don't [think I] know the keyword to research constructing this. As I say, CASE is my best guess.
I'm having trouble working out how to manipulate CASE to fit what I'm trying to do. I do not need to return the values only the account_id from the valid row (only) that matches 4 or 5 of the given inputs.
I imagine that I could construct a layout that does this:
SELECT account_id CASE <if postcode_column=postcode_var> X=X+1
CASE <if surname_column=surname_var> X=X+1
...
...
WHERE X > 3
Is CASE the right idea?
If not, What is the process I need to use to achieve the desired results?
What is [another] MySQL keyword / syntax I need to research, if not CASE.
Here is your pseudo query:
SELECT account_id
FROM account
WHERE (postcode = 'pc')+
(postal_address = 'pa')+
(phone_number = '12345678901')+
(surname = 'sn')+
(forename= 'fn') > 3

Table setup for Messaging Functionality (email)

We've created an application whereby members of the public can search for each other and get in contact (think of it as a dating site) if they so desire, I'm currently in the process of building the Messaging functionality, but I'm curious on how I go about creating the table(s) in the database.
The current flow of the application is as follows:
User1 clicks on User2 to view his/her profile, scrolls down to the bottom of his/hers profile and types some text into a textarea and clicks send at this point I then pass in the data to the database I then send User2 an email saying "you have mail etc".
Taking that into consideration I would of assumed my Email table within SQL Server would look something like this:
Id (PK) (Increments by 1)
ToUserId (FK) // User who they're getting in contact with
FromUserId (FK) // User who sent the message
Content (nvarchar(3000)
Status (int) // read , new , deleted , sent
EmailDate (datetime)
EmailDeleted (datetime)
But the problem with this setup is both user's maybe sending / replying to each other so I would have multiple entries / statuses in one table which may become a nightmare to manage / control (unless I'm over thinking it)
I've spent a good few hours trying to come up with a solution from browsing on the web trying to gain knowledge for building messaging functionality yet it comes back very shy of results. Has anyone been able to build such functionality that wouldn't mind sharing knowledge with me.
You can break it to two tables, something like this:
TblMessage
(
Message_Id int identity(1,1) primary key,
Message_SentDate datetime not null default(getDate()),
Message_Title varchar(100),
Message_Content varchar(max),
Message_SenderId int, -- (fk to users)
Message_IsDraft bit not null default(0), -- when 1 it's saved as draft.
Message_IsDeletedFromOutbox bit not null default(0)-- when 1 don't show on sender outbox
)
TblMessageToRecipient
(
MTR_UserId int, -- (fk to users)
MTR_Message_Id int, -- (fk to message)
MTR_ReadDate datetime null, (if null then status is new)
MTR_DeleteDate datetime null, (if not null then status is deleted)
PRIMARY KEY (MTR_UserId, MTR_Message_Id)
)
This way you can give the recipient an option to "delete forever" a message and just delete the relevant record from TblMessageToRecipient.
Also, you can delete the message completely from tblMessage if it doesn't have a reference on the TblMessageToRecipient and Message_DeletedFromOutbox = 1
(this can be done by a scheduled sql agent job to prevent tblMessages from getting too big)
update:
I hope this will answer your question in the comment:
The recipient has several possible statuses:
New - when MTR_ReadDate is null.
Read -when MTR_ReadDate is not null, and MTR_DeleteDate is null.
Recycled - when MTR_DeleteDate is null.
Deleted - when the record is deleted from TblMessageToRecipient.
The sender have only 3 possible statuses:
Draft.
I've added a bit column to the TblMessage called IsDraft. note that drafts should also save recipient information, so it should be saved in both tables, just show it to the sender in the drafts box, and don't show it to the recipient. Note that when the sender discard the draft you should delete the message from both tables.
Sent.
Once the message is in both tables, and IsDraft = 0, and IsDeletedFromOutbox = 0, it means that the message was sent. in this case, show it to the sender in the sent messages box, and show it to the recipient.
Deleted from outbox.
When IsDeletedFromOutbox = 1 you simply don't show the message to the sender.
if the message record does not have any references in the TblMessageToRecipient, you can delete the record from TblMessages since it was deleted by the sender and by all of it's recipients.
Update 2:
To summerize our conversation in the comments, there are 2 mahor ways to keep a conversation structure (meaning the link between a message and a reply to it [and it's reply and so on...])
One way is to keep a Message_ParentId nullable column in TblMessages.
This column will contain null for any message that is not a reply to an older message, but for replies it will contain the message id of the message it was a reply to.
The second way is to keep a Message_ConversationId column that will always contain a value. when a message is a reply to an older message, it's Message_ConversationId should be the same as it's parent message. When it's not a reply, it's ConversationId should be generated. Since we are talking about sql server 2008, it means the easiest way to generate a new conversation id every time will be to add a new table called TblConversation. this table can keep a single column Conversation_Id that will be an int identity column, and to get a new conversation id do something like this:
DECLARE #ConversationId int
INSERT INTO TblConversation DEFAULT VALUES
SELECT #ConversationId = SCOPE_IDENTITY()
and then user the #ConversationId when inserting a new root message.

How to select a row just once without checking any fields in SQL

I have a table named tblMessages that has a field named Type .type value can be 1 for user messages and can be 2 for general messages.
in selection of user message i write this sql function:
SELECT * FROM tblMessage WHERE USERID=$Userid and Type=1 and ISseen=0;
then in another SQL code i updated that row and set ISseen=1 for specific user
UPDATE tblMessage SET ISseen=1 Where USERID=$userid;
but in selection of General messages i have a little problem.i want to select general message for all user and show it once!as i said before i can update tblmessage and set ISseen=1 then if ISseen be a value of 1 user message is not selected for another time.but for selection of general message i cant do it with WHERE USERID=$userid.
i dont want to insert general message for every user in table also.then
How can i select general messages ONCE for a user by my table structure?
*****////******
Edit number 1
my tblmessage has 20000 records and just has 10 general messages.in my website when user loged in to his/her acount and go to his/her messages can see Personal Messages and General Messages.when user reach that page i Updated the tblmessage table and set ISseen==1
then when that user loged out and then came back to website and to his/her userpage that personal message didnt show again!cause the field ISseen=1.But in another way the general messages is showed again and again.cause what?cause i dont have any ISseen field for general message.
my question is what can i do then with this explanation of my table.how can i do the same thing with general messages.
Create another table messageTable that will look like this .
ID -- Message
1 -- User message
2 -- General message
now you can apply join to get message on the behalf of type .
your query will be look like this.
SELECT messageTable.*,tblMessage.* FROM tblMessage,messageTable WHERE tblMessage.USERID=$Userid and tblMessage.Type=1 and tblMessage.ISseen=0 and tblMessage.Type=messageTable.ID;

delete message for one user but not for the other

hello I have a database with 3 tables.
USERS('user_id','name','surname')
MESSAGE_GROUP('user_one','user_two', 'hash')
MESSAGES('from_id','group_hash', 'messages')
My php code enables me to send messages between users. My question is how to enable a user to delete a message from its mailbox but the other user still watching the message. The messages must be full deleted only if both users delete the message. I am not interesting about the code, I am interesting only in finding the logic behind this. Any proposals that includes mysql code are welcome. thanks
I think you should follow this. :)
You can keep an extra field in message_group table something like 'deleted_from' which will be initially 0
If user one deletes it make the value of 'deleted_from'=1, if user two deletes it, make the value of 'deleted_from' = 2.
When you go to delete the message for a user, and you find the value 'deleted_from' other than 0, delete the message completely, else mark the value of 'deleted_from' as '1' or '2'.
You will need to either:
Make a new table that specifies the mailboxes the message resides in, so that you can connect it to both users mailboxes
Or duplicate the message so that each user has their own copy that can be deleted
Add flags to the message table (not recommended) indicating whether the sender or recipient has deleted it. This I would avoid as it will not scale well if you have (or intend to add) group messaging.
add to MESSAGE_GROUP a status field with values :
0 no owner and should be deleted
1 only the sender owns the message
2 only the reciever owns the message
3 both sender and recievers own it
I would change the fields from the table in between like this (example that every user only can send a message to one person at a time):
USERS('user_id','name','surname')
MESSAGE_GROUP('user_id','message_id')
MESSAGES('from_id', 'to_id', 'messages')
So every user that has a message, will have a row in the MESSAGE_GROUP. When one user deletes the post, delete that row in MESSAGE_GROUP