When you want to alert a user of something once (one time notes about new features, upcoming events, special offers, etc.), what's the best way to do it?
I'm mainly concerned with the data representation, but if there are more issues to think about please point them out. This is my first time approaching this particular problem.
So my thoughts so far ...
You could have a users, a messages, and a seen/acknowledged messages table. When the user acknowledges the messages, we have a new entry in the seen table with a user id & message id pair.
However, the seen table will grow rapidly with the number of users and messages. At some point, this would become unwieldy (any insight when that would be for a single mysql db on a single server?).
Would it be better to just create 1 seen table per message and maybe end up with 20-30 such additional tables to start? Not really a problem. It just comes with the added nuisance of having to create a new table every time there is a new message (of course, that would be automated in the code - still a little more coding).
This is for a project that has 2-3K current users, but the hopes are to grow that to 10K over the next year, and of course, we're looking beyond that, too ...
Edit:
I'm not enthusiastic about the currently top voted method at all. The proposal seems to be to prepopulate a messages table and delete messages as they are seen. This seems to be a lot more work. You not only have to add your entire user list each time you add a new message. You also have to add all the messages for a new user each time you add a new user - separate logic.
On top of that, the record of a message being "seen" is actually the absence of a record. That does not seem right. Plus, if you later decide to track when messages were seen with a simple time stamp. You've have to rewrite a lot of code and other code becomes unusable.
Lastly, could someone tell me why it's so absolutely horrible to add new tables to the database? Doesn't this happen all the time when a new feature is added? Take any CMS: Joomla or Wordpress for example. When you add new plugins, you are creating tables dynamically. So it has to be more nuanced and contextual than "don't do it". What are the pitfalls and what are the circumstances under which you don't do it or it's okay to do?
I can see that you might say: Be careful about creating new tables on a production servers. Make sure it's been well tested, but ultimately, you're just adding an empty table.
This may require and extended answer, so if any knows any articles, please post them.
Edit: Gabriel Sosa gave a nice flushed out example of his messages table, and I'll simply create a seen table similar to what I originally posted although with timestamp column too. Thanks!
You could have the unseen messages listed in a table, and once the message is displayed you delete that row from the table. And you could also delete rows after X weeks, perhaps, whether or not the users see those messages. That would keep the table from growing unbounded. I'm imagining tables like this:
messages
--------
type PRIMARY KEY
text TEXT
unseenMessages
--------------
id PRIMARY KEY
messageType FOREIGN KEY
user FOREIGN KEY
expirationDate DATE
This unseenMessages table would hold all of the messages in your system, once per message per user. When a user loads a page you check if they have any entries in this table. If so you display those messages and then delete them from the table. Think of it like a message "inbox".
Also, I would not do anything that involves dynamic table creation. You should never, ever* create tables on the fly. Ever.
* Except temporary tables, of course.
All of your messages should be stored in one table, or at least a fixed number of predefined tables. It is a cardinal database sin to create tables on the fly. The same goes for adding and removing columns on the fly. You just don't change the database schema dynamically. You don't do that in polite society. If you think you have to, you haven't designed the database correctly.
The programming analogue is the eval() function: it's just one of those things that's almost never a good idea. And in all fairness, eval() is okay in certain situations. Creating tables on the fly never is.
Your volume is not intimidating for a modern RDBMS. Keep in mind that there are many 100's of MILLIONS of records sitting in Twitter's MySQL database and other SQL Server and Oracle databases.
I can see two ways to actually solve this
Set a distinct cookie for the particular message which is good if these messages area a rarity
Create a couple tables to hold the message details
Messages - would hold the message definition including the message text (or HTML in your case?), as well a message status of active/inactive.
User Messages - a cross reference
table that includes a row if a user
has viewed the message. When the
user sees and acknowledges a message
you could insert a row into this
table.
To determine whether a user
should see the message or not, you
would query this table and the
active messages with the user's ID.
If a result is returned, then you
should bypass the message, otherwise
display it..
I think this provides you an opportunity to scale well into the future since the "User Messages" tables would only be integer association keys between the User/Account table and the Message table. You could also log the user's disposition on the User Message table (acknowledged, viewed, bypassed, etc...)
Let me know if this isn't clear, and i can try to explain better or provide a diagram. I'm sure there are some other patterns for doing this as well. Bank of America flashes these things after logging into my online accounts about once every month or so.
this was my aproach:
CREATE TABLE IF NOT EXISTS `system_user_messages` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`section` enum('home','account','all') NOT NULL DEFAULT 'home',
`message` varchar(250) NOT NULL,
`message_type` varchar(25) NOT NULL,
`show` tinyint(4) NOT NULL DEFAULT '1',
`allow_dismiss` tinyint(4) NOT NULL DEFAULT '1',
`created_on` datetime NOT NULL,
`dismissed_on` datetime DEFAULT NULL,
`show_order` int(11) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `idx-user_id` (`user_id`),
KEY `idx-section` (`section`),
KEY `message_type` (`message_type`)
);
I added allow_dismiss because you may dont want allow the user to dismiss that message. In my case when some user's CC is about to expire we dont allow to dismiss and then the system remove the message once the user updates the CC information. By the other hand you may also want to show the message only on certain areas of your site.
I posted the sql because I think is clear in this way... I know there are lot improments to make over this schema but maybe give you an idea.
You could have a users, a messages,
and a seen/acknowledged messages
table. When the user acknowledges the
messages, we have a new entry in the
seen table with a user id & message id
pair.
That seems pretty reasonable.
However, the seen table will grow
rapidly with the number of users and
messages. At some point, this would
become unwieldy (any insight when that
would be for a single mysql db on a
single server?).
Based on the number of users you're talking, I wouldn't bet on that being a concern unless you do a lot of messages often. If it makes sense to retire messages, you could have a disabled flag on messages and a background job to remove rows from the seen table (warehouse it!) that correspond to disabled messages.
Performance wise, a lot of it is riding on your server's specs and what else it is doing. You can get a lot of performance out of a well-indexed, simple table even with lots (100s of thousands+) of rows and some configuration tweaks.
John Kugelman's solution (1 entry per user per message) makes more sense if you want to control who sees messages on a person/group level.
The techniques aren't mutually exclusive either.
Update:
Agree with no dynamic table creation. :)
How important are the messages? If they are truely just a view once "hey here's what's new", does it matter if you know if everyone sees them? (But maybe you want to measure effectiveness?) Why not just display for a week or month or whatever timeframe it is your average user logs in on. Any key site updates can be put on a separate change log page for those really interested. If there are special offers, then its just time limited and why wouldn't users want to be reminded of it each time they log in? Or you could have part of the home page (or a link) dedicated to what's new on the site and record in a cookie if the user has clicked it since it has last been updated, and if not, then highlight it for that user. Sorry for the rambley response.
Related
Currently I am having a hard time deciding/weighing the pros/cons of tracking login information for a member website.
Currently
I have two tables, login_i and login_d.
login_i contains the member's id, password, last login datetime, and total count of logins. (member id is primary key and obviously unique so one row per member)
login_d contains a list of all login data in history which tracks each and every time a login occurs. It contains member's id, datetime of login, ip_address of login. This table's primary key is simply an auto-incremented INT field, really purposeless but need a primary and the only unique single field (an index on the otherhand is different but still not concerned).
In many ways I see these tables as being very similar but the benefit of having the latter is to view exactly when a member logged in, how many times, and which IP it came from. All of the information in login_i (last login and count) truthfully exists in login_d but in a more concise form without ever needing to calculate a COUNT(*) on the latter table.
Does anybody have advice on which method is preferred? Two tables will exist regardless but should I keep record of last_login and count in login_i at all if login_d exists?
added thought/question
good comment made below - what about also tracking login attempts based on a username/email/ip? Should this ALSO be stored in a table (a 3rd table I assume).
this is called denormalization.
you ideally would never denormalize.
it is sometimes done anyway to save on computationally expensive results - possibly like your total login count value.
the downside is that you may at some point get into a situation where the value in one table does not match the values in the other table(s). of course you will try your best to keep them properly up to date, but sometimes things happen. In this case, you will possibly generate bugs in application logic if they receive an incorrect value from one of the sources.
In this specific case, a count of logins is probably not that critical to the successful running of the app - so not a big risk - although you will still have the overhead of maintaining the value.
Do you often need last login and count? If Yes, then you should store it in login_i aswell. If it's rarely used then you can take your time process the query in the giant table of all logins instead of storing duplicated data.
i'm designing a web site for a friend and i'm not sure what's the best way is to go in regards to one of my database tables.
To give you an idea, this is roughly what i have
Table: member_profile
`UserID`
`PlanID`
`Company`
`FirstName`
`LastName`
`DOB`
`Phone`
`AddressID`
`website`
`AllowNonUserComments`
`AllowNonUserBlogComments`
`RequireCaptchaForNonUserComments`
`DisplayMyLocation`
the last four
AllowNonUserComments
AllowNonUserBlogComments
RequireCaptchaForNonUserComments
DisplayMyLocation
(and possibly more such boolean fields to be added in the future) will control certain website functionality based on user preference.
Basically i'm not sure if i should move those fields to a
new table : member_profile_settings
`UserID`
`AllowNonUserComments`
`AllowNonUserBlogComments`
`RequireCaptchaForNonUserComments`
`DisplayMyLocation`
or if i should just leave it be part of the member_profile table since every member is going to have their own settings.
The target is roughly 100000 members on the long run and 10k to 20k in the short run. My main concern is database performance.
And while i'm at it question #2) would it make sense to move contact information of the member such as address street, city, state, zip , phone etc into the member_profile table instead of having address table and having the AddressID like i currently have.
Thank you
I would say "no" and "yes, but" as the answers to 1) and 2). For #1, your queries are going to be a lot easier to manage if you create columns for each preference. The best systems I've worked with were done that way. Moving the preferences into a separate table with "user, preference, value" triples leads to complex queries that join multiple tables just to check a setting.
For #2: there's no reason to put the address in another table, because the single "AddressID" column means there's just one address per member, anyway, and again, it's just going to complicate the queries. If you turn it around backwards and have an address table that embeds userids then that might make sense; it makes even more sense to do phone numbers that way, since people often have multiple phone numbers.
If each member in the database has exactly ONE value for each of the attributes you have listed, then your database is already normalized and thus in a quite convenient form. So, to answer #1, moving these fields to a different table would improve nothing and just make querying more difficult.
As for #2, if you wanted to contemplate the possibility of a member having multiple addresses or phone numbers, you should definitely put those in different tables, allowing many-to-one relationships. This might also make sense if you expect that a number of users will share the same address; this way, you will not be duplicating information by having to store all the same address information for multiple users, you would just reference an addresses table that would have the relevant information one time per address.
However, if you need neither multiple addresses per member nor multiple members per address, then putting the addresses information in another table is just unnecessary complexity. Which solution is more convenient depends on the needs of your specific application.
Since each member has exactly one value in this table, it's already normalized. However, considering query efficiency, sometimes denormalization should be considered.
Except the ID field, the others could seperate into 2 groups: profile group and settings group. If your website usaually use these two groups of data seperately, you should consider to have news table for different usage.
For example, if the profile fields only shows in profile page and the settings fields works in whole site, it's not necessary to look up profile fields all the time.
I'm about to implement a list of topic/argument in my forum, and I'd like to insert a sort of flag like "read/not read yet" for each message, regard each user in my website.
I think at somethings like this : a table watched_topics with id(INT), user(VARCHAR) and topic_id(INT). When a user watch the page, I'll insert (if the data doesn't exist) these information.
When another user will insert a new message in a topic, I'll delete from the table watched_topics all line with that topic_id.
That could provide a trouble : Think about to 9000 topics and 9000 users that have watched all topics : the table will be so big (9000x9000=81000000).
So, I think is not the best strategy to implement this kind of stuff! Any suggestion would be appreciated :)
Cheers
May I suggest a different approach?
Make use of web browser history mechanism.
Every topic can get a new, unique URL every time a new message is added there. It could include the number of messages, last modified time or a combination of both.
If the user did see the topic, he must have visited it, so a properly set up CSS can help identifying the read ones. You can even use some client-side scripts to modify the behaviour of the page based on that.
Another way to do that would be to keep the watched topics table the way you want to do it, but also store last visit time in user's profile and show all topics as read that haven't changed since that time.
However it's pretty safe to assume that all users reading all topics is very unlikely.
Your suggestion sounds good. I would make user-field also a foreign key - it gives you a bit more flexibility.
Are you sure all 9000 topics are read by all 9000 users? I mean is this reality? Like you said, topic-entries are deleted when new message is added. And when that happens, another 9000 entries are deleted :)
I would index the table and go with your suggestion (with user_id change). If the table size gets in your way, you can always change the implementation later. Most likely it will never be the issue anyway.
For the deletion: you could save what the latest msg-ID was the user saw. This way you do not have to perform a lot of delete actions every time a msg is posted in a much-viewed topic.
i am scrubbing my head now for hours to solve thw following situation:
Several Html Forms on a webpage are identified by an id. Users can create forms on the clients side themselves and fill in data. How can I guarantee that the id of the form the user generates is unique and that there doesnt occure any collision in the saving process because the same id was generated by the client of someone else.
The problems/questions:
A random function on the client side could return identical id's on two clients
Looking up the SQL table for free id wouldnt solve the problem
Autoincrement a new id would complicate the whole process because DOM id and SQL id differ so we come to the next point:
A "left join" to combine dom_id and user_id to identify the forms in the database looks like a performance killer because i expect these tables will be huge
The question (formed as simple as i can):
Is there a way that the client can create/fetch a unique id which will be later used as the primary key for a database entry without any collisions? Whats the best practice?
My current solution (bad):
No unique id's at all to identify the forms. Always a combination through a left join to identify the forms generated by the specific user. But what happens if the user says: Delete my account (and my user_id) but leave the data on the server. I would loose the user id and this query qouldn't work anymore...
I am really sorry that i couldn't explain it in another way. But i hope someone understood what i am faced with and could give me at least a hint
THANK YOU VERY MUCH!
GUIDs (Globally Unique IDentifiers) might help. See http://en.wikipedia.org/wiki/GUID
For each form the client could generate a new GUID. Theoretically it should be unique.
I just don't show IDs to the user until they've submitted something, at which point they get to see the generated auto-increment id. It keeps things simple. If you however really need it, you could use a sequence table, but it has some caveats which make me advise against it:
CREATE TABLE sequence (id integer default 0, sequencename varchar(32));
Incrementing:
UPDATE sequence
SET id = #generated := id + 1
WHERE sequencename = 'yoursequencename';
Getting:
SELECT #generated;
I'm currently working on a game, and just a while ago i started getting start on loading and saving.
I've been thinking, but i really can't decide, since I'm not sure which would be more efficient.
My first option:
When a user registers, only the one record is inserted (into 'characters' table). When the user tries to login, and after he/she has done so successfully, the server will try loading all information from the user (which is separate across multiple tables, and combines via mysql 'LEFT JOIN'), it'll run though all the information it has and apply them to the entity instance, if it runs into a NULL (which means the information isn't in the database yet) it'll automatically use a default value.
At saving, it'll insert or update, so that any defaults that have been generated at loading will be saved now.
My second option:
Simply insert all the required rows at registration (rows are inserted when from website when the registration is finished).
Downsides to first option: useless checks if the user has logged in once already, since all the tables will be generated after first login.
Upsides to first option: if any records from tables are deleted, it would insert default data instead of kicking player off saying it's character information is damaged/lost.
Downsides to second option: it could waste a bit of memory, since all tables are inserted at registration, and there could be spamming bots, and people who don't even manage to get online.
Upsides to first option: We don't have to check for anything in the server.
I also noted that the first option may screw up any search systems (via admincp, if we try looking a specific users).
I would go with the second option, add default rows to your user account, and flag the main user table as incomplete. This will maintain data integrity across your database, whereas every user record is complete in it's entirety. If you need to remove the record, you can simply add a cascading delete script to clean house.
Also, I wouldn't develop your data schema based off of malacious bots creating accounts. If you are concerned about the integrity of your user accounts, add some sort of data validation into your solution or an automated clean-house script to clear out incomplete accounts once the meet a certain criteria, i.e. the date created meeting a certain threshold.
You mention that there's multiple tables of data for each user, with some that can have a default value if none exist in the table. I'm guessing this is set up something like a main "characters" table, with username, password, and email, and a separate table for something like "favorite shortcuts around the site", and if they haven't specified personal preferences, it defaults to a basic list of "profile, games list, games by category" etc.
Then the question becomes when registering, should an explicit copy of the favorite shortcuts default be added for that user, or have the null value default to a default list?
I'd suggest that it depends on the nature of the auxiliary data tables; specifically the default value for those tables. How often would the defaults change? If the default changes often, a setup like your first option would result in users with only a 'basic' entry would frequently get new auxiliary data, while those that did specify their own entries would keep their preferences. Using your second option, if the default changed, in order to keep users updated, a search/replace would have to be done to change entries that were the old default to the new default.
The other suggestion is to take another look at your database structure. You don't mention that your current table layout is set in stone; is there a way to not have all the LEFT JOIN tables, and have just one 'characters' table?