Should inbox messages be stored in a database, or are there more effecient alternatives? - mysql

I am working on a chat website where users can create rooms, invite others and chat together. I have a lot of the core infrastructure for the website in place, including most of the server and half of the website itself, but I have come across a decision I have to make.
Each user can receive messages to their inbox, for example, they receieved an invite request to join another user's room, or a more general alert relating to their account.
Question: should I store these inbox messages in a database, and what are the alternatives?
Inbox messages typically last for a couple of days, so they are quite ephemeral pieces of data. If I was to store it in a database, this is a rough idea on how the entity would look:
| accountId | message | type |
|-----------|---------------------------------------------------------|-----------------|
| 59 | user3 requested you to join 'hangouts' | invite_request |
| 24 | dialto accepted your request to join 'study group' | invite_response |
| 67 | please confirm your email address to claim your account | account_alert |
On the website, I would create an interface where they can view their inbox messages, and they can discard them if they want. If they discard an inbox message, then it is deleted in the database.
Is this the best solution for this problem, in terms of effieciency? Are there alternatives?
I don't know if this will help but here is my tech stack for this application:
Database: MySQL
Backend: NodeJS | Graphql
Frontend: React | Graphql
Thanks a bunch.
[ I might take around 6-8 hours to respond as I am about the leave for school = 7:48 AM, sorry :) ]

There are no good alternatives for persistence of structured data.
There are many different databases which are optimised for different purposes, no one size fits all..
As a thumb rule, when you start a project, you keep it simple and avoid complexity at all cost.
When you get some real scale, then you start doing optimizations and looking at access patterns, horizontal scaling/partitioning(distributed systems), in memory stores, etc..

Related

Create a new record for every user in a database based on a column

I have a MySQL database with a user table, after a new requirement I had to create a new table called social_media and from now on every new user will be created with a social_media_id column that holds a reference to their social media.
+===================+ +===================+
| user | | social_media |
+===================+ +===================+
|PK id | |PK id |
|FK social_media_id | | instagram |
| first_name | | facebook |
| last_name | | twitter |
| email | +===================+
+===================+
I want to update my database so that every user that didn't had a social media reference before gets one (even if the values inside are null), so they can update them if they wish, is there something I can do to make a new social_media record for every user that doesn't have one, and add the correct social_media_id foreign key for that user?
Ok #Jorche, this is too long to be a comment, but I do want to help.
First off, this is probably what youre data structure should look like:
Second, to be able to tell you how you enter these records is very difficult for me at this moment because I have absolutely ZERO requirements or any other business logic that would help me to help you on how to pinpoint the best approach to doing so. Odds are, you would have to work hand in hand with application developers, or ETL developers (that might even be you though) to figure out what that approach is. Maybe its a stored procedure that gets called, maybe its a trigger set up, hard to say for sure without additional context, ya know?
All we know at this point is that users exist and sometimes they have relational data related to social media entities. Your job is literally to understand that process flow and make the appropriate decisions on how to log that data in a way that makes sense from both an operational perspective and a database design perspective.
Hate to say it hombre, but the questions you have now are all entirely dependent on details you haven't provided.

How to filter DB results with permissions stored in another DB

i'm currently searching for a good approach to filter DB results based on permissions which are stored in another services DB.
Let me first show the current state:
There's one Document-Service with 2 tables (permission, document) in its MySQL DB. When documents for a user are requested, a paginated result should be returned. For brevity let's ignore the pagination for now.
Permission table: Document table:
user_id| document_id document_id| more columns
-------|------------ A
1 | A B
2 | A C
2 | B
2 | C
The following request "GET /documents/{userId}" will result in the following query against the DB:
SELECT d.* FROM document d JOIN permission p WHERE p.user_id = '{userId}' AND p.document_id = d.document_id;
That's the current implementation and now i am asked to move the permission table into its own service. I know, one would say that's not a good idea, but this question is just a broken down example and in the real scenario it's a more meaningful change than it looks like. So let's take it as a "must-do".
Now my problem: After i move the table into another DB, i cannot use it in the sql query of Document-Service anymore to filter results.
I also cannot query everything and filter in code, because there will be too much data AND i must use pagination which is currently implemented by LIMIT/OFFSET in the query (ignored in this example for brevity).
I am not allowed to access a DB from any other application except its service.
My question is: Is there any best practise or suggested approach for this kind of situation?
I already had 2 ideas which i would like to list here, even though i'm not really happy with either of them:
Query all document_ids of a user from the new Permission-Service and change the SQL to "SELECT * FROM document WHERE document_id IN {doc_id_array_from_permission_service}". The array could get pretty big and the statement slow; not happy about that.
Replicate the permission table into Document-Service DB on startup and keep the query as it is. But then i need to implement a logic/endpoint to update the table in the Document-Service whenever it changes in the Permission-Service otherwise it get's out of sync. This feels like i'm duplicating so much logic in both services.
For the sake of this answer, I'm going to assume that it is logical for Permissions to exist completely independently of Documents. That is to say - if the ONLY place a Permission is relevant is with respect to a DocumentID, it probably does not make sense to split them up.
That being the case, either of the two options you laid out could work okay; both have their caveats.
Option 1: Request Documents with ID Array
This could work, and in your simplified example you could handle pagination prior to making the request to the Documents service. But, this requires a coordinating service (or an API gateway) that understands the logic of the intended actions here. It's doable, but it's not terribly portable and might be tough to make performant. It also leaves you the challenge of now maintaining a full, current list of DocumentIDs in your Permissions service which feels upside-down. Not to mention the fact that if you have Permissions related to other entities, those all have to get propagated as well. Suddenly your Permissions service is dabbling in lots of areas not specifically related to permissions.
Option 2: Eventual Consistency
This is the approach I would take. Are you using a Messaging Plane in your Microservices architecture? If so, this is where it shines! If not, you should look into it.
So, the way this would work is any time you make a change to Permissions, your Permissions Service generates a permissionUpdatedForDocument event containing the relevant new/changed Permissions info. Your Documents service (and any other service that cares about permissions) subscribes to these events and stores its own local copy of relevant information. This lets you keep your join, pagination, and well-bounded functionality within the Documents service.
There are still some challenges. I'd try to keep your Permissions service away from holding a list of all the DocumentID values. That may or may not be possible. Are your permissions Role or Group-based? Or, are they document-specific? What Permissions does the Documents service understand?
If permissions are indeed tied explicitly to individual documents, and especially if there are different levels of permission (instead of just binary yes/no access), then you'll have to rethink the structure in your Permissions service a bit. Something like this:
Permission table:
user_id| entity_type| entity_id | permission_type
-------|------------|-----------|----------------
1 | document | A | rwcd
2 | document | A | r
2 | document | B | rw
2 | document | C | rw
1 | other | R | rw
Then, you'll need to publish serviceXPermissionUpdate events from any Service that understands permissions for its entities whenever those permissions change. Your Permissions service will subscribe to those and update its own data. When it does, it will generate its own event and your Documents service will see confirmation that its change has been processed and accepted.
This sounds like a lot of complication, but it's easy to implement, performant, and does a nice job of keeping each service pretty well contained. The Messaging plane is where they interact with each other, and only via well-defined contracts (message names, formats, etc.).
Good luck!

User behavior monitoring database

I've developed an iPhone app that allows users to send/receive files to/from a server.
At this point I wish to add a database to my server side application (sockets in c#) in order to keep track of the following:
personal card (name,age,email,etc...) - a user can (but isn't obligated) to fill one out
the number of files a user sent and received so far
app Stats which are sent every once in a while and contains info such as number of errors in the app, he's OS version etc...
the number of total files sent/received in the past hour (for all users)
each user has a unique 36 digit hex number "AF41-FB11-.....-FFFF"
The DB should provide the following answers: which users are "heavy users", how many files were exchanged in the past day/hour/month and is there a correlation between OS and number of errors.
Unfortunately I'm not very familiar with DB design, so I thought about the following:
a users table which will contain:
uniqueUserID | Parsonal Card | Files Sent | Files Received
an App stats table (each user can have many records)
uniqueUserID | number_of errors | OS_version | .... | sumbission_date_time
a general stats table (new record added every hour)
| total_files_received_in_the_last_hour | total_files_sent_in_the_last_hour | submission_date_time
My questions are:
performance-wise, does it make sense to collect and store data per user in side the server side application, and toss it all into the DB once an hour (e.g. open a connection, UPDATE fields/INSERT fields, close the connection) ? Or should I simply update each transaction (send/receive file) a user does every time he performs it?
Should I create a different primary key, other than the 36-digit id?
Does this design make sense??
I'm using mySQL 5.1, innoDB, the DBMS is on the same machine as the server-side app
Any insights will be helpful!

Database Schema allowing for multiple login opportunities (Facebook-Connect, Oauth, OpenID, etc.) for the same account

I want to accomplish nearly the same thing as this question, which is to store authentication data from multiple sources (Facebook, Twitter, LinkedIn, openID, my own site, etc.) so that one person can log in to their account from any/all of the mentioned providers.
The only caveat being that all user data needs to be stored in a single table.
Any suggestions?
If there is no clean way to accomplish this, is it better to have tons of columns for all of the possibly login methods or create additional tables to handle the login methods with Foreign Keys relating back to the user table (as described in this answer)?
perhaps you want to create a table dedicated to the account types, along with a table for the actual user.
Say you have a users table with an auto_increment uinique ID for each user. Then, you want to create another table example: user_accounts, with it's own auto_icnrement ID, another column for relational ID (to the users table and a 3rd (or/and) 4th table for account type / extra data for authentication if needed.
You then insert a record for each account type for each user. Basically it may look like this:
user_accounts
| ID | User_ID | account_type | authentication
| 1 | 1 | facebook | iamthecoolestfacebookerever
| 2 | 1 | google | mygoogleaccount
In it's most simplistic form. you will probably be storing much different data than that, but hopefully you get the point.

How can website hits statistics be helpful to improve usability?

Have you noticed that almost every links in facebook have ref query string?
I belive that, with that ref, facebook somehow track and study their user behaviour. this could be their secret recipe of making a better usability.
So, I am trying out the same thing, change http://a.com/b.aspx
to
http://a.com/b.aspx?ref=c and log every hits into a table.
========================================================================
userid | page | ref | response_time | dtmTime
========================================================================
54321 | profile.aspx | birthday | 123 | 2009-12-23 11:05:00
12345 | compose.aspx | search | 456 | 2009-12-23 11:05:02
54321 | payment.aspx | gift | 234 | 2009-12-23 11:05:01
12345 | chat.aspx | search | 567 | 2009-12-23 11:05:03
..... | ............ | ........ | ... | ...................
I think it's a good start. I just don't know what to do with these informations.
Is there any appropriate methodology to process these informations?
Research has shown that fast responses are a way to improve not only usability of a website. It's also a way to improve conversion rates or site usage in general.
Tests at Amazon revealed that every 100 ms increase in load time of Amazon.com decreased sales by 1%
Experiments at Microsoft on Live Search showed that when search results pages were slowed by 1 second: a) Queries per user declined by 1.0%, and b) Ad clicks per user declined by 1.5%
People simply don't want to wait. Therefore, we track response time percentiles for our sites. Additionally, nice visualization of this data helps with measuring performance optimization efforts and monitoring server health.
Here is an example generated using Google Charts:
That looks bad! Response times of > 4000 ms certainly indicate performance problems that have a considerable impact on usability. At times the 800 ms percentile (which we consider a good indicator for our apps) was as low as 77%. We typically try to get the 800 ms percentile at 95%. So this looks like there's some serious work ahead ... but the image is nice, isn't it? ;)
Here's a second answer as the former was only about response time statistics.
The ref query string allows to identify the sources, especially of people entering a Conversion funnel. So you might make statements like "N $ of revenue come from users clicking link X on page Y". Now you could try to modify link X to X1 and see if it increases revenue from this page. That would be your first step into A/B Testing and Multivariate Analysis. Google Website Optimizer is a tool exactly for this purpose.
Well facebook uses them for user interface usage observation (I believe) so they see where people click more (logo or profile link) and they consider changing the UI accordingly in order to make interaction better.
You might also be able to use it to see common patterns in usage. For instance, if people follow a certain chain profile -> birthday -> present -> send you might consider adding in a function or feature to "send present" on their profile when it's that persons birthday. Just a thought.
To make the best use of your website statistics you need to think about what your users are trying to acheive and what you want them to achieve. These are your site's goals
For an ecomerce site this is failrly easy. Typical goals might be:
Search for a product and find information about it.
Buy a product.
Contact someone for help.
You can then use your stats to see if people are completing the site's goals. To do this you need to collect a visitors information together so you can see all the pages they have been to.
Once you can look at all the pages a user has visitted and the sequence they visitted them in you can see what they have been doing. You can look for drop out points where they were about to buy something and then didn't. You can identify product searches that were unsuccessful. You can do all sorts. You can then try and fix these issues and watch the stats to see if it has helped.
The stats you're collecting are a good start, but collecting good stats and collating them is complicated. I'd suggest using an existing stats package I personally use Google Analytics, but there are others available.