Saving all successful logins to MySQL database...Is it wise? - mysql

So I'm looking for some clearance, or advice as it were. I've set up my login system to save information about a user every time they sign in. I.e. IP Address, the ID of the user they've signed into, the timestamp, etc.
Now, my question is...is this wise? The reason I've done it is for future reference, if I ever needed to look into someone's account. This way I'll be able to see and pull back the information regarding every single time their user account has been accessed. It will allow me to see if someone has been accessing their account who shouldn't have been etc.
However, as the population of the database grows the table of all logins is going to become massive because a row is inserted every single time someone successfully signs into an account.
So, will this have any affect on my database further down the line, or is what I've done here perfectly fine?
Thank you for any advice you can give.

Unfortunately the answer is "it depends". The factors affecting the decision include but aren't limited to your server hardware and number of active users.
In general I don't see a problem with saving this login data to a database table. If you don't need historical data for logins after a certain amount of time you can delete old login data to manage the table size.
I would try it and monitor the table size and system performance over time. If it grows too fast then reevaluate its importance versus strategies to optimize it and keep the functionality. It all depends on your specific situation.
I wouldn't skip capturing data I think is important because it might cause problems. I would test and evaluate it over time, then use real metrics to guide my decision.

Related

Is it a good practice to store auth session in the database?

I created a login system that in addition to being used on a website, will also be used in mobile applications.
As on cell phones I want to keep the user logged in until he chooses to log out, I did not use the authentication for sessions in PHP.
So I thought it would be better to store the login sessions in the database, for each user request, to verify if the authentication token is still valid.
But I don't know if this is a good practice. Since every time the user updates the screen in the browser, or sends any application request to the system, he will make a query to verify that the login is still active and then make another query to search for what the user requested.
My concern is whether this will become too slow, for a system that could have between 900 million and 1,5 billion users, since the database will have many more requests and verification queries in addition to the normal query requested by the user.
Below is the current structure of my database. I would also like tips if my structure is very wrong.
Yes, it's a good practice to store session information in an application's main transactional database. A great many web applications work this way at large scale.
If you have the skills to do so, you might consider setting things up so session information is stored in a separate database that's not dependent on data in your transactional database. This separate database needs just one table:
login_token PK
key PK
value
The session_id is the value of the login_token session cookie, a large hard-to-guess random value your web app sends to each logged-in user's browser. For example, if my user id were 100054 the session table might contain these rows for me.
2EwZzPJdigVlrwtkFC5qoe97YE0EBddJ user_id 10054
2EwZzPJdigVlrwtkFC5qoe97YE0EBddJ user_name ojones
Why use this key/value design? It is easily ported to a high-performance key/value storage system like Redis. It's simple. And, to log me off and kill my session all you need is
DELETE FROM session WHERE login_token = '2EwZzPJdigVlrwtkFC5qoe97YE0EBddJ'
(You asked for feedback on your table design. Here is mine: Use INT or BIGINT values for primary keys in tables you expect to become large. VARCHAR values are a poor choice for primary keys because index lookup and row insertion are substantially slower. CHAR(n) values are a slightly better choice, but still slower than integers. The session table only covers presently logged in users.)
And, I'll repeat my comment. Don't waste too much time today on designing your new system so it can run at the scale of Twitter or Facebook (~ 10**9 users). At this stage of your project, you cannot know where your performance bottlenecks will lie when you run at that scale. And it will take you a decade, at the very least, to get that many users. By then you'll have hundreds of developers working on your system. If you hire them wisely, most of them will be smarter than you.
How do I know these things? Experience, wasted time, and systems that did not scale up even when I designed them to do that.

Get all database actions (insert, update , delete, alter, ...)information

in sql server 2008 is there a way to get the user that inserted some rows, or updated, deleted, dropped, altered some tables?
can we get this information the date that occurred?
also is there a way to know if the data was inserted from the same machine or from other machine?
Edit: if it's really hard then maybe a way to achieve this is to user triggers
but is there a way to catch every action that happens on the DB so i can log them all??
something like on insert on any table
i want everything to be done on the DB so no matter what business app i use it will be logged
Unless you already had something set up in advance - a CDC mechanism of some kind it is going to be incredibly difficult to extract that information from the logs. It is possible given enough time, but it is a highly skilled forensic activity that is extremely time consuming to perform. (And relies on full logs being available.) There are third party log readers than can help with this but it will still be a huge effort.

When is the duplication of database data okay?

When is it okay to have duplication of data in your database?
I'm working on this application that is supposed to track the number of user downloads. From my layman's point of view I can
simply have a column in the user table and increment the counter every time the user downloads something, or
have a counter table that has two columns, one for the user and one for the downloaded file.
As I see it both options enable me to track how many downloads each user has. However if this application sees the light of day and has tons of users then querying the database to look through the whole counter table could be quite expensive.
I guess my question is which do you all recommend?
There's no data duplication in the second option, just more data.
If you're not interested in knowing which files are downloaded, I go for the first option (takes least space). If you are, go for the second.
At some point, though, you might also be interested to see the download trend over time :) have you considered logging downloads using Google Analytics? They're probably a lot better at this game than you :)

Design Opinions Needed: Template Databases/Tables for users

I need professional programmers/DBAs to bounce my idea off of and to know if it would/could even work. Please read below and give me any information that may break this theory. Thanks.
Overview of Website Idea:
The website will be used by sports card collectors to chat, answer questions on forums, showcase their cards/box breaks, trade/sell to/with other users, and keep a collection of their cards.
Design Issue:
A user can have an unlimited number of cards. This could make for some very large tables.
Design Question:
I do not want to limit the users on how many cards they can have in their collection on the site. If they have 5 copies of one card, and would rather have 5 records, one for each card, then that is their prerogative. This may also be necessary as each of the cards may be in a different condition. However, by allowing this to happen, this means that having only one table to store all records for all users is not even close to an option. I know sports card collectors with over 1,000,000 cards.
I was thinking that by either creating a table or a database for each user, it would allow for faster queries. All databases would be on the same server (I don't know who my host will be yet, only in design phase currently). There would be a main database with data that everyone would need (the base item while the user table/database would have a reference to the base item). I do see that it is possible for a field to be a foreign key from another database, so I know my idea in that aspect is possible, but overall I'm not sure what the best idea is.
I see most hosts say "unlimited number of databases" which is what got me to thinking about a database for each user. I could use this for that users posts on threads, their collection items, their preferences, and other information. Also, by having each user have a different table/database, if someone's table needed to be reindexed for whatever reason, it wouldn't affect the other users.
However, my biggest concern in either fashion would be additions/deletions to the structure of the tables/databases. I'm pretty sure a script could be written to make the necessary changes, but it seems like a pretty high risk. For instance, I'm pretty sure that I could write a script to add a field to a specific table in each database, or all of the like tables, but then to verify them it could prove difficult.
Any ideas you can throw out there for me would be greatly appreciated. I've been trying to work on this site for over a year now and keep getting stuck on the database design because of my worry of too large of tables, slow response time, and if the number of users grow, breaking some constraints set by phpmyadmin/MySQL. I also don't want to get half way through the database building and then think that there's a better way to do it. I know there may be multiple ways to do it, but what is the most common practice for it? Thank you all very much.
I was thinking that by either creating a table or a database for each user, it would allow for faster queries.
That's false. A single data base will be faster.
1,000,000 cards per user isn't really a very large number unless you have 1,000,000 users.
Multiple databases is an administration nightmare. A single database is always preferred.
my worry of too large of tables, slow response time, and if the number of users grow, breaking some constraints set by phpmyadmin/MySQL
You'll be hard-pressed to exceed MySQL limits.
Slow response is part of your application and details of your SQL queries more than anything else.
Finally. And Most Important.
All technology goes out of date. Eventually, you must replace something. In order to get to the point where you're forced to upgrade, you must first get something running.
Don't worry about "large database" until you have numbers of rows in the billions.
Don't worry about "long-term" solutions because all software technology expires. Quickly.
Regarding number of users.
Much of web interaction is time spent interacting with the browser through JavaScript. Or reading a page. Clicks are actually sort of rare. MySQL on a reasonably large server should handle 30 or more nearly concurrent queries with sub-second response. Your application will probably take very little time to format and start sending an HTML page. Things can rip along at a very, very good clip on a typical server.
If your database design avoids the dreaded full-table scan.
You must have proper indexes for the most common queries.
Now. What are the odds of 30 nearly concurrent requests? If a user only clicks once every 10 seconds (they have to read the page, fill in the form, re-read the page, think, drink their beer) then the odds of 30 clicks in a single second means you have to have 300 concurrent users. Considering that people have other things to do in their lives, that means you must have 50,000 or so users (figuring they're spending 1 hour each week on your site.)
I wouldn't go down the path of creating a database for every user... that will create countless headaches for you: data integrity issues, referential integrity issues, administrative issues...
As long as your table is well normalized and indexed, I don't think a table with hundreds of millions of rows is prohibitively large.
Instead, I would just start with a simple table design. If your site is wildly successful, it wouldn't be any extra effort to implement partitioning or sharding in MySql down the road as opposed to scaling out right off the bat.
If I where in your shoes I would start with one database and one table and not worry too much about the possible size of the table. If you ever get so successful and reach the size you imagine you would probably have a lot more resources and knowledge of your domain to make a better informed decision. Once that happens, you can also consider noSql solution such as HBase, Mondgodb and others that allow for horizontal scaling(unlimited size) with some limitations that businesses that deal with big data are bound to face. You can also use mysql partitions or other sharding solutions. So, go build your product with one table and don't sweat this problem until you absolutely need to. Good luck!

general question about database

I've kinda silly question. I have a small community website. I'm thinking to make specific pages which can be viewed only by the members who have permission. So I suppose i will add each member ID in the database and when a member will try to access the page then i will first check if the member is logged in and then i will check the user ID, if it exists in the database table of users which have permission to view that content. Now Im just wondering if the database grows up, wont it take a long time to check everythng before loading the page?
Premature optimization is the root of all evil (Donald Knuth)
You can easily handle several millions of users with a single database, so that won't be a problem until your community is huge. When you reach that step, you can switch to more scalable DB solutions like Cassandra.
Having that said, take Brad Christie's comment into account, and use a reasonable identity management that won't thrash your database unnecessarily.
"a long time" is subjective and depends on many factors. For a small community website, you will likely not run into any issues with the method you've described. Still, it is considered best practice, and will speed up queries significantly, if you make use of proper indexes. Columns that will be queried against, such as the user ID, should be indexed. Not using an index means that MySQL has to read every record in your table and check to see if it matches your criteria.
This article may be of use to you:
http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm
Also, if you are concerned about how your site will perform when your dataset grows, consider populating it with a bunch of dummy data and running a few tests. This site will help you generate a bunch of data to put in your database.
http://www.generatedata.com/#about
Lastly, if pages are not specific to a particular person or small group of people, consider using more general buckets for access control. For example, if only admins can view a page, tie that page to an "admin" permission and note which users are admins. Then, you can do a quick check to see what type or types of user a particular person is, and decide to show them the page or not. This type of system is typically refered to as an Access Control List (ACL).
http://en.wikipedia.org/wiki/Access_control_list