Table structure of active members for 4 years in sql - mysql

I am working on an achievement system and the users can unlock a badge when they have been active on the site for 4 years.
I tried to store each time a user in logged in but it's not really a good idea.
So my question is, how the table structure should be if I want to know if the user was active for 4 years?

It depends on what you mean by active, but if you mean that four years has elapsed since they first logged on, registered etc. then you just need a date field to store the account setup / logon etc. Then when they logon you can test to see if four years has past since that date and insert a badge accordingly.

You can save the last consecutive log in and the first log in. That's just two columns. Every time a user logs in, you can check if he logged in the day before. If yes, then update that column. If not then update the first log in because that's when the 4 year period will begin again. Let me know if that makes sense or if you need examples to help with understanding the thought process.

That is too generic a question to provide an full solution. By active for 4 years it sounds like you will need to track their actions. So perhaps some sort of transaction or history table that links their userid with a datetime plus actions performed.
Then you just have to define what "active" means such as having performed specific actions at least once per week, etc.
EDIT
First off I am a SQL Server developer, but have attempted to convert this to MySQL syntax for you.
~ = primay keys to ensure uniqueness & good performance
User table
~ UserID unique user id - could be an Identity, GUID or similar field
UserName unique user name
anything else you want to track such as First Name, Last Name, etc
UserAction table
~ UserID link to User table
~ ActionType number indicating what the user performed
1 = login
add in other types in the future if you ever want to track anything else
~ ActionDate datetime of the action
anything else that you may want to track such as the duration or end date (in this case that would be them logging out)
Database systems are designed to hold lots of data so you shouldn't have to worry too much about that unless space is a factor. You can always delete any data that is older than 4 or 5 years if you like.
Normally I would just use a CTE (common table expression) but apparently they won't be available until MySQL 8.0 [https://www.mysql.com/why-mysql/presentations/mysql-80-common-table-expressions/]
So instead here is another method. It assumes that you want a login within each given year. i.e. at least one login from 2017, one from 2016, 2015 & 2014. If instead you wanted at least one login going back 1 year from today and so on then we will need to modify this query.
-- this will return the number of years with at least one login, going as far back as 4 years
SELECT COUNT(1) AS NumberOfYearsWithLogin
FROM (
SELECT YEAR(CURDATE()) - YEAR(ActionDate) AS NumberOfYearsAgo
FROM UserAction
WHERE UserID = 123 -- put user of interest here
AND ActionType = 1 -- login
AND YEAR(CURDATE()) - YEAR(ActionDate) < 4 -- check the last 4 years
GROUP BY YEAR(CURDATE()) - YEAR(ActionDate) -- group by year number with today's year as number 0
) A
;

Related

Database design for hourly, weekly, ranking?

I've been trying to figure out a good way to handle a ranking system of this sort. As a rough example, I would like to query a facebook page and grab the likes and comments of each post. Then, there would be three rankings based on a time interval. To give a simplified example:
Hourly
- I pull all the posts updated within the last hour, and compare the # of likes/comments compared to my previous entry (the last pull being an hour prior).
Daily
- I pull down all posts within a 24 hours date range. I compare the # of likes/comments compared to the previous entry. "Post X had 12 more likes and 40 more comments today compared to yesterday"
Weekly
- I pull down all posts within a week's range and do the same as above. "Post X had no new likes, but 10 more comments added this week compared to last week"
In terms of the DB tables, what would be a good way to handle this? Would it make sense to have one giant table with the posts (title, comments_previous, comments_current, likes_previous, likes_current, etc)?
Thank you!
Columns: (PK)timestamp, (index)pageid, count. Set a new timestamp every hour on the hour for pages that are liked. Timestamp is the PK so that you don't get horrible fragmentation from your clustered index / page layout in the database.
If you feel for performance reasons that you need to de-normalize, you can make additional daily and monthly tables that are rolled-up summations. Likely, you'll be able to efficiently generate what you need without the rollup tables by using where clauses on the time / pageid combination, thereby giving you what you need with just one table.
Purge old data as you see fit, or keep it.
Clarification
When a comment receives a like, do the following:
insert into likeRanking (concat(select left(now(),13), '00:00'), commentid, 1)
on duplicate key update score = score + 1;
I would do this as follows:
Create a table that gets the time now, comments now, and likes now.
Then after an hour of that time, create another table that gets the time now, comments now and likes now, then subtract it to the previously created table. Then drop the other table and insert the new values of the new table. Then after an hour, create another table.
Same with monthly and yearly.
Let me know if you need anything else.

better schema for a database

i have an application in which a user can lock a certain commodity and once he locks it,he has to pick it up in 2 hours. If he doesnt pick it up in 2 hours then that item is unlocked and user loses 1 locking chance. the user has 3 locking chances initially and if he loses all 3 in 2 months time then he is banned for 2 months. now i have prepared a schema for this but i feel its not optimum. the schema looks like this
user table
--locking_chances_left //initially 3
--first_chance_miss_date //date on which the user loses its first locking chance
--second_chance_miss_date //date on which the user loses its second locking chance
--banned // boolean field to indicate whether the user is banned
locked_items table
--item_no
--user_id
--locking_time
banned_users table
--user_id
--ban_date //date on which the user is banned i.e lost the last chance
now i have a event which is scheduled to run every minute to see if any item in locked_items table has been locked for more than 2 hours and if it finds any then it removes it from this table which unlocks the item and then decreases locking_chances_left by 1 from the users table. now i have to keep track of whether a user loses all his chances in a period of 2 months to ban him. so i have kept first_chance_miss_date to keep the date when his chances decrease from 3 to 2 and second_chance_miss_date to keep the date when his chances decrease from 2 to 1. i have an after update trigger on users table that checks when the value of locking_chances_left is changed and it updates the first_chance_miss_date and second_chance_miss_date accordingly. is there some better way without using these 2 fields for miss dates and just using one field.
thanks for bearing this
I'd probably do this with a "user_missed_date" table with user_id and missed_date as fields you can then
select user_id, count(*) as misses from user_missed_date where date>[last two months] group by user_id
Or use that as the basis for a subquery.
You would probably want indexes on both user_id, missed_date and missed_date,user_id
I don't think this is a better solution, but I'll throw it out there:
You could have a table of lock_events, instead of locked_items. Every time an item gets locked, it goes in the event table. If an item gets picked up, you could either delete it, or you could add an additional event saying it was picked up. If you select items that are older than 2 hours, you get a list of expired locked items.
This way you have a history of all the events in the system. It's simple to calculate chances_left and also simple to see if the user burnt all his chances in a 2 month period. You end up doing more CPU cycles here, but you also get a nice record of all the transactions on your site!

Need help with a database design for Top 10

I am trying to come up with a database design to hold the "Top 10" results for some calculations that are being done. Basically, when all is said in done, there will be 3 "Top 10" categories, which I am fine with all being separate tables, however I need to be able to go back and later pull historical data about what was in the Top 10 at certain times, hence the need for a database, although a flat-file would work, this has the potential to hold years worth of data.
Now, it's been awhile since I have done anything serious with a database, other than something that had a couple of simple tables, so I am having some issues thinking through this design. If someone could help me with the design of it, I know enough MySQL to get the rest done.
So, in essence, I need to store: A group of 10 names, a % of the total points each name had, the rank they held in the Top 10 and a time associated with that Top 10 (So I can later query for that time)
I would think I need a table for for the Top 10 with 11 columns, one for the ID and 10 for the Foreign Key of the 'Names' table, that holds every name ever used with a PK, Name, %, and Rank. This seems clunky to me, anyone else have a suggestion?
edit:The 'Top 10' is associated with a specific set of data for 5-minute intervals, and each interval is completely independent from the previous or future intervals.
I don't recommend your solution, because then if you want to ask the database "How often has Joe been in the top 10," you have to write 10 queries of the form
SELECT Date FROM Top10 WHERE FirstPlace = 'joe'
SELECT Date FROM Top10 WHERE SecondPlace = 'joe'
...
Instead, how about a Rankings table, with fields:
id
Date
Person
Rank
Then if you want the Top 10 list for a certain date, the query is
SELECT * FROM Rankings WHERE Date = ...
and if you want to know someone's historical ranking, the query is
SELECT * FROM Rankings WHERE Person = ...
and if you want to know all the historical leaders, the query is
SELECT * FROM Rankings WHERE Rank = 1
The downside to this is that you might accidentally make two different people 8th place, and your database would allow the anomaly. But I have good news for you -- people might actually tie for 8th place, so you might actually want that to be possible!
I assume that your "Top 10" is a snapshot data in certain time. And your business logic is that "every 5 minutes" so that the time is the parent entity for table design
top_10_history
th_id - the primary key
th_time - the time point when taking the snapshot data of "Top 10"
top_10_detail
td_th_id - the FK to top_10_history
td_name_id - the FK to name
td_percentage - the "%"
td_rank - the rank
If the sequence of "Top 10" could be calculated from columns in "top_10_detail", you don't need a column to keep the sequence of it. Otherwise, you need a column to persist the sequence for it.
If you need more complicated query such as "The top 10 at 12:00 AM in last 30 days", using individual columns for "day", "hour", and "minute" would be a better idea for performance(with suitable indexes).

Problem with an agenda/availability query

I have a mysql table with users and their weekly calendar.
Every user can set his own availability for the week (morning, afternoon, night / MON thru SAT), and that is not going to change often, almost never.
Imagine those users are personal trainers in a gym, or tennis courts you can book...
My problem here is to find the right query (or maybe even rethinking the way i'm storing that data in mysql) in order for an external web user to check availability of them based on 3 check buttons [o]morning, [o]afternoon, and [o]night
So I want my web user to go to my website and check/uncheck those buttons in order to see which one (personal trainer, or whatever) is available
So if I check Morning i can see only the people available, also (but not only), in the morning,(because a personal trainer can be available during the morning but also in the afternoon etc..)..
it may sounds an easy problem but i'm having hard time...
any help is appreciated
Thanks!
This isn't really an algorithm question, this is more of a DBA question. You'd most likely have a user table and an availability table.
user:
userid
...
availability:
userid
day
timeofday
When given a query such as Monday Wednesday Morning Afternoon (assuming the relationship is (Monday OR Wednesday) AND (Morning OR Afternoon)) you can do a query such as.
SELECT userid FROM availability WHERE day='wednesday' OR day is 'monday' AND timeofday='morning' OR timeofday='afternoon'
The answer to this question will be dependant on your DB structure. If your are storing the availabile times as 1, 2, 3 or any combo of such 12, 13, 123, 23 then you can simply use a
MYSQL Regular expression to limit your results based on the input checkbox criteria.
I would suggest somthing like:
SELECT trainer FROM trainer_table WHERE availaility regexp '[Limiting Criteria]'
In the above code, simply replace trainer with the name of your fields you wish to return. Then replace trainer_table with the name of your table and finally replace Limiting Criteria with your limiting text, be it 1 or 2 or 3 or any combination.
If you want more specific help, an example of your own table structure would be helpful.

How would you design this DB?

We are launching a website (paid subscription) and the sign up process includes entering an activation code. Activation codes are printed on scratch cards and sold via offline channels. Some of these cards are for 1 month access. Others are for 3 months and 1 year. Activation codes are unique 10-digit random numbers.
When the access expires, users can buy another activation card and extend the subscription by entering the new activation code. Additionally, we should also be able to extend their subscription if they request for it. For example, until a certain date (e.g. 1 additional week).
Considering the above information, how would you design the DB for the user-activation_code relationship? Do you think this design is good?
tbl_user
----------------
id
name
status_id
tbl_user_status
----------------
id
description
tbl_activation_code
----------------
activation_code
activation_code_type_id
activation_code_status_id
user_id
activated_date
expiry_date
tbl_activation_code_type
----------------
id
description
tbl_activation_code_status
----------------
id
description
Update: Activation codes will be required only:
1) Upon initial sign up
2) Closer to the access expiry date (say, 7 days) when the system displays a notification with a link to page to enter the activation code
3) After expiry, when a user tries to login, she will be asked for the activation code
Therefore, a user is not expected to key in the activation code as and when wanted.
It's not bad. However, I would suggest that you add two fields to tbl_user:
tbl_user
----------------
id
name
status_id
activated_date
expiry_date
Of course, activated_date holds the date they were first activated, while expiry_date holds the date when they will expire. You also need a procedure to update this expiry_date, whenever they buy a new card. This procedure should handle two cards with overlapping dates, so the user doesn't double-up payment for a particular period. For example:
Card 1 - Sep 1 to Sep 30
Card 2 - Sep 16 to Oct 15
There are fifteen days of overlap there, so the user's activated_date should be Sep 1, while their expiry_date should be Oct 30 (Oct 15 + 15 days).
Considering this, I would change tbl_activation_code, as expiry_date becomes a bit misleading. Instead, create a column called access_days that will be used to calculate the user's expiry_date.
Also, if you want to remember cards that were issued, even if not activated, then I would split tbl_activation_code into two tables:
tbl_activation_code
----------------
activation_code
activation_code_type_id
activation_code_status_id
access_days
tbl_activation
----------------
activation_code_id
user_id
activated_date
I would consider a bit of denormalisation - at the moment, to determine whether a user currently has access or not you have to look through potentially multiple records for that user in tbl_activation_code to see if there is an active record for that user.
So it might be worth adding a surrogate IDENTITY/autonumber field in tbl_activation_code, and adding a foreign key to that in tbl_user - this would point to the user's current activation code record, simplifying the scenarios where you need to find the current state of a user's access. This way, a user record will always reference directly their current activation code, plus you still have the full history of their previous codes.