I am planning to make a small game where everybody have a bank account. To see their management skills I want to log every hour or day their amount of money and display it as a graph.
Now my question is how can/should I log this with mySql.
I think its not very practically to do this:
id user currentMoney 2014.08.22-04 2014.08.22-03 2014.08.22-02 2014.08.22-01
(after the currentMoney these are columns for every hour) so that every hour 1 column gets created with the currentMoney. I think thats not the right way. There must be a better way. ideally it would be that after one Month it starts from the beginning again and overwrites the old listings but thats only optional.
My second question: Is there a jquery application that can create graphs out of the databse? Or how can i do this?
thanks for helping and sorry for my english skills.
Populating a database is done by adding rows, not columns.
Adding columns is a structural change, and should happen rarely. A change in the structure typically means a change in the application, which implies a new version of the application.
Add rows. Your log table must look like this:
balance_history
===============
* user_id
* balance_date
current_balance
Sample contents:
user_id | balance_date | current_balance
1 | 2014-08-22 04:00:00 | 1.00
... | ... | ...
1 | 2014-08-23 12:00:00 | 99.99
2 | 2014-08-22 04:00:00 | 1.00
... | ... | ...
2 | 2014-08-23 12:00:00 | 1.23
To purge old data, all you need to do is DELETE FROM balance_history WHERE balance_date < [date_of_your_choice].
Related
I have a table which contains task list of persons. followings are columns
+---------+-----------+-------------------+------------+---------------------+
| task_id | person_id | task_name | status | due_date_time |
+---------+-----------+-------------------+------------+---------------------+
| 1 | 111 | walk 20 min daily | INCOMPLETE | 2017-04-13 17:20:23 |
| 2 | 111 | brisk walk 30 min | COMPLETE | 2017-03-14 20:20:54 |
| 3 | 111 | take medication | COMPLETE | 2017-04-20 15:15:23 |
| 4 | 222 | sport | COMPLETE | 2017-03-18 14:45:10 |
+---------+-----------+-------------------+------------+---------------------+
I want to find out monthly compliance in percentage(completed task/total task * 100) of each person like
+---------------+-----------+------------+------------+
| compliance_id | person_id | compliance | month |
+---------------+-----------+------------+------------+
| 1 | 111 | 100 | 2017-03-01 |
| 2 | 111 | 50 | 2017-04-01 |
| 3 | 222 | 100 | 2017-03-01 |
+---------------+-----------+------------+------------+
Here person_id 111 has 1 task in month 2017-03-14 and which status is completed, as 1 out of 1 task is completed in march then compliance is 100%
Currently, I am using separate table which stores this compliance but I have to calculate compliance update that table every time the task status is changed
I have tried creating a view also but it's taking too much time to execute view almost 0.5 seconds for 1 million records.
CREATE VIEW `person_compliance_view` AS
SELECT
`t`.`person_id`,
CAST((`t`.`due_date_time` - INTERVAL (DAYOFMONTH(`t`.`due_date_time`) - 1) DAY)
AS DATE) AS `month`,
COUNT(`t`.`status`) AS `total_count`,
COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) AS `completed_count`,
CAST(((COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) / COUNT(`t`.`status`)) * 100)
AS DECIMAL (10 , 2 )) AS `compliance`
FROM
`task` `t`
WHERE
((`t`.`isDeleted` = 0)
AND (`t`.`due_date_time` < NOW())
GROUP BY `t`.`person_id` , EXTRACT(YEAR_MONTH FROM `t`.`due_date_time`)
Is there any optimized way to do it?
The first question to consider is whether the view can be optimized to give the required performance. This may mean making some changes to the underlying tables and data structure. For example, you might want indexes and you should check query plans to see where they would be most effective.
Other possible changes which would improve efficiency include adding an extra column "year_month" to the base table, which you could populate via a trigger. Another possibility would be to move all the deleted tasks to an 'archive' table to give the view less data to search through.
Whatever you do, a view will always perform worse than a table (assuming the table has relevant indexes). So depending on your needs you may find you need to use a table. That doesn't mean you should junk your view entirely. For example, if a daily refresh of your table is sufficient, you could use your view to help:
truncate table compliance;
insert into compliance select * from compliance_view;
Truncate is more efficient than delete, but you can't use a rollback, so you might prefer to use delete and top-and-tail with START TRANSACTION; ... COMMIT;. I've never created scheduled jobs in MySQL, but if you need help, this looks like a good starting point: here
If daily isn't often enough, you could schedule this to run more often than daily, but better options will be triggers and/or "partial refreshes" (my term, I've no idea if there is a technical term for the idea.
A perfectly written trigger would spot any relevant insert/update/delete and then insert/update/delete the related records in the compliance table. The logic is a little daunting, and I won't attempt it here. An easier option would be a "partial refresh" on called within a trigger. The trigger would spot user targetted by the change, delete only the records from compliance which are related to that user and then insert from your compliance_view the records relating to that user. You should be able to put that into a stored procedure which is called by the trigger.
Update expanding on the options (if a view just won't do):
Option 1: Daily full (or more frequent) refresh via a schedule
You'd want code like this executed (at least) daily.
truncate table compliance;
insert into compliance select * from compliance_view;
Option 2: Partial refresh via trigger
I don't work with triggers often, so can't recall syntax, but the logic should be as follows (not actual code, just pseudo-code)
AFTER INSERT -- you may need one for each of INSERT / UPDATE / DELETE
FOR EACH ROW -- or if there are multiple rows and you can trigger only on the last one to be changed, that would be better
DELETE FROM compliance
WHERE person_id = INSERTED.person_id
INSERT INTO compliance select * from compliance_view where person_id = INSERTED.person_id
END
Option 3: Smart update via trigger
This would be similar to option 2, but instead of deleting all the rows from compliance that relate to the relevant person_id and creating them from scratch, you'd work out which ones to update, and update them and whether any should be added / deleted. The logic is a little involved, and I'm not not going to attempt it here.
Personally, I'd be most tempted by Option 2, but you'd need to combine it with option 1, since the data goes stale due to the use of now().
Here's a similar way of writing the same thing...
Views are of very limited benefit in MySQL, and I think should generally be avoided.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(task_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,person_id INT NOT NULL
,task_name VARCHAR(30) NOT NULL
,status ENUM('INCOMPLETE','COMPLETE') NOT NULL
,due_date_time DATETIME NOT NULL
);
INSERT INTO my_table VALUES
(1,111,'walk 20 min daily','INCOMPLETE','2017-04-13 17:20:23'),
(2,111,'brisk walk 30 min','COMPLETE','2017-03-14 20:20:54'),
(3,111,'take medication','COMPLETE','2017-04-20 15:15:23'),
(4,222,'sport','COMPLETE','2017-03-18 14:45:10');
SELECT person_id
, DATE_FORMAT(due_date_time,'%Y-%m') yearmonth
, SUM(status = 'complete')/COUNT(*) x
FROM my_table
GROUP
BY person_id
, yearmonth;
person_id yearmonth x
111 2017-03 1.0
111 2017-04 0.5
222 2017-03 1.0
i have a tbl_remit where i need to get the last remittance.
I'm developing as system wherein I need to get the potential collection of each Employer using the Employer's last remittance x 12. Ideally, Employers should remit once every month. But there are cases where an Employer remits again for the same month for the additional employee that is newly hired. The Mysql Statement that I used was this.
SELECT Employer, MAX(AP_From) as AP_From,
MAX(AP_To) as AP_To,
MAX(Amount) as Last_Remittance,
(MAX(Amount) *12) AS LastRemit_x12
FROM view_remit
GROUP BY PEN
Result
|RemitNo.| Employer | ap_from | ap_to | amount |
| 1 | 1 |2016-01-01 |2016-01-31 | 2000 |
| 2 | 1 |2016-02-01 |2016-02-28 | 2000 |
| 3 | 1 |2016-03-01 |2016-03-31 | 2000 |
| 4 | 1 |2016-03-01 |2016-03-31 | 400 |
By doing that statement, i ended up getting the wrong potential collection.
What I've got:
400 - Last_Remittance
4800 - LastRemit_x12 (potential collection)
What I need to get:
2400 - Last_Remittance
28800 - LastRemit_x12 (potential collection)
Any help is greatly appreciated. I don't have a team in this project. this may be a novice question to some but to me it's really a complex puzzle. thank you in advance.
You want to filter the data for the last time period. So, think where rather than group by. Then, you want to aggregate by employer.
Here is one method:
SELECT Employer, MAX(AP_From) as AP_From, MAX(AP_To) as AP_To,
SUM(Amount) as Last_Remittance,
(SUM(Amount) * 12) AS LastRemit_x12
FROM view_remit vr
WHERE vr.ap_from = (SELECT MAX(vr2.ap_from)
FROM view_remit vr2
WHERE vr2.Employer = vr.Employer
)
GROUP BY Employer;
EDIT:
For performance, you want an index on view_remit(Employer, ap_from). Of course, that assumes that view_remit is really a table . . . which may be unlikely.
If you want to improve performance, you'll need to understand the view.
I'm struggling to design an efficient automated task to clean up a reputation points table, similar to SO I suppose.
If a user reads an article, comments on an article and/or shares an article, I give my members some reputation points. If my member does all three of these for example, there would be three separate rows in that DB table. When showing the members points, I simply use a SUM query to count all points for that member.
Now, with a million active members, with high reputation, there are many, many rows in my table and would somehow like to clean them up. Using a Cron Job, I would like to merge all reputation rows for each member, older than 3-months, into one row. For example:
user | repTask | repPoints | repDate
-----------+-------------------------------+--------------+-----------------------
10001 + Commented on article | 5 | 2012-11-12 08:40:32
10001 + Read an article | 2 | 2012-06-12 12:32:01
10001 + Shared an article | 10 | 2012-06-04 17:39:44
10001 + Read an article | 2 | 2012-05-19 01:04:11
Would become:
user | repTask | repPoints | repDate
-----------+-------------------------------+--------------+-----------------------
10001 + Commented on article | 5 | 2012-11-12 08:40:32
10001 + (merged points) | 14 | Now()
Or (merging months):
user | repTask | repPoints | repDate
-----------+-------------------------------+--------------+-----------------------
10001 + Commented on article | 5 | 2012-11-12 08:40:32
10001 + (Merged for 06/2012) | 12 | Now()
10001 + (Merged for 05/2012) | 2 | Now()
Anything after 3-months is considered legitimate, anything before may need to be revoked in-case of cheating, hence why I state 3-months.
First of all, is this a good idea? I'm trying to avoid, say in 3 years time, having 100's of millions of rows. If it's not a good idea to merge points, is there a better way to store the data as it's inputted. I obviously cannot change what's already inputted but could make it better for the future.
If this is a good idea, I'm struggling to come up with an efficient query to modify the data. I'm not looking for exact code but if somebody could help describe a suitable query that could merge all points older than 3-months, for each user, or merge all points older than 3-months into separate months, for each user, it would be extremely helpful.
You can do it that way, with cron jobs, but how about this:
Create a trigger or procedure so that anytime a point is added, it updates a total column in the users table, and anytime a point is revoked the total column is subtracted from?
This way, no matter how many millions or billions of rows in the points table, you don't have to query those to get the total points results. You could even have separate columns for months or years. Also, since you're not deleting any rows you can go back and retroactively revoke a point from, say, a year ago if needed.
I have a number of files on my website that are private and pushed through php. I keep track of the downloads using a mysql database. Currently I just use a column for each file and insert a new row for every day, which is fine because I don't have many files.
However, I am going to be starting to add and remove files fairly often, and the number of files will be getting very large. As I see it I have two options:
The first is to add and remove columns for each file as they are added and removed. This would quickly lead to the table having very many columns. I am self-taught so I'm not sure, but I think that's probably a very bad thing. Adding and removing columns once there are a lot of rows sounds like a very expensive operation.
I could also create a new database with a generic 'fileID' feild, and then can add a new row every day for each file, but this would lead to a lot of rows. Also, it would be a lot of row insert operations to create tracking for the next day.
Which would be better? Or is there a third solution that I'm missing? Should I be using something other than mysql? I want something that can be queried so I can display the stats as graphs on the site.
Thank you very much for your help, and for taking the time to read.
I could also create a new database with a generic 'fileID' feild, and then can add a new row every day for each file, but this would lead to a lot of rows.
Yes, this is what you need to do — but you mean "a new table", not "a new database".
Basically you'll want a file table, which might look like this:
id | name | created_date | [other fields ...]
----+-----------+--------------+--------------------
1 | foo.txt | 2012-01-26 | ...
2 | bar.txt | 2012-01-27 | ...
and your downloads_by_day table will refer to it:
id | file_id | `date` | download_count
----+---------+------------+----------------
1 | 1 | 2012-01-27 | 17
2 | 2 | 2012-01-27 | 23
3 | 1 | 2012-01-28 | 6
4 | 2 | 2012-01-28 | 195
I am having trouble developing some queries on the fly for our clients and sometimes find myself asking "Would it be better to start with a subset of the data I know I'm looking for, then just import into a program like Excel and process the data accordingly using similar functions, such as Pivot Tables"?.
One instance in particular I am struggling with is the following example:
I have an online member enrollment system. For simplicity sake, let's assume the data captured is: Member ID, Sign Up Date, their referral code, their state.
A sample member table may look like the following:
MemberID | Date | Ref | USState
=====================================
1 | 2011-01-01 | abc | AL
2 | 2011-01-02 | bcd | AR
3 | 2011-01-03 | cde | CA
4 | 2011-02-01 | abc | TX
and so on....
ultimately, the types of queries I want to build and run with this data set can extend to:
"Show me a list of all referral codes and the number of sign ups they had by each month in a single result set".
For example:
Ref | 2011-01 | 2011-02 | 2011-03 | 2011-04
==============================================
abc | 1 | 1 | 0 | 0
bcd | 1 | 0 | 0 | 0
cde | 1 | 0 | 0 | 0
I have no idea how to build this type of query in MySQL to be honest (I imagine if it can be done it would require a LOT of code, joins, subqueries, and unions.
Similarly, another sample query may be how many members signed up in each state by month
USState | 2011-01 | 2011-02 | 2011-03 | 2011-04
==============================================
AL | 1 | 0 | 0 | 0
AR | 1 | 0 | 0 | 0
CA | 1 | 0 | 0 | 0
TX | 0 | 1 | 0 | 0
I suppose my question is two fold:
1) Is it in fact best to just try and build these out with the necessary data from within a MySQL GUI such as Navicat or just import the entire subset of data into Excel and work forward?
2) If I was to use the MySQL route, what is the proper way to build the subsets of data in the examples mentioned below (note that the queries could become far more complex such as "Show how many sign ups came in for each particular month by each state and grouped by each agent as well (each agent has 50 possible rows)"
Thank you so much for your assistance ahead of time.
I am a proponent of doing this kind of querying on the server side, at least to get just the data you need.
You should create a time-periods table. It can get as complex as you desire, going down to days even.
id year month monthstart monthend
1 2011 1 1/1/2011 1/31/2011
...
This gives you almost limitless ability to group and query data in all sorts of interesting ways.
Getting the data for the original referral counts by month query you mentioned would be quite simple...
select a.Ref, b.year, b.month, count(*) as referralcount
from myTable a
join months b on a.Date between b.monthstart and b.monthend
group by a.Ref, b.year, b.month
order by a.Ref, b.year, b.month
The result set would be in rows like ref = abc, year = 2011, month = 1, referralcount = 1 as opposed to a column for every month. I am assuming that since getting a larger set of data and manipulating it in Excel was an option, that changing the layout of this data wouldn't be difficult.
Check out this previous answer that goes into a little more detail about the concept with different examples: SQL query for Figuring counts by month
I work on an Excel based application that deals with multi-dimensional time series data, and have recently been working on implementing predefined pivot table spreadsheets, so I know exactly what you're thinking. I'm a big proponent of giving users tools rather than writing up individual reports or a whole query language for them to use. You can create pivot tables on the fly that connect to the database and it's not that hard. Andrew Whitechapel has a great example here. But, you will also need to launch that in Excel or setup a basic Excel VSTO program, which is fairly easy to do in Visual Studio 2010. (microsoft.com/vsto)
Another thing, don't feel like you have to create ridiculously complex queries. Every join that you have will slow down any relational database. I discovered years ago that doing multi-step queries into temp tables in most cases will be much clearer, faster, and easier to write and support.