Database Size Management - mysql

I am in the final stages of building my website and I am getting a little nervous that I am doing something wrong with my database.
I am building a Laravel/mysql site that allows users to add events. So I have an Event database. I allow users to choose dates for their event for next six months. That means one Event can have around 180 event dates which I save to a separate database called Shows. Each of those show dates can have tickets and the prices(vip, general etc) in a Ticket database. That means for just one event, it could create a huge number of entries.
event(1) -> show on every day(180) -> 5 ticket types(900 entries in my tickets database)
As far as I can tell this is the correct way to do it, but it seems like my Ticket database is going to get massive quickly. I will be using Elastic Search to filter through the data.

Related

Firestore Cloud Function Trigger only when path updates with new entire

I have a firestore database that looks like this
/entries/ ....
/users/{userid}...
a bunch of documents is being sent into ... of entries and userid contains on 8 docs of user profile information.
my problem is that the entries doc contains field hours and no relation to the user doc which contains the field weekly_capacity
I need to aggregate this the two fields hours/weekly capacity setting them to Full-time equivalency variable
But the Full-time equivalency needs to be accurate and this company FTE can change so it would need to calculate the FTE over various date even if the user changed their FTE status x number of times.
And the current app only fetched the entries when the user logins into the app, which can be whenever.
None of the API requests that I am using will give me a json that holds both weekly_capacity and hours on the same fetch. If every time a user logs into the app firestore calls the http to fetch all entries then how can I compare the hours field on the collection's entries to the weekly_capacity field
Just a little context: FTE = Full-time equivalency and is used to measure as a standard to see if an employee compares to there core commit hours they signed up for which is 40. SO if I agreed to work 40 and I work actually work 40 hours then I would be 1 whole FTE. If I worked 20 and I suppose to work 40 I am .5 FTE. The math is really simple it's just that in my situation the variable FTE can change any time and the app will allow the user to enter a range of dates fetching the total actual hours they worked and FTE letting them know how many hours they were supposed to work vs how many hours they actually worked. Since the variable changes, I need some way in firestore to track the change and aggregate correctly against the hours actually worked. To give an error example: let's say I changed my FTE from 1 to .7 on March 20th, I then want to generate a report of March 1 to March 30th stating my hours worked and FTE status meaning did I reach my goal. The kicker is that I can't fetch or merge the entries which hold the var hours and /users/ which hold the var weekly_capacity.
I don't even think a cloud function would solve the problem since entries are only fetched when the user logins in right?
I'm assuming the following for answering your question.
Requirement: To calculate FTE for a user when user's weekly_capacity is updated or user logs in.
Problems:
Some way in firestore to track the change.
Calculate FTE correctly according to the change.
Here's what I think will solve the problems.
Google Cloud Firestore supports listeners for the collections in which you store the data. So you can listen for any change in users collection and entries collection. This is how you can track the change.
To calculate FTE, when a change is made to weekly_capacity of user document or a new entry is made to entries collection you need to query both collections separately to get the records corresponding to the user affected. You can also use a collections-group query for this purpose but that depends on your database design.
Hope that helps.

Summing a ledger over long period of time. (reconciliation, snapshots, rolling sum?)

We are building a warehouse stock management system and have a stock movements table that records stock into, through and out of the system, for each product and each location it is stored. i.e.
10 units of Product A is received into Location A
10 units Product A are moved to Location B and removed from Location A.
1 unit is removed (sold) from Location B
... and so on.
This means that over to work out how much of each product is stored in each location we would;
"SELECT SUM('qty') FROM stock_movements GROUP BY location, product"
(we actually use Eloquent but I have used SQL for an example)
Over time, this will mean our stock movements table will grow to millions of rows and I am wondering the way to best manage this. The options I can think of:
Sum the rows as grouped above and accept it may get slow over time. Im not sure how many rows it will take before it actually starts to cause any performance issues. When requesting a whole inventory log via our API each row would have to be summed for every product, so this will compile to a fairly large calculation.
Create a snapshot of the summed rows every day/week/month etc. on a cron and then just add the sum of the most recent rows on the fly.
Create a separate table with a live stock level which is added to and subtracted with every stock movement. The stock movements table shows an entire history of all movements while the new table just shows the live amounts. We would use database transactions here to ensure they keep in sync.
Is there a defined and best practice way to handle this kind of thing already? Would love to hear your thoughts!
The good news is that your system is already where a lot of people say the database world should be moving: event sourcing. ES just stores every event against an object, in this case your location, and in order to get the current state you have to start with an empty object and replay all of that objects events.
Of course, this can be time-consuming, and your last two bullet points are the standard ways of dealing with it. First, you can create regular snapshots with the current-as-of-then totals for that location, and then when someone asks for the current-as-of-now totals you only need to replay events since the last snapshot. Second, you can have a separate table of current values, and whenever you insert a record into your event store you also update the current value. If they ever get out-of-sync, you can always start fresh and replay the entire event series again.
Both of these scenarios are typically managed through an intermediary queue service, like SQL's Service Broker, RabbitMQ, or Amazon's SQS: instead of inserting an event directly into your event store, you send the change into a queue and the code that processes the queue will update your snapshot.
Good luck!

How to handle product delivery times in mysql?

I have a winform application and a mysql server saving the data.
I created a form from which users can save/update/delete records pretty normal stuff.
The problem comes when I added the delivery time functionality where the product has a expected, standard time in which it will be delivered; so say it has a six day expected time the form would calculate the expected time and show it back to the user, straight forward.
I've tried using a DATETIME column but after reading mysql documentation it has upper and lower limits so I cantĀ“save say '0000-00-01 12:30:00'. So how do I save that kind of data into mysql? Whats standard practice in this topic?

Merging table data in mysql?

I have an LDAP CSV file that is imported nightly and dumped into my MYSQL database. It has about 70000 employee records.
Included in that is empl#, email, group, supervisor, etc.
I have reports that are being generated from various web sites. We are dumping these reports in the database once a month. These reports usually have empl#, email, hits, logins, whatever...
My goal is to combine the report data and add in things like group, supervisor, etc based on empl#... Speed is a big concern because of the size of the database and number of users.
At first I thought of making a simple left join (given that report data is left - and that all people in the report may not be an employee). However the problem with that is that it does not take a snapshot in time. If report data from 6 months ago is viewed I don't want it mixed with current employee data - I want it to stay a snapshot in time.
What is the best way to handle this?
You will need a date column of some kind in both sets of data on which to join. Once you have that, you can simply put a condition that establishes the snapshot in the WHERE that limits the selection.

Query, Display, and Filter Large Database Lists

I am trying to determine the best method of collecting a large list from a database and then displaying and filtering the results on the client side. Let me give a quick example:
Example: I've got a database with customer data and currently it contains around 2000 records. This number is constantly increasing. On my website I have a page that I want to be able to query said database based on information such as name, email, phone number etc. and of course display the results (when a user types in Smith it returns all records containing the name Smith). I am planning on using AJAX so that I can query the database and display the results on the fly similar to how google does it. When a user begins searching, results will start showing up on the page as they are found.
Possible Solutions:
Unfortunately I am stumped on how to go about implementing something like this. I am considering using a ValueList pattern. When the user first loads the page, should I be querying the database and storing every record in a collection and then searching that collection list and displaying the results on my jsp page? Essentially creating a java database. The thing I like about the ValueList pattern is that I take one huge hit on page load and dump the entire database in objects stored in a list. What if the database is larger though, say 2,000,000 records?
Or should I be using a simple DOA pattern without the ValueList and query the database for each individual search? This would result in a LOT of database queries, especially considering that I plan on returning results as the user types in the search box.
Edit: The more I think about this, the more it is an AJAX question. My biggest concern should be how to query my database while the user is typing. Do I set some sort of listener to listen for the user to stop typing and then perform the query?
I would use Solr for this type of task.
Fields, which you are going to use for searching should be indexed with Solr.
Then you do an ajax query to Solr and get the result. You can set the order, number of items per page and show results only for current page.
Solr has a lot of other features that can be useful for you.