I have a website written in PHP with a search form.
This site have lots of sections and mySql tables.
The client was complaining about the results because they were not grouped, and wanted them grouped according to the site section, like below:
Results in "About" Page:
<the results>
Results in "Blog":
<the results>
... and so on.
I had to implement a quick solution for it, so I made several queries and ran them separately... and used a foreach to iterate over the results and print them.
Well, it works, but I´m not happy about it, because it is quite slow and I wonder if I'll have performance issues in the future.
I´m not a mysql genius myself and I just started my backend programmer carreer, so I wanted someone to give an idea of how I could handle this in a more professional way.
I was thinking of using a join, but I don't know how I can group the results using this approach.
Any help would be very appreciated.
I really doubt a join would help you in any way. Since you said that each of the section addresses a different table, there is no way you can join them whilst still making any sense. The best that you can do is writing these queries into one sentence and then take all the needed information in one go thus saving time on the data sending php <=> mysql since you will be executing once and returning once. Take a look here to see how you can do that. I really don't think there is anything better at all that you could improve :)
Response: Clearly, the more requests you are doing, the longer your script will perform. Have you ever tried using a command ping google.com in cmd? You see that even you send a very small amount of data, you cannot get the response faster than 30ms or so. This is the price that you pay for any request. Plus, it also the amount of data sent adds to the time. So executing it one by one you will make many unnecessary calls. Moreover you can always try it by yourself, it is not difficult to write it in either way. And output the time spent for the task. Repeat a few times. If the time spent sending data is very insignificant, you can just leave it the easy way. But keep in mind, if your application grows bigger, every lost millisecond will add up. Multiply it by one thousand requests and you could have lost a minute of your time or even more. Anyways, definitely don't go the UNION way because you will most likely lose the most time analyzing the data you received. Either
And about that function: I, myself, have never came up to need using that function so it is either me or you reading it - we would both read it from scratch. The only difference is that I knew such function existed :) Weirdly enough, there is very little information on this. C# has datasets which are very good and make it easy to handle the data. PHP is slacking behind in my opinion :/ And I am all hands in C# now, so I tend to know less and less in php. If I were you, I would just copy paste the example from the link I gave and create a reusable class for later. Hope I helped at least a little bit :)
Related
This is more of a philosophical question, than a technical question. I'm an Access noob and am running into a philosophical conundrum.
I've run some queries from my base tables. I have them pretty much how I want them, don't really foresee making additional changes.
So, my question is this: is there an advantage to keeping my data in queries? If they're going to be static queries, should I just make them into tables? Have I already made too many tables/is there such a thing?
I'm working with computer scans. The scans are looking at different things on the computer--these are 2 tables. I also have a master list of stuff that I put together. And then I have a list of printers.
Then, I have like 7 queries. They're things like looking for intersects between the different scans, comparing the results of scans to lists of printers, etc. etc.
So, yeah. Do I keep them as queries, does it not matter, or should I make them into tables if they're just going to be static?
welcome to stack overflow.
You are going to need to provide much more detail if you want specific help in these forums. Showing us details of what you've already accomplished helps a lot. Also showing more details on where you are trying to get to is helpful. You might want to read the site rules for how to post a well formed question.
As for databases... there are many ways to construct a database depending on the amount of data fields as well as amount of data in those fields.
If you have a relatively small database that doesn't change, you can break with conventions and clump data into rough tables. But it's very advisable to NOT do that. Because if you ever need to change the tables or add new data... it starts to become a nightmare fairly quickly.
Which brings up the question: what are well formed tables in a database and how are they connected?
The answer usually takes going to some database classes. But you can start by looking up what Third Normal Form is. This will give you an idea of how to break down data into tables that are manageable and easy to expand upon.
But Third Normal Form is not always the best way to store data. Sometimes for reporting purposes, it's better to have tables in second normal form or lower to give you more speed on retrieval. (Mind you, this is usually for databases with massive amounts of data.)
Anyway, it's worth looking up articles on database design or taking a class. The more you understand about how data is retrieved and stored, the better you will be at deciding what the best structure is for you.
If you post more details, I'm sure people from stack overflow will help give you more pointers.
Best of luck! :)
I am working on a project, which will be used by around 500 employees in my organization. Currently, it's still in development phase, and very few people(around 10) are using it. I'm using MySQL. I just want to know, what happens if many users are doing front end edits and then save, at the same point of time? Some SELECT queries that I've written do take as long as 6 seconds to execute. As only one query can be executed at any point of time, if already a query is in progress, and another hits the database, will it create problem? If this is a common situation in large scale projects, please let me know how can I handle this. I'm not sure, if I've made myself clear :). Any advice or links will be very helpful.
From technical aspect, no - nothing bad will happen, the database won't go ballistics and die on you, they're made for purposes like concurrent access.
From logical point of view - something bad will happen. If two people edit the same thing at the same time and then post it at the same time - it gets saved to hard drive one after another. The last one to save is the one whose updates will be on the HDD, effectively causing the first person to lose their changes.
You can approach this problem from several angles. Some projects introduce the concept of locking (not table locking but in-app locking). It revolves around marking a record as locked using a boolean column and if anyone tries to access that record for updating, the software says that someone else is editing it. It's something really difficult to implement and for the most time it doesn't work as expected (I think I vaguely remember Joomla! using something like that, it was one of the most annoying features ever).
The other option you have is to save each update as a revision. That way you can keep track on who updated what and when and you never lose any records in case of would-get overwritten. I believe that SO and Wikipedia use that approach and it works really great because you can inspect what two or more people have done and merge their contributions.
Optimistic Concurrency Control
http://en.wikipedia.org/wiki/Optimistic_concurrency_control
Make sure that each record contains date metadata on last changed/modified time, and load that as part of your data object. Then when attempting to commit the row to database, check the last_modified time in the table to ensure that it is the SAME as the one stored in memory for your object. If it matches, commit it, else throw exception.
I am sure there are lots of tutorial for this kind of topic, but I can't find what I want because I don't know the jargon for it. So I ask StackOverflow.
Here the example:
People can Like or Dislike videos on Youtube, and the database should update the counts for Like or Dislike. However, it's impractical, especially for sites like Youtube, to update the database every time a user clicked on Like / Dislike button.
How can we cache the query / count numbers at a time interval, and when the time expired we send all the queries / update the database at one time? Or any similar technique for this kind of situation?
So what you're observing is the time delay between something happening and being able to view the results of what happened.
And you're on the right path to only update periodically.
But you're on the wrong path as far as where to do the periodic updates.
Thing is you WANT to update the "database" every time ASAP (namely the database(s) responsible for writing - choose your missing corner of the CAP triangle) to capture everything pretty quickly, but for your visitors/viewers, you give them a slightly-behind (a few seconds to maybe a day, depending the situation) view of the write database(s).
You do NOT want to store this on the browser and potentially lose what the user did should the request fail, the internet go down, etc.
Slightly off topic - you typically do not try to "prematurely optimize" without data on knowing how much you're going to save by caching, buffering, etc. Optimizations like that add complexity - and you will stay sane, longer, if you keep things simple for as long as possible. Keep your design simple and optimize your bottlenecks once you know what they are.
Slightly more off topic - I'd recommend reading on distributed computing, specifically as it pertains to databases and then some design. You'll realize these highly focused abstract problems all have "solutions" with various advantages and disadvantages.
I have a query that takes about a minute to complete since it deals with a lot of data, but I also want to put the results on a website. The obvious conclusion is to cache it (right?) but the data changes as time goes by and I need a way to automatically remake the cached page maybe every 24 hours.
can someone point me to how to do this?
edit: I want to make a "top 10" type of thing so it's not displaying the page that is the problem but the amount of time it takes for the query to run.
Caching the results of the query with a 24hr TTL (expiry) would probably work fine. Use a fragment cache assuming this is a chunk of the page.
You can setup memcached or redis as stated to store the cache. Another thing you can do is setup a job that warms the cache every 24 hrs (or as desired) so that unlucky user doesn't have to generate the cache for you.
If you know when the cache is expired base on a state or change in your database you can expire the cache based on that. A lot of times I use the created at or updated at fields as part of the cache key to assist in this process.
There is some good stuff in the scaling rails screencasts by envy labs and new relic. http://railslab.newrelic.com/scaling-rails, a little out of date but the principles are still the same.
Also, checkout the caching rails guides. http://guides.rubyonrails.org/caching_with_rails.html
Finally, make sure indexes are setup properly, use thoughtbots post here: http://robots.thoughtbot.com/post/163627511/a-grand-piano-for-your-violin
Typed on my phone so apologies for typos.
Think a little beyond the query. If your goal is to allow the user to view a lot of data, then grab that data as they want it rather than fighting with a monsterous query that's going to overwhelm your UI. The result not only looks better, but is much, much quicker.
My personal trick for this pattern is DataTables. It's a grid that allows you to use Ajaxed queries (which is built in) to get data from your query a "chunk" at a time that the user wants to see. It can sort, page, filter, limit, and even search with some simple additions to the code. It even has a plug-in to export results to excel, pdf, etc.
The biggest thing that Datatables has that others don't is a concept called "pipelining" which allows you to get an amount to show (say 20) plus an additional amount forward and/or backwards. This allows you to still do manageable queries, but not to have to hit the database each time the user hits "next page"
I've got an app dealing with millions of records. One query of all data would be impossible....it would just take too long. Grabbing 25 at a time, however, is lightning fast, no tricks required. Once the datatable was up, I just performance tuned my query, did some indexing where needed, and voila.....great, responsive app.
Here's a simple example:
<table id="example"></table>
$('#example').dataTable( {
"bProcessing": true,
"bServerSide": true,
"sAjaxSource": "/processing/file.php"
} );
Use a cache store that allows auto-expiration after a certain length of time.
Memcached does it, Redis too I guess !
i'm developing a system that will collect user activities samples (opened a window, scrolls, enter page, leave page, etc.) and i'm looking for the best way to store these samples and query it.
i prefer something smart where i can execute sql-like group by queries (for example give me all the window open events grouped by date and hour), and of course something flexible enough in case i'll need to add columns in the future.
i'm trying to avoid thinking about all the queries i might need and just save an aggregated version of the data by time, since i'd like to do drill-downs. (for example, count all the window open events by date and time, and then see all event in each time-frame, or change it to be by unique userId).
thanks.
PS - i currently use MySql for this task, but the data is expected to grow rapidly. I've experimented with mongoDB as well.
I believe mongoDB can be a good solution. First of all it's designed to hold big data and it's really easy to use and scale (replica set or sharding). Also the expression language is solid. I mean it's not as powerful as SQL, but still good enough. Here is a good link about mapping SQL command to MongoDB.
There are other alternatives, but I think or they are too complex or their expression language is not powerful enough.
Have a look at this link too, which can help you find the right solution for you.