Is it possible to fetch data from Redis from within MySQL (using a native function or some other mechanism)? I would like to be able to use this information in ORDER BY statements for paging with LIMIT. Otherwise I will have to fetch all the data from MySQL, fetch additional data for each row from Redis, sort in my application and keep the page I need.
It would be much more efficient if MySQL could say call a function for every row to get data from Redis, do the sorting and only send me the page I need.
Even if this is possible (with open source everything is technically possible), it's unlikely to improve performance much over the cleaner approach of sorting within your app. If your data set is small, returning everything is not a problem. If your data set is large, you probably need to be sorting by an indexed column to get decent performance out of sql, and you can't index a function.
Also, if the result set isn't huge the dominant performance issue is usually latency rather than processing or data transfer. One query from sql and one mget from redis should be reasonably quick.
If you really want just one query when a page is viewed you will need to have both record and sorting data in one place - either add the data from redis as a column in sql or cache your queries in redis lists.
Related
I am deciding between using MongoDB and MySQL for my next application. I'll use Elasticsearch for search queries so I don't care about this.
So in these 3 operations: insert, update and delete. Which one is faster, MongoDB or SQL?
Thanks.
If you are inserting/updating/deleting an object, like a customer or order, then I would assume it would be slower to do it via SQL because that data would typically be normalized so you would need to split that object into it's normalized forms to insert.
The real answer is to load test the different tools based on your needs and see which works out for you but I suspect that either one will work well for you unless you're dealing with big, big data.
I have a Laravel web app that's using a VueJS front-end and MySQL as the RDBMS. I currently have a table that is 23.8gb and contains 8m+ rows and it's growing every second.
When querying this data, I'm joining it to 4 other tables so the entire dataset is humongous.
I'm currently only pulling and displaying 1000 rows as I don't need anymore than that. VueJS is showing the data in a table and there are 13 filter options for the user to select from to filter the data ranging from date, name, status, etc.
Using Eloquent and having MySQL indexes in place, I've managed to get the query time down to a respectable time but I need this section of the app to be as responsive as possible.
Some of the where clauses that kick off from the filters are taking 13 seconds to execute which I feel is too long.
I've been doing some reading and thinking maybe MongoDB or Redis may be an option but have very little experience with either.
For this particular scenario, what do you think would be the best option to maximise read performance?
If I were to use MongoDB, I wouldn't migrate the current data... I'd basically have a second database that contains all the new data. This app hasn't gone into production yet and in most use cases, only the last 30 days worth of data will be required but the option to query old data is still required hence keeping both MySQL and MongoDB.
Any feedback will be appreciated.
Try to use elasticsearch. It will speed up the read process.
Try converting the query into a stored procedure. You can execute the stored procedure like this..
DB::select('exec stored_procedure("Param1", "param2",..)');
or
DB::select('exec stored_procedure(?,?,..)',array($Param1,$param2));
Try this for without parameters
DB::select('EXEC stored_procedure')
Try using EXPLAIN to optimise the performance.
How to optimise MySQL queries based on EXPLAIN plan
I've got a ruby scripts which imports XML files to a MySQL database. It does it by looping through the elements in the XML file and finally
table.where(
value: e['value'],
...
).first_or_create
The script has to process a lot of data, most of it is already in the database. Because of this, it runs really slow because first_or_create obviously triggers a lot ot SELECT queries.
Is there any way to handle this more rapidly? Is it related to connection management?
Thanks
first_or_create is of course a convenience method, which doesn't much care about performance on a bigger data set.
Ensure all your indices are in place.
First obvious way to increase performance would be: since every create statement is wrapped in a begin, commit transaction block. Thats 3 queries for one insert.
You can place your whole loop inside transaction block - that will gain you some time as it will only execute begin and commit once.
Remember that the roundtrip to/from database takes considerable amount of time, so an obvious performance boost is to combine multiple statements into one. Try to create one SELECT query finding a batch of, let's say 1000 records. DB will return that 200 don't exist and you can go ahead and build one INSERT statement for those 200 queries.
Always perform measurements and always try to formulate what level of performance you are trying to achieve, to not make the code too verbose.
It's better to eliminate records that need to create.check records if not in the db then create
I was wondering when we call the perl dbi apis to query a database, are all the results return? Or do we get partially the result set and as we iterating we retrieve more and more rows from the database.
The reason I am asking is that I notice the following in a perl script.
I did a query to a database which returns a really large number of records. After getting this records I did a for loop over the results and created a hash of this data.
What I noticed is that the actual query from the database return in a reasonable amount of time (the results were a lot) but the big delay was looping over the data to create the hash.
I don't understand this. I would expect that the query would be the slow part since the for loop and the construction of the hash would be in-memory and would be cheap.
Any explanation/idea why this happens? Am I misunderstanding something basic here?
Update
I understand that MySQL caches data so when I run the same query multiple times it would be faster the second time and on. But still I would not expect the for loop over the data set in memory to be of the same (and more) time duration as the query to the MySQL DB.
Assuming you are using DBD::mysql, the default is to pull all the results from the server at once and store them in memory. This avoids tying up the server's resources and works fine for the majority of result sets as RAM is usually plentiful.
That answers your original question, but if you would like more assistance, I suggest pasting code - it's possible your hash building code is doing something wrong, or unnecessary queries are being made. See also Speeding up the DBI for tips on efficient use of the DBI API, and how to profile what DBI is doing.
I have read and followed instructions here:
What is an efficient method of paging through very large result sets in SQL Server 2005? and what becomes clear is I'm ordering by a non-indexed field - this is because it's a generated field from calcuations - it does not exist in the database.
I'm using the row_number() technique and it works pretty well. My problem is that my stored procedure does some pretty big joins on a fair bit of data and I'm ordering by the results of these joins. I realise that each time I page it has to call the entire query again (to ensure correct ordering).
What I would like (without pulling entire result set into the client code and paging there) is that once it SQL Server got the whole result set it could then page through that
Is there any built-in way to achieve that? - I thought that views might do this but I can't find info on this.
EDIT: Indexed Views will not work for me as I need to pass in parameters. Anyone got any more ideas - I think either I have to use memcached or have a service that builds indexes in the background. I just wish there was a way for SQL Server to get that table and hold onto it whilst it is paged...
I am not very familiar with paging, and without knowing the logic behind your procedure, I can only guess you'd benefit from IndexedViews or #TemporaryTables with Indexes.
You mentionned you were ordering by a non-indexed field that is generated, that information combined with the fact that your procedure calls the entire query every time would lead me to believe you could make that query an IndexedView. You'd get better performance from accessing it multiple times and it would also enable you to add an Index onto the field you're ordering by.
You could also use a #TemporaryTable if it somehow stays alive during your paging requests... Insert the dataset you are working with in a #TemporaryTable, you can then create an index with T-SQL on the generated colum.
Indexed Views for SQL Server 2005: http://technet.microsoft.com/en-us/library/cc917715.aspx