I am developing an intense financial MySQL database (django + SQLAlchemy) which is interrogated and manipulated constantly. My DB contains a lot of date-value pairs. I keep loading more and more data as time progresses, but historical values don't change which is why I think caching could really improve performance for me.
Is beaker really my best option, or should I implement my own caching over Redis? I would love to hear some ideas for caching architectures - thanks!
The Mysql cache stores the text of a SELECT statement together with the corresponding result that was sent to the client.
If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.
Related
problem
My question is I am developing a system. I click a query caching concept for fast response time. now I want to find which kind of traffic on system on web application is better for query caching and which is not. and what is the downside of query caching.
Whether Query Cache is good for you? Depends on
what MySQL version are you using
what is the scale of your application
what kind of queries you want to cache
How it works
If MySQL Query Cache is used, MySQL won't go to the trouble of parsing the query every time the query is hit. MySQL will look for a identical query in the query cache whenever a query is hit and if it finds the query, it won't need to parse it again, it will just send it to the server and fetch the results.
Issues & Limitations
Please do remember that the cache doesn't store data of your query. You will not receive old/stale data from a cached query. It just stores the parsed query. But a point to be made here is that if the underlying tables (of the cached query) undergo any change, all the tables being used in the cache will be invalidated.
Among other things, there are serious limitations to the Query Cache. Cached queries cannot be used for stored procedures, functions and triggers. They're also not used for queries which are subqueries of an outer query.
It was once considered a great tool for speeding up the queries, but recently MySQL development team has decided to retire this feature as they found some scalability issues with the query cache.
Do read this this article on MySQL Server Team's blog about retiring the Query Cache in MySQL 8.0
I have a few set of API's written in CakePHP which we want to migrate to Amazon AWS.
Following is the current situation:
Website is hosted on GoDaddy as shared hosting with domain, for example: democompany.com
Backend database is MySQL which we access via PhpMyAdmin. It has several tables e.g. users, plans, purchases etc.
All API's are written in CakePHP which we access via base URL:
democompany.com/cake
For example, for adding an entry in users table, we create a JSON and send it via REST API. Below image show the JSON:
Now, since our users are growing, our API response time has slowed. Sending a POST or GET takes time to return the response.
We were thinking of migrating our API's and database to Amazon AWS or any other solution. I am not much aware of AWS, so don't know which product would be best.
Which would be the best solution and offer immediate response and would be cost-effective?
A slowing MySQL database with PHP backend can have many reasons. Try these:
One of the most important thing is to think about your indexes. You probably have a primary index on ID with auto_increment. But if you query a lot on another column (like SELECT * FROM users WHERE email LIKE '%john%') it is important to also set an index on the email column. How indexes work is vital if you want high performing databases. See this post for a start of how this works: How do MySQL indexes work?
Another thing is the amount and complexity of your queries. Do you use many queries in one page load or only a few? Try to get as much information as possible out of one query.
Sorting data can be extremely expensive as well. Does removing SORT BY whatever speed things up a lot? Check this out: MYSQL, very slow order by
If you looked at all of this and are sure that all your queries are running smooth you can look at persistent connections (re-using connections in one page load for example), bigger servers, etc.
So I have a small game in node.js(only the server of course) which has map data and player accounts stored in a mysql database. Right now I constructed it in a way that minimizes the amount of queries made by loading data from the database and keeping it in javascript objects/arrays or whatever seems appropriate and only writing to the database when needed.
Now I was thinking: Is this really worth it? In many cases it would be alot better(in terms of data would be more save and WAY more up-to-date) to hardly store data in the server and just loading it from the database when needed(respectively writing when it needs to be changed).
My question is: Is it efficient/save/recommendable to have the server read/write from the database often rather than having data from the database in javascript variables in the server?
Additional info:
-The nodejs server and my mysql server are on the same machine and a query usually takes less than 1ms or maybe 3ms for big queries like loading room data.
-I am using a module simply called mysql.
-If needed I will include extra info, just ask in a comment.
Really depends on your Use-Case. Generally speaking, I would not add another layer of caching in node.js but handle that in your db with a bigger cache and optimized queries.
I would like to convert my stats tracking system not to write to the database directly, as we're hitting bottlenecks.
We're currently using memcached for certain aspects of the site, and I wanted to use it for storing stats and committing them to mysql DB periodically.
The issue lies however in the number of items (which is in the millions) for which potentially there could be stats collected between the cronjob runs that would commit them into the database. Other than running a SELECT * FROM data and checking for existence of every single memcache key, and then updating the table.... is there any other way to do this?
(I'm not saying below is gospel, this is just my gut feeling. As said later on, I don't have the specifics of your system :) And obviously no offence meant etc :) )
I would advice against using memcached for this. Memcached is build te quickly retrieve values that you've gotten before, not to store values. The big difference is that is your cache is getting full, you'll loose your data.
Normally, you'd just have no data in your cache, and recollect the data from the source, which is impossible in this case. That alone would be a reason for me to try an dissuade you from this.
Now you say the major problem is the mysql connection limit you are hitting. If you do simple stuff (like what we talked about in the comments: the insert delayed), it's just a case of increasing the limit. You should probably have enough power to have your scripts/users go to the database once and say "this should eventually be added", and then go away. If your users can't even open 1 connection for that, there's a serious resource problem you probably won't fix by adding extra layers of cache?
Obviously hard to say without any specs of the system, soft and hardware, but my suggestion would be to see if you can just let them open their connections by increasing the limit, and fiddle with the server variables a bit, instead of monkey-patching your system by using a memcached as an in-between layer.
I had a similar issue with statistic data. But please don't use memcached for it. You can't be sure that ALL your items will moved to DB. You can loose data and/or double process data.
You should analyse your bottleneck against how much data you are writing/reading and how many connections you need. And than switch to something scalable like Hadoop, Cassandra, Scripe and other systems.
You need to provide additional information on the platform that you are running: O/S, database (version), storage engine, RAM, CPU (if possible)?
Are you inserting into a single table or more than one table?
Can you disable the indexes on the tables you are inserting into as this slows down the insert functions.
Are you running any triggers or stored procedures to compute values as you insert the raw data?
OK so I'm kinda new to databases in general. I understand the basic theory behind them and have knocked up the odd Access DB here and there.
One thing I'm struggling to learn about is the specifics of how e.g. an SQL query accesses a database.
So say you have a scenario where there's a database on a LAN server (let's say it's MS Access for arguments sake). You run some SQL query or other on it from a client machine. Does the client machine have to download the entire database to run said query (even if the result of the query is just one line)? Or does it somehow manage to get just the data it wants to come down the ol' CAT5? Does the server have to be running anything to do that? Can't quite understand how the client could get JUST the query results without the server having to do some of the work...
I'm seeing two conflicting stories on this matter when googling stuff.
And so this follows on the next question (which may already be answered): if you CAN query a DB without having to get the whole damn thing, and without the server running any other software, can the same be done with a CSV? If not, why not?
Reason I ask is I'm developing an app for a mobile device that needs to talk to a db or CSV file of some kind, and it'll be updating records at a pretty high rate (barcode scanning), so don't want the network to grind to a halt (it's a slow bag of [insert relevant insult] as it is). The less data travelling from device to server, the better.
Thanks in advance
The various SQL servers are just that: a server. It's a program that listens for client queries and sends back a response. It is more than just its data.
A CSV file, or "flat file" is just data. There is no way for it to respond to a query by itself.
So, when you are on a network, your query is sent to the server, which does the work of finding the appropriate results. When you open a flat file, you're using the network and/or file system to read/write the entire file.
Edit to add a note about your specific usage. You'll probably want to use a database engine, as the queries are going to be the least amount of network traffic. For example, when you scan a barcode, your query may be as simple as the following text:
INSERT INTO barcode_table ('code', 'scan_date', 'user') VALUES ('1234567890', '2011-01-24 12:00:00', '1');
The above string is handled by the database engine and the code (along with whatever relevant support data) is stored. No need for your application to open a file, append data to it, and close it. The latter becomes very slow once files get to a large size, and concurrency can become a problem with many users accessing it.
If your application needs to display some data to your user, it would request specific information the same way, and the server would generate the relevant results. So, imagine a scenario in which the user wants a list of products that match some filter. If your products were books, suppose the user requested a list by a specific author:
SELECT products.title, barcode_table.code
FROM products, barcode_table
WHERE products.author = 'Anders Hejlsberg'
ORDER BY products.title ASC;
In this example, only those product titles and their barcodes are sent from the server to the mobile application.
Hopefully these examples help make a case for using a structure database engine of some kind, rather than using a flat file. The specific flavor and implementation of database, however, is another question unto itself.
Generally speaking, relational databases are stored on a remote server, and you access them via a client interface. Each database vendor has software that you'd install on your remote computer that would allow you to access the database on a server. The entire DB is not sent back to the client when a query is executed, although it can send very large result sets if you are not careful about how to structure your query. Generally speaking the flow is like this:
A database server listens for clients to connect
A client connects and issues a SQL
command to the database
The database builds a query plan to
figure out how to get the result
The plan is executed and the results
are sent back to the client.
CSV is simply a file format, not a fully functional platform like a relational database.