I could not find an answer to this while going through the documentations. If I am running a query using the QUERY node of a couchbase, does couchbase automatically splits the query into multiple parts based on the cores available in the query node to parallely run each part of the SQL. For example if i have a query as below
SELECT * FROM bucket1;
And the query node has 8 cores, does the above query gets split into 8 PARTS?
Example query:
SELECT *
FROM bucket1
WHERE ...;
The above query first need to parse, prepare, plan. All these are run in serial. Plan generates various operators (Authorize, IndexScan, Fetch, Filter, Projection,....).
During Execution each of those operators run independently and parallel work on different documents assuming there is no blocking operation (Like group, aggregation, sort). With in the Fetch it uses as much cores as possible to get the data.
Check EXPLAIN there is parallel operators. By default it is 1, you can change that by setting query parameter max_parallelism (will change how many copies of those operators run in parallel) (each work different document). Setting higher value can impact negatively due to context switching.
The following two links has diagram which explains more details.
https://docs.couchbase.com/server/5.0/architecture/querying-data-with-n1ql.html
https://dzone.com/articles/new-performance-tricks-with-n1ql
NOTE: CE version of query service only uses 4 cores.
Related
We are currently doing a lot of small queries. We execute a query, read the results, and then execute the next one. Since network requests cost a lot of time, this ping-ponging gets slow very fast.
This is why we want to do multiple queries at once, sending all data that the SQL server must know to it, and only retrieving one result (consisting of multiple result sets).
We found that Qt 5.14.1's QSqlQuery has the nextResult() function, but in the documentation (link) it says:
Some databases may execute all statements at once while others may delay the execution until the result set is actually accessed, [...].
MY QUESTION:
So, does MySql Server 8.0 delay the execution until the result set is actually accessed? If this is the case, then we still have a ping-pong for every query right? Which would be very slow still.
P.S. Our current solution to just have 1 ping-pong is to union different result sets (resulting in kind of a block diagonal matrix) with lots and lots of null values), and this question is meant to find a better way to do this.
I am using Snappydata and SQL to run some analysis, however the job is slow and involves join operations on very large input data.
I am considering partition the input data first, then run the jobs on different partitions at the same time to speed up the process. But
in the embedded mode I am using, my code gets the SnappySession passed in, and I can use bin/snappy-sql to query the tables, So I assume all snappydata jobs would share the same SnappySession (or same table namespace, like the same database in Postgresql in my understanding).
So I assume if I submit my job using the same jar with different input arguments, the tables namespace would be the same for different jobs, thus causing errors.
So my question is: is it possible to have multiple snappySession (or multiple namespace like database names) that run a series of operations independently, preferably in one snappydata job to avoid managing many jobs at the same time?
I am not sure I follow the question. Maybe this will help:
When queries are submitted using snappy-sql this shell uses JDBC to connect and run the query. Internally snappy will start a Job and run concurrent tasks on each partition depending on the query. And, yes, this SQL session internally is associated with a unique SnappySession (spark session).
Or, maybe, you are trying to partition the data across many tables and start processing on these tables independently but in parallel ?
In oracle we can create a table and insert data and select it with parallel option.
Is there any similar option in mysql. I am migrating from oracle to mysql and my system has more select and less data change, so any option to select parallely is what i am seeking for.
eg: Lets consider my table has 1 million rows and if i use parallel(5) option then five threads are running the same query with limit and fetching approximately 200K each and as final result i get 1 million record in 1/5th of usual time.
In short, the answer is no.
The MySQL server is designed to execute concurrent user sessions in parallel, but not to execute one given user session in several parts in parallel.
This is a personal opinion, but I would refrain from wanting to apply optimizations up front, making assumptions about how the RDBMS works. Better measure the query first, and see if the response time is a real concern or not, and only then investigate possible optimizations.
"Premature optimization is the root of all evil." (Donald Knuth)
Queries within MySQL are always run parallel. If you want to run different queries simultaneously through your program, however, you would need to open different connections through workers that your program would have async access to.
You could also run tasks through creating events or using delayed inserts, however I don't think that applies very well here. Something else to consider:
Generally, some operations are guarded between individual query
sessions (called transactions). These are supported by InnoDB
backends, but not MyISAM tables (but it supports a concept called
atomic operations). There are various level of isolation which differ
in which operations are guarded from each other (and thus how
operations in one parallel transactions affect another) and in their
performance impact. - Holger Just
He also mentions the MySQL transcations page, which breifly goes over the different engine types available to MySQL (MyISAM being faster, but not as reliable):
MySQL Transcations
While working with MySQL and some really "performance greedy queries" I noticed, that if I run such a greedy query it could take 2 or 3 minutes to be computed. But if I retry the query immediately after it finished the first time, it takes only some seconds. Does MySQL store something like "the last x queries"?
The short answer is yes. there is a Query Cache.
The query cache stores the text of a SELECT statement together with the corresponding result that was sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. The query cache is shared among sessions, so a result set generated by one client can be sent in response to the same query issued by another client.
from here
The execution plan for the query will be calculated and re-used. The data can be cached, so subsequent executions will be faster.
Yes, depending on how the MySQL Server is configured, it may be using the query cache. This stores the results of identical queries until a certain limit (which you can set if you control the server) has been reached. Read http://dev.mysql.com/doc/refman/5.1/en/query-cache.html to find out more about how to tune your query cache to speed up your application if it issues many identical queries.
I have the following scenario:
I have a database with a particular MyISAM table of about 4 million rows. I use stored procedures (MySQL Version 5.1) and one in particular to search through these rows on various criteria. This table has several indexes on it, and the queries through this stored procedure are normally very fast ( <1s). Basically I use a prepared statement and create and execute some dynamic SQL in this search sp. After executing the prepared statement, I perform "DEALLOCATE PREPARED stmt;"
Most of the queries run in under a second (I use LIMIT to get just 15 rows at any time). However, there are some rare queries which take longer to run (say 2-3s). I have optimized the searched table as far as I can.
I have developed a web application and I can run and see the results of the fast queries in under a second on my development machine.
However, if I open two browser instances and do a simultaneous search (against the development machine), one with the longer running query, and the other with the faster query, the results are returned at the same time, i.e. it seems as if the fast query waits for the slower query to finish before returning the results. i.e. both queries will take 2-3 seconds...
Is there a reason for this? Because I thought that MyISAM handles SELECTS irrespective of one another and currently this is not the behaviour I am experiencing...
Thanks in advance!
Tim
This is just due to you doing it from the same machine, if the searches were coming from two different machines they would go at the same time. Would you really like one person to be able to bog down your MySQL server just by opening a bunch of browser windows and hitting refresh?
That is right. Each select query on a MyISAM table locks the entire table until it is finished. Their excuse is that this achieves "a very high read throughput". Switching to innoDB will allow concurrent reads.