If we want to create mysql databases for cadence. Assuming we want 10 shards for cadence, we should create a set of mysql cadences tables for each shard? If we want 5 machine to create mysql database for 10 shards, how should we do?
Assuming we want 10 shards for cadence, we should create a set of mysql cadences tables for each shard
No. You will only need one set of tables for the whole Cadence cluster.
The sharding mechanism is implemented within Cadence server. Unless you have a sharded MySQL solution, you don't need to worry about anything about sharding when setting up database schema.
If you do have a sharded MySQL, just make sure to use shardID as partition(sharding) key for the table.
Sharding in Cadence is only needed for History service(that's why it's called numHistoryShards in the config).
More reading about the sharding:
https://cadenceworkflow.io/docs/operation-guide/setup/#static-configuration
Typically you will need 2K shards in production if MySQL is the database.
here is some reference that might help you for mysql , postgresql
Related
I have some questions before implement the following scenario:
I have the Database A (it contains multiple tables with lots of data, and is being queried by multiple clients)
this database contains a users table, which I need to create some triggers, but this database is managed by a partner. We don't have permissions to create triggers.
And the Database B is managed by me, much lighter, the queries are only from one source, and I need to have access to users table data from Database A so I can create triggers and take actions for every update, create or delete in users table from database A.
My most concern is, how can this federated table impact on performance in database A? Database B is not the problem.
Both databases stay in the same geographic location, just different servers.
My goal is to make possible take actions from every transaction in database A users table.
Definitely queries that read federated tables have performance issues.
https://dev.mysql.com/doc/refman/8.0/en/federated-usagenotes.html says:
A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging.
(emphasis mine)
The reason the federated engine was created was to support applications that need to write to tables at a rate greater than a single server can support. If you are inserting to a table and overwhelming the I/O of that server, you can use a federated table so you can write to a table on a different server.
Reading from federated tables is likely to be worse than reading local tables, and cannot be optimized with indexes.
If you need good performance, you should use replication or a CDC tool, to maintain a real table on server B that you can query as a local table, not a federated table.
Another solution would be to cache the user's table in the client application, so you don't have to read it on every query.
I need to drop a column from a big table (about 20 GB) in a production MySQL database,
but I don't want the MySQL server to hang or make a risk on production database.
This table is using the InnoDB engine and it contains around 10,000,000 records.
The best possible way as far as I know, is to use a MASTER-MASTER setup in mysql
You can modify MASTER1 first, and just use MASTER2 in production. Then you switch over and do the same.
We have a MySQL database based on InnoDB. We are looking to build an Analytics system for this data. We are thinking to create a cloned database that denormalizes the data to prevent join and uses MyIsam for faster querying. This second database will also facilitate avoiding extra load on the main database to which the data will be written.
Apart from this, we are also creating some extra tables that will store aggregated numbers to avoid recalculation.
I am wondering how can I sync these tables once every day to keep them updated. It looks similar to Master-slave config of MySQL which uses binary log. But in our case, the second database is not an exact slave. Are there any open-source reliable tools or any other ideas which I can use to write an 'update mechanism'?
Thanks in advance.
What is the best strategy to make a clustered MySQL deployment in which some tables of the DB are placed on one node and some other tables are placed on another node while acting as a single coherent DB from the application's perspective?
Let's say if I have 2 data nodes A and B, and a database with 5 tables, I want tables 1, 2, and 3 to be placed on node A and tables 4 and 5 to be placed on node B.
Do we need this deployment to be a clustered deployment, or would a typical MySQL deployment handle this? If yes, how so?
How about having table 4 replicated on both A and B?
MySQL will allow for transparent access to tables stored on other instances using the federated engine (this has been available for a long time).
MySQL does provide a feature called partitioning - which is applied to tables to distribute the data across different filesystems - but this is something very different.
How about having table 4 replicated on both A and B?
You can set up mysql replication to only copy specific tables (see replicate-wild-do-table) however mixing federation and replication is going to get very confusing very quickly - get it wrong and you will trash your data. Use one or the other. Not both.
I have a relatively light query that needs information from a local MySQL table along with another MySQL table which is stored on a physically separate machine (on the same network). I'm keen to avoid setting up replication just to facilitate this light query that only needs executed once a day.
Is there any way that I can join with a table on a remote machine using one query? Or run a SELECT INTO into a local table.
Notes
I'm using C# & .NET 4.
This can be done by using the FEDERATED storage engine for the remote table. Find out more.