I have two databases with the same schema (dev/prod) hosted on different machines (and different hosts).
Is there any mechanism or tool whereby I can do a select against specific rows in one db and insert them into the other?
You can use MySQL's FEDERATED storage engine:
The FEDERATED storage engine lets you access data from a remote MySQL database without using replication or cluster technology. Querying a local FEDERATED table automatically pulls the data from the remote (federated) tables. No data is stored on the local tables.
So, to create a connection:
CREATE TABLE federated_table (
id INT(20) NOT NULL AUTO_INCREMENT,
name VARCHAR(32) NOT NULL DEFAULT '',
other INT(20) NOT NULL DEFAULT '0',
PRIMARY KEY (id),
INDEX name (name),
INDEX other_key (other)
)
ENGINE=FEDERATED
DEFAULT CHARSET=latin1
CONNECTION='mysql://fed_user#remote_host:9306/federated/test_table';
Having defined such a table, you could then perform INSERT ... SELECT as you see fit:
INSERT INTO federated_table SELECT * FROM local_table WHERE ...
Or
INSERT INTO local_table SELECT * FROM federated_table WHERE ...
If you are federating multiple tables from the same server, you may wish to use CREATE SERVER instead.
Related
more information added 2022-09-05.
I have a problem with a sql query in a mariadb environment. I have a query that runs a number of tests in the database to identify unprocessed/unverified records, or records with contradicting data. The record identifiers that need to be checked are collected in one field (named fam_list) using a group_concat. The use of group_concat properly collects all record identifiers if no group by function is used. Issue is when group by is added in the query. I want to group the record identifiers by person responsible and project code. After adding the group by instruction, I retrieve several rows but the field that should collect all record identifiers is empty in all rows.
Additionally surprising is the fact that the column that counts the number of family identifiers (named fam_count), which is also an aggregate function, works correct in all attempts.
The query functions ok in my development machine, but fails in my production machine.
extra information 2022-09-05
Removing parts of the query suggests that the origin of the problem lies in the combination of the table fam_proj_link and the view Team_curr_members.
Building up from left to right continues to produce the expected result until Team_curr_members is added in the join. From that moment onwards the group_concat column is empty. The table fam_proj_link can be joined with Team_curr_members directly. A query running in the table fam_proj_link joined to the view Team_curr_members still produces the empty group_concat column.
The simplified query that still fails is:
select 'uninsp' as checkname, l.proj_id, count(l.pb_fam_id), group_concat(l.pb_fam_id)
from PLS.fam_proj_link l left join Projects_2012.Team_curr_members t on(l.proj_id=t.proj_id and t.func_id=1)
where rank_id=-1
group by l.proj_id
I have tried modifying the join condition by doing a straight join, but still an empty group_concat field. I have tried redefyning the join to 'using(proj_id), but still an empty group_concat field.
In none of these cases does the mariadb server produce any error information and all the queries run as expected in the development server.
I checked the site for other usage of the group_concat function and found irregular behaviour in more pages. Including pages that used to work correctly until shortly before my holidays.
I have run optimize tables on the entire database and restarted the mariadb server package, but without improvement.
Could it be that there are damages in some database related files and if so, where do I have to look and what do I have to look for?
end of extra added information
Does anybody have any idea what could be the reason or what I can try to find out what goes wrong?
Query (for one of the tests):
This works in development and production:
select 'uninsp' as checkname, coalesce(user_id,0) as user_id, coalesce(firstsur,' none') as firstsur, l.proj_id, label, stat_name, stat_class, count(distinct pb_fam_id) as fam_count, group_concat(distinct pb_fam_id) as fam_list
from PLS.fam_proj_link l join Projects_2012.Proj_list_2018 p using(proj_id) join Projects_2012.Proj_status_reflist using(stat_id)
left join Projects_2012.Team_curr_members t on(p.proj_id=t.proj_id and t.func_id=1) left join UserData.people_view using(user_id)
where ref_code and stat_name!='cancelled' and rank_id=-1;
This works in development only, not in production:
select 'uninsp' as checkname, coalesce(user_id,0) as user_id, coalesce(firstsur,' none') as firstsur, l.proj_id, label, stat_name, stat_class, count(distinct pb_fam_id) as fam_count, group_concat(distinct pb_fam_id) as fam_list
from PLS.fam_proj_link l join Projects_2012.Proj_list_2018 p using(proj_id) join Projects_2012.Proj_status_reflist using(stat_id)
left join Projects_2012.Team_curr_members t on(p.proj_id=t.proj_id and t.func_id=1) left join UserData.people_view using(user_id)
where ref_code and stat_name!='cancelled' and rank_id=-1
group by user_id, proj_id;
Production machine:
Debian linux running mariadb 10.1.41
Development machine:
Manjaro linux running mariadb 10.8.3
Of course, I can solve the issue by abandoning the group_concat in mariadb and store the record identifiers in arrays in php and implode the array before generating the output, but I assume that concatenation in mariadb is faster than concatenation via arrays in php.
Relevant table definitions:
fam_proj_link
--
-- Table structure for table `fam_proj_link`
--
CREATE TABLE `fam_proj_link` (
`proj_id` smallint(5) UNSIGNED NOT NULL,
`pb_fam_id` int(10) UNSIGNED NOT NULL,
`originator` tinyint(1) NOT NULL DEFAULT '0',
`rank_id` tinyint(3) NOT NULL DEFAULT '-1',
`factsheet` tinyint(1) NOT NULL DEFAULT '0',
`cust_title` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- RELATIONS FOR TABLE `fam_proj_link`:
-- `rank_id`
-- `def_ranks` -> `rank_id`
-- `pb_fam_id`
-- `pb_fams` -> `pb_fam_id`
-- `proj_id`
-- `Projects` -> `proj_id`
--
--
-- Indexes for dumped tables
--
--
-- Indexes for table `fam_proj_link`
--
ALTER TABLE `fam_proj_link`
ADD PRIMARY KEY (`proj_id`,`pb_fam_id`) USING BTREE,
ADD KEY `originator` (`originator`),
ADD KEY `rank_id` (`rank_id`),
ADD KEY `pb_fam_id` (`pb_fam_id`),
ADD KEY `factsheet` (`factsheet`);
--
-- Constraints for dumped tables
--
--
-- Constraints for table `fam_proj_link`
--
ALTER TABLE `fam_proj_link`
ADD CONSTRAINT `fam_proj_link_ibfk_3` FOREIGN KEY (`rank_id`) REFERENCES `def_ranks` (`rank_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `fam_proj_link_ibfk_4` FOREIGN KEY (`pb_fam_id`) REFERENCES `pb_fams` (`pb_fam_id`) ON DELETE CASCADE ON UPDATE CASCADE,
ADD CONSTRAINT `fam_proj_link_ibfk_5` FOREIGN KEY (`proj_id`) REFERENCES `Projects_2012`.`Projects` (`proj_id`) ON DELETE CASCADE ON UPDATE CASCADE;
Proj_list_2018 is a view with the below stand-in structure:
all fields are indexed in the parent tables, except for the label which is a combination of a group_concat and a normal concat over fields in three different tables.
--
-- Stand-in structure for view `Proj_list_2018`
-- (See below for the actual view)
--
CREATE TABLE `Proj_list_2018` (
`proj_id` smallint(6) unsigned
,`priority` tinyint(4)
,`stat_id` tinyint(4) unsigned
,`ref_code` bigint(20)
,`label` mediumtext
);
Proj_status_reflist:
--
-- Table structure for table `Proj_status_reflist`
--
CREATE TABLE `Proj_status_reflist` (
`stat_id` tinyint(3) UNSIGNED NOT NULL,
`stat_group_id` tinyint(3) NOT NULL DEFAULT '1',
`stat_name` varchar(20) NOT NULL DEFAULT '-',
`stat_class` varchar(25) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- RELATIONS FOR TABLE `Proj_status_reflist`:
-- `stat_group_id`
-- `Proj_status_groups` -> `stat_group_id`
--
--
-- Indexes for dumped tables
--
--
-- Indexes for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
ADD PRIMARY KEY (`stat_id`),
ADD UNIQUE KEY `stat_name_UNIQUE` (`stat_name`),
ADD KEY `stat_group_id` (`stat_group_id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
MODIFY `stat_id` tinyint(3) UNSIGNED NOT NULL AUTO_INCREMENT;
--
-- Constraints for dumped tables
--
--
-- Constraints for table `Proj_status_reflist`
--
ALTER TABLE `Proj_status_reflist`
ADD CONSTRAINT `Proj_status_reflist_ibfk_1` FOREIGN KEY (`stat_group_id`) REFERENCES `Proj_status_groups` (`stat_group_id`) ON UPDATE CASCADE;
Team_curr_members is a view with the below stand-in structure:
all fields are indexed in the parent table.
--
-- Stand-in structure for view `Team_curr_members`
-- (See below for the actual view)
--
CREATE TABLE `Team_curr_members` (
`proj_id` smallint(5) unsigned
,`func_id` tinyint(3) unsigned
,`user_id` smallint(5) unsigned
);
people_view is a view with the below stand-in structure:
--
-- Stand-in structure for view `people_view`
-- (See below for the actual view)
--
CREATE TABLE `people_view` (
`user_id` smallint(5) unsigned
,`synth_empl` int(1)
,`firstsur` varchar(77)
,`surfirst` varchar(78)
);
After trying many things, I noticed that the problem appeared in more group_concat queries, many of them had been running without problems for several years. Rebuilding the indexes of the database tables did not solve the issue. Since the software was not altered during the last three years, I concluded that the database engine might be damaged.
I asked IT to create a new virtual server. After building a complete new server environment and migration of the databases and web interface to the new server all problems were solved.
Apparently the problem was not in the queries, but in the server enviroment. I have not discovered which part of the old server environment caused the actual issue, but possibly one or more mariadb/php related files were damaged.
Thanks to those that took the effort to come with suggestions.
I have a question to know what is the best solution I should choose.
I have two MariaDB databases on different machines on the same gigabit network, both running MariaDB 10.1.8 on centos 7
One is a Web database and the other is a FreeRadius database.
Web Database is around 8GB and with workbench I can see around 18 InnoDB writes per second.
Web Database machine is 50GB disc, 8Gb Ram, 4xCPU
On PRTG the Web Database select sensor delays around 140ms-203ms
Radius Database is around 20GB and with workbench I can see around 28 InnoDB writes per second.
Radius Database machine is 100GB disc, 16GB Ram, 6xCPU
On PRTG the Web Database select sensor delays around 140ms-300ms
I think the Radius Database is usually more used than web database.
Now the question is that i need to create a table with users visits by day.
I need this table to be shared by 2 databases and to be able to insert data from both machines.
Sometimes web Server will insert first and sometimes will be radius server that will insert them. Radius Server will do most of the inserts on that table.
This is the table that I need to populate
CREATE TABLE `visitas` (
`idUsuario` int(10) unsigned NOT NULL,
`idInstalacion` int(10) unsigned NOT NULL,
`fVisita` date DEFAULT NULL,
`tAcceso` varchar(25) DEFAULT NULL,
`nUpdates` int(10) unsigned NOT NULL,
UNIQUE KEY `Visitas` (`idUsuario`,`idInstalacion`,`fVisita`),
KEY `idUsuario` (`idUsuario`),
KEY `idInstalacion` (`idInstalacion`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
So what I've created is a TRIGGER on radius server to insert on visits (with on duplicated key update). That works perfect.
DELIMITER $$;
create TRIGGER UpdateVisitas AFTER INSERT ON radacct
FOR EACH ROW
BEGIN
INSERT into visitas (
select u.department as idUsuario,u.company as idInstalacion,DATE(r.acctstarttime) as fVisita,'Radius' as tAcceso,0 as nUpdates from radacct r,userinfo u where r.username=u.username and r.radacctid=NEW.radacctid
) ON DUPLICATE KEY UPDATE visitas.tAcceso='Radius',nUpdates=(1+nUpdates) ;
END$$
Now the question i need to FEDERATEDX this table to be able to insert from radius server and web server and to be joined with other tables from web server.
Insert will be done mostly from Radius and select with joins for statistics will be done mostly from Web.
So according to all this information what is best to create the Visit table on Radius Server and federated it on Web Server o the other way.
Let me know if you need more information.
thanks a lot !.
When we try to create a table on a master-master MySQL MIXED replication, with a composite key containing an AUTO_INCREMENT column, it creates the table on the master but failed to do so on the slave.
Here is the error we got on slave side:
Error 'Incorrect table definition; there can be only one auto column and it must be defined as a key' on query. Default database: 'total_chamilo'. Query: 'CREATE TABLE `c_attendance_result` (
c_id INT NOT NULL,
id int NOT NULL auto_increment,
user_id int NOT NULL,
attendance_id int NOT NULL,
score int NOT NULL DEFAULT 0,
PRIMARY KEY (c_id, id)
) DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci'
Database is MyISAM.
MySQL version is 5.5.40-0+wheezy1-log
Surprisingly, we have table matching the same schema working on same servers but created on an other replication mode (statement) and/or on a previous MySQL version.
Does anyone know a way to fix this, if possible without changing original query since it is part of a large dump, full of this kind of statement...
thanks,
A.
That looks very much like this slave is not properly configured and trying to use InnoDB instead of MyISAM. An InnoDB table with an AUTO_INCREMENT column requires at least one key where the auto-increment column is the only or leftmost column. See the MySQL 5.5 reference manual. In your case the auto-increment column is the second column.
Is there anything I can change in the my.ini file to speed up "LOAD DATA INFILE"?
I have two MySQL 5.5 instances each of which has one identical table structured as follows:
CREATE TABLE `log_access` (
`_id` bigint(20) NOT NULL AUTO_INCREMENT,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`type_id` int(11) NOT NULL,
`building_id` int(11) NOT NULL,
`card_id` varchar(15) NOT NULL,
`user_key` varchar(35) DEFAULT NULL,
`user_name` varchar(25) DEFAULT NULL,
`user_validation` varchar(10) DEFAULT NULL,
PRIMARY KEY (`_id`),
KEY `log_access__user_key_timestamp` (`user_key`,`timestamp`)
KEY `log_access__timestamp` (`timestamp`)
) ENGINE=MyISAM
On a daily basis I need to move the data from previous day from instance A to instance B, which consists of roughly 25 million records. At the moment I am doing the following:
On instance A, generate an OUTFILE with "WHERE timestamp BETWEEN
'2014-09-23 00:00:00' AND '2014-09-23 23:59:59'. This usually takes
less than 2 minutes.
On instance B, execute "LOAD DATA INFILE". This is the problem area
as it takes about 13 hours.
On instance A, delete records from the previous day. This will probably be another
On instance B, run stats On instance B, truncate the table
I have also considered partitioning the tables and just exchanging the partitions. EXCHANGE PARTITION is supported as of 5.6 and I am willing to update MySQL, however, all documentation discusses exchanging between tables and I haven't been able to confirm that I would be able to do that between DB instances.
Replication between the instances, but as I have not tinkered with replication in the past and this is a time sensitive assignment I am somewhat reluctant to tread into new waters.
Any words of wisdom much appreciated.
CREATE the table without PRIMARY KEY and _id column and add these after LOAD DATA INFILE is complete. MySQL checks the PRIMARY KEY integrity with each INSERT, so I think you can gain a lot of performance here. With MariaDB you can disable keys, but I think this won't work on some storage engines (see here)
Not-very-nice-alternative:
I found it very easy to move a MYISAM-database by just copy/move the files on disk. If you cut/paste the files and run a REPAIR TABLE. on your target machine you can do this without restarting the Server. Just make sure you copy all 3 files (.frm, .myd, .myi)
LOAD DATA INFILE in perfect PK-order, INTO a table that only has the PK-definition, so no secondary indexes yet. After import, add all secondary indexes at once, with 'ALTER TABLE mytable ALGORITHM=INPLACE, LOCK=NONE, ADD KEY ...'.
Consider adding back the secondary indexes on each involved box separately, so not via replication (sql_log_bin=0), to prevent replication lag.
Consider using a partitioned table, as then you can run a 'LOAD DATA INFILE' per partition, in parallel. (applies to RANGE and HASH partitioning, as the separate tsv-files (one or more per partition) are easy to prepare for those)
MariaDB doesn't have the variant 'INTO mytable PARTITION (p000)' yet.
You can load into a separate table first, and then exchange partitions, but MariaDB also doesn't have 'WITHOUT VALIDATION' yet.
Say you are running two mysql servers: one a master, the other the slave. The master has triggers set that update columns with the COUNT of the number of rows in other tables. For instance, you have a news table and a comments table. News contains an INT column called "total_comments" which is incremented via trigger every time a new row is put into "comments." Does the slave need this trigger as well (to keep "news.total_comments" up to date) or will it get be told to update the appropriate "news.total_comments" directly?
From the docs http://dev.mysql.com/doc/refman/5.0/en/faqs-triggers.html:
22.5.4: How are actions carried out through triggers on a master
replicated to a slave? First, the
triggers that exist on a master must
be re-created on the slave server.
Once this is done, the replication
flow works as any other standard DML
statement that participates in
replication. For example, consider a
table EMP that has an AFTER insert
trigger, which exists on a master
MySQL server. The same EMP table and
AFTER insert trigger exist on the
slave server as well. The replication
flow would be: An INSERT statement is
made to EMP. The AFTER trigger on EMP
activates. The INSERT statement is
written to the binary log. The
replication slave picks up the INSERT
statement to EMP and executes it. The
AFTER trigger on EMP that exists on
the slave activates.
And
22.5.4 Actions carried out through triggers on a master are not
replicated to a slave server.
Thus, you DO need the triggers on the slave.
It depends on the replication you're using. If you use statement based replication, then you must use matching triggers in the master and the slave. If you use row-based replication, then you must not include the triggers on the slave.
You can have the action of requests made by triggers in the binary log with federated tables (MySQL5) by adding the same table with a local connection.
---------------------------------------------------------------------------------------
-- EXEMPLE :
-- We want install a replication of the table test_table that will be managed by Trg_Update triggers.
---------------------------------------------------------------------------------------
Create database TEST;
USE TEST;
CREATE TABLE test_trigger (
id INT(20) NOT NULL AUTO_INCREMENT,
name VARCHAR(32) NOT NULL DEFAULT '',
PRIMARY KEY (id),
INDEX name (name),
) ;
DELIMITER |
CREATE TRIGGER Trg_Update AFTER INSERT ON test_trigger
FOR EACH ROW BEGIN
INSERT INTO federated_table (name, other) values (NEW.name, ‘test trigger on federated table -> OK’)
END|
DELIMITER ;
CREATE TABLE test_table (
id INT(20) NOT NULL AUTO_INCREMENT,
name VARCHAR(32) NOT NULL DEFAULT '',
other VARCHAR(32) NOT NULL DEFAULT '',
PRIMARY KEY (id),
INDEX name (name),
INDEX other_key (other)
) ;
CREATE TABLE federated_table (
id INT(20) NOT NULL AUTO_INCREMENT,
name VARCHAR(32) NOT NULL DEFAULT '',
other VARCHAR(32) NOT NULL DEFAULT '',
PRIMARY KEY (id),
INDEX name (name),
INDEX other_key (other)
)
ENGINE=FEDERATED
CONNECTION='mysql://root#localhost/TEST/test_table';
---------------------------------------------------------------------------------------