Sync data from RDS MySQL to Amazon Redshift - mysql

I'm trying to Sync data from RDS MySQL to Amazon Redshift. For that, created a Data Pipeline scheduled for run once. Synced one table then tried with another table named 'roles' but it failed with the following error message "output table named 'public.roles' doesn't exist and no createTableSql was provided". The actual result of pipeline is as follows.
RedshiftTableCreateActivity - Finished
RDSToS3CopyActivity - Finished
S3ToRedshiftCopyActivity - FAILED ("output table named 'public.roles' doesn't exist and no createTableSql was provided")
S3StagingCleanupActivity - CASCADE_FAILED
For the Pipeline, tried with Truncate/OVERWRITE_EXISTING insert modes.
Can anyone help me on this ?

It seems that your redshift table "roles" does not exist.
Also, you can specify the createTableSql to be "create table if not exists roles(your table definition)"

Related

Redshift Alter table command returns `Target table and source table attributes don't match.`

I have an airflow pipeline which creates a staging table from an existing table, loads data in it from a csv, and then the following alter command is executed.
ALTER TABLE "schema"."table"
APPEND FROM "schema"."table_staging"
FILLTARGET
I observed the following error is retuned occasionally, I simply dropped the original table and re-created it and it will work fine for few days, and then out of no where I'll encounter this again.
DETAIL:
-----------------------------------------------
error: Target table and source table attributes don't match.
code: 8001
context: The source table and the target table have different sort keys. Both tables must use the same sort keys and sort style.
query: 0
location: tbl_perm.cpp:2823
process: padbmaster [pid=19083]
I'm unable to figure out why it starts breaking all of a sudden.

Update a table (that has relationships) using another table in SSIS

I want to be able to update a specific column of a table using data from another table. Here's what the two tables look like, the DB type and SSIS components used to get the tables data (btw, both ID and Code are unique).
Table1(ID, Code, Description) [T-SQL DB accessed using ADO NET Source component]
Table2(..., Code, Description,...) [MySQL DB accessed using ODBC Source component]
I want to update the column Table1.Description using the Table2.Description by matching them with the right Code first (because Table1.Code is the same as Table2.Code).
What i tried:
Doing a Merge Join transformation using the Code column but I couldn't figure out how to reinsert the table because since Table1 has relationships i can't simply drop the table and replace it with the new one
Using a Lookup transformation but since both tables are not the same type it didn't allow me to create the lookup table's connection manager (which would be for in my case MySQL)
I'm still new to SSIS but any ideas or help would be greatly appreciated
My solution is based on #Akina's comments. Although using a linked server would've definitely fit, my requirement is to make an SSIS package to take care of migrating some old data.
The first and last are SQL tasks, while the Migrate ICDDx is the DFT that transfers the data to a staging table created during the first SQL task.
Here's the SQL commands that gets executed during Create Staging Table :
DROP TABLE IF EXISTS [tempdb].[##stagedICDDx];
CREATE TABLE ##stagedICDDx (
ID INT NOT NULL,
Code VARCHAR(15) NOT NULL,
Description NVARCHAR(500) NOT NULL,
........
);
and here's the sql command (based on #Akina's comment) for transferring from staged to final (inside Transfer Staged):
UPDATE [MyDB].[dbo].[ICDDx]
SET [ICDDx].[Description] = [##stagedICDDx].[Description]
FROM [dbo].[##stagedICDDx]
WHERE [ICDDx].[Code]=[##stagedICDDx].[Code]
GO
Here's the DFT used (both TSQL and MySQL sources return sorted output using ORDER BY Code, so i didnt have to insert Sort components before the Merge Join) :
Note: Btw, you have to setup the connection manager to retain/reuse the same connection so that the temporary table doesn't get deleted before we transfer data to it. If all goes well, then after the Transfer Staged SQL Task, the connection would be closed and the global temporary table would be deleted.

AWS DMS issues after the migration

I am using AWS DMS for migrating 350G of data.
The migration has been completely but the status is showing error. I have checked the cloudwatch logs and got the following errors:
E: RetCode: SQL_ERROR SqlState: HY000 NativeError: 1280 Message: [MySQL][ODBC 5.3(w) Driver][mysqld-5.5.5-10.2.12-MariaDB-log]Incorrect index name 'PRIMARY' [1022502] (ar_odbc_stmt.c:4428)
[TARGET_LOAD ]E: execute create primery key failed, statement ALTER TABLE <databaseName>.<table> ADD CONSTRAINT PRIMARY PRIMARY KEY ( id ) [1022502] (odbc_endpoint_imp.c:3938)
I have compared the DBs on source and targets and found that there are some variations in the table size and also the Key filed is empty on target RDS; I suspect that the Key's are not migrated to my target RDS(compared using describe). In DMS document it is mentioned that the keys will migrated.
Is there any way to fix this issue?
Please let me know if anyone faced the issues while using AWS RDS.
It looks like DMS is attempting to apply an index that already exists in the target.
From another issue the incorrect index name message relates to attempting to create an index that already exists.
Consider running Schema Conversion Tool to create the target schema and run the DMS task with target table prep mode of do nothing. This way you can troubleshoot creation of the schema separately from migrating data.
Also consider creating a task for just this table with otherwise identical task configuration using source table filters, which will give you a complete end to end targeted log.
For reference AWS have written a very detailed blog series for troubleshooting DMS:
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 1)
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong (Part 2)
Debugging Your AWS DMS Migrations: What to Do When Things Go Wrong? (Part 3)

MySQL Replication fails intermittently with HA_ERR_KEY_NOT_FOUND when inserting from a tmp table

I'm running mysql 5.6 onn AWS RDS. I have a mysql slave via a readreplica in RDS as well. Replication to the slave gets an error when running a stored proc that inserts from a temporary table into a non-temporary table.
My reading of the mysql documentation is that this case is handled as long as we use mixed mode binlogging (which we do). http://dev.mysql.com/doc/refman/5.6/en/binary-log-mixed.html
Is this a mysql bug or am I missing something? Is this approach simply not supported when using mysql slaves?
The stored proc that is causing trouble is doing something like this, where MySummaryTable is the non-temp table and tmp_locations_table is the temp one:
CREATE TEMPORARY TABLE tmp_locations_table
INSERT INTO MySummaryTable (
accountID,
locationID
)
SELECT
row_data.accountID,
row_data.locationID
FROM tmp_locations_table row_data
GROUP BY
row_data.accountID,
row_data.locationID
ON DUPLICATE KEY UPDATE
locationID=123
The exact mysql error I'm seeing isn't particularly helpful: Could not execute Update_rows event on table myschema.MySummaryTable; Can't find record in 'MySummaryTable', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log blahblahblah

Error replicating database due to cross-db reference - table doesn't exist

We have mysql v5.0.77 running on a server collecting some measurement data.
On the mysql server, we have the following databases:
raw_data_db
config_tables_db
processed_data_db
We ONLY want to replicate the 'processed_data_db' which is constructed using information from the 'raw_data_db' and 'config_tables_db'.
We keep getting errors on our slave server when it tries to duplicate the statements that are constructing the processed data.
Example:
[ERROR] Slave: Error 'Table 'raw_data_db.s253' doesn't exist' on query. Default database: 'data'. Query: 'CREATE TEMPORARY TABLE temp SELECT * FROM raw_data_db.s253 WHERE DateTimeVal>='2011/04/21 17:00:00' AND DateTimeVal<='2011/04/21 17:10:00'', Error_code: 1146
What I am assuming is happening is that the cross-db selects can't find the raw database because we aren't replicating it, and the data do not exist on the slave...or something along those lines?
So I tried using ignores, but we're still getting the errors
replicate-wild-ignore-table = raw_data_db.*
replicate-wild-ignore-table = data.temp*
Other configuration information:
replicate-rewrite-db = processed_data_db->data
replicate-do-db = data
Is it possible to replicate just the one database if all the tables are created from references to other databases? Any ideas on how to get around this error?
I looked in to row-based replication which seemed like it might do the trick, but it's only available in v5.1 or greater....is there anything similar in earlier versions?
I fixed the ignore table statements to "data.%temp%", and it seems to be ignoring just fine, but I still can't replicate the tables I want because the insert statement is now referencing a table that doesn't exist.
ex.
Error 'Table 'data.temp' doesn't exist' on query. Default database: 'data'. Query: 'INSERT INTO abc SELECT FROM_UNIXTIME(AVG(UNIX_TIMESTAMP(DateTimeVal))), ROUND(AVG(Difference),3), ROUND(STDDEV(Difference),3), ROUND(AVG(Frequency),0), ROUND(AVG(SignalPower),1) FROM temp WHERE ABS(Difference)<'10000.0' AND Difference!='0''
The processing is creating temporary tables from the raw database and then averaging all the values in the temporary table and inserting the result in to the processed_data_db, but since I'm ignoring the create statements, it doesn't have access to those tables, but the reason I'm ignoring them in the first place is because they reference tables outside of what I want to replicate...so I'm not sure how I should approach this....any suggestions would be greatly appreciated.
Temporary tables and replication
options. By default, all temporary
tables are replicated; this happens
whether or not there are any matching
--replicate-do-db, --replicate-do-table, or --replicate-wild-do-table options in effect. However, the
--replicate-ignore-table and --replicate-wild-ignore-table options are honored for temporary tables.
http://dev.mysql.com/doc/refman/5.0/en/replication-features-temptables.html
edit:
replicate raw_data_db and config_tables_db tables which using
in you insert query
use drbd protocol
http://www.mysql.com/why-mysql/drbd/