SQL Server 2014 Switch statement issues, violation of constraint or function - sql-server-2014

I'm attempting to correct a partitioning mistake in SQL Server 2014. We have a data warehouse where all fact like tables are partitioned by year. When I originally setup the partition function I neglected to leave an empty partition at the end. These tables also have clustered columnstore indexes present (you apparently can't split non-empty partitions with columnstore, so I have to drop them as you'll see below).
I found SQLCat's managepartition tool, which has helped me create a temporary table for each table tied to the function. I then switch my last partition out to each temporary table, so that it is empty across all tables tied to the function and the split can take place instantaneously (as opposed to the 6 hours it took in testing on the dev box when I just dropped the columnstore indexes and did the split with the last partition not empty). The problem is that I'm having trouble switching the data back in.
ALTER TABLE SWITCH statement failed. Check constraints or partition function of source table 'DW_Dev.DDS.FactTable1TempSplit' allows values that are not allowed by check constraints or partition function on target table 'DW_Dev.DDS.FactTable1'.
Here's the thing, the SQLCat tool creates the constraint on each of the new temporary staging tables that should line up with the correct partition number. It looks correct, so I'm a little stumped. The partition function is based on a date key (column 1 (int)) and I've verified that only the year 2015 (the last partition, 3) is present in the temporary table.
select $PARTITION.pfYearlyGrain (20150101)
comes back with 3.
SELECT [Column1]
FROM [DW_Dev].[DDS].[FactTable1TempSplit]
WHERE [Column1] < 20150101
returns 0 records, same with >20151231 (just in case, if that existed we'd have other problems but I thought I'd check all the same).
I threw in a "is not null" as an addition on the constraint without resolving the issue.
The schema/function is as follows:
CREATE PARTITION SCHEME [psYearlyGrain] AS PARTITION [pfYearlyGrain]
TO ([SECONDARY], [SECONDARY], [SECONDARY], [SECONDARY], [SECONDARY])
GO
CREATE PARTITION FUNCTION [pfYearlyGrain](int) AS RANGE LEFT FOR
VALUES (20131231, 20141231, 20151231, 20161231)
GO
Complete code/commands below for one of the table combos:
Drop Index CI_FactTable1
ON DDS.FactTable1
GO
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
CREATE TABLE [DDS].[FactTable1TempSplit](
[Column1] [int] NOT NULL,
[Column2] [int] NOT NULL,
[Column3] [smallint] NOT NULL,
[Column4] [smallint] NOT NULL,
[Column5] [smallint] NOT NULL,
[Column6] [smallint] NOT NULL,
[Column7] [smallint] NOT NULL,
[Column8] [smallint] NOT NULL,
[Column9] [int] NOT NULL
) ON [SECONDARY]
ALTER TABLE [DDS].[FactTable1TempSplit]
ADD CONSTRAINT [chk_FactTable1TempSplit_partition_3]
CHECK (([Column1]>20141231) AND [Column1] IS NOT NULL)
ALTER TABLE [DDS].[FactTableTempSplit]
CHECK CONSTRAINT [chk_FactTableTempSplit_partition_3]
ALTER TABLE [DDS].[FactTable1] SWITCH PARTITION 3
TO [DDS].[FactTable1TempSplit];
ALTER TABLE [DDS].[FactTable1TempSplit] SWITCH
TO [DDS].[FactTable1] PARTITION 3;
Also worth noting, the original fact table has no constraints, keys, indexes at this point. So the only thing I can think of is that for some reason the check constraint on the temporary table isn't valid/matching with the partition number/function, but I'm not seeing why. Any thoughts (other than that I really, really should have left an empty partition)?

Related

MySQL seems to be very slow for updates

MySQL seems to be very slow for updates.
A simple update statement is taking more time than MS SQL for same update call.
Ex:
UPDATE ValuesTbl SET value1 = #value1,
value2 = #value2
WHERE co_id = #co_id
AND sel_date = #sel_date
I have changed some config settings as below
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=10G
innodb_log_file_size=2G
log-bin="foo-bin"
skip-log-bin
This is the create table query
CREATE TABLE `valuestbl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`sel_date` datetime NOT NULL,
`co_id` int(11) NOT NULL,
`value1` decimal(10,2) NOT NULL,
`value2` decimal(10,2) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21621889 DEFAULT CHARSET=latin1;
MySQL version: 8.0 on Windows
The update query takes longer time to update when compared to MS SQL, anything else I need to do to make it faster?
There are no indices, the ValuesTbl tables has a PK, not using for anything. the id column is a Primary key from another table, the sel_date is a date field and 2 decimal columns
If there are no indexes on ValuesTbl then the update has to scan the entire table which will be slow if the table is large. No amount of server tuning will fix this.
A simple update statement is taking more time than MS SQL for same update call.
The MS SQL server probably has an index on either co_id or sel_date. Or it has fewer rows in the table.
You need to add indexes, like the index of a book, so the database doesn't have to search the whole table. At minimum an index on co_id will vastly help performance. If there are many columns with different sel_date per ID, a compound index on (co_id, sel_date) would help further.
See Use The Index, Luke for an extensive tutorial on indexes.

MySQL Partitioning a VARCHAR(60)

I have a very large 500 million rows table with the following columns:
id - Bigint - Autoincrementing primary index.
date - Datetime - Approximately 1.5 million rows per date, data older 1 year is deleted.
uid - VARCHAR(60) - A user ID
sessionNumber - INT
start - INT - epoch of start time.
end - INT - epoch of end time.
More columns not relevant for this query.
The combination of uid and sessionNumber forms a uinque index. I also have an index on date.
Due to the sheer size, I'd like to partition the table.
Most of my accesses would be by date, so partitioning by date ranges seems intuitive, but as the date is not part of the unique index, this is not an option.
Option 1: RANGE PARTITION on Date and BEFORE INSERT TRIGGER
I don't really have a regular issue with the uid and sessionNumber uniqueness being violated. The source data is consistent, but sessions that span two days may be inserted on two consecutive days with midnight being the end time of the first and start time of the second.
I'm trying to understand if I could remove the unique key and instead use a trigger that would
Check if there is a session with the same identifiers the previous day and if so,
Updates the end date.
cancels the actual insert.
However, I am not sure if I can 1) trigger an update on the same table. or 2) prevent the actual insert.
Option 2: LINEAR HASH PARTITION on UID
My second option is to use a linear hash partition on the UID. However I cannot see any example that utilizes a VARCHAR and converts it to an INTEGER which is used for the HASH partitioning.
However I cannot finde a permitted way to convert from VARCHAR to INTEGER. For example
ALTER TABLE mytable
PARTITION BY HASH (CAST(md5(uid) AS UNSIGNED integer))
PARTITIONS 20
returns that the partition function is not allowed.
HASH partitioning must work with a 32-bit integer. But you can't convert an MD5 string to an integer simply with CAST().
Instead of MD5, CRC32() can take an arbitrary string and converts to a 32-bit integer. But this is also not a valid function for partitioning.
mysql> alter table v partition by hash(crc32(uid));
ERROR 1564 (HY000): This partition function is not allowed
You could partition by the string using KEY Partitioning instead of HASH partitioning. KEY Partitioning accepts strings. It passes whatever input string through MySQL's built-in PASSWORD() function, which is basically related to SHA1.
However, this leads to another problem with your partitioning strategy:
mysql> alter table v partition by key(uid);
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
Your table's primary key id does not include the column uid that you want to partition by. This is a restriction of MySQL's partitioning:
every unique key on the table must use every column in the table's partitioning expression.
Here's the table I'm testing with (it would have been a good idea for you to include this in your question):
CREATE TABLE `v` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`uid` varchar(60) NOT NULL,
`sessionNumber` int(11) NOT NULL,
`start` int(11) NOT NULL,
`end` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uid` (`uid`,`sessionNumber`),
KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Before going any further, I have to wonder why you want to use partitioning anyway? "Sheer size" is not a reason to partition a table.
Partitioning, like any optimization, is done for the sake of specific queries you want to optimize for. Any optimization improves one query at the expense of other queries. Optimization has nothing to do with the table. The table is happy to sit there with 5 billion rows, and it doesn't care. Optimization is for the queries.
So you need to know which queries you want to optimize for. Then decide on a strategy. Partitioning might not be the best strategy for the set of queries you need to optimize!
I'll assume your 'uid' is a 128-bit UUID kind of value, which can be stored as a BINARY(16), because that is generally worth the trouble.
Next, stay away from the 'datetime' type, as it is stored like a packed string, and doesn't hold any timezone information. Store date-time-values either as pure numerical values (the number of seconds since the UNIX-epoch), or let MySQL do that for you and use the timestamp(N) type.
Also don't call a column 'date', not just because that is a reserved word, but also because the value contains time details too.
Next, stay away from using anything else than latin1 as the CHARSET of (all) your tables. Only ever do UTF-8-ness at the column level. This to prevent unnecessarily byte-wide columns and indexes creeping in over time. Adopt this habit and you'll happily look back on it after some years, promised.
This makes the table look like:
CREATE TABLE `v` (
`uuid` binary(16) NOT NULL,
`mysql_created_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`visitor_uuid` BINARY(16) NOT NULL,
`sessionNumber` int NOT NULL,
`start` int NOT NULL,
`end` int NOT NULL,
PRIMARY KEY (`uuid`),
UNIQUE KEY (`visitor_uuid`,`sessionNumber`),
KEY (`mysql_created_at`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
PARTITIONED BY RANGE COLUMNS (`uuid`)
( PARTITION `p_0` VALUES LESS THAN (X'10')
, PARTITION `p_1` VALUES LESS THAN (X'20')
...
, PARTITION `p_9` VALUES LESS THAN (X'A0')
, PARTITION `p_A` VALUES LESS THAN (X'B0')
...
, PARTITION `p_F` VALUES LESS THAN (MAXVALUE)
);
To make the KEY (mysql_created_at) be only on the date-part, needs a calculated column, which can be added in-place, and then an index on it is also light to add, so I'll leave that as homework.

Partition exchange column type or size mismatch (ORA-14097)

I'm trying to do a exchange partition on a database and I'm having the following error: ORA-14097: column type or size mismatch in ALTER TABLE EXCHANGE PARTITION
The script that does this was already created and it was running as expected on an Oracle 11g database. As soon as I've updated to 12c I've got this problem. This is how I'm doing the partition exchange:
-- The new partitioned table.
CREATE TABLE NEW_TABLE
(
id NUMBER(18) NOT NULL,
message VARCHAR2(4000) NOT NULL,
details VARCHAR2(4000),
partition_time TIMESTAMP(6) DEFAULT to_timestamp('01-01-2016','dd-mm-yyyy HH24:MI') NULL
) NOCOMPRESS LOGGING
PARTITION BY RANGE (partition_time) INTERVAL (NUMTODSINTERVAL(1,'HOUR'))
(PARTITION initial VALUES LESS THAN (to_timestamp('01-01-2016','dd-mm- yyyy HH24:MI')));
-- The old table.
CREATE TABLE OLD_TABLE
(
id NUMBER(18,0) NOT NULL,
message VARCHAR2(4000 byte) NOT NULL,
details VARCHAR2(4000),
);
-- Add the column that does not exist on the old table (keep the same columns).
ALTER TABLE OLD_TABLE ADD partition_time TIMESTAMP(6) DEFAULT to_timestamp('01-01-2016','dd-mm-yyyy HH24:MI') NULL;
ALTER TABLE NEW_TABLE
EXCHANGE PARTITION INITIAL
WITH TABLE OLD_TABLE
WITHOUT VALIDATION;
(...)
Now, once again, on Oracle 11g this was working perfectly. On Oracle 12c I've got the error explained above. I've did some research and I've seen people talk about INVISIBLE columns. Well, I've recreated the OLD_TABLE so I think there will be no invisible columns.
EDIT:
I've realized that on Oracle 12c when I try to alter the table to create a new column another invisible column is created (named SYS_NC00011$). This is why the partition exchange is not working. My question now is why is this happening and what is the best way to "remove this column" ? Already tryied to drop unused columns with no success.
Thank you guys!
We ran recently into the same error. Similar to your case, the error was triggered by a hidden column (and it wasn't even easter ;-). In our case the hidden column was caused by a ALTER TABLE xxx DROP COLUMN yyy of a compressed table.
In your case, it seems very likely that the hidden column is created by the ALTER TABLE xxx ADD COLUMN yyy NULL. As the article DDL Optimization in Oracle Database 12c and this answer explains, adding a NULL column does some data dictionary magic and adds a hidden column to track if the new column has been written to for each row.
CREATE TABLE old_table (
id NUMBER(18,0) NOT NULL,
message VARCHAR2(4000 BYTE) NOT NULL,
details VARCHAR2(4000)
);
ALTER TABLE old_table ADD partition_time TIMESTAMP(6)
DEFAULT to_timestamp('01-01-2016','dd-mm-yyyy HH24:MI') NULL;
SELECT * FROM user_tab_cols WHERE table_name='OLD_TABLE';
ID NUMBER
MESSAGE VARCHAR2
DETAILS VARCHAR2
SYS_NC00004$ RAW
PARTITION_TIME TIMESTAMP(6)
So, to fix your case, either recreate the table including the column partition_time:
CREATE TABLE old_table (
id NUMBER(18,0) NOT NULL,
message VARCHAR2(4000 BYTE) NOT NULL,
details VARCHAR2(4000),
partition_time TIMESTAMP(6) DEFAULT DATE '2016-01-01'
);
or add the column without a DEFAULT:
ALTER TABLE OLD_TABLE ADD partition_time TIMESTAMP(6) NULL;
or disable the new feature (Doc Id 2277937.1):
ALTER SESSION SET "_add_col_optim_enabled"=FALSE ;
ALTER TABLE old_table ADD partition_time TIMESTAMP(6)
DEFAULT to_timestamp('01-01-2016','dd-mm-yyyy HH24:MI') NULL;
SELECT * FROM user_tab_cols WHERE table_name='OLD_TABLE';
ID NUMBER
MESSAGE VARCHAR2
DETAILS VARCHAR2
PARTITION_TIME TIMESTAMP(6)
I haven't found a way yet to rebuild the table to get rid of the hidden column. ALTER TABLE MOVE does not help, only CREATE TABLE AS SELECT does.
Another reilable solution without compromising anything would be to create/rebuild OLD_TABLE (non-partitioned) using "..FOR EXCHANGE.." clause. It's available only from Oracle version 12.2 onwards.
CREATE TABLE OLD_TABLE **FOR EXCHANGE** WITH TABLE NEW_TABLE;
It's not clear from your description if the OLD_TABLE is empty or has data in your case. If you have data, you can populate data in it using
INSERT INTO OLD_TABLE SELECT * FROM <old backup table>;
This avoids ORA-14097 (or ORA-00932 in some cases) during the 'exchange partition' get the job done seamlessly.
Oracle could sense issues with "exchange partition" soon after introducing DDL optimisation related to DEFAULT column attribute and hence introduced "..FOR EXCHANGE.." version of CTAS operation from 12.2 onwards.
Thanks to wolφi for highlighting the hidden columns and pointing me in the right direction.
I confirmed the hidden columns with the query below:
SELECT * FROM SYS.dba_tab_cols
I then recreated my staging table including the system generated column names with matching types and in the same order according to INTERNAL_COLUMN_ID.
The partition exchange still failed because the new columns were showing as USER_GENERATED='YES'
The final fix was to mark the columns as unused:
ALTER TABLE STAGING_TABLE
set unused ("SYS_C00006_16092719:09:49$"
,"SYS_C00007_16092719:10:34$"
,"SYS_C00008_16092719:06:48$"
,"SYS_C00009_16092719:07:00$"
,"SYS_C00010_16092719:07:10$"
,"SYS_C00011_16092719:08:15$"
,"SYS_C00012_16092719:08:59$" );
After this the partition exchange worked.
The most obvious one is that NEW_TABLE has a PARTITION_TIME column, while OLD_TABLE does not.
The other things to check, that might be an issue
NEW_TABLE.ID is NUMBER(18,0), while OLD_TABLE.ID is NUMBER(18)
OLD_TABLE.MESSAGE is VARCHAR2(4000 byte). You should check your
length semantics, since if they are defined as CHAR, then
NEW_TABLE.message would be VARCHAR2(4000 char).

MySQL LOAD DATA INFILE Taking 13 Hours

Is there anything I can change in the my.ini file to speed up "LOAD DATA INFILE"?
I have two MySQL 5.5 instances each of which has one identical table structured as follows:
CREATE TABLE `log_access` (
`_id` bigint(20) NOT NULL AUTO_INCREMENT,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`type_id` int(11) NOT NULL,
`building_id` int(11) NOT NULL,
`card_id` varchar(15) NOT NULL,
`user_key` varchar(35) DEFAULT NULL,
`user_name` varchar(25) DEFAULT NULL,
`user_validation` varchar(10) DEFAULT NULL,
PRIMARY KEY (`_id`),
KEY `log_access__user_key_timestamp` (`user_key`,`timestamp`)
KEY `log_access__timestamp` (`timestamp`)
) ENGINE=MyISAM
On a daily basis I need to move the data from previous day from instance A to instance B, which consists of roughly 25 million records. At the moment I am doing the following:
On instance A, generate an OUTFILE with "WHERE timestamp BETWEEN
'2014-09-23 00:00:00' AND '2014-09-23 23:59:59'. This usually takes
less than 2 minutes.
On instance B, execute "LOAD DATA INFILE". This is the problem area
as it takes about 13 hours.
On instance A, delete records from the previous day. This will probably be another
On instance B, run stats On instance B, truncate the table
I have also considered partitioning the tables and just exchanging the partitions. EXCHANGE PARTITION is supported as of 5.6 and I am willing to update MySQL, however, all documentation discusses exchanging between tables and I haven't been able to confirm that I would be able to do that between DB instances.
Replication between the instances, but as I have not tinkered with replication in the past and this is a time sensitive assignment I am somewhat reluctant to tread into new waters.
Any words of wisdom much appreciated.
CREATE the table without PRIMARY KEY and _id column and add these after LOAD DATA INFILE is complete. MySQL checks the PRIMARY KEY integrity with each INSERT, so I think you can gain a lot of performance here. With MariaDB you can disable keys, but I think this won't work on some storage engines (see here)
Not-very-nice-alternative:
I found it very easy to move a MYISAM-database by just copy/move the files on disk. If you cut/paste the files and run a REPAIR TABLE. on your target machine you can do this without restarting the Server. Just make sure you copy all 3 files (.frm, .myd, .myi)
LOAD DATA INFILE in perfect PK-order, INTO a table that only has the PK-definition, so no secondary indexes yet. After import, add all secondary indexes at once, with 'ALTER TABLE mytable ALGORITHM=INPLACE, LOCK=NONE, ADD KEY ...'.
Consider adding back the secondary indexes on each involved box separately, so not via replication (sql_log_bin=0), to prevent replication lag.
Consider using a partitioned table, as then you can run a 'LOAD DATA INFILE' per partition, in parallel. (applies to RANGE and HASH partitioning, as the separate tsv-files (one or more per partition) are easy to prepare for those)
MariaDB doesn't have the variant 'INTO mytable PARTITION (p000)' yet.
You can load into a separate table first, and then exchange partitions, but MariaDB also doesn't have 'WITHOUT VALIDATION' yet.

MySQL: Each index on InnoDB table takes longer to create than the last

I've got a MySQL table that looks like this:
CREATE TABLE my_facts (
`id` int(11) DEFAULT NULL auto_increment PRIMARY KEY,
`account_id` int(11) NOT NULL,
`asked_on` date NOT NULL,
`foo_id` int(11) NOT NULL,
`bar_id` int(11) NOT NULL,
`baz_id` int(11) NOT NULL,
`corge_id` int(11) NOT NULL,
`grault_id` int(11) NOT NULL,
`flob_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL)
ENGINE=InnoDB;
and has 450k rows. But: I want to add several indexes to it:
CREATE INDEX `k_account_foo_id` ON `my_facts` (`account_id`, `asked_on`, `foo_id`, `tag_id`);
CREATE INDEX `k_account_bar_id` ON `my_facts` (`account_id`, `asked_on`, `bar_id`, `tag_id`);
CREATE INDEX `k_account_baz_id` ON `my_facts` (`account_id`, `asked_on`, `baz_id`, `tag_id`);
CREATE INDEX `k_account_corge_id` ON `my_facts` (`account_id`, `asked_on`, `corge_id`, `tag_id`);
CREATE INDEX `k_account_grault_id` ON `my_facts` (`account_id`, `asked_on`, `grault_id`, `tag_id`);
My problem is that each index takes longer to create than the last -- and it seems to be on a geometric trajectory. In order, the indexes take 11.6s, 28.8s, 44.4s, 76s, and 128s to create. And I'd like to add a few more indexes.
When I create the table as MyISAM, not only is the whole process a whole lot faster, creating each subsequent index takes maybe a second longer than the previous index.
What gives? Is this behavior expected? Am I doing something funny in my index creation?
For what it's worth, I'm using MySQL 5.1.48/OS X 10.6.8 in this test.
This is expected behavior based on how index creation happens in InnoDB.
In MySQL versions up to 5.0, adding or dropping an index on a table
with existing data can be very slow if the table has many rows. The
CREATE INDEX and DROP INDEX commands work by creating a new, empty
table defined with the requested set of indexes. It then copies the
existing rows to the new table one-by-one, updating the indexes as it
goes. Inserting entries into the indexes in this fashion, where the
key values are not sorted, requires random access to the index nodes,
and is far from optimal. After all rows from the original table are
copied, the old table is dropped and the copy is renamed with the name
of the original table.
Beginning with version 5.1, MySQL allows a storage engine to create or drop indexes without copying the contents of the entire table. The standard built-in InnoDB in MySQL version 5.1, however, does not take advantage of this capability.