Liquibase "Load Data" throws Error 1218 with this file on DB2 - csv

I want to load some initial Data into my DB2 Database with the Liquibase "Load-Data"-Tag. It works with the 70 other files just fine, but it throws this Error with one specific file.
SQLCODE=-1218, SQLSTATE=57011, SQLERRMC=4096
Already attempted fixes
I already tried all the possible fixes in the db2 documentation, but the Problem persisited:
1. increase the bufferpool size
2. decrease the maximum number of database agents and/or connections
3. decrease the maximum degree of parallelism
4. decrease the prefetch size for table spaces that are in this
bufferpool
5. move some table spaces into other bufferpools.
The original file-size was over 4000 lines but even when decreasing the size it still throws the error when one of the following lines is in the csv (examples, there are more of these lines):
"0141.651 ";"004";"default ";"- Valor neto en % "
"0144.654 ";"002";"default ";"- Net Value as % "
"0311.000-TC ";"002";"default ";"Raw Materials, Supplies "
Code
Following there is my initialisation of the bufferpool, the tablespace, the table and the liquibase script:
CREATE BUFFERPOOL BPDAT SIZE 5000 PAGESIZE 4 K;
CREATE TABLESPACE TSBLCORE MANAGED BY DATABASE USING (FILE 'tsblcore.001' 200000) BUFFERPOOL BPDAT;
create table BLPDET
(
SYMBOLICPOSITIONNO CHAR(15) not null,
LANGUAGEKEY CHAR(3) not null,
REPORTGROUPKEY CHAR(24) not null,
DESCRIPTION CHAR(100) not null
) in TSBLCORE;
<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:pro="http://www.liquibase.org/xml/ns/pro"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-4.1.xsd
http://www.liquibase.org/xml/ns/pro
http://www.liquibase.org/xml/ns/pro/liquibase-pro-4.1.xsd"
>
<changeSet author="Jonas" id="4.31">
<sql>COMMIT;</sql>
<loadData encoding="UTF-8"
file="build/Server/resources/changelog/Data/BL_POSDESCRIPTION.csv"
commentLineStartsWith=""
quotchar="""
relativeToChangelogFile="false"
schemaName="DB5"
separator=";"
tableName="BLPDET"
usePreparedStatements="false">
</loadData>
</changeSet>
</databaseChangeLog>
Error
This is the Error displayed in the logs:
2022-08-18-10.58.09.909950+000 I424271E825 LEVEL: Severe
PID : 12733 TID : 140309988108032 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : DB2LOCAL
APPHDL : 0-503 APPID: 172.17.0.1.55526.220818105144
UOWID : 366 ACTID: 142
AUTHID : DB2INST1 HOSTNAME: d4fac55a3d2e
EDUID : 48 EDUNAME: db2agent (DB2LOCAL) 0
FUNCTION: DB2 UDB, buffer pool services, sqlbIsExtentAllocated, probe:4792
MESSAGE : ZRC=0x8502002C=-2063466452=SQLB_BPFULL
"no available buffer pool pages"
DATA #1 : Pointer, 8 bytes
0x00007f9c366bc3c8
DATA #2 : unsigned integer, 4 bytes
4
DATA #3 : unsigned integer, 4 bytes
0
DATA #4 : Pointer, 8 bytes
0x00007f9c76ffc0d0
DATA #5 : Pointer, 8 bytes
0x00007f9c64d77440
I tried to raise my number of pages up to like 100,000, but the error persisted.
db2diag.log:
ADM6019E All pages in buffer pool "IBMSYSTEMBP4K" (ID "4096") are in use. Refer to the documentation for SQLCODE -1218
ADM6073W The table space "TSBLCORE" (ID "8") is configured to use buffer pool ID "3", but this buffer pool is not active at this time. In the interim the table space will use buffer pool ID "4096". The inactive buffer pool should become available at next database startup provided that the required memory is available.
Environment
Database Product
DB2/LINUXX8664
Database Version
SQL11055
Driver Version
4.8
Liquibase Version
4.14.0
Is there any possibility that something in the example datasets causes this behaviour?
PS: This is my first Stackoverflow Question, so please excuse me if i forgot to include something.

Related

MySQL crash after enormous row locks

I'm using MySQL 5.7.14 x64 on Windows Server 2008 R2
Sometimes (randomly times at day) mysql crashing with this stack trace
11:44:40 UTC - mysqld got exception 0x80000003 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=8388608
read_buffer_size=65536
max_used_connections=369
max_threads=2800
thread_count=263
connection_count=263
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 3195125 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x2ee2b72b0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
13fe1bad2 mysqld.exe!my_sigabrt_handler()[my_thr_init.c:449]
1401c7979 mysqld.exe!raise()[winsig.c:587]
1401c6870 mysqld.exe!abort()[abort.c:82]
13ff1dd38 mysqld.exe!ut_dbg_assertion_failed()[ut0dbg.cc:67]
13ff1df51 mysqld.exe!ib::fatal::~fatal()[ut0ut.cc:916]
13ff0e008 mysqld.exe!buf_LRU_check_size_of_non_data_objects()[buf0lru.cc:1219]
13ff0f4ab mysqld.exe!buf_LRU_get_free_block()[buf0lru.cc:1303]
1400305cb mysqld.exe!buf_block_alloc()[buf0buf.cc:557]
13ff3767e mysqld.exe!mem_heap_create_block_func()[mem0mem.cc:319]
13ff37499 mysqld.exe!mem_heap_add_block()[mem0mem.cc:408]
13ffd87f4 mysqld.exe!RecLock::lock_alloc()[lock0lock.cc:1441]
13ffd795c mysqld.exe!RecLock::create()[lock0lock.cc:1534]
13ffd73a6 mysqld.exe!RecLock::add_to_waitq()[lock0lock.cc:1735]
13ffdcaaa mysqld.exe!lock_rec_lock_slow()[lock0lock.cc:2007]
13ffdc6ce mysqld.exe!lock_rec_lock()[lock0lock.cc:2081]
13ffd8cc7 mysqld.exe!lock_clust_rec_read_check_and_lock()[lock0lock.cc:6307]
140076fe3 mysqld.exe!row_ins_set_shared_rec_lock()[row0ins.cc:1502]
140072927 mysqld.exe!row_ins_check_foreign_constraint()[row0ins.cc:1739]
140072de8 mysqld.exe!row_ins_check_foreign_constraints()[row0ins.cc:1932]
140075d69 mysqld.exe!row_ins_sec_index_entry()[row0ins.cc:3356]
1400758a6 mysqld.exe!row_ins_index_entry_step()[row0ins.cc:3583]
140071b30 mysqld.exe!row_ins()[row0ins.cc:3721]
14007755a mysqld.exe!row_ins_step()[row0ins.cc:3907]
13ffaad50 mysqld.exe!row_insert_for_mysql_using_ins_graph()[row0mysql.cc:1735]
13fe7a7d3 mysqld.exe!ha_innobase::write_row()[ha_innodb.cc:7489]
13f6e5531 mysqld.exe!handler::ha_write_row()[handler.cc:7891]
13f8e54de mysqld.exe!write_record()[sql_insert.cc:1860]
13f8e916a mysqld.exe!read_sep_field()[sql_load.cc:1222]
13f8e7af4 mysqld.exe!mysql_load()[sql_load.cc:563]
13f716e86 mysqld.exe!mysql_execute_command()[sql_parse.cc:3649]
13f7194b3 mysqld.exe!mysql_parse()[sql_parse.cc:5565]
13f71267d mysqld.exe!dispatch_command()[sql_parse.cc:1430]
13f71368a mysqld.exe!do_command()[sql_parse.cc:997]
13f6d82bc mysqld.exe!handle_connection()[connection_handler_per_thread.cc:300]
140105122 mysqld.exe!pfs_spawn_thread()[pfs.cc:2191]
13fe1b93b mysqld.exe!win_thread_start()[my_thread.c:38]
1401c73ef mysqld.exe!_callthreadstartex()[threadex.c:376]
1401c763a mysqld.exe!_threadstartex()[threadex.c:354]
772859bd kernel32.dll!BaseThreadInitThunk()
773ba2e1 ntdll.dll!RtlUserThreadStart()
At this time active only 2 transactions
---TRANSACTION 1111758443, ACTIVE 565 sec
mysql tables in use 7, locked 7
7527 lock struct(s), heap size 876752, 721803 row lock(s), undo log entries 379321
MySQL thread id 166068, OS thread handle 1508, query id 112695582 localhost converter Waiting for table level lock
delete from pl
using
import_k2b_product_links ipl inner join k2b_products pSource on ipl.src_product = pSource.article and pSource.account_id = 22
inner join k2b_products pDest on ipl.dst_product = pDest.article and pDest.account_id = 22
inner join k2b_product_links pl on pl.src_product_id = pSource.id and pl.dst_product = pDest.id
where ipl.action = 1
---TRANSACTION 1111759716, ACTIVE 496 sec inserting, thread declared inside InnoDB 1
mysql tables in use 4, locked 4
7 lock struct(s), heap size 1304535248, 102060778 row lock(s), undo log entries 1
MySQL thread id 19436, OS thread handle 11664, query id 112301161 localhost exchange_central
LOAD DATA INFILE 'd:/kdm/temp/webCentral/ufrd1uwx.v2r'
INTO TABLE k2b_orders
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
(id_status, dt, account_id, sms_sended, params, update_ts, exported, id_editor, dt_offset, device_id, gen, changer_device_id, total, creator_device_id, id, dt_server, device_category_id, original_params, order_num, sended, editor_comment, admin_comment)
I don't understand why transaction 1111758443 Waiting for table level lock?
And why transaction 1111759716 lock 102060778 rows while it load just only one from external file and it showed in undo log entries 1?
Which investigation I must done for known reason of this enormous locks and crash.
Thanks!
Two things make me think that the crash is not the 'real' problem.
Both queries in the log show 'huge' times, such as ACTIVE 565 sec.
And these are all quite large:
max_used_connections=369
max_threads=2800
thread_count=263
connection_count=263
When there are hundreds of threads simultaneously active, InnoDB stumbles over itself. Throughput stalls, and latency goes through the roof.
One cure is to avoid so many connections. This is sometimes best done at the client. What is the client? For example, Apache has MaxClients. A dozen Apaches, each with MaxClients = 50 would be trying to open 600 connections. Probably one Apache cannot effectively handle 50 threads at once. Lower that number.
Are there any VIEWs deceiving us?
Another thing to do is to pursue table level lock. Let's see SHOW CREATE TABLE for the tables involved. Check for appropriate indexes.
import_k2b_product_links: INDEX(action, ...)
k2b_products: INDEX(account_id, src_product) -- in either order
k2b_products: INDEX(account_id, dest_product) -- in either order
k2b_product_links: INDEX(src_product_id, dest_product_id) -- or PK, see below
Is k2b_product_links a many:many mapping table? If so, get rid of id auto_increment as discussed Here .
The index suggestions, if useful, could speed up the DELETE, thereby cutting down on possible contention.

Hive query does not begin MapReduce process after starting job and generating Tracking URL

I'm using Apache Hive.
I created a table in Hive (similar to external table) and loaded data into the same using the LOAD DATA LOCAL INPATH './Desktop/loc1/kv1.csv' OVERWRITE INTO TABLE adih; command.
While I am able to retrieve simple data from the hive table adih (e.g. select * from adih, select c_code from adih limit 1000, etc), Hive gives me errors when I ask for data involving slight computations (e.g. select count(*) from adih, select distinct(c_code) from adih).
The Hive cli output is as shown in the following link -
hive> select distinct add_user from adih;
Query ID = latize_20161031155801_8922630f-0455-426b-aa3a-6507aa0014c6
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
Starting Job = job_1477889812097_0006, Tracking URL = http://latize-data1:20005/proxy/application_1477889812097_0006/
Kill Command = /opt/hadoop-2.7.1/bin/hadoop job -kill job_1477889812097_0006
[6]+ Stopped $HIVE_HOME/bin/hive
Hive stops displaying any further logs / actions beyond the last line of "Kill Command"
Not sure where I have gone wrong (many answers on StackOverflow tend to point back to YARN configs (environment config detailed below).
I have the log as well but it contains more than 30000 characters (Stack Overflow limit)
My hadoop environment is configured as follows -
1 Name Node & 1 Data Node. Each has 20 GB of RAM with sufficient ROM. Have allocated 13 GB of RAM for the yarn.scheduler.maximum-allocation-mb and yarn.nodemanager.resource.memory-mb each with the mapreduce.map.memory.mb being set as 4 GB and the mapreduce.reduce.memory.mb being set as 12 GB. Number of reducers is currently set to default (-1). Also, Hive is configured to run with a MySQL DB (rather than Derby).
You should set the appropriate values to the properties show in your trace,
eg: Edit the properties in hive-site.xml
<property>
<name>hive.exec.reducers.bytes.per.reducer</name>
<value>67108864</value></property>
Looks like you have set mapred.reduce.tasks = -1, which makes Hive refer to its config to decide the number of reduce tasks.
You are getting an error as the number of reducers is missing in Hive config.
Try setting it using below command:
Hive> SET mapreduce.job.reduces=XX
As per official documentation: The right number of reduces seems to be 0.95 or 1.75 multiplied by (< no. of nodes > * < no. of maximum containers per node >).
I managed to get Hive and MR to work - increased the memory configurations for all the processes involved:
Increased the RAM allocated to YARN Scheduler and maximum RAM allocated to the YARN Nodemanager (in yarn-site.xml), alongside increasing the RAM allocated to the Mapper and Reducer (in mapred-site.xml).
Also incorporated parts of the answers by #Sathiyan S and #vmorusu - set the hive.exec.reducers.bytes.per.reducer property to 1 GB of data, which directly affects the number of reducers that Hive uses (through application of its heuristic techniques).

Cassandra .csv import error:batch too large

I'm trying to import data from a .csv file to Cassandra 3.2.1 via copy command.In the file are only 299 rows with 14 columns. I get the Error:
Failed to import 299 rows: InvalidRequest - code=2200 [Invalid query] message="Batch too large"
I used the following copy comand and tryied to increase the batch size:
copy table (Col1,Col2,...)from 'file.csv'
with delimiter =';' and header = true and MAXBATCHSIZE = 5000;
I think 299 rows are not too much to import to cassandra or i am wrong?
Adding the CHUNKSIZE keyword resolved the problem for me.
e.g.
copy event_stats_user from '/home/kiren/dumps/event_stats_user.csv ' with CHUNKSIZE=1 ;
The error you're encountering is a server-side error message, saying that the size (in term of bytes count) of your batch insert is too large.
This batch size is defined in the cassandra.yaml file:
# Log WARN on any batch size exceeding this value. 5kb per batch by default.
# Caution should be taken on increasing the size of this threshold as it can lead to node instability.
batch_size_warn_threshold_in_kb: 5
# Fail any batch exceeding this value. 50kb (10x warn threshold) by default.
batch_size_fail_threshold_in_kb: 50
If you insert a lot of big columns (in size) you may reach quickly this threshold. Try to reduce MAXBATCHSIZE to 200.
More info on COPY options here

Overhead of data transfer from a MySQL database using PreparedStatement

This is not a question about query optimization. Rather, a sanity check about what to expect of data transfer rates from MySQL 5.5.27 (Amazon RDS).
When running a particularly heavy query, MySQL Workbench is showing data transfer rate of about 1MB/s and the query runs for about 420 seconds. This adds up to about 420M bytes of data being transferred.
If this data is saved into a simple text file, the size of the file ends up being less than 7M bytes. I certainly expected to see some overhead due to metadata of the ResultSet, JDBC driver mechanisms, etc. But 420M vs. 7M seems like an extraordinary terrible ratio to me. Or, is this normal?
Any feedback is much appreciated.
Much thanks!
PS. More details:
-the JDBC Driver is mysql-connector-java-5.1.13
-the data is transferred between Amazon RDS and an EC2 instance
-Java 1.6 PreparedStatement is used to execute the query
Wireshark is a wonderful free and open-source (GPL) network analysis tool that can be used to great effect in cases like this. I ran the following test to see how much traffic a "typical" JDBC connection to a "normal" MySQL server might generate.
I created a table named jdbctest in MySQL (5.5.29-0ubuntu0.12.04.2) on my test server.
CREATE TABLE `jdbctest` (
`id` int(11) DEFAULT NULL,
`textcol` varchar(6) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I populated it with 100,000 rows of the form
id textcol
------ -------
1 ABCDEF
2 ABCDEF
3 ABCDEF
...
100000 ABCDEF
At 4 bytes per id value and 6 bytes per textcol value, retrieving all 100,000 rows should represent somewhere on the order of 1 MB of data.
I fired up Wireshark, started a trace, and ran the following Java code which uses mysql-connector-java-5.1.26:
import java.sql.*;
public class mysqlTestMain {
static Connection dbConnection = null;
public static void main(String[] args) {
try {
String myConnectionString = "";
myConnectionString =
"jdbc:mysql://192.168.1.3:3306/mytestdb";
dbConnection = DriverManager.getConnection(myConnectionString, "root", "whatever");
PreparedStatement stmt = dbConnection.prepareStatement("SELECT * FROM jdbctest");
ResultSet rs = stmt.executeQuery();
int i = 0;
int j = 0;
String s = "";
while (rs.next()) {
i++;
j = rs.getInt("id");
s = rs.getString("textcol");
}
System.out.println(String.format("Finished reading %d rows.", i));
rs.close();
stmt.close();
dbConnection.close();
} catch (SQLException ex) {
ex.printStackTrace();
}
}
}
The console output confirmed that I had retrieved all 100,000 rows.
Looking at the summary of the Wireshark trace, I found:
Packets captured: 1811
Avg. packet size: 992.708 bytes
Bytes: 1797795
The breakdown by direction was
packets bytes
------- -----
from me to server 636 36519
from server to me 1175 1761276
So it appears that to retrieve my ~1 MB of data I received 1.72 MB of total network traffic from the MySQL server. That ~72% overhead on the download (or ~76% including traffic in both directions) is certainly nowhere near the ~5900% overhead suggested by your (rate * time) calculation.
I strongly suspect that the ~1 MB/s rate being reported by MySQL Workbench is not the overall average transfer rate over the entire time. The best way to determine the overhead in your particular circumstance would be to use a tool like Wireshark and measure it yourself.

Only one node owns data in a Cassandra cluster

I am new to Cassandra and just run a cassandra cluster (version 1.2.8) with 5 nodes, and I have created several keyspaces and tables on there. However, I found all data are stored in one node (in the below output, I have replaced ip addresses by node numbers manually):
Datacenter: 105
==========
Address Rack Status State Load Owns Token
4
node-1 155 Up Normal 249.89 KB 100.00% 0
node-2 155 Up Normal 265.39 KB 0.00% 1
node-3 155 Up Normal 262.31 KB 0.00% 2
node-4 155 Up Normal 98.35 KB 0.00% 3
node-5 155 Up Normal 113.58 KB 0.00% 4
and in their cassandra.yaml files, I use all default settings except cluster_name, initial_token, endpoint_snitch, listen_address, rpc_address, seeds, and internode_compression. Below I list those non-ip address fields I modified:
endpoint_snitch: RackInferringSnitch
rpc_address: 0.0.0.0
seed_provider:
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "node-1, node-2"
internode_compression: none
and all nodes using the same seeds.
Can I know where I might do wrong in the config? And please feel free to let me know if any additional information is needed to figure out the problem.
Thank you!
If you are starting with Cassandra 1.2.8 you should try using the vnodes feature. Instead of setting the initial_token, uncomment # num_tokens: 256 in the cassandra.yaml, and leave initial_token blank, or comment it out. Then you don't have to calculate token positions. Each node will randomly assign itself 256 tokens, and your cluster will be mostly balanced (within a few %). Using vnodes will also mean that you don't have to "rebalance" you cluster every time you add or remove nodes.
See this blog post for a full description of vnodes and how they work:
http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2
Your token assignment is the problem here. An assigned token are used determines the node's position in the ring and the range of data it stores. When you generate tokens the aim is to use up the entire range from 0 to (2^127 - 1). Tokens aren't id's like with mysql cluster where you have to increment them sequentially.
There is a tool on git that can help you calculate the tokens based on the size of your cluster.
Read this article to gain a deeper understanding of the tokens. And if you want to understand the meaning of the numbers that are generated check this article out.
You should provide a replication_factor when creating a keyspace:
CREATE KEYSPACE demodb
WITH REPLICATION = {'class' : 'SimpleStrategy', 'replication_factor': 3};
If you use DESCRIBE KEYSPACE x in cqlsh you'll see what replication_factor is currently set for your keyspace (I assume the answer is 1).
More details here