Big Cassandra query causes daemon to crush - configuration

There is a need to read ~2.5Gb of records from cassandra 1.1.6 database running on CentOS release 6.3 virtual machine. When daemon with default out-of-the-box configuration is queried, i get error :
INFO [Thread-2] 2012-10-30 20:05:13,345 CassandraDaemon.java (line 212) Listening for thrift clients...
WARN [ScheduledTasks:1] 2012-10-30 20:06:27,076 GCInspector.java (line 145) Heap is 0.8434091049049706 full. You may need to reduce memtable and/or
WARN [ScheduledTasks:1] 2012-10-30 20:06:27,077 StorageService.java (line 2855) Flushing CFS(Keyspace='system', ColumnFamily='Versions') to relieve m
INFO [ScheduledTasks:1] 2012-10-30 20:06:27,077 ColumnFamilyStore.java (line 659) Enqueuing flush of Memtable-Versions#1970754472(83/103 serialized/l
INFO [FlushWriter:2] 2012-10-30 20:06:27,078 Memtable.java (line 264) Writing Memtable-Versions#1970754472(83/103 serialized/live bytes, 3 ops)
INFO [FlushWriter:2] 2012-10-30 20:06:27,096 Memtable.java (line 305) Completed flushing /var/lib/cassandra/data/system/Versions/system-Versions-hf-1
WARN [ScheduledTasks:1] 2012-10-30 20:06:28,793 GCInspector.java (line 139) Heap is 0.9390217139392345 full. You may need to reduce memtable and/or
WARN [ScheduledTasks:1] 2012-10-30 20:06:28,794 AutoSavingCache.java (line 156) Reducing KeyCache capacity from 2075306 to 12 to reduce memory pressu
WARN [ScheduledTasks:1] 2012-10-30 20:06:28,794 GCInspector.java (line 145) Heap is 0.9390217139392345 full. You may need to reduce memtable and/or
INFO [ScheduledTasks:1] 2012-10-30 20:06:28,795 StorageService.java (line 2851) Unable to reduce heap usage since there are no dirty column families
WARN [ScheduledTasks:1] 2012-10-30 20:06:30,181 GCInspector.java (line 145) Heap is 0.9984246325381808 full. You may need to reduce memtable and/or
INFO [ScheduledTasks:1] 2012-10-30 20:06:30,182 StorageService.java (line 2851) Unable to reduce heap usage since there are no dirty column families
WARN [ScheduledTasks:1] 2012-10-30 20:06:34,740 GCInspector.java (line 145) Heap is 0.9983338780063149 full. You may need to reduce memtable and/or
INFO [ScheduledTasks:1] 2012-10-30 20:06:34,741 StorageService.java (line 2851) Unable to reduce heap usage since there are no dirty column families
ERROR [ReadStage:33] 2012-10-30 20:06:34,843 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:33,5,main]
java.lang.OutOfMemoryError: Java heap space
<------>at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:323)
<------>at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:398)
<------>at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363)
<------>at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:120)
<------>at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:255)
<------>at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:275)
<------>at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:232)
<------>at edu.stanford.ppl.concurrent.SnapTreeMap.<init>(SnapTreeMap.java:453)
<------>at org.apache.cassandra.db.AtomicSortedColumns$Holder.<init>(AtomicSortedColumns.java:311)
<------>at org.apache.cassandra.db.AtomicSortedColumns.<init>(AtomicSortedColumns.java:77)
<------>at org.apache.cassandra.db.AtomicSortedColumns.<init>(AtomicSortedColumns.java:48)
<------>at org.apache.cassandra.db.AtomicSortedColumns$1.fromSorted(AtomicSortedColumns.java:61)
<------>at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:399)
<------>at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:382)
<------>at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:377)
<------>at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:339)
<------>at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:79)
<------>at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:39)
<------>at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
<------>at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
<------>at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:116)
<------>at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
Don't have much time for calculating size of heap/memory tables etc., added
JVM_OPTS="-Xms4g -Xmx4g"
to daemon config (test server has 8 gigs of ram). Query fails again with
ERROR [ReadStage:1] 2012-10-30 20:46:22,417 AbstractCassandraDaemon.java (line 135) Exception in thread Thread[ReadStage:1,5,main]
java.lang.RuntimeException: java.lang.IllegalArgumentException
at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:71)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException
at org.apache.cassandra.io.util.FastByteArrayOutputStream.<init>(FastByteArrayOutputStream.java:78)
at org.apache.cassandra.io.util.DataOutputBuffer.<init>(DataOutputBuffer.java:40)
at org.apache.cassandra.db.RangeSliceReply.getReply(RangeSliceReply.java:48)
at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:64)
... 4 more
and i can't debug further. Does anyone knows how can i tweak Cassandra so i can run that query? Database have ~500 of supercolumns ~7 mb each. I need to read them all and at some point have in memory (client machine has 40 gigs of ram so thats 100% not a lacking resources issue) for further processing. Query result is not returned into api at all.

You should generally use 8G RAM or even 16G ram for cassandra per recommendations.
What query are you running? I know PlayOrm uses a cursor so it gives you stuff in pieces so it won't run out of memory. We have easily retrieved 100Gigs of stuff with PlayOrm and cassandra into our client though we discard it as it streams back and we have had no issues doing that.
later,
Dean

Related

InnoDB: File (unknown): 'read' returned OS error 403. Cannot continue operation ; [ERROR] mysqld got exception 0x80000003 ;

i am facing the following error recently. Mysql was hosted by XAMPP in local, and recently getting repetitively Error.
2020-04-28 7:28:12 260 [Warning] InnoDB: Retry attempts for reading partial data failed.
2020-04-28 7:28:12 260 [ERROR] InnoDB: Tried to read 16384 bytes at offset 49152, but was only able to read 0
2020-04-28 7:28:12 260 [ERROR] InnoDB: Operating system error number 203 in a file operation.
2020-04-28 7:28:12 260 [Note] InnoDB: Some operating system error numbers are described at https://mariadb.com/kb/en/library/operating-system-error-codes/
2020-04-28 7:28:12 260 [ERROR] InnoDB: File (unknown): 'read' returned OS error 403. Cannot continue operation
200428 7:28:17 [ERROR] mysqld got exception 0x80000003 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.3.16-MariaDB
key_buffer_size=33554432
read_buffer_size=262144
max_used_connections=11
max_threads=65537
thread_count=9
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 36501 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x2b34eecdd28
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
mysqld.exe!my_parameter_handler()
mysqld.exe!strxnmov()
mysqld.exe!strxnmov()
mysqld.exe!parse_user()
mysqld.exe!parse_user()
mysqld.exe!parse_user()
mysqld.exe!parse_user()
mysqld.exe!pthread_dummy()
mysqld.exe!pthread_dummy()
mysqld.exe!parse_user()
mysqld.exe!pthread_dummy()
mysqld.exe!pthread_dummy()
mysqld.exe!parse_user()
mysqld.exe!parse_user()
mysqld.exe!parse_user()
mysqld.exe!?ha_rnd_next#handler##QEAAHPEAE#Z()
mysqld.exe!?rr_sequential##YAHPEAUREAD_RECORD###Z()
mysqld.exe!?sub_select##YA?AW4enum_nested_loop_state##PEAVJOIN##PEAUst_join_table##_N#Z()
mysqld.exe!?disjoin#?$List#VItem####QEAAXPEAV1##Z()
mysqld.exe!?exec_inner#JOIN##QEAAXXZ()
mysqld.exe!?exec#JOIN##QEAAXXZ()
mysqld.exe!?mysql_select##YA_NPEAVTHD##PEAUTABLE_LIST##IAEAV?$List#VItem####PEAVItem##IPEAUst_order##434_KPEAVselect_result##PEAVst_select_lex_unit##PEAVst_select_lex###Z()
mysqld.exe!?handle_select##YA_NPEAVTHD##PEAULEX##PEAVselect_result##K#Z()
mysqld.exe!?execute_init_command##YAXPEAVTHD##PEAUst_mysql_lex_string##PEAUst_mysql_rwlock###Z()
mysqld.exe!?mysql_execute_command##YAHPEAVTHD###Z()
mysqld.exe!?mysql_parse##YAXPEAVTHD##PEADIPEAVParser_state##_N3#Z()
mysqld.exe!?dispatch_command##YA_NW4enum_server_command##PEAVTHD##PEADI_N3#Z()
mysqld.exe!?do_command##YA_NPEAVTHD###Z()
mysqld.exe!?pool_of_threads_scheduler##YAXPEAUscheduler_functions##PEAKPEAI#Z()
mysqld.exe!?tp_callback##YAXPEAUTP_connection###Z()
ntdll.dll!TpReleaseWait()
ntdll.dll!RtlInitializeResource()
KERNEL32.DLL!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x2b34ef77140): SELECT `JCY_PROGRAM_BOM_ID`, `JCY_PROGRAM_BOM_MATERIAL_ID`, `JCY_PROGRAM_BOM_PROGRAM_ID`, `JCY_PROGRAM_BOM_CREATED_BY`, `JCY_PROGRAM_BOM_CREATED_DATETIME` FROM `jcy_program_bom`
where jcy_program_bom_material_id = 45 AND JCY_PROGRAM_BOM_PROGRAM_ID = 18
Connection ID (thread ID): 260
Status: NOT_KILLED
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
Once this happen, the XAMPP's MYSQL service will be forced stop. And i will have to restart it again.
Anyone know how to fix/encounter this issue? Any guide or comment will be much appreciate

FREEBSD mysql error table open cache error

2019-03-19 01:43:26 22929 [Warning] Buffered warning: Could not increase number of max_open_files to more than 79992 (request: 4294967295)
2019-03-19 01:43:26 22929 [Warning] Buffered warning: Changed limits: table_open_cache: 39915 (requested 524288)
This is my problem in mysql when starting service this is from log. I did not yet google solution because FREEBSD OS
It seems like you have hit this FreeBSD bug, which was reported for package mysql57-server-5.7.12.
A workaround is to change the permissions of file /usr/local/etc/mysql/my.cnf to something like 646 (the key point is that others need write permissions).
You can also see this this FreeBSD forum thread for more information..

Cassandra down after high traffic

I have a problem with cassandra when the traffic goes high... cassandra crashes
here's what i got in system.log
WARN 11:54:35 JNA link failure, one or more native method will be unavailable.
WARN 11:54:35 jemalloc shared library could not be preloaded to speed up memory allocations
WARN 11:54:35 JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO 11:54:35 Initializing SIGAR library
INFO 11:54:35 Checked OS settings and found them configured for optimal performance.
INFO 11:54:35 Initializing system.schema_triggers
ERROR 11:54:36 Exiting due to error while processing commit log during initialization.
java.lang.InternalError: a fault occurred in a recent unsafe memory access operation in compiled
Java code at org.apache.cassandra.db.commitlog.CommitLogDescriptor.writeHeader(CommitLogDescriptor.java:87) ~[apache-cassandra-2.2.4.jar:2.2.4]
at org.apache.cassandra.db.commitlog.CommitLogSegment.<init>(CommitLogSegment.java:153) ~[apache-cassandra-2.2.4.jar:2.2.4] at org.apache.cassandra.db.commitlog.MemoryMappedSegment.<init>(MemoryMappedSegment.java:47) ~[apache-cassandra-2.2.4.jar:2.2.4] at
org.apache.cassandra.db.commitlog.CommitLogSegment.createSegment(CommitLogSegment.java:121) ~[apache-cassandra-2.2.4.jar:2.2.4]
atorg.apache.cassandra.db.commitlog.CommitLogSegmentManager$1.runMayThrow(CommitLogSegmentManager.java:122) ~[apache-cassandra-2.2.4.jar:2.2.4]
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) [apache-cassandra-2.2.4.jar:2.2.4]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
and in debug.log
DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,586 SliceQueryPager.java:92 - Querying next page of slice query; new filter: SliceQueryFilter [reversed=false, slices=[[, ]], count=5000, toGroup = 2]
WARN [SharedPool-Worker-2] 2017-05-25 12:54:18,658 SliceQueryFilter.java:307 - Read 2129 live and 27677 tombstone cells in RestCommSMSC.SLOT_MESSAGES_TABLE_2017_05_25 for key: 549031460 (see tombstone_warn_threshold). 5000 columns were requested, slices=[-]DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:95 - Fetched 2129 live rows
DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:112 - Got result (2129) smaller than page size (5000), considering pager exhausted DEBUG [SharedPool-Worker-1] 2017-05-25 12:54:18,808 AbstractQueryPager.java:133 - Remaining rows to page: 2147481518
INFO [main] 2017-05-25 12:54:34,826 YamlConfigurationLoader.java:92 - Loading settings from file:/opt/SMGS/apache-cassandra-2.2.4/conf/cassandra.yaml INFO [main] 2017-05-25 12:54:34,923 YamlConfigurationLoader.java:135 Node configuration
[authenticator=AllowAllAuthenticator; authorizer=AllowAllAuthorizer;
auto_snapshot=true; batch_size_fail_threshold_in_kb=50;
batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; cas_contention_timeout_in_ms=1000; client_encryption_options=<REDACTED>; cluster_name=Test Cluster; column_index_size_in_kb=64; commit_failure_policy=stop; commitlog_segment_size_in_mb=32; commitlog_sync=periodic; commitlog_sync_period_in_ms=10000; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=16; concurrent_counter_writes=32; concurrent_reads=32; concurrent_writes=32; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; cross_node_timeout=false; disk_failure_policy=stop; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=600000; dynamic_snitch_update_interval_in_ms=100; enable_user_defined_functions=false; endpoint_snitch=SimpleSnitch; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; incremental_backups=false; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; inter_dc_tcp_nodelay=false; internode_compression=all; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=localhost; max_hint_window_in_ms=10800000; max_hints_delivery_threads=2; memtable_allocation_type=heap_buffers; native_transport_port=9042; num_tokens=256; partitioner=org.apache.cassandra.dht.Murmur3Partitioner; permissions_validity_in_ms=2000; range_request_timeout_in_ms=50000; read_request_timeout_in_ms=10000; request_scheduler=org.apache.cassandra.scheduler.NoScheduler; request_timeout_in_ms=50000; role_manager=CassandraRoleManager; roles_validity_in_ms=2000; row_cache_save_period=0; row_cache_size_in_mb=0; rpc_address=localhost; rpc_keepalive=true; rpc_port=9160; rpc_server_type=sync; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, parameters=[{seeds=127.0.0.1}]}]; server_encryption_options<REDACTED>;snapshot_before_compaction=false; ssl_storage_port=7001; sstable_preemptive_open_interval_in_mb=50; start_native_transport=true; start_rpc=true; storage_port=7000; thrift_framed_transport_size_in_mb=15; tombstone_failure_threshold=100000; tombstone_warn_threshold=5000; tracetype_query_ttl=86400; tracetype_repair_ttl=604800; trickle_fsync=false; trickle_fsync_interval_in_kb=10240; truncate_request_timeout_in_ms=60000; windows_timer_interval=1; write_request_timeout_in_ms=2000]
DEBUG [main] 2017-05-25 12:54:34,958 DatabaseDescriptor.java:296 - Syncing log with a period of 10000
INFO [main] 2017-05-25 12:54:34,958 DatabaseDescriptor.java:304 - DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
INFO [main] 2017-05-25 12:54:35,110 DatabaseDescriptor.java:409 - Global memtable on-heap threshold is enabled at 1991MB INFO [main] 2017-05-25 12:54:35,110 DatabaseDescriptor.java:413 - Global memtable off-heap threshold is enabled at 1991MB
i don't know if this problem is related to commitLogs or not, anyways in cassandra.yaml i'm setting:
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32
you can start you cassandra with this command:
cd /ASMSC02/apache-cassandra-2.0.11/
nohup bin/cassandra
regards,
Hafiz

Using DBOutputFormat to write data to Mysql causes IOException

Recently, I am learning MapReduce and use it to write data to MySQL database. There are two ways to do so, DBOutputFormat and SQOOP. I tried the first one (refer to here), but encountered a problem, following is the error:
...
16/05/25 09:36:53 INFO mapred.LocalJobRunner: 3 / 3 copied.
16/05/25 09:36:53 INFO mapred.LocalJobRunner: reduce task executor complete.
16/05/25 09:36:53 WARN output.FileOutputCommitter: Output Path is null in cleanupJob()
16/05/25 09:36:53 WARN mapred.LocalJobRunner: job_local1404930626_0001
java.lang.Exception: java.io.IOException
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.io.IOException
at org.apache.hadoop.mapreduce.lib.db.DBOutputFormat.getRecordWriter(DBOutputFormat.java:185)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:540)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/05/25 09:36:54 INFO mapreduce.Job: Job job_local1404930626_0001 failed with state FAILED due to: NA
16/05/25 09:36:54 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=32583
FILE: Number of bytes written=796446
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=402
HDFS: Number of bytes written=0
HDFS: Number of read operations=18
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
...
while I manually use JDBC to connect and insert data, it turns out to be successful. And I notice that the map/reduce task executors are complete, but it encounters the IOException. So I guess the problem is database-related.
My code is here. Appriciated if some one could help me to figure out what is the problem.
Thanks in advance!

While Cassandra compaction Fatal exception in thread CompactionExecutor

I am having cassandra cluster of 12 nodes on EC2 running cassandra-0.8.2.
While compaction I got the following exception which caused Seed node to get down.
Below is the exception stack trace.
ERROR [CompactionExecutor:31] 2011-12-16 08:06:02,308 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[CompactionExecutor:31,1,main]
java.io.IOError: java.io.EOFException: EOF after 430959023 bytes out of 778986868
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:149)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:90)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:74)
at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:179)
at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
at org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
at org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
at org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
at org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
at org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: EOF after 430959023 bytes out of 778986868
at org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:63)
at org.apache.cassandra.io.sstable.SSTableIdentityIterator.(SSTableIdentityIterator.java:141)
... 23 more
It says it is Caused by: java.io.EOFException:
Is it because of the corrupt sstables?
if it is, then how to remove or repair those sstables?
It looks like this is indeed caused by corrupt sstables (which may indicate a hardware problem). My recommendations:
Upgrade to the latest stable 0.8.x version of Cassandra. This will be a drop-in replacement for 0.8.2.
Run "nodetool scrub" on the machine having problems
Review http://www.datastax.com/docs/1.0/install/cluster_init -- I recommend two seed nodes per data center, but remember that a seed node is only consulted when restarting nodes, so it's not a big deal to have one down during normal operation