Kafka msql connector configuration file doesnot work when query is not provided - mysql

I have a configuration file for Kafka which reads data from MYSQL database perfectly fine
name=local-jbdc
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
connection.url=jdbc:mysql://localhost:3306/book
connection.user=root
connection.password=newpass
topic.prefix=quickstart-events
mode=incrementing
incrementing.column.name=__id
query=select * from book_table
offset.flush.timeout.ms=5000
buffer.memory=200
poll.interval.ms=10000
tasks.max=1
Now when I take out the query and provide table.whitelist it doesnot read anything. Not even error.
The confiuration is shown below
name=local-jbdc
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
connection.url=jdbc:mysql://localhost:3306/book
connection.user=root
connection.password=newpass
topic.prefix=quickstart-events
mode=incrementing
incrementing.column.name=__id
table.whitelist=book_table
offset.flush.timeout.ms=5000
buffer.memory=200
poll.interval.ms=10000
tasks.max=1
Can someone help me understand the root cause of this problem. Also how will I be able to do incremental mode for multiple tables.
Edits
When I stop the kafka with Ctrl+C on keyboard
there is a log coming up like this
[2020-11-30 12:35:38,057] INFO [ReplicaManager broker=0] Shut down completely (kafka.server.ReplicaManager)
[2020-11-30 12:35:38,058] INFO Shutting down. (kafka.log.LogManager)
[2020-11-30 12:35:38,106] INFO [ProducerStateManager partition=connect-status-4] Writing producer snapshot at offset 394 (kafka.log.ProducerStateManager)
[2020-11-30 12:35:38,158] INFO [ProducerStateManager partition=__consumer_offsets-18] Writing producer snapshot at offset 1 (kafka.log.ProducerStateManager)
[2020-11-30 12:35:38,219] INFO [ProducerStateManager partition=quickstart-eventsbook_table-0] Writing producer snapshot at offset 19645 (kafka.log.ProducerStateManager)
[2020-11-30 12:35:38,239] INFO [ProducerStateManager partition=quickstart-book_table-0] Writing producer snapshot at offset 2652 (kafka.log.ProducerStateManager)```

The problem was pretty simple. When provided table.whitelist the connector creates a topic with the topic.prefix appended to the table name. In my case it created a new topic named quickstart-eventsbook_table. When query is provided topic.prefix is treated as the topic to send data too.

Related

SpringApplication.run() loads entire database

I am working on a Spring Boot, MySQL, JavaFX, client server application - No web - and had a surprising effect that although I didn't altered any entity from the UI, I got an ObjectOptimisticLockingFailureException saying "Row was updated or deleted by another transaction". So I was wondering what - if not me - is updating this entity, and started to debug by switching on
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.use_sql_comments=true
spring.jpa.properties.hibernate.format_sql=true
logging.level.org.hibernate.type=TRACE
in my property file to see what is going on between the application and the database. What I found is something that I don't understand at all:
When the application starts, right in the beginning, before any of my code is called as far as I can say, SpringApplication.run(..) is called:
#Override
public void init() throws Exception {
springContext = SpringApplication.run(ProdMgrApp.class);
..
}
When I execute this command in the debugger - but also if don't run the application from the debugger - the application generates 563 thousand (!) lines of SQL code, basically querying the entire database - a couple of thousands selects, over 100 updates, and about 400 inserts. Interestingly, despite the insert statements, the database content is not doubled or extended in anyway. But the integer version information for optimistic locking (#Version) is increasing. In a way it doesn't harm, but it takes a while - also without debugging statements to the console - and when the database will grow,.. this is a no-go.
What am I doing wrong?
Although I am working now a while with Spring Boot and in particular with the JPA part, I am still far away from being an expert. Let me know, should you need more information.
EDIT:
I debugged a bit a realized that because I am combining JavaFX and Spring Boot the startup of the application differs from "normal" setup. In a non JavaFX application the SpringApplication.run() call is located in main(). In a JavaFX application the call is located in init() - see also https://better-coding.com/javafx-spring-boot-gradle-project-setup-guide-and-test/ as a result within SpringApplication deduceMainApplicationClass() will return null. Could that be the root cause?
The trace looks like this:
INFO 17:00 o.s.b.StartupInfoLogger.logStarting:50: Starting application on ThinkPad with PID 6664 (started by Alexander in C:\Users\Alexander\Documents\Codebase\agiletunes-codespace\agiletunes-productmanager)
INFO 17:00 o.s.b.SpringApplication.logStartupProfileInfo:646: No active profile set, falling back to default profiles: default
INFO 17:00 o.s.d.r.c.RepositoryConfigurationDelegate.registerRepositoriesIn:126: Bootstrapping Spring Data repositories in DEFAULT mode.
INFO 17:00 o.s.d.r.c.RepositoryConfigurationDelegate.registerRepositoriesIn:182: Finished Spring Data repository scanning in 1151ms. Found 39 repository interfaces.
INFO 17:00 c.z.h.HikariDataSource.getConnection:110: HikariPool-1 - Starting...
INFO 17:00 c.z.h.HikariDataSource.getConnection:123: HikariPool-1 - Start completed.
INFO 17:00 o.h.j.i.u.LogHelper.logPersistenceUnitInformation:31: HHH000204: Processing PersistenceUnitInfo [
name: default
...]
INFO 17:00 o.h.Version.logVersion:46: HHH000412: Hibernate Core {5.3.10.Final}
INFO 17:00 o.h.c.Environment.<clinit>:213: HHH000206: hibernate.properties not found
INFO 17:00 o.h.a.c.r.j.JavaReflectionManager.<clinit>:49: HCANN000001: Hibernate Commons Annotations {5.0.4.Final}
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration boolean -> org.hibernate.type.BooleanType#5ffacf79
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration boolean -> org.hibernate.type.BooleanType#5ffacf79
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration java.lang.Boolean -> org.hibernate.type.BooleanType#5ffacf79
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration numeric_boolean -> org.hibernate.type.NumericBooleanType#4058800
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration true_false -> org.hibernate.type.TrueFalseType#46bb075a
DEBUG 17:00 o.h.t.BasicTypeRegistry.register:156: Adding type registration yes_no -> org.hibernate.type.YesNoType#7d390456
.. more lines of registrations and ParameterValues ..
DEBUG 17:01 o.h.t.EnumType.setParameterValues:126: Using NAMED-based conversion for Enum com.agiletunes.shared.domain.risk.Risk$Severity
DEBUG 17:01 o.h.t.EnumType.setParameterValues:126: Using NAMED-based conversion for Enum com.agiletunes.shared.domain.risk.Risk$Type
DEBUG 17:01 o.h.t.EnumType.setParameterValues:126: Using NAMED-based conversion for Enum com.agiletunes.shared.domain.risk.Risk$Type
TRACE 17:01 o.h.t.s.TypeConfiguration.sessionFactoryCreated:195: Handling #sessionFactoryCreated from [org.hibernate.internal.SessionFactoryImpl#25b9be87] for TypeConfiguration
INFO 17:01 o.s.o.j.AbstractEntityManagerFactoryBean.buildNativeEntityManagerFactory:415: Initialized JPA EntityManagerFactory for persistence unit 'default'
INFO 17:01 o.s.b.StartupInfoLogger.logStarted:59: Started application in 20.227 seconds (JVM running for 22.033)
INFO 17:01 o.h.h.i.QueryTranslatorFactoryInitiator.initiateService:47: HHH000397: Using ASTQueryTranslatorFactory
Hibernate:
/* select
generatedAlias0
from
Product as generatedAlias0 */ select
product0_.id as id2_59_,
product0_.goal as goal3_59_,
product0_.identifier as identifi4_59_,
product0_.level as level5_59_,
product0_.parent_id as parent_25_59_,
product0_.plannedBegin as plannedB6_59_,
product0_.plannedEnd as plannedE7_59_,
followed by thousands lines of SQL
This is my property file:
#No JMX needed - disabling it allows for faster startup
spring.jmx.enabled=false
spring.main.banner-mode=off
#no web server needed
spring.main.web-application-type=none
# Properties can be queried in the code e.g. #Value(value = "${spring.datasource.driver-class-name}") private String message;
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.url=jdbc:mysql://127.0.0.1/agiletunesdb?useSSL=false&serverTimezone=Europe/Berlin&useUnicode=true&characterEncoding=utf-8&characterSetResults=utf-8
spring.datasource.username=YYYYYY
spring.datasource.password=XXXXXX
# create db schema
#spring.jpa.hibernate.ddl-auto=create
#spring.jpa.hibernate.ddl-auto=update
#---- Naming strategy: Use underscore instead of camel case
spring.jpa.hibernate.naming.physical-strategy=org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
#---- Prevent use of deprecated [org.hibernate.id.MultipleHiLoPerTableGenerator] table-based id generator
spring.jpa.hibernate.use-new-id-generator-mappings=true
# The SQL dialect makes Hibernate generate better SQL for the chosen database
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQL5Dialect
#---- Show sql queries send to db
spring.jpa.show-sql=true
spring.jpa.properties.hibernate.use_sql_comments=true
#---- Print SQL statements spread over multiple lines for easier readibility
spring.jpa.properties.hibernate.format_sql=true
#---- show parameter values in sql statements complemental to "?"
logging.level.org.hibernate.type=TRACE
#---- Switch on colors
spring.output.ansi.enabled=ALWAYS
logging.pattern.console=%highlight(%5p) %d{HH:mm} %C{3}.%method:%L: %msg %n
The issue is that I have a CommandLineRunner which is supposed to create test data when the database is empty. This CommandLineRunner tries to read all entities from a certain table and if nothing is returned it populates all tables. The trouble is that something is wrong with the lazy loading of the related entity. I have to investigate this and if needed I'll open a new question.

Writing from cascalog to MySQL does not work. How to debug this?

I'm trying to write the result of a cascalog query into a MySQL-Database. For this, I'm using cascading-jdbc and following an example i found here. I'm using cascading-jdbc-core and cascading-jdbc-mysql in version 3.0.0.
I'm executing precisely this code from my REPL:
(let [data [["foo1" "bar1"]
["foo2" "bar2"]]
query-params (into-array String ["?col1" "?col2"])
column-names (into-array String ["col1" "col2"])
update-params (into-array String ["?col1"])
update-column-names (into-array String ["col1"])
jdbc-tap (fn []
(let [scheme (JDBCScheme.
(Fields. query-params)
column-names
nil
(Fields. update-params)
update-column-names)
table-desc (TableDesc.
"test_table"
query-params
column-names
(into-array String []))
tap (JDBCTap.
"jdbc:mysql://192.168.99.101:3306/test_db?user=root&password=my-secret-pw"
"com.mysql.jdbc.Driver"
table-desc
scheme)]
tap))]
(?<- (jdbc-tap)
[?col1 ?col2]
(data ?col1 ?col2)))
When I'm running the code, I'm seeing these logs inside the REPL:
15/12/11 11:08:44 INFO hadoop.FlowMapper: sinking to: JDBCTap{connectionUrl='jdbc:mysql://192.168.99.101:3306/test_db?user=root&password=my-secret-pw', driverClassName='com.mysql.jdbc.Driver', tableDesc=TableDesc{tableName='test_table', columnNames=[?col1, ?col2], columnDefs=[col1, col2], primaryKeys=[]}}
15/12/11 11:08:44 INFO mapred.Task: Task:attempt_local1324562503_0006_m_000000_0 is done. And is in the process of commiting
15/12/11 11:08:44 INFO mapred.LocalJobRunner:
15/12/11 11:08:44 INFO mapred.Task: Task 'attempt_local1324562503_0006_m_000000_0' done.
15/12/11 11:08:44 INFO mapred.LocalJobRunner: Finishing task: attempt_local1324562503_0006_m_000000_0
15/12/11 11:08:44 INFO mapred.LocalJobRunner: Map task executor complete.
Everything looks fine. However, no data is written. I checket with tcpdump that not even a connection with my local MySQL-database is being established. Also, when I change the JDBC-connection-string to obvious wrong values (user names that do not exist, a non-existing DB name and even a non-existing IP for the DB server), I get the same logs that do not complain about anything.
Also, changing the jdbc-tap to stdout produces the expected values.
I do not know at all how to debug this. Is there a possibility to produce error outputs? Right now, I have no clue what is going wrong.
As it turns out, I was using the wrong version of cascading-jdbc. Cascalog 2.1.1 is using Cascading 2.5.3. Switching to a 2.5 version fixed the problem.
I was not able to see this from the error messages though (as there were none). One of the developers of cascading-jdbc was kind enough to point this out to me.

How to test whether log compaction is working or not in Kafka?

I have made the changes in server.properties file in Kafka 0.8.1.1 i.e. added log.cleaner.enable=true and also enabled cleanup.policy=compact while creating the topic.
Now when I am testing it, I pushed the following messages to the topic with following (Key, Message).
Offset: 1 - (123, abc);
Offset: 2 - (234, def);
Offset: 3 - (345, ghi);
Offset: 4 - (123, changed)
Now I pushed the 4th message with a same key as an earlier input, but changed the message. Here log compaction should come into picture. And using Kafka tool, I can see all the 4 offsets in the topic. How can I know whether log compaction is working or not? Should the earlier message be deleted, or the log compaction is working fine as the new message has been pushed.
Does it have to do anything with the log.retention.hours or topic.log.retention.hours or log.retention.size configurations? What is the role of these configs in log compaction.
P.S. - I have thoroughly gone through the Apache Documentation, but still it is not clear.
even though this question is a few months old, I just came across it doing research for my own question. I had created a minimal example for seeing how compaction works with Java, maybe it is helpful for you too:
https://gist.github.com/anonymous/f78184eaeec3ee82b15182aec24a432a
Furthermore, consulting the documentation, I used the following configuration on a topic level for compaction to kick in as quickly as possible:
min.cleanable.dirty.ratio=0.01
cleanup.policy=compact
segment.ms=100
delete.retention.ms=100
When run, this class shows that compaction works - there is only ever one message with the same key on the topic.
With the appropriate settings, this would be reproducible on command line.
Actually, the log compaction is visible only when the number of logs reaches to a very high count eg 1 million. So, if you have that much data, it's good. Otherwise, using configuration changes, you can reduce this limit to say 100 messages, and then you can see that out of the messages with the same keys, only the latest message will be there, the previous one will be deleted. It is better to use log compaction if you have full snapshot of your data everytime, otherwise you may loose the previous logs with the same associated key, which might be useful.
In order check a Topics property from CLI you can do it using Kafka-topics cmd :
https://grokbase.com/t/kafka/users/14aev0snbd/command-line-tool-for-topic-metadata
It is a good point to take a look also on log.roll.hours, which by default is 168 hours. In simple words: even in case you have not so active topic and you are not able to fill the max segment size (by default 1G for normal topics and 100M for offset topic) in a week you will have a closed segment with size below log.segment.bytes. This segment can be compacted on next turn.
You can do it with kafka-topics CLI.
I'm running it from docker(confluentinc/cp-enterprise-kafka:6.0.0).
$ docker-compose exec kafka kafka-topics --zookeeper zookeeper:32181 --describe --topic count-colors-output
Topic: count-colors-output PartitionCount: 1 ReplicationFactor: 1 Configs: cleanup.policy=compact,segment.ms=100,min.cleanable.dirty.ratio=0.01,delete.retention.ms=100
Topic: count-colors-output Partition: 0 Leader: 1 Replicas: 1 Isr: 1
but don't get confused if you don't see anything in Config field. It happens if default values were used. So, unless you see cleanup.policy=compact in the output - the topic is not compacted.

How can I create a parquet file bigger than node's assigned memory?

I'm trying to create a parquet file from a table stored in mysql. The source contains millions of rows and I get a GC Overhead limit exception after a couple of minutes.
Can apache drill be configured in a way that allows operations to use disk temporarily in case there is no more RAM available?
This were my steps before getting the error:
Put the mysql jdbc connector inside jars/3rdparty
Execute sqlline.bat -u "jdbc:drill:zk=local"
Navigate to http://localhost:8047/storage
Configure a new storage pluggin to connect to mysql
Navigate to http://localhost:8047/query and execute the following queries
ALTER SESSION SET `store.format` = 'parquet';
ALTER SESSION SET `store.parquet.compression` = 'snappy';
create table dfs.tmp.`bigtable.parquet` as (select * from mysql.schema.bigtable)
Then I get the error and the aplication ends:
Node ran out of Heap memory, exiting.
java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2149)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1956)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3308)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:463)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3032)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2280)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2673)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2546)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2504)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1370)
at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
at org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup(JdbcRecordReader.java:177)
at org.apache.drill.exec.physical.impl.ScanBatch.(ScanBatch.java:101)
at org.apache.drill.exec.physical.impl.ScanBatch.(ScanBatch.java:128)
at org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch(JdbcBatchCreator.java:40)
at org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch(JdbcBatchCreator.java:33)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:151)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131)
at org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174)
at org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:105)
at org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79)
at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230)
at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Check drill-env.sh located in <drill_installation_directory>/conf
By default values are:
DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_HEAP="4G"
The default memory for a Drillbit is 8G, but Drill prefers 16G or more depending on the workload.
If you have sufficient RAM you can configure it as 16G
You can read in detail in Drill's documentation.

Neo4j server hangs every 2 hours consistently. Please help me understand if something is wrong with the configuration

We have a neo4j graph database with around 60 million nodes and an equivalent relationships.
We have been facing consistent packet drops and delays in processing and a complete hung server after 2 hours. We had to shutdown and restart our servers every time this happens and we are having trouble understanding where we went wrong with our configuration.
We are seeing the following kind of exceptions in the console.log file -
java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.jetty.server.HttpConnection - HttpConnection#609c1158{FILLING}
java.lang.IllegalStateException: s=DISPATCHED i=true a=null o.e.j.util.thread.QueuedThreadPool
java.lang.IllegalStateException: org.eclipse.jetty.util.SharedBlockingCallback$BlockerTimeoutException
o.e.j.util.thread.QueuedThreadPool - Unexpected thread death: org.eclipse.jetty.util.thread.QueuedThreadPool$3#59d5a975 in
qtp1667455214{STARTED,14<=21<=21,i=0,q=58}
org.eclipse.jetty.server.Response - Committed before 500 org.neo4j.server.rest.repr.OutputFormat$1#39beaadf
o.e.jetty.servlet.ServletHandler - /db/data/cypher java.lang.IllegalStateException: Committed at
org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253)
~[jetty-server-9.2.
org.eclipse.jetty.server.HttpChannel - /db/data/cypher java.lang.IllegalStateException: Committed at
org.eclipse.jetty.server.Response.resetBuffer(Response.java:1253)
~[jetty-server-9.2.
org.eclipse.jetty.server.HttpChannel - Could not send response error 500: java.lang.IllegalStateException: Committed
o.e.jetty.server.ServerConnector - Stopped
o.e.jetty.servlet.ServletHandler - /db/data/cypher org.neo4j.graphdb.TransactionFailureException: Transaction was marked
as successful, but unable to commit transaction so rolled back.
We are using neo4j enterprise edition 2.2.5 server in SINGLE/NON CLUSTER mode on Azure D series 8 core CPU,56 GB RAM UBUNTU 14.04 LTS machine with an attached 500GB data disk.
Here is a snapshot of the sizes of neostore files
8.5G Oct 2 15:48 neostore.propertystore.db
15G Oct 2 15:48 neostore.relationshipstore.db
2.5G Oct 2 15:48 neostore.nodestore.db
6.9M Oct 2 15:48 neostore.relationshipgroupstore.db
3.7K Oct 2 15:07 neostore.schemastore.db
145 Oct 2 15:07 neostore.labeltokenstore.db
170 Oct 2 15:07 neostore.relationshiptypestore.db
The Neo4j configuration is as follows -
Allocated 30GB to file buffer cache (dbms.pagecache.memory=30G)
Allocated 20GB to JVM heap memory (wrapper.java.initmemory=20480, wrapper.java.maxmemory=20480)
Using the default hpc(High performance) type cache.
Forcing the RULE planner by default (dbms.cypher.planner=RULE)
Maximum threads processing queries is 16(twice the number of cores) - org.neo4j.server.webserver.maxthreads=16
Transaction timeout of 60 seconds - org.neo4j.server.transaction.timeout=60
Guard Timeout if query execution time is greater than 10 seconds - org.neo4j.server.webserver.limit.executiontime=10000
Rest of the settings are default
We actually want to setup a cluster of 3 nodes but before that we want to be sure if our basic configuration is correct. Please help us
--------------------------------------------------------------------------
EDITED to ADD Query Sample
Typically our cypher query frequency is 18K queries in an hour with an average of roughly 5-6 queries a second. There are also times when there are about 80 queries per second.
Our Typical Queries look like the ones below
match (a:TypeA {param:{param}})-[:RELA]->(d:TypeD) with distinct d,a skip {skip} limit 100 optional match (d)-[:RELF]->(c:TypeC)<-[:RELF]-(b:TypeB)<-[:RELB]-(a) with distinct d,a,collect(distinct b.bid) as bids,collect(distinct c.param3) as param3Coll optional match (d)-[:RELE]->(p:TypeE)<-[:RELE]-(b1:TypeB)<-[:RELB]-(a) with distinct d as distD,bids+collect(distinct b1.bid) as tbids,param3Coll,collect(distinct p.param4) as param4Coll optional match (distD)-[:RELC]->(f:TypeF) return id(distD),distD.param5,exists((distD)<-[:RELG]-()) as param6, tbids,param3Coll,param4Coll,collect(distinct id(f)) as fids
match (a:TypeA {param:{param}})-[:RELB]->(b) return count(distinct b)
MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where r.param1=true with a,a1,h match (h)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c) return a1.param,id(a1),collect(b.bid),c.param5
match (a:TypeA {param:{param}}) match (a)-[:RELB]->(b) with distinct b,a skip {skip} limit 100 match (a)-[:RELH]->(h1:TypeH) match (b)-[:RELF|RELE]->(x)<-[:RELF|RELE]-(h2:TypeH)<-[:RELH]-(a1) optional match (a1)<-[rd:RELD]-(a) with distinct a1,a,h1,b,h2,rd.param1 as param2,collect(distinct x.param3) as param3s,collect(distinct x.param4) as param4s optional match (a1)-[:RELB]->(b1) where b1.param7 in [0,1] and exists((b1)-[:RELF|RELE]->()<-[:RELF|RELE]-(h1)) with distinct a1,a,b,h2,param2,param3s,param4s,b1,case when param2 then false else case when ((a1.param5 in [2,3] or length(param3s)>0) or (a1.param5 in [1,3] or length(param4s)>0)) then case when b1.param7=0 then false else true end else false end end as param8 MERGE (a)-[r2:RELD]->(a1) on create set r2.param6=true on match set r2.param6=case when param8=true and r2.param9=false then true else false end MERGE (b)-[r3:RELM]->(h2) SET r2.param9=param8, r3.param9=param8
MATCH (a:TypeA {param:{param}})-[:RELI]->(g:TypeG {type:'type1'}) match (g)<-[r:RELI]-(a1:TypeA)-[:RELJ]->(j)-[:RELK]->(g) return distinct g, collect(j.displayName), collect(r.param1), g.gid, collect(a1.param),collect(id(a1))
match (a:TypeA {param:{param}})-[r:RELD {param2:true}]->(a1:TypeA)-[:RELH]->(b:TypeE) remove r.param2 return id(a1),b.displayName, b.firstName,b.lastName
match (a:TypeA {param:{param}})-[:RELA]->(b:TypeB) return a.param1,count(distinct id(b))
MATCH (a:TypeA {param:{param}}) set a.param1=true;
match (a:TypeE)<-[r:RELE]-(b:TypeB) where a.param4 in {param4s} delete r return count(b);
MATCH (a:TypeA {param:{param}}) return id(a);
Adding a few more strange things I have been noticing....
I am have stopped all my webservers. So, currently there are no incoming requests to neo4j. However I see that there are about 40K open file handles in TCP close/wait state implying the client has closed its connection because of time out and Neo4j has not processed it and responded to that request. I also see (from messages.log) that Neo4j server is
still processing queries and as it does this, the 40K open file handles is slowly reducing. By the time I write this post there are about 27K open file handles in TCP close/wait state.
Also I see that the queries are not continuously processed. Every once in a while I am seeing a pause in messages.log and I see these messages about log rotation because of some out of order sequence as below
Rotating log version:5630
2015-10-04 05:10:42.712+0000 INFO
[o.n.k.LogRotationImpl]: Log Rotation [5630]: Awaiting all
transactions closed...
2015-10-04 05:10:42.712+0000 INFO
[o.n.k.i.s.StoreFactory]: Waiting for all transactions to close...
committed: out-of-order-sequence:95494483 [95494476]
committing:
95494483
closed: out-of-order-sequence:95494480 [95494246]
2015-10-04 05:10:43.293+0000 INFO [o.n.k.LogRotationImpl]: Log
Rotation [5630]: Starting store flush...
2015-10-04 05:10:44.941+0000
INFO [o.n.k.i.s.StoreFactory]: About to rotate counts store at
transaction 95494483 to [/datadrive/graph.db/neostore.counts.db.b],
from [/datadrive/graph.db/neostore.counts.db.a].
2015-10-04
05:10:44.944+0000 INFO [o.n.k.i.s.StoreFactory]: Successfully rotated
counts store at transaction 95494483 to
[/datadrive/graph.db/neostore.counts.db.b], from
[/datadrive/graph.db/neostore.counts.db.a].
I also see these messages once in a while
2015-10-04 04:59:59.731+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]:
NodeCache array:66890956 purge:93 size:1.3485746GiB misses:0.80978173%
collisions:1.9829895% (345785) av.purge waits:13 purge waits:0 avg.
purge time:110ms
or
2015-10-04 05:10:20.768+0000 DEBUG [o.n.k.EmbeddedGraphDatabase]:
RelationshipCache array:66890956 purge:0 size:257.883MiB
misses:10.522135% collisions:11.121769% (5442101) av.purge waits:0
purge waits:0 avg. purge time:N/A
All of this is happening when there are no incoming requests and neo4j is processing old pending 40K requests as I mentioned above.
Since, it is a dedicated server, should not the server be processing the queries continuously without such a large pending queue? Am I missing something here? Please help me
Didn't go completely over your queries. You should examine each of the queries you send often by prefixing with PROFILE or EXPLAIN to see the query plan and get an idea how many accesses they cause.
E.g. the second match in the following query looks like being expensive since the two patterns are not connected with each other:
MATCH (a:TypeA{param:{param}})-[r:RELD]->(a1)-[:RELH]->(h) where r.param1=true with a,a1,h match (m)-[:RELL]->(d:TypeI) where (d.param2/2)%2=1 optional match (a)-[:RELB]-(b)-[:RELM {param3:true}]->(c) return a1.param,id(a1),collect(b.bid),c.bPhoto
Also enable garbage collection logging in neo4j-wrapper.conf and check if you're suffering from long pauses. If so, consider to reduce heap size.
Looks like that this issue requires more research from your side, but there is some things from my experience.
TL;DR; - I had similar issue with my own unmanaged extension, where transactions were not properly handled.
Language/connector
What language/connector is used in your application?
You should verify that:
If some popular open-source library is used - your application is using latest version. Probably there is bug in your connector.
If you have your own, hand-written solution that works with REST API - verify that ALL http request are closed at client side.
Extension/plugins
It's quite easy to mess things up, if custom-written extensions/plugins are used.
What should be checked:
All transaction are always closed (try-with-resource is used)
Neo4j settings
Verify your server configuration. For example, if you have large value for org.neo4j.server.transaction.timeout and you don't handle properly transaction at client side - you can end up with a lot of running transactions.
Monitoring
You are using Enterprise version. That means that you have access to JMX. It's good idea to check information about active Locks & Transactions.
Another Neo4j version
Maybe you can try another Neo4j version. For example 2.3.0-M03.
This will give answers for questions like:
Is this Neo4j 2.2.5 bug?
Is this existing Neo4j installation misconfiguration?
Linux configuration
Check your Linux configuration.
What is in your /etc/sysctl.conf? Are there any invalid/unrelated settings?
Another server
You can try to spin-up another server (i.e. VM at DigitalOcean), deploy database there and load it with Gatling.
Maybe your server have some invalid configuration?
Try to get rid of everything, that can be cause of the problem, to make it easier to find a problem.