response from uWsgi lost - mysql

I have a django app hosted via Nginx and uWsgi and remote Mysql DB. In a certain very simple request:
Im randomly getting 502 with below error in uwsgi.log Sometimes it works, sometimes not... I couldn't find any reason to justify when not works... Can anybody help me to understand whats going on and how to resolve it?
*** HARAKIRI ON WORKER 1 (pid: 26789, try: 1) ***
HARAKIRI: -- wchan> 0
*** backtrace of 26789 ***
uwsgi(uwsgi_backtrace+0x29) [0x451c09]
uwsgi(what_i_am_doing+0x19) [0x452069]
/lib64/libc.so.6 [0x399da302d0]
/lib64/libpthread.so.0(read+0x4b) [0x399e20daab]
/usr/lib64/mysql/libmysqlclient_r.so.15(vio_read+0x38) [0x2b03e853ab98]
/usr/lib64/mysql/libmysqlclient_r.so.15(vio_read_buff+0x43) [0x2b03e853abf3]
/usr/lib64/mysql/libmysqlclient_r.so.15 [0x2b03e853bdf8]
/usr/lib64/mysql/libmysqlclient_r.so.15(my_net_read+0x199) [0x2b03e853c1f9]
/usr/lib64/mysql/libmysqlclient_r.so.15(cli_safe_read+0x6f) [0x2b03e8535d5f]
/usr/lib64/mysql/libmysqlclient_r.so.15 [0x2b03e8536bc9]
/usr/lib64/mysql/libmysqlclient_r.so.15(mysql_real_query+0x1e) [0x2b03e853553e]
/array/purato/python/lib/python2.6/site-packages/MySQL_python-1.2.3-py2.6-linux-x86_64.egg/_mysql.so [0x2b03e82d27e9]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x6185) [0x2b03e18e36b5]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x679a) [0x2b03e18e3cca]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x679a) [0x2b03e18e3cca]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x679a) [0x2b03e18e3cca]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x56b3) [0x2b03e18e2be3]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x679a) [0x2b03e18e3cca]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e187359c]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0xddd) [0x2b03e18de30d]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e187359c]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0xddd) [0x2b03e18de30d]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x679a) [0x2b03e18e3cca]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e187349d]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e1857f9f]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e18a2f4a]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalFrameEx+0x1127) [0x2b03e18de657]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_EvalCodeEx+0x8cf) [0x2b03e18e4c7f]
/array/purato/python/lib/libpython2.6.so.1.0 [0x2b03e187349d]
/array/purato/python/lib/libpython2.6.so.1.0(PyObject_Call+0x68) [0x2b03e1848548]
/array/purato/python/lib/libpython2.6.so.1.0(PyEval_CallObjectWithKeywords+0x56) [0x2b03e18dc906]
uwsgi(python_call+0x20) [0x45f240]
uwsgi(uwsgi_request_wsgi+0x11c) [0x4619ec]
uwsgi(wsgi_req_recv+0x8f) [0x41ef7f]
uwsgi(simple_loop_run+0xc5) [0x44d3c5]
uwsgi(uwsgi_ignition+0x132) [0x44ffc2]
uwsgi(uwsgi_worker_run+0x252) [0x450262]
uwsgi(uwsgi_start+0x13ad) [0x45169d]
uwsgi(main+0x1be6) [0x454f36]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x399da1d9c4]
uwsgi [0x419fe9]
*** end of backtrace ***
HARAKIRI: --- uWSGI worker 1 (pid: 26789) WAS managing request /brizer/ since Tue Oct 13 12:01:51 2014 ---
*** HARAKIRI ON WORKER 1 (pid: 26789, try: 2) ***
DAMN ! worker 1 (pid: 26789) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 1 (new pid: 27845)
Anybody knows something about that?

You are Harakiri enable in uwsgi. Try without Harakiri.

You have a mysql query requiring more than 10 seconds to generate the response. After 10 seconds the harakiri is triggered destroying your process. You can obviously increase or remove the harakiri but it will only mitigate a serious performance problem. You should fix the query generated by /v2/cost/. If you have difficults identyfing it, enable the mysql slow_query log

Related

Google Cloud Functions app doesn't write custom logs

Recently I've set up an application in Google Cloud Functions. It's a pet project, I just want to get a grasp on how things work on this platform. I'm trying to write custom application logs. I run the exact same code as given in an example here:
function entryPoint(ServerRequestInterface $request): string
{
$log = fopen('php://stderr', 'wb');
fwrite($log, "Log entry from fwrite().\n");
fwrite($log, json_encode([
'message' => 'Structured log with error severity',
'severity' => 'error'
]) . PHP_EOL);
fwrite(
log,
json_encode([
'logName' => 'projects/lyrical-bolt-XXX/logs/cloudfunctions.googleapis.com%2Fcloud-functions',
'resource' => [
'type' => 'cloud_function',
'labels' => [
'project_id' => 'lyrical-bolt-XXX',
'function_name' => 'index',
'region' => 'europe-central2'
]
],
'textPayload' => 'Hello, Vasya! How are you??'
]) . PHP_EOL
);
return '';
}
Then I open Logs Explorer but my logs are just not there:
Am I missing anything? Maybe there are any permissions that I should grant to anything? If so, where exactly can I do it?
Your code works for me. I replaced your 3rd log with a debug:
gcloud functions logs read test \
--region=${REGION} \
--project=${PROJECT}
Yields:
LEVEL NAME EXECUTION_ID TIME_UTC LOG
D test ppg5pq2nle4b 2021-07-07 17:43:46.364 Function execution took 9 ms, finished with status code: 200
test ppg5pq2nle4b 2021-07-07 17:43:46.363 [07-Jul-2021 17:43:46] WARNING: [pool app] child 19 said into stderr: "{"message":"Structured log with debug severity","severity":"debug"}"
test ppg5pq2nle4b 2021-07-07 17:43:46.363 [07-Jul-2021 17:43:46] WARNING: [pool app] child 19 said into stderr: "{"message":"Structured log with error severity","severity":"error"}"
test ppg5pq2nle4b 2021-07-07 17:43:46.363 [07-Jul-2021 17:43:46] WARNING: [pool app] child 19 said into stderr: "Log entry from fwrite()."
D test ppg5pq2nle4b 2021-07-07 17:43:46.355 Function execution started
D test ppg579syhkvj 2021-07-07 17:43:45.126 Function execution took 3 ms, finished with status code: 200
test ppg579syhkvj 2021-07-07 17:43:45.126 [07-Jul-2021 17:43:45] WARNING: [pool app] child 17 said into stderr: "{"message":"Structured log with error severity","severity":"error"}"
test ppg579syhkvj 2021-07-07 17:43:45.126 [07-Jul-2021 17:43:45] WARNING: [pool app] child 17 said into stderr: "Log entry from fwrite()."
test ppg579syhkvj 2021-07-07 17:43:45.126 [07-Jul-2021 17:43:45] WARNING: [pool app] child 17 said into stderr: "{"message":"Structured log with debug severity","severity":"debug"}"
D test ppg579syhkvj 2021-07-07 17:43:45.123 Function execution started
D test ppg56xpfu1ym 2021-07-07 17:43:43.859 Function execution took 118 ms, finished with status code: 200
test ppg56xpfu1ym 2021-07-07 17:43:43.856 [07-Jul-2021 17:43:43] WARNING: [pool app] child 17 said into stderr: "{"message":"Structured log with debug severity","severity":"debug"}"
test ppg56xpfu1ym 2021-07-07 17:43:43.856 [07-Jul-2021 17:43:43] WARNING: [pool app] child 17 said into stderr: "{"message":"Structured log with error severity","severity":"error"}"
test ppg56xpfu1ym 2021-07-07 17:43:43.856 [07-Jul-2021 17:43:43] WARNING: [pool app] child 17 said into stderr: "Log entry from fwrite()."
D test ppg56xpfu1ym 2021-07-07 17:43:43.742 Function execution started
I test 2021-07-07 17:43:13.164 [pid1-nginx] Starting nginx (pid 18): /usr/sbin/nginx -c /tmp/nginxconf-466886520/nginx.conf
I test 2021-07-07 17:43:13.158 [pid1-nginx] Successfully connected to /tmp/google-config/app.sock after 251.013037ms
test 2021-07-07 17:43:12.951 [serve] Running /bin/sh -c exec php-fpm -R --nodaemonize --fpm-config /tmp/serve-php-051025325/php-fpm.conf
test 2021-07-07 17:43:12.950 [serve] workersFromArgs: memoryMB:256 flagAppWorkers:0 workers:2
test 2021-07-07 17:43:12.945 [serve] Could not parse memory limit; defaulting to 256MiB.
NOTE You do not include the Google Logging logName, resource etc. properties. This is metadata that wraps what you write to stdout and stderr
NOTE You have a typo on line #17 fwrite(log, should be fwrite($log,

lein cljsbuild fails with untraceable error. How do you troubleshoot cljsbuild errors?

I do not see any log file for the compilation and the error in the terminal is insufficient for me to troubleshoot further.
How do i get more verbose error logging or how should i trouble shoot this issue?
First few lines from stacktrace below
Compiling ClojureScript...
Compiling ["resources/public/js/app.js"] from ["src/cljs"]...
Compiling ["resources/public/js/app.js"] failed.
clojure.lang.ExceptionInfo: failed compiling file:resources\public\js\out\cljs\core.cljs {:file #object[java.io.File 0x7c5d1d25 "resources\\public\\js\\out\\cljs\\core.cljs"], :clojure.error/phase :compilation}
at cljs.compiler$compile_file$fn__3901.invoke(compiler.cljc:1706)
at cljs.compiler$compile_file.invokeStatic(compiler.cljc:1666)
I have a simple cljs file with the following contents
(ns moose.core)
(defn run []
(.write js/document "This is not the end!"))
My project.clj has the following config for cljsbuild
:cljsbuild
{:builds [{:id "dev"
:source-paths ["src/cljs"]
:figwheel {:on-jsload "moose.core/run"
:open-urls ["http://localhost:3449/index.html"]}
:jar true
:compiler {:main moose.core
:warnings true
:output-dir "resources/public/js/out"
:asset-path "js/out"
:output-to "resources/public/js/app.js"}}]}
:clean-targets ^{:protect false} [:target-path :compile-path "resources/public/js" "dev-target"]
Update 1
Following Alan's advice below, i created a new template and narrowed down the cause to adding a fairly old library for interacting with CouchDB
[com.ashafa/clutch "0.4.0"]
The question remains how do I get detailed/complete logs for cljsbuild.
Update 2
Turns out the position of the library in the list of dependencies has an impact.
If it appears before [com.cognitect/transit-clj "0.8.313"] compilation fails otherwise it works.
The configuration options in ClojureScript are not well documented. It is easiest to clone an existing (working) project and go from there. I would suggest starting from the cljs-template project as follows (see the README):
git clone https://github.com/cloojure/cljs-template.git demo-0212 ; temp
> cd demo-0212
~/expr/demo-0212 > ls -ldF *
-rwxrwxr-x 1 alan alan 222 Feb 12 16:04 npm-install.bash*
-rwxrwxr-x 1 alan alan 4216 Feb 12 16:04 project.clj*
-rw-rw-r-- 1 alan alan 1576 Feb 12 16:04 README.adoc
drwxrwxr-x 3 alan alan 4096 Feb 12 16:04 resources/
drwxrwxr-x 5 alan alan 4096 Feb 12 16:04 src/
drwxrwxr-x 4 alan alan 4096 Feb 12 16:04 test/
~/expr/demo-0212 > ./npm-install.bash
...<snip>... lots of stuff
At this point your project has the npm stuff needed for the unit tests.
> lein clean
> lein doo phantom test once
;; ======================================================================
;; Testing with Phantom:
doorunner - beginning
doorunner - end
Testing tst.flintstones.dino
test once - enter
globalObject: #js {:a 1, :b 2, :c 3}
(-> % .-b (+ 5) => 7
(js/makeDino) => #js {:desc blue dino-dog, :says #object[Function]}
dino.desc => blue dino-dog
dino.says(5) => Ruff-Ruff-Ruff-Ruff-Ruff!
:keep-words ("am" "having" "today")
:re-seq ("am" "having" "today")
test once - leave
Testing tst.flintstones.wilma
test each - enter
test each - leave
test each - enter
wilmaPhony/stats: #js {:lipstick red, :height 5.5}
wilma => #js {:desc patient housewife, :says #object[Function]}
test each - leave
Testing tst.flintstones.pebbles
test once - enter
test once - leave
Testing tst.flintstones.slate
logr-slate-enter
logr-slate-leave 3
Testing tst.flintstones.bambam
test each - enter
test each - leave
test each - enter
logr-bambam-enter
logr-bambam-leave 3
test each - leave
Ran 9 tests containing 22 assertions.
0 failures, 0 errors.
lein doo phantom test once 38.73s user 1.05s system 313% cpu 12.701 total
You can also fire off figwheel to see results in the browser:
> lein clean
> lein figwheel
see new webpage (30-60 sec delay)
------------------------
Figwheel template
Checkout your developer console.
I am a component!
I have bold and red text.
...etc...
------------------------

openshift installation error on centos 7

I have install the openshift in centos 7.
Installed the prerequisite and then installing the openshift via this command.
atomic-openshift-installer install
getting this error.. Please guide how to solve the same.
[WARNING]: Could not match supplied host pattern, ignoring: oo_lb_to_config
There was a problem fetching the required information. Please see /tmp/ansible.log for details.
tail -f /tmp/ansible.log
2018-07-21 12:36:47,139 p=23956 u=root | skipping: [10.142.0.2]
2018-07-21 12:36:47,160 p=23956 u=root | TASK [openshift_version : Set openshift_version for rpm installation] ************************************************************************************
2018-07-21 12:36:47,209 p=23956 u=root | included: /usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/check_available_rpms.yml for 10.142.0.2
2018-07-21 12:36:47,233 p=23956 u=root | TASK [openshift_version : Get available origin version] **************************************************************************************************
2018-07-21 12:36:47,767 p=23956 u=root | fatal: [10.142.0.2]: FAILED! => {"changed": false, "module_stderr": "Shared connection to 10.142.0.2 closed.\r\n", "module_stdout": "Traceback (most recen
t call last):\r\n File \"/tmp/ansible_aWcbKG/ansible_module_repoquery.py\", line 642, in \r\n main()\r\n File \"/tmp/ansible_aWcbKG/ansible_module_repoquery.py\", line 632, in main\r\
n rval = Repoquery.run_ansible(module.params, module.check_mode)\r\n File \"/tmp/ansible_aWcbKG/ansible_module_repoquery.py\", line 588, in run_ansible\r\n results = repoquery.repoquery()\r
\n File \"/tmp/ansible_aWcbKG/ansible_module_repoquery.py\", line 547, in repoquery\r\n rval = self._repoquery_cmd(repoquery_cmd, True, 'raw')\r\n File \"/tmp/ansible_aWcbKG/ansible_module_re
poquery.py\", line 385, in _repoquery_cmd\r\n returncode, stdout, stderr = _run(cmds)\r\n File \"/tmp/ansible_aWcbKG/ansible_module_repoquery.py\", line 356, in _run\r\n stderr=subprocess.P
IPE)\r\n File \"/usr/lib64/python2.7/subprocess.py\", line 711, in init\r\n errread, errwrite)\r\n File \"/usr/lib64/python2.7/subprocess.py\", line 1327, in _execute_child\r\n raise c
hild_exception\r\nOSError: [Errno 2] No such file or directory\r\n", "msg": "MODULE FAILURE", "rc": 1}
2018-07-21 12:36:47,770 p=23956 u=root | PLAY RECAP ***********************************************************************************************************************************************
2018-07-21 12:36:47,770 p=23956 u=root | 10.142.0.2 : ok=24 changed=2 unreachable=0 failed=1
2018-07-21 12:36:47,770 p=23956 u=root | localhost : ok=12 changed=0 unreachable=0 failed=0
2018-07-21 12:36:47,770 p=23956 u=root | INSTALLER STATUS *****************************************************************************************************************************************
You can ignore this warning:
[WARNING]: Could not match supplied host pattern, ignoring: oo_lb_to_config
There was a problem fetching the required information. Please see
/tmp/ansible.log for details.
All this is saying is that you have not defined a load balancer (haproxy). So your DNS needs to point to the masters or your are installing an haproxy manually.
tail -f /tmp/ansible.log
2018-07-21 12:36:47,139 p=23956 u=root | skipping: [10.142.0.2] 2018-07-21 12:36:47,160 p=23956 u=root | TASK [openshift_version : Set openshift_version for rpm installation] ************************************************************************************
2018-07-21 12:36:47,209 p=23956 u=root | included: /usr/share/ansible/openshift-ansible/roles/openshift_version/tasks/check_available_rpms.yml for 10.142.0.2 2018-07-21 12:36:47,233 p=23956 u=root | TASK [openshift_version : Get available origin version] **************************************************************************************************
I had a similiar problem. It was due to an inconsistent setup of all of the nodes, having different versions of the rpms. But I have the feeling that this isn't going on here.
Try running the ansible-playbooks with the -vvvv option (the advanced install). This may help you find what the issue is.

WebSphere Liberty Profile blocking on Datasource lookup

I'm trying to configure a datasource in IBM WebSphere Liberty Profile (16.0.0.3) and this is what I've done so far:
server.xml
<authData id="dbuser" password="{xor}blablabla" user="MY_USER"/>
<dataSource id="Oracle" isolationLevel="TRANSACTION_READ_COMMITTED"
jdbcDriverRef="OracleDriver"
jndiName="EPMS_DS"
recoveryAuthDataRef="dbuser"
type="javax.sql.ConnectionPoolDataSource">
<properties.oracle databaseName="DBNAME" portNumber="1521" serverName="SERVERNAME"/>
</dataSource>
<jdbcDriver id="OracleDriver"
javax.sql.ConnectionPoolDataSource="oracle.jdbc.pool.OracleConnectionPoolDataSource"
libraryRef="shared-library"/>
web.xml
<resource-env-ref>
<description>The Oracle DS</description>
<resource-env-ref-name>jdbc/OracleDS</resource-env-ref-name>
<resource-env-ref-type>javax.sql.DataSource</resource-env-ref-type>
</resource-env-ref>
ibm-web-bnd.xml
<resource-ref name="jdbc/OracleDS" binding-name="EPMS_DS">
<authentication-alias name="dbuser" />
</resource-ref>
However, besides the application server is taking more than 2 minutes to startup, my application seems to freeze on the following instruction:
ctx = new InitialContext();
ctx.lookup("java:comp/env/jdbc/OracleDS");
The log doesn't show any errors, the last line it shows is an application's debug message indicating it is going to do a JNDI lookup.
I've also tried different configurations in server.xml, without <authData> and explicitly defining user and password on the datasource, but with identical results:
<dataSource id="Oracle" isolationLevel="TRANSACTION_READ_COMMITTED" jdbcDriverRef="OracleDriver" jndiName="EPMS_DS" type="javax.sql.ConnectionPoolDataSource">
<properties.oracle URL="jdbc:oracle:thin:#SERVERNAME:1521:DBNAME" password="{xor}blablabla" user="MY_USER"/>
</dataSource>
Sadly, Liberty Profile doesn't seem to provide a way to test the DB connection, but everything seems correctly configured (I can assure the credentials are correct, as well as the server name and port). What am I missing here?
EDIT #1
Following njr's suggestion, I've performed a thread dump and here is a summary:
- waiting on com.ibm.tx.jta.impl.EventSemaphore#737eaefc
- waiting on com.ibm.ws.objectManager.FileLogOutput$FlushHelper#19d51071
- waiting on com.ibm.ws.objectManager.FileLogOutput$NotifyHelper#2fa0da91
- waiting on com.ibm.ws.objectManager.ObjectManagerState$CheckpointHelper#5b0919fc
- waiting on com.ibm.ws.sib.msgstore.persistence.dispatcher.SpillDispatcher$DispatchingLock#1620db94
(8 Occorrences, but different instances)
...
- waiting on com.ibm.ws.threading.internal.BoundedBuffer$GetQueueLock#c8a05b6
(56 Occorrences, but different instances)
...
- waiting on java.lang.Object#4c1d5897
- waiting on java.lang.ref.Reference$Lock#5448da4c
- waiting on java.lang.ref.ReferenceQueue$Lock#f91b025
- waiting on java.util.LinkedList#5b213416
- waiting on java.util.LinkedList#6cb46e1f
- waiting on java.util.TaskQueue#f50561c
(14 Occorrences, but different instances)
...
- waiting on java.util.concurrent.atomic.AtomicReference#5476d077
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#4da17c93
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#513339c6
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#5dc2ae0f
- waiting on org.eclipse.osgi.framework.eventmgr.EventManager$EventThread#236970be
- waiting on org.eclipse.osgi.framework.eventmgr.EventManager$EventThread#6dfdd5
- waiting on org.eclipse.osgi.framework.eventmgr.EventManager$EventThread#72ce4e1c
- blocked on com.ibm.tx.jta.embeddable.impl.EmbeddableTMHelper#5748c911
- blocked on com.ibm.tx.jta.embeddable.impl.EmbeddableTMHelper#5748c911
Can someone help me to interpret this?
EDIT #2
Here's where the complete stack trace of the blocked threads:
LargeThreadPool-thread-148 [217] (BLOCKED)
com.ibm.tx.jta.embeddable.impl.EmbeddableTMHelper.start line: 63
com.ibm.tx.jta.util.TxTMHelper.start line: 461
com.ibm.tx.util.TMHelper.start line: 74
com.ibm.tx.jta.util.TxTMHelper.checkTMState line: 500
com.ibm.tx.util.TMHelper.checkTMState line: 116
com.ibm.tx.jta.impl.TranManagerSet.registerResourceInfo line: 270
com.ibm.ws.transaction.services.TransactionManagerService.registerResourceInfo line: 260
com.ibm.ejs.j2c.ConnectionManager.registerXAResourceInfo line: 2537
com.ibm.ejs.j2c.ConnectionManager.<init> line: 509
com.ibm.ejs.j2c.ConnectionManagerServiceImpl.getConnectionManager line: 407
com.ibm.ejs.j2c.ConnectionManagerServiceImpl.getConnectionManager line: 54
com.ibm.ws.jca.cm.AbstractConnectionFactoryService.createResource line: 146
com.ibm.ws.injectionengine.osgi.internal.IndirectJndiLookupObjectFactory.createResourceWithFilter line: 346
com.ibm.ws.injectionengine.osgi.internal.IndirectJndiLookupObjectFactory.createResource line: 319
com.ibm.ws.injectionengine.osgi.internal.IndirectJndiLookupObjectFactory.getObjectInstance line: 133
com.ibm.ws.injectionengine.osgi.internal.IndirectJndiLookupObjectFactory.getObjectInstance line: 99
com.ibm.wsspi.injectionengine.InjectionBinding.getInjectionObjectInstance line: 1556
com.ibm.wsspi.injectionengine.InjectionBinding.getInjectionObject line: 1433
com.ibm.wsspi.injectionengine.InjectionBinding.getInjectionObject line: 1389
com.ibm.ws.injectionengine.osgi.internal.naming.InjectionJavaColonHelper.getObjectInstance line: 116
com.ibm.ws.jndi.url.contexts.javacolon.internal.JavaURLContext.lookup line: 333
com.ibm.ws.jndi.url.contexts.javacolon.internal.JavaURLContext.lookup line: 371
org.apache.aries.jndi.DelegateContext.lookup line: 161
javax.naming.InitialContext.lookup line: 417
pt.sibs.epms.persistence.utils.EntityManagerFactoryController.jndiLookupUsed line: 264
pt.sibs.epms.persistence.utils.EntityManagerFactoryController.checkConfiguration line: 115
pt.sibs.epms.persistence.utils.EntityManagerFactoryController.<init> line: 95
pt.sibs.epms.persistence.utils.EntityManagerFactoryController.<init> line: 51
pt.sibs.epms.persistence.utils.EntityManagerFactoryController$SingletonHolder.<clinit> line: 81
pt.sibs.epms.persistence.utils.EntityManagerFactoryController.getInstance line: 88
pt.sibs.epms.util.logging.LoggerConfiguration.<clinit> line: 33
pt.sibs.epms.ecc.renderer.HtmlFormRenderer.<clinit> line: 25
java.lang.Class.forName0 line: not available [native method]
java.lang.Class.forName line: 348
com.ibm.ws.webcontainer.osgi.webapp.WebApp.addClassToHandlesTypesStartupSet line: 1104
com.ibm.ws.webcontainer.osgi.webapp.WebApp.scanForHandlesTypesClasses line: 1038
com.ibm.ws.webcontainer.webapp.WebApp.initializeServletContainerInitializers line: 2493
com.ibm.ws.webcontainer.webapp.WebApp.initialize line: 1037
com.ibm.ws.webcontainer.webapp.WebApp.initialize line: 6545
com.ibm.ws.webcontainer.osgi.DynamicVirtualHost.startWebApp line: 466
com.ibm.ws.webcontainer.osgi.DynamicVirtualHost.createRunnableHandler line: 264
com.ibm.ws.webcontainer.osgi.DynamicVirtualHost.createRunnableHandler line: 329
com.ibm.ws.http.internal.VirtualHostImpl.discriminate line: 251
com.ibm.ws.http.dispatcher.internal.channel.HttpDispatcherLink.ready line: 301
com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleDiscrimination line: 471
com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.handleNewRequest line: 405
com.ibm.ws.http.channel.internal.inbound.HttpInboundLink.processRequest line: 285
com.ibm.ws.http.channel.internal.inbound.HttpICLReadCallback.complete line: 66
com.ibm.ws.channel.ssl.internal.SSLReadServiceContext$SSLReadCompletedCallback.complete line: 1777
com.ibm.ws.tcpchannel.internal.WorkQueueManager.requestComplete line: 504
com.ibm.ws.tcpchannel.internal.WorkQueueManager.attemptIO line: 574
com.ibm.ws.tcpchannel.internal.WorkQueueManager.workerRun line: 929
com.ibm.ws.tcpchannel.internal.WorkQueueManager$Worker.run line: 1018
java.util.concurrent.ThreadPoolExecutor.runWorker line: 1142
java.util.concurrent.ThreadPoolExecutor$Worker.run line: 617
java.lang.Thread.run line: 745
And the second thread:
LargeThreadPool-thread-3 [33] (BLOCKED)
com.ibm.tx.jta.embeddable.impl.EmbeddableTMHelper.start line: 63
com.ibm.tx.jta.util.TxTMHelper.start line: 461
com.ibm.tx.util.TMHelper.start line: 74
com.ibm.tx.jta.util.TxTMHelper.checkTMState line: 500
com.ibm.tx.util.TMHelper.checkTMState line: 116
com.ibm.tx.jta.impl.TranManagerSet.begin line: 167
com.ibm.ejs.csi.TranStrategy.beginGlobalTx line: 593
com.ibm.ejs.csi.Required.preInvoke line: 56
com.ibm.ejs.csi.TransactionControlImpl.preInvoke line: 222
com.ibm.ejs.container.EJSContainer.preInvokeActivate line: 3176
com.ibm.ejs.container.EJSContainer.EjbPreInvoke line: 2576
com.ibm.ejs.container.TimedObjectWrapper.invokeCallback line: 84
com.ibm.ejs.container.TimerNpRunnable.doWork line: 196
com.ibm.ejs.container.TimerNpRunnable.run line: 103
java.util.concurrent.Executors$RunnableAdapter.call line: 511
java.util.concurrent.FutureTask.run line: 266
java.util.concurrent.ThreadPoolExecutor.runWorker line: 1142
java.util.concurrent.ThreadPoolExecutor$Worker.run line: 617
java.lang.Thread.run line: 745
Edit #3
The thread that is WAITING and, apparently, blocking the other two threads:
LargeThreadPool-thread-38 [103] (WAITING)
java.lang.Object.wait line: not available [native method]
java.lang.Object.wait line: 502
com.ibm.tx.jta.impl.EventSemaphore.waitEvent line: 71
com.ibm.tx.jta.impl.RecoveryManager.waitForReplayCompletion line: 1273
com.ibm.tx.jta.impl.TxRecoveryAgentImpl.initiateRecovery line: 413
com.ibm.ws.recoverylog.spi.RecoveryDirectorImpl.directInitialization line: 751
com.ibm.ws.recoverylog.spi.RecoveryDirectorImpl.driveLocalRecovery line: 1240
com.ibm.ws.recoverylog.spi.RecLogServiceImpl.start line: 125
com.ibm.tx.jta.embeddable.impl.EmbeddableTMHelper.start line: 130
com.ibm.tx.jta.util.TxTMHelper.start line: 461
com.ibm.tx.util.TMHelper.start line: 74
com.ibm.tx.jta.util.TxTMHelper.checkTMState line: 500
com.ibm.tx.util.TMHelper.checkTMState line: 116
com.ibm.tx.jta.impl.TranManagerSet.begin line: 167
com.ibm.ws.transaction.services.TransactionManagerService.begin line: 281
com.ibm.ws.concurrent.persistent.internal.PersistentExecutorImpl$PollingTask.run line: 2239
Not sure if it is related, but ffdc is showing the following exception:
------Start of DE processing------ = [09-11-2016 14:41:09:006 GMT]
Exception = com.ibm.ws.recoverylog.spi.LogIncompatibleException
Source = com.ibm.ws.recoverylog.spi.LogHandle
probeid = 326
Stack Dump = com.ibm.ws.recoverylog.spi.LogIncompatibleException
at com.ibm.ws.recoverylog.spi.LogFileHandle.fileOpen(LogFileHandle.java:522)
at com.ibm.ws.recoverylog.spi.LogHandle.openLog(LogHandle.java:324)
at com.ibm.ws.recoverylog.spi.MultiScopeRecoveryLog.openLog(MultiScopeRecoveryLog.java:602)at com.ibm.ws.recoverylog.spi.RecoveryLogImpl.openLog(RecoveryLogImpl.java:77)
at com.ibm.tx.jta.impl.RecoveryManager.run(RecoveryManager.java:1835)
at java.lang.Thread.run(Thread.java:745)
In your Edit #3, the thread that is waiting in
com.ibm.tx.jta.impl.RecoveryManager.waitForReplayCompletion
will have spawned another recoveryManager thread who's role is to access the transaction log files in your flesystem. That other thread Should do the minimal amount of file processing necessary before signalling to the waiting thread that it may continue. Can you see another thread with a stack containing
com.ibm.tx.jta.impl.RecoveryManager.run ?
I am concerned about the LogIncompatibleException. It suggests that the transaction log files on your filesystem are
corrupt. This should not cause the server to hang and I believe you've hit a product defect.
If you need to make progress quickly, it may be appropriate in your scenario to delete the transaction log files.
Please note that this is something we only suggest to customers with extreme care as the transaction logs ensure
the integrity of distributed transactions. In a production environment we'd generally recommend that such action
is only taken under the guidance of IBM Level 3 Service. But in a test/evaluation scenario it can be applicable.
The Liberty transaction log info is stored in the /wlp/usr/servers//tranlog
directory. If appropriate the tranlog and partnerlog subdirectories may be deleted and the server restarted.

ejabberd odbc error + unable to figure out exact source

my ejabberd server is constantly crashing and it is somewhat related to ODBC module but I am not able to understand the issue. Below are the logs. Can anyone help me interpret?
I have copy pasted a few messages below.
=ERROR REPORT==== 14-Oct-2015::00:27:51 === ** State machine <0.27422.5> terminating ** Last message in was {'$gen_sync_event', {<0.27896.5>,#Ref<0.0.10.246367>}, {sql_cmd, {sql_query,<<"SELECT 1;">>}, {1444,782471,512104}}} ** When State == session_established ** Data == {state,<0.27423.5>,odbc,30000,<<"abchost.com">>,1000, {0,{[],[]}}} ** Reason for termination = ** {function_clause,[{odbc,sql_query, [<0.27423.5>,<<"SELECT 1;">>,59000], [{file,"odbc.erl"},{line,183}]}, {ejabberd_odbc,sql_query_internal,1, [{file,"src/ejabberd_odbc.erl"}, {line,468}]}, {ejabberd_odbc,run_sql_cmd,4, [{file,"src/ejabberd_odbc.erl"}, {line,374}]}, {p1_fsm,handle_msg,10, [{file,"src/p1_fsm.erl"},{line,582}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,237}]}]}
and
00:27:51.573 [error] CRASH REPORT Process <0.27434.5> with 0 neighbours exited with reason: no function clause matching odbc:sql_query(<0.27435.5>, <<"SELECT 1;">>, 59000) line 183 in p1_fsm:terminate/8 line 760
and
00:27:53.965 [error] gen_fsm <0.27439.5> in state session_established terminated with reason: no function clause matching odbc:sql_query(<0.27442.5>, <<"SELECT 1;">>, 59000) line 183
and
=ERROR REPORT==== 14-Oct-2015::00:27:51 === ** Generic server <0.27435.5> terminating ** Last message in was {'DOWN',#Ref<0.0.10.239386>,process,<0.27434.5>, {function_clause, [{odbc,sql_query, [<0.27435.5>,<<"SELECT 1;">>,59000], [{file,"odbc.erl"},{line,183}]}, {ejabberd_odbc,sql_query_internal,1, [{file,"src/ejabberd_odbc.erl"}, {line,468}]}, {ejabberd_odbc,run_sql_cmd,4, [{file,"src/ejabberd_odbc.erl"}, {line,374}]}, {p1_fsm,handle_msg,10, [{file,"src/p1_fsm.erl"},{line,582}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,237}]}]}} ** When Server state == {state,#Port<0.2314388>,undefined,<0.27434.5>, undefined,on,false,false,off,connected, undefined,0, [#Port<0.2314379>,#Port<0.2314376>], #Port<0.2314386>,#Port<0.2314366>} ** Reason for termination == ** {stopped, {'EXIT',<0.27434.5>, {function_clause, [{odbc,sql_query, [<0.27435.5>,<<"SELECT 1;">>,59000], [{file,"odbc.erl"},{line,183}]}, {ejabberd_odbc,sql_query_internal,1, [{file,"src/ejabberd_odbc.erl"},{line,468}]}, {ejabberd_odbc,run_sql_cmd,4, [{file,"src/ejabberd_odbc.erl"},{line,374}]}, {p1_fsm,handle_msg,10,[{file,"src/p1_fsm.erl"},{line,582}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,237}]}]}}}
and
00:27:51.552 [error] Supervisor odbc_sup had child [] started with {odbc,start_link_sup,undefined} at <0.27432.5> exit with reason {stopped,{'EXIT',<0.27429.5>,{function_clause,[{odbc,sql_query,[<0.27432.5>,<<"SELECT 1;">>,59000],[{file,"odbc.erl"},{line,183}]},{ejabberd_odbc,sql_query_internal,1,[{file,"src/ejabberd_odbc.erl"},{line,468}]},{ejabberd_odbc,run_sql_cmd,4,[{file,"src/ejabberd_odbc.erl"},{line,374}]},{p1_fsm,handle_msg,10,[{file,"src/p1_fsm.erl"},{line,582}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,237}]}]}}} in context child_terminated
I think you are referring to a bug that has already been fixed in ejabberd master branch: https://github.com/processone/ejabberd/commit/7d99484859df7c33a73da92d84b5cb5bd27a244e