Fiware CEP server stops responding - fiware

In developing in Fi-Cloud's CEP I've been having an issue that has been happening repeatedly. As I'm trying to develop a definition to perform a task, CEP's server and Authoring Tool stop responding, although ssh is still responsive.
This issue happens as I develop. I'm using the AuthoringTool to alter the definition bit by bit and then I re-upload it to the server through the authoring tool's export feature.
To reinitiate the proton with the new definition each time I alter it, I use Google's Postman with this single operation:
-PUT (url:http://{ip}:8080/ProtonOnWebServerAdmin/resources/instances/ProtonOnWebServer)
header: 'Content-Type' : 'application/json'; body : {"action": "ChangeDefinitions","definitions-url" : "/ProtonOnWebServerAdmin/resources/definitions/Definition_Name"}
At the same time, I'm logged in with three ssh intances, one to monitor the files being created on /opt/tomcat10/sample/ and other things, and the other two to 'tail -f ' log files the definition writes to, as events are processed: one log for events recieved and another log for events detected by the EPAgent.
I'm iterating through these procedures over and over as I'm developing and eventualy CEP server and the Authoring Tool stop responding.
By "tailing" tomcat's log file (# tail -f /opt/tomcat10/logs/catalina.out) I can see that, when under these circumstances, if I attemp a:
-GET (url: http://{ip}:8080/ProtonOnWebServerAdmin/resources/instances/ProtonOnWebServer)
I get no response back and tomcat logs the following response:
11452100 [http-bio-8080-exec-167] ERROR org.apache.wink.server.internal.RequestProcessor - An unhandled exception occurred which will be propagated to the container.
java.lang.OutOfMemoryError: PermGen space
Exception in thread "http-bio-8080-exec-167" java.lang.OutOfMemoryError: PermGen space
Ssh is still responsive and I can look at tomcat's log this way.
To get over this and continue, I exit ssh connections and restart CEP's instance in the Fi-Cloud.
Is the procedure I'm using to re-upload and re-run the definition inapropriate? Should I take a different approach to developing?

When you update a definition that the CEP is already working with, and you want the CEP engine to work with the updated definition, you need to:
Export the definition using the authoring tool export (as you did)
Stop the engine run, using REST PUT
PUT //host:8080/ProtonOnWebServerAdmin/resources/instances/ProtonOnWebServer
{"action":"ChangeState","state":"stop"}
Start the engine, using REST PUT
PUT //host:8080/ProtonOnWebServerAdmin/resources/instances/ProtonOnWebServer
{"action":"ChangeState","state":"start"}
You don't need to activate the "ChangeDefinitions" action, since it is the same definition name that the engine is already working with.
Activating "ChangeDefinitions" action, only influences the next run of the CEP, and has no influence on the current run.
This answer your question about how you should update a CEP definition.
Hope it will solve your issue.

Related

Why my Soffid JSON REST Web Services Connector does not update an object in the target system?

I am trying to connect my Soffid 3 server with our custom web application named Schrift. I am using а JSON REST Web Services Connector for this purpose. I added REST Web service plugin and then configured an agent with JSON/XML/SOAP Rest webservice type.
Loading of objects is working fine. My REST connector connects to the web service successfully and gets data of the accounts.
The problem is when I am trying to update some data (for example, I am trying to lock an account), nothing happens. And unfortunately I don't know what should be happening. When should REST connector send updated data to the managed system and in which way? I didn't find any log entries saying that REST connector was trying to update an object on managed system. Maybe I did smth wrong or missed something.
I would appreciate for any help. I can post any conf or log details if you need.
Update#1
(I did some investigation after the first answer)
I checked the agent settings: Read only and Manual account creation are set to no
The account was set to unmanaged type, but I succeeded in changing its type to shared and then to single without getting an error. Now it is set to single
The task queue is empty.
Also I've checked that update method is present and update properties are set correctly. updateParams is not set (it means that all attributes should be sent to the managed system).
But when I change status of the account (from Enable to Disable), nothing happens.
In the console log I can see only these lines
14-Sep-2021 13:26:29.708 INFO [BPM-Scheduler:192.168.7.121:1] com.soffid.iam.bpm.job.JobExecutorThread.run No job to execute
When I manually run the task Analize impact for changes on Schrift, Execution log shows
Changes detected for accounts
=============================
NO CHANGE DETECTED
Changes detected for roles
=============================
NO CHANGE DETECTED
Update#2
After many attempts I made some progress. Now when I make some changes in the account, the task named UpdateAccount baklykov#irf.com.ua#Schrift appears, but runs with an error.
At first it was 415 Unsupported Media Type error as I wrote in comments, but now it looks a little different
Throws exception updating object : Extensible object [type = account]
EmployeeEmail: baklykov#irf.com.ua
IsLockedOut: true (log truncated) ...
caused by Unexpected response, Content-Type: null
Update#3
I found out that soffid's request for updating the object was in improper format (all the parameters were passed in the html request instead of putting them in json body)
After researching I found a method's property called Encoding and set it to application/json value.
Now the parameters are passed in json body (that's what I need), but now the problem is that soffid puts all the parameters in json body, including the key parameter by which the object for updating should be determined. My guess this is the reason why the object in the target system is still not updated.
In other words my application expects a request like this:
https://myapp.mysite.com/api/v1/Soffid/Employees?EmployeeEmail=baklykov%40irf.com.ua :
{"EmployeeLastName":"Baklykov","EmployeeFirstName":"Ivan"}
but Soffid sends this:
https://myapp.mysite.com/api/v1/Soffid/Employees:
{"EmployeeLastName":"Baklykov","EmployeeFirstName":"Ivan","EmployeeEmail":"baklykov#irf.com.ua"}
The system should have created a UpdateAccount task in the task queue. Please, verify:
The task engine is in automatic mode. In read-only or manual mode, no task will be created.
If you are updating an account, check the account is not set as unmanaged. In that case, no tasks is created.
Finally, verify the task queue has not held the task up.
Have you checked the engine mode? Look at Main Menu > Administration > Configure Soffid > Integration engine > Smart engine settings
It should be set to automatic.

Exception handling with Realbrowserlocusts

In using realbrowserlocusts class it appears that I'm limited in any exception handling.
The only reference that partially works is: self.client.wait.until(EC.visibility_of_element_located ....
In a failed condition where the element is not found the script simply starts over again. With the script I'm working with I need to maintain a solid session state; I need to throw and exception(report an error), log the user out and then let the script start over again. I've been testing out the behavior with the locust.py script that Nick B. created with several approaches to "try, except" and they work running without realbrowserlocusts (selenium only) but with it the execution just stops.
Any examples would be greatly appreciated.
In its current format I've been able to run 3x the amount of a browser-based load per/agent/slave than our commercial tool. My goal is to replace it with a locust/selenium approach.
locust-plugins's WebdriverUser has a little bit better exception handling I think. A failure to find an element will log a failed request and if you use RescheduleTaskOnFail (as in the the example) it will restart the task when that happens.
https://github.com/SvenskaSpel/locust-plugins/blob/master/examples/webdriver_ex.py

Agile PLM Unable to extract {0}, see the log for details

I am new to Agile PLM,
I am getting error like Unable to extract {0}, see the log for details message in ATO's.
Can anyone help me to resolve this and root cause for this issue?
Log to Javaclient and try to do Destination reset.
Please check for space on your managed servers. Generally when it is more than 80% occupancy on server this error comes.
Third approach you can do is :
Disable all the Subscribers.
Test destination if it works or not.
4th Approach :
If you are having a clustered environment. You can check ACS configurations :
if ACS.Skips = True in all the server then it's not ideal scenario. It should be true only on one.
Apart from the above suggestions, if your issue is still not resolved, you can try to check if any exception is occurring during extract. I hope, if you are using any clustered environment, ACS is enabled in only one managed server. So expecting that, check out the STDOUT log of the particular managed server (if using WLS) where the ACS is enabled. If the WLS managed servers are installed as Windows Service you need to edit the registry entries for each of them & modify the server start up parameters for each of them to set ACS.Skipserver=true in all but one.
Now, to print the logs, you need to log in to web client, go to tools & settings -> Administration --> Logging configuration --> Set the following entries to DEBUG:
com.agile.acs.PCExtractTask
com.agile.acs.ScheduleMaster
com.agile.acs.ScheduledEventTask
com.agile.extract.server
com.agile.extract.server.ExtractService

my nodejs script is not exiting on its own after successful execution

I have written a script to update my db table after reading data from db tables and solr. I am using asyn.waterfall module. The problem is that the script is not getting exited after successful completion of all operations. I have used db connection pool also thinking that may be creating the script to wait infinitly.
I want to put this script in crontab and if it will not exit properly it would be creating a hell lot of instances unnecessarily.
I just went through this issue.
The problem with just using process.exit() is that the program I am working on was creating handles, but never destroying them.
It was processing a directory and putting data into orientdb.
so some of the things that I have come to learn is that database connections need to be closed before getting rid of the reference. And that process.exit() does not solve all cases.
When my project processed 2,000 files. It would get down to about 500 left, and the extra handles would have filled up the available working memory. Which means it would not be able to continue. Therefore never reaching the process.exit at the end.
On the other hand, if you close the items that are requesting the app to stay open, you can solve the problem at its source.
The two "Undocumented Functions" that I was able to use, were
process._getActiveHandles();
process._getActiveRequests();
I am not sure what other functions will help with debugging these types of issues, but these ones were amazing.
They return an array, and you can determine a lot about what is going on in your process by using these methods.
You have to tell it when you're done, by calling
process.exit();
More specifically, you'll want to call this in the callback from async.waterfall() (the second argument to that function). At that point, all your asynchronous code has executed, and your script should be ready to exit.
EDIT: As pointed out by #Aaron below, this likely has to do with something like a database connection being active, and not allowing the node process to end.
You can use the node module why-is-node-running:
Run npm install -D why-is-node-running
Add import * as log from 'why-is-node-running'; in your code
When you expect your program to exit, add a log statement:
afterAll(async () => {
await app.close();
log();
})
This will print a list of open handles with a stacktrace to find out where they originated:
There are 5 handle(s) keeping the process running
# Timeout
/home/maf/dev/node_modules/why-is-node-running/example.js:6 - setInterval(function () {}, 1000)
/home/maf/dev/node_modules/why-is-node-running/example.js:10 - createServer()
# TCPSERVERWRAP
/home/maf/dev/node_modules/why-is-node-running/example.js:7 - server.listen(0)
/home/maf/dev/node_modules/why-is-node-running/example.js:10 - createServer()
We can quit the execution by using:
connection.destroy();
If you use Visual Studio code, you can attach to an already running Node script directly from it.
First, run the Debug: Attached to Node Process command:
When you invoke the command, VS Code will prompt you which Node.js process to attach to:
Your terminal should display this message:
Debugger listening on ws://127.0.0.1:9229/<...>
For help, see: https://nodejs.org/en/docs/inspector
Debugger attached.
Then, inside your debug console, you can use the code from The Lazy Coder’s answer:
process._getActiveHandles();
process._getActiveRequests();

How do BundleActivator, ManagedService, and my application interact on start/stop?

I had a non-OSGi application. To convert it to OSGi, I first bundled it up and gave it a simple BundleActivator. The activator's start() started up a thread of what used to be the main() of my app (and is now a Runnable), and remembered that thread. The activator's stop() interrupted that thread, and waited for it to end (via join()), then returned. This all seemed to be working fine.
As a next step in the OSGiification process, I am now trying to use OSGi configuration management instead of the Properties-based configuration that the application used to use. So I am adding in a ManagedService in addition to the Activator.
But it's no longer clear to me how I am supposed to start and stop my application; examples that I've seen are only serving to confuse me. Specifically, here:
http://felix.apache.org/site/apache-felix-config-admin.html
They no longer seem to do any real starting of the application in BundleActivator.start(). Instead, they just register a ManagedService to receive configuration. So I'm guessing maybe I start up the app's main thread when I receive configuration, in the ManagedService? They don't show it - the ManagedService's updated() just has vague comments saying to "apply configuration from config admin" when it is passed a non-null Dictionary.
So then I look here:
http://blog.osgi.org/2010/06/how-to-use-config-admin.html
In there, it seems like maybe they're doing what I guessed. They seem to have moved the actual app from BundleActivator to ManagedService, and are dealing with starting it when updated() receives non-null configuration, stopping it first if it's already started.
But now what about when the BundleActivator's stop() gets called?
Back on the first example page that I mentioned above, they unregister the ManagedService. On the second example page, they don't show what they do.
So I'm guessing maybe unregistering the ManagedService will cause null configuration to be sent to ManagedService.updated(), at which point I can interrupte the app thread, wait for it to end, and then return?
I suspect that I'm thoroughly incorrect, but I don't know what the "real" way to do this is. Thanks in advance for any help.
BundleActivator (BA) and ManagedService (MS) are callbacks to your bundle. BundleActivator is for the active state of your bundle. BA.start is when you bundle is being started and BA.stop is when it is being stopped. MS is called to provide your bundle a configuration, if there is one, or notify you there is no configuration.
So in BA.start, you register your MS service and return. When MS is called (on some other thread), you will either receive your configuration or be told there is no configuration and you can act accordingly (start app, etc.)
Your MS can also be called at anytime to advice of the modification or deletion of your configuration and you should act accordingly (i.e. adjust your app behavior).
When you are called at BA.stop, you need to stop your app. You can unregister the MS or let the framework do it for you as part of normal bundle stop processing.