Parse jolokia output with jq - json

I have an Apache Artemis broker, of which I can get some management information through jolokia. This response is in json format; I also have jq to do "json stuff" with it.
curl -s -X GET --url 'http://localhost:8161/console/jolokia/read/org.apache.activemq.artemis:*'
This works; and provides a json response.
I want to make a kind of generic script to check some values from this response; hence a few questions:
(For ease of testing I stored the response in a file broker.json, normally I would just pipe the output from curl to jq or store it in a variable, depending on how often jq has to be called)
One of the keys I want to query I can get like this:
jq '."value"."org.apache.activemq.artemis:broker=\"broker1\""' broker.json
However, in a more generic script, I won't know the name of the broker (which is "broker1" here); is there some way I can wildcard the key like this: "org.apache.activemq.artemis:broker=\"*\"" ? My attempts so far have not given me anything
The second question is a bit harder I think.
In the response there is a field that can be found by querying .request.timestamp
the value is in seconds since epoch.
On the broker are queues, and some of them might have messages; I want to find those that have messages older than, say, 5 minutes.
I can find one such object with this key:
jq '."value"."org.apache.activemq.artemis:address=\"my.queue\",broker=\"broker1\",component=addresses,queue=\"my.queue\",routing-type=\"anycast\",subcomponent=queues"' broker.json
This object contains two keys I can use for this purpose:
- FirstMessageAge : age in ms
- FirstMessageTimestamp: timestamp in miliseconds since epoch.
How would I query for this? Ideally I'd like to get the answer "my.queue has messages older than X"; where my.queue can also be obtained from having the key "Address" or "Name"
Artemis uses Address and Queues as separate entities; for all practical purposes here, both have the same name.
I am trying to make a (simple) script that can periodically monitor the broker health (not too many messages on queues for too long, queues having consumers, stuff like that; which all can be gotten from this single rest call; I think that with the answers to above questions I should be able to figure out how to get this.

is there some way I can wildcard the key like this:
"org.apache.activemq.artemis:broker=\"*\""
The best way to match wildcards on key names is by using with_entries or to_entries. Since you have not provided an example in accordance with the MCVE guidelines, it's not clear exactly how you'd do so, but by analogy with the example you give, you could start with:
.value
| to_entries[]
| select(.key | test("^org.apache.activemq.artemis:broker=\".*\""))
| .value

Related

How do I best construct complex NiFi routing

I'm a total noob when it comes to NiFi - so please feel free to highlight any stupidity/ignorance.
I'm reading messages from a Kafka topic using NiFi.
Each message contains JSON that contains a field called Function and then a whole bunch of different fields, based on the Function. For example, if Function ="Login", you can expect a username and password field, but if Function = "Pay", you can expect "From", "To" and "Amount" fields.
I need to process each type of Function differently. So, basically, I want to read the message from Kafka, determine the function and then route the message, based on the function to the appropriate set of rules.
It sounds like this should be simple - but for one small complication. I have about 500 different types of Functions. So, I don't want to add a RouteOnAttribute node for each function.
Is there a better way to do this? If this was "real code", I suppose that I'm looking for the difference between an "if" statements and some sort of "switch/case" statement....
You would first use EvaluateJsonPath to extract the function into a flow file attribute, then RouteOnAttribute which would need 500 conditions added to it, and then connect each of those 500 conditions to whatever follow on processing is required. The only other thing you could do is implement a custom processor that handles the 500 conditions internally.

Filtering with regex vs json

When filtering logs, Logstash may use grok to parse the received log file (let's say it is Nginx logs). Parsing with grok requires you to properly set the field type - e.g., %{HTTPDATE:timestamp}.
However, if Nginx starts logging in JSON format then Logstash does very little processing. It simply creates the index, and outputs to Elasticseach. This leads me to believe that only Elasticsearch benefits from the "way" it receives the index.
Is there any advantage for Elasticseatch in having index data that was processed with Regex vs. JSON? E.g., Does it impact query time?
For elasticsearch it doesn't matter how you are parsing the messages, it has no information about it, you only need to send a JSON document with the fields that you want to store and search on according to your index mapping.
However, how you are parsing the message matters for Logstash, since it will impact directly in the performance.
For example, consider the following message:
2020-04-17 08:10:50,123 [26] INFO ApplicationName - LogMessage From The Application
If you want to be able to search and apply filters on each part of this message, you will need to parse it into fields.
timestamp: 2020-04-17 08:10:50,123
thread: 26
loglevel: INFO
application: ApplicationName
logmessage: LogMessage From The Application
To parse this message you can use different filters, one of them is grok, which uses regex, but if your message has always the same format, you can use another filter, like dissect, in this case both will achieve the same thing, but while grok uses regex to match the fields, dissect is only positional, this make a huge difference in CPU use when you have a high number of events per seconds.
Consider now that you have the same message, but in a JSON format.
{ "timestamp":"2020-04-17 08:10:50,123", "thread":26, "loglevel":"INFO", "application":"ApplicationName","logmessage":"LogMessage From The Application" }
It is easier and fast for logstash to parse this message, you can do it in your input using the json codec or you can use the json filter in your filter block.
If you have control on how your log messages are created, choose something that will make you do not need to use grok.

Gemfire pdxInstance datatype

I am writing pdxInstances to GemFire using the sequence: rabbitmq => springxd => gemfire.
If I put this JSON into rabbitmq {'ID':11,'value':5}, value appears as a byte value in GemFire. If I put {'ID':11,'value':500}, value appears as a word and if I put {'ID':11,'value':50000} it appears as an Integer.
A problem arises when I query data from GemFire and order them. For example, if I use a query such as select * from /my_region order by value it fails, saying it cannot compare a byte with a word (or byte with an integer).
Is there any way to declare the data type in JSON? Or any other method to get rid of this problem?
To add a bit of insight into this problem... in reviewing GemFire/Geode source code, it would seem it is not possible to configure the desired value type and override GemFire/Geode's default behavior, which can be seen in JSONFormatter.setNumberField(..).
I will not explain how GemFire/Geode involves the JSONFormatter during a Region.put(key, value) operation as it is rather involved and beyond the scope of this discussion.
However, one could argue that the problem is not necessarily with the JSONFormatter class, since storing a numeric value in a byte is more efficient than storing the value in an integer, especially when the value would indeed fit into a byte. Therefore, the problem is really that the Comparator used in the Query processor should be able to compare numeric values in the same type family (byte, short, int, long), upcasting where appropriate.
If you feel so inclined, feel free to file a JIRA ticket in the Apache Geode JIRA repository at https://issues.apache.org/jira/browse/GEODE-72?jql=project%20%3D%20GEODE
Note, Apache Geode is the open source "core" of Pivotal GemFire now. See the Apache Geode website for more details.
Cheers!
Your best bet would be to take care of this with a custom module or a groovy script. You can either write a custom module in Java to do the conversion and then upload the custom module into SpringXD, then you could reference your custom module like any other processor. Or you could write a script in Groovy and pass the incoming data through a transform processor.
http://docs.spring.io/spring-xd/docs/current/reference/html/#processors
The actual conversion probably won't be too tricky, but will vary depending on which method you use. The stream creation would look something like this when you're done.
stream create --name myRabbitStream --definition "rabbit | my-custom-module | gemfire-json-server etc....."
stream create --name myRabbitStream --definition "rabbit | transform --script=file:/transform.groovy | gemfire-json-server etc...."
It seems like you have your source and sink modules set up just fine, so all you need to do is get your processor module setup to do the conversion and you should be all set.

ElasticSearch Get Index Names and Store Size

I am attempting to capture a list of all the indexes and their sizes in a way that I could capture the information using Angular's $http service and then iterate through the information using the ng-repeat preferably with something like:
<ul ng-repeat="elsindex in elsIndexHttpResponse">
<li>{{elsindex.name}}:{{elsindex.size}}</li>
</ul>
The closest thing I have found is this:
http://localhost:9200/_cat/indices?h=index,store.size
Except:
a. its responses are not in json so easily referencing it using the ng-repeat <li> elements isn't going to work; and
b. i would like, if possible, to get the size output to reflect the same unit size (like bytes).
If this involves something complicated then I'd be grateful for pointers on where I should focus.
I am using elasticsearch v1.4.4
Many thanks
I realize this question dates already, but wanted to add my 2 cents.
http://localhost:9200/_cat/indices?h=index,store.size&bytes=kb&format=json
Would actually get you exactly what you requested:
format=json -> formats the output to json
bytes=kb -> outputs the size in kilobytes
Information regarding the size unit was retrieved from cat APIs doc
Possible values for the bytes argument
Information regarding the format was an attempt in Sense, which has some auto-completion features quite useful to detect such options.
Cheers.
Index size in bytes is included with an indices stats API call:
curl http://localhost:9200/_stats/indexing,store
For nicely formatted JSON output, append ?pretty to the end of the URL:
curl http://localhost:9200/_stats/indexing,store?pretty
See the Indices stats API documentation for additional details and related information.
Just a slight modification from above answer.
curl -X GET "localhost:9200/_cat/indices?h=index,store.size&bytes=gb?pretty"
In case you want the size of a particular index, the below API works fine on Elastic Search 7.14.
curl http://10.29.61.105:9200/employee/_stats where employee is the desired index name.

Edit json object by lua script in redis

I want edit my json object before back from the Redis server,
In my Redis server I have 4 keys:
user:1 {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:post:1 {"PostId":"1","Content":"Test content"}
user:1:post:2 {"PostId":"2","Content":"Test content"}
user:1:post:3 {"PostId":"3","Content":"Test content"}
I want to get this context by lua script,How ? :
{"Id":"1","Name":"Gholzam","Posts":"[{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test
content"},{"PostId":"1","Content":"Test content"}]}
The choice of client here is largely irrelevant; the important thing to do is: figure out the data storage. You say you have 4 keys - but it is not obvious to me how we we know, given user:1, what the posts are. Common approaches there include:
have a set called user:1:posts (or something similar) which contains either the full keys (user:1:post:1, etc) or the relative keys (1, etc)
have a hash called user:1:posts (or something similar) which contains the posts keyed by their id
I'd be tempted to use the latter approach, as it is more direct - so I might have:
user:1, a string with contents {"Id":"1","Name":"Gholzam","Posts":"[]"}
user:1:posts, a hash with 3 pairs:
key 1 with value {"PostId":"1","Content":"Test content"}
key 2 with value {"PostId":"2","Content":"Test content"}
key 3 with value {"PostId":"3","Content":"Test content"}
Then you can use hgetall or hvals to get the posts easily.
The second part is how to manipulate json at the server. The good news here is that redis provides access to json tools inside lua via cjson.
I am an expert in neither cjson nor lua; however, frankly my advice is: don't do this. IMO, redis works best if you let it focus on what it is great at: storage and retrieval. You probably can bend it to your whim, but I would be very tempted to do any json manipulation outside of redis.