multiple pm2 logs (json + standard) - pm2

Is there any way to write pm2 logs as both json and the standard format, or is it one or the other?
I would like to keep logs in standard format for troubleshooting but I also need logs in json format for our logging software.
I'm currently using an ecosystem file with log_type: json. Based on this page: https://pm2.io/docs/runtime/guide/log-management/ it seems it is one or the other but perhaps there is a clever way to do this?

Related

Backup core data, one entity only

My application requires some kind of data backup and some kind of data exchange between users, so what I want to achieve is the ability to export an entity but not the entire database.
I have found some help but for the full database, like this post:
Backup core data locally, and restore from backup - Swift
This applies to the entire database.
I tried to export a JSON file, this might work except that the entity I'm trying to export contains images as binary data.
So I'm stuck.
Any help exporting not the full database but just one entity or how to write a JSON that includes binary data.
Take a look at protobuf. Apple has an official swift lib for it
https://github.com/apple/swift-protobuf
Protobuf is an alternate encoding to JSON that has direct support for serializing binary data. There are client libraries for any language you might need to read the data in, or command-line tools if you want to examine the files manually.

Nifi, how to produce via Kafka avro files with multiple records each file

I created a pipeline that handles a single json file (a vector of 5890 elements, each a record) and send it via Kafka in avro format. The producer works fine, then when I read it with a consumer I get a flowfile (a avro file) each record. 5890 avro files. How can I set or merge more records in a single avro file?
I simply use a PublishKafkaRecord_0_10 1.5.0 (jsonTreeReader 1.5.0 and AvroRecordSetWriter 1.5.0) and ConsumeKafka_0_10 1.5.0 .
Firstly, NiFi 1.5.0 is from January 2018. Please consider upgrading as this is terribly out of date. NiFi 1.15.3 is the latest as of today.
Secondly, the *Kafka_0_10 processors are geared at very old versions of Kafka - are you really using v0.10 of Kafka? You have the following processors for later Kafka versions:
*Kafka_1.0 for Kafka 1.0+
*Kafka_2.0 for Kafka 2.0+
*Kafka_2.6 for Kafka 2.6+.
It would be useful if you provide examples of your input and desired output and what you are actually trying to achieve.
If you are looking to consume those message in NiFi and you want a single FlowFile with many messages, you should use ConsumeKafkaRecord rather than ConsumeKafka. This will let you control how many records you'd like to see per 'file'.
If your consumer is not NiFi, then either they need to merge on their end, or you need to bundle all your records into one larger message when producing. However, this is not really the point of Kafka as it's not geared towards large messages/files.

How to parse json data from kafka server with spark streaming?

I managed to connect spark streaming to my kafka server in which I have data with json format. I want to parse these data in order to do use the function groupby as explained here: Can Apache Spark merge several similar lines into one line?
In fact, in this link we import json data from a file which is clearly easier to treat. I didn't find someting similar with a kafka server.
Do you have any idea bout it.
Thanks and regards
It's really hard to understand what you're asking because we can't see where you are now without code. Maybe this general guidance is what you need.
Your StreamingContext can be given a foreachRDD block where you'll get an RDD. Then you can sqlContext.read.json(inputRDD) and you will have a DataFrame which you can process however you like.

Apache Spark-Get All Field Names From Nested Arbitrary JSON Files

I have run into a somewhat perplexing issue that has plagued me for several months now. I am trying to create an Avro Schema (schema-enforced format for serializing arbitrary data, basically, as I understand it) to convert some complex JSON files (arbitrary and nested) eventually to Parquet in a pipeline.
I am wondering if there is a way to get the superset of field names I need for this use case staying in Apache Spark instead of Hadoop MR in a reasonable fashion?
I think Apache Arrow under development might be able to help avoid this by treating JSON as a first class citizen eventually, but it is still aways off yet.
Any guidance would be sincerely appreciated!

Using Parquet format with Cygnus

I would like to store the events data in Parquet format (e.g., on HDFS). Do I need to modify the code of the corresponding sinks, or there is a way around it? E.g., using a Flume interceptor.. Thanks.
On the one hand, there was an issue regarding Cygnus about modifying the code having in mind the goal of supporting multiple output formats when writting to HDFS. The modification was done, but only support for our custom Json and CSV formats were coded. This meas the code is ready for being modified in order to add a third format. I've added a new issue regarding the specific Parquet support on OrionHDFSSink; if you finally decide to do the modification, I can assign you the issue :)
On the other hand, you can always use the native HDFS sink (that persists all the notified body) and, effectively, program a custom interceptor.
As you can see, in both cases you will have to code the Parquet part (or wait until we have room for implementing it).