I am trying to figure out the best way to write all of our logs as a single line of JSON using log4j2. Can any one suggest me appender to achieve above one. Any help would be appreciated. Currently I am converting data into JSON and logging at particular levels but I want to do it automatically.
Related
I'm looking for ideas for an Open Source ETL or Data Processing software that can monitor a folder for CSV files, then open and parse the CSV.
For each CSV row the software will transform the CSV into a JSON format and make an API call to start a Camunda BPM process, passing the cell data as variables into the process.
Looking for ideas,
Thanks
You can use a Java WatchService or Spring FileSystemWatcher as discussed here with examples:
How to monitor folder/directory in spring?
referencing also:
https://www.baeldung.com/java-nio2-watchservice
Once you have picked up the CSV you can use my example here as inspiration or extend it: https://github.com/rob2universe/csv-process-starter specifically
https://github.com/rob2universe/csv-process-starter/blob/main/src/main/java/com/camunda/example/service/CsvConverter.java#L48
The example starts a configurable process for every row in the CSV and includes the content of the row as a JSON process data.
I wanted to limit the dependencies of this example. The CSV parsing logic applied is very simple. Commas in the file may break the example, special characters may not be handled correctly. A more robust implementation could replace the simple Java String .split(",") with an existing CSV parser library such as Open CSV
The file watcher would actually be a nice extension to the example. I may add it when I get around to it, but would also accept a pull request in case you fork my project.
I need some guidance on how to proceed with a problem.
Our integration team receives xml files which are converted to json and sent to pub/sub. We then ingest the json files (or are supposed to) into bigquery.
The problem is that the xml files do not include all possible objects or values all the time. So, I cant create a correct schema in bq to receive the json files. I got the xsd file with an extension file which gives me all possible objects but I don't know how to convert this to a correct bq schema.
Do you have any suggestions on how to create a bq schema from xsd files? I was thinking that if I create an xml file with dummy data (including all objects and more than one object when creating repeated objects) with help of the xsd maybe that xml file may be converted to json and then use the auto-schema detection of bq.
Any suggestions?
Thanks,
Cris
If you have the XSD schema files, you can convert these to a valid JSON schema. There are a few tools that can help you to accomplish this.
Keep in mind that the tools are for general purposes and not for the particular case of BigQuery, so you'll have to tune the result to get a valid JSON schema. For this check the components of a BigQuery schema, and for quick reference the sample provided in the documentation.
I am wondering if it is possible to extract the metadata of a flat file in a CSV using Pentaho Spoon. What I mean by that is for example get a CSV file input step, choose the file you want to read and then somehow get access to the metadata of that file and export it into a CSV.
I found on the documentation a step called Metadata Structured that was introduced in 3.1.0 but I can't find it in the latest version of Spoon, maybe it got removed by now.
Update: I found the "Metadata structure of stream" that almost does what I need to be done. Right now my transformation looks like this: csv file input -> metadata structure of stream -> text file ouput. The problem is that it doesnt extract all the metadata. It doesn't extract Format, Decimal and Group. It also gets me an Origin column that I don't really need and I have to get rid of it.
Update2: I keep trying to get to those columns that are missing but the problem is that the Metadata structure of stream step only outputs these columns "Position,Fieldname,Comments,Type,Length,Precision,Origin" so I cannot really access the format column for example that is an input for the step :( I can't really find a work-around for this
We're constructing a network of data and part of that includes modifying a search query from a public website to pull all of the data we want. That data, however, when pulled is stored into a JSON txt file.
Ultimately we want this data to be stored in an Access Database so the next step, we thought, was to convert it to XML so we can have an Excel sheet to import. We found a formatting tool (http:jsonformatter.org). When running the tool we received the following error:
“Microsoft Access has encountered an error processing the XML schema in file ‘Data.xml’,
A document must contain exactly one root element”
I've no idea what this entails or where to start debugging. Are there alternatives we might consider?
The error says that there is more than one root element. Have you validated the XML generated? I looked at the website. I tried to ask via comment but I don't have enough rep but you should post some of your json and xml.
If I am reading your issue correctly, you are converting json to xml format and then to excel?
I would suggest writing some code to consume the json and export the xml files to import.
I'm using Mule CE and need to remove the first row (containing headers) of an incoming CSV file so that I can pass it to a transformer to be converted to maps. How would I achieve this? My guess would be some kind of splitter, but I've not had any luck so far.
The jdbc:csv-to-maps-transformer contains a property ignoreFirstRecord, but if you have your csv transformations done with some custom transformer, you may have to do this with Java (or some of the scripting languages). I don't think there is any standard Mule component that does this out-of-the-box.