How to create the elasticsearch index document from s3 json file object - json

S3 to Elasticsearch
We have JSON files in s3, ship the s3 JSON object to elasticsearch as a document(one json files in s3 to one document elasticsearch)
Push s3 JSON objects (s3 containers json files), each object will be one document)
Here is my pipeline
input {
s3 {
access_key_id => "MY_KEY"
secret_access_key => "MY_SECRET"
bucket => "sthreetoes"
region => "ap-south-1"
codec => "json"
}
}
output {
elasticsearch {
hosts => "http://elasticsearch:9200"
index => "test-data"
}
}
When I run the pipeline, losgstash is shipping to index considering each key-value paid as document.
json object in s3
{
"id": 246,
"first_name": "Lell",
"last_name": "Bsel",
"email": "lbros6t#sakura.jp",
"gender": "Female",
"ip_address": "12.12.12.8"
}
When we run the logstash pipeline index is created with 6 documents considering each key-value pair in JSON as a message.
Any solution to save the JSON object as a single document

Related

Azure Data Factory - convert Json Array to Json Object

I retrieve data using Azure Data Factory from an OnPremise database and the output I get is as follows:
{
"value":[
{
"JSON_F52E2B61-18A1-11d1-B105-XXXXXXX":"{\"productUsages\":[{\"customerId\":3552,\"productId\":120,\"productionDate\":\"2015-02-10\",\"quantity\":1,\"userName\":\"XXXXXXXX\",\"productUsageId\":XXXXXX},{\"customerId\":5098,\"productId\":120,\"productionDate\":\"2015-04-07\",\"quantity\":1,\"userName\":\"ZZZZZZZ\",\"productUsageId\":ZZZZZZ}]}"
}
]
}
The entire value array is being serialized into a JSON and I end up with:
[{
"productUsages":
[
{
"customerId": 3552,
"productId": 120,
"productionDate": "2015-02-10",
"quantity": 1,
"userName": "XXXXXXXX",
"productUsageId": XXXXXX
},
{
"customerId": 5098,
"productId": 120,
"productionDate": "2015-04-07",
"quantity": 1,
"userName": "ZZZZZZZ",
"productUsageId": ZZZZZZZ
}
]
}]
I need to have a Json Object at a root level, not Json Array ([] replaced with {}). What's the easiest way to achieve that in Azure Data Factory?
Thanks
In ADF When you read any Json file it will read as array of Objects by default :
Sample data While reading Json data:
Data preview:
But when you want to move data to sink in Json format you have option called Set of objects you need to select that:
Sample data While storing in sink in form of Json data:
Output

Trying to generate a json asset file in Flutter

I have created a flutter web project and I am using flutter_azure_b2c package which uses asset json files, everything works fine when I run it.
it's shown like this
Now I want to generate these json config files automatically using env variables when I build the app.
Here is my json file format:
{
"client_id" : "",
"redirect_uri" : "",
"cache_location": "localStorage",
"interaction_mode": "redirect",
"authorities": [
{
"type": "B2C",
"authority_url":""
},
{
"type": "B2C",
"authority_url":""
}
],
"default_scopes": [
]
}
How can it be done?
I tried to write into the file when executing main.dart using this package https://pub.dev/packages/global_configuration but json values are not updated when I pass them in flutter_azure_b2c method

Parse complex json file in Azure Data Factory

I would like to parse a complex json file in Azure Data Factory. The structure is the below which means that there are nested objects and arrays. From my understanding ADF can parse arrays but what should we do in order to parse more complex files?
The structure of the file is the below
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": "xyzzy"
"history data": [
{
"name":"x",
"latest buy": "22-12-21"
"total buys": 4
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
}
}
There seems to be some error in you JSON structure you posted. This is not a valid JSON structure. It is missing the commas (,) and braces.
When you have a valid JSON structure, you can use flatten transformation in Data flow to flatten the JSON.
Source:
{
"productA": {
"subcategory 1" : [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": "xyzzy",
"history data": [
{
"name":"x",
"latest buy": "22-12-21",
"total buys": 4,
"other comments": {"John":"Very nice","Nick":"Not nice"}
}
]
}
]
}
}
ADF Data flow:
Source transformation:
Connect the JSON dataset to source transformation and in Source Options, under JSON settings, select a single document.
Source preview:
Flatten transformation:
Here select the array level which you want to unroll in Unroll by and Unroll root and add mappings.
Preview of flatten:
Refer to this parse & flatten documents for more details on parsing the JSON documents in ADF.

Use Azure Data Factory to parse JSON table to csv format

I am new to data flows in adf. I have a api response json and I want to convert it to csv format
{ "headers": [ "SCENARIO", "LOT", "BID_SUBMISSION", "SHARE_AWARDED", "Lot ID", "Price (EUR)" ], "data-rows": [ [ "Low Cost Baseline", "Item 01", "Bidder 1", 1.0, "Item 01", 42.0 ], [ "Low Cost Baseline", "Item 02", "Bidder 2", 1.0, "Item 02", 265.0 ] ] }
You can convert JSON to CSV format using flatten transformation in ADF data flow.
Connect the Source to the rest API dataset and create a linked service connection by providing API details.
Select the document form as per the source JSON format.
Connect source output to flatten transformation to flatten the JSON file. Refer to Flatten transformation in mapping data flow for flattern settings.
Add sink destination to store the JSON file in CSV format by selecting blob storage dataset and CSV format.
Note: You can add multiple flatten/parse transformations between source and sink to get required output.

Logstash - won't parse JSON

I want to parse data to Elasticsearch using Logstash. So far this worked great but when I try to parse JSON files, Logstash just won't do ...anything. I can start Logstash without any exception but it won't parse anything.
Is there something wrong with my config? The path to the JSON file is correct.
my JSON:
{
"stats": [
{
"liveStatistic": {
"#scope": "21",
"#scopeType": "foo",
"#name": "minTime",
"#interval": "60",
"lastChange": "2011-01-11T15:19:53.259+02:00",
"start": "2011-01-18T14:19:48.333+02:00",
"unit": "s",
"value": 10
}
},
{
"liveStatistic": {
"#scope": "26",
"#scopeType": "bar",
"#name": "newCount",
"#interval": "60",
"lastChange": "2014-01-11T15:19:59.894+02:00",
"start": "2014-01-12T14:19:48.333+02:00",
"unit": 1,
"value": 5
}
},
...
]
}
my Logstash agent config:
input {
file {
path => "/home/me/logstash-1.4.2/values/stats.json"
codec => "json"
start_position => "beginning"
}
}
output {
elasticsearch {
host => localhost
protocol =>"http"
}
stdout {
codec => rubydebug
}
}
You should add the following line to your input:
start_position => "beginning"
Also put the complete document on one line and maybe add {} around your document to make it a valid json document.
Okay, two things:
First, the file input is by default set to start reading at the end of the file. If you want the file to start reading at the beginning, you will need to set start_position. Example:
file {
path => "/mypath/myfile"
codec => "json"
start_position => "beginning"
}
Second, keep in mind that logstash keeps a sincedb file which records how many lines of a file you have already read (so as to not parse information repeatedly!). This is usually a desirable feature, but for testing over a static file (which is what it looks like you're trying to do) then you want to work around this. There are two ways I know of.
One way is you can just make a new copy of the file every time you want to run logstash, and remember to tell logstash to read from that file.
The other way is you can go and delete the sincedb file, wherever it is located. You can tell logstash where to write the sincedeb file with the sincedb_path feature.
I hope this all helped!