CSV to JSON and add title - json

I have a csv document:
{
"epsilon_id": 194029423,
"weather": "cloudy",
"temperature": 27
},
{
"epsilon_id": 932856192,
"weather": "sunny",
"temperature": 31
}
I was wondering if there was a tool to make it into valid json where the field epsilon_id is the title for the data.
ex:
{
194029423: {
"weather": "cloudy",
"temperature": 27
},
932856192: {
"weather": "sunny",
"temperature": 31
}
}
I would prefer it to be a program (in whatever language) that I can run because I have 1,000 entries in my test sample and I will have tens of thousands in my final copy.
Any help would be much appreciated!

You are looking at JSON transformation, and ofcourse can be achieved with a custom programming. I can explain you how you can achieve this in Java, but functionally its gonna be the same for any programming of your choice.
Your input json will look like this:
[{
"epsilon_id": 194029423,
"weather": "cloudy",
"temperature": 27
},
{
"epsilon_id": 932856192,
"weather": "sunny",
"temperature": 31
}]
When you parse in java using popular Jackson library, you will get list of object for below class:
class Input
{
#JsonProperty(access = Access.WRITE_ONLY)
String epsilon_id,
String weather,
int temperature
}
Then you create a map object Map<Integer, Input>, populate data like below:
Map<Integer, Input> map = new HashMap<>();
for(Input obj : listOfInputs){
map.put(obj.epsilon_id, obj)
};
Serialize your result map using Jackson again to get your desired output format:
{
194029423: {
"weather": "cloudy",
"temperature": 27
},
932856192: {
"weather": "sunny",
"temperature": 31
}
}
If you are not very familiar with Java & Jackson JSON parsing, I found this tutorial with code sample, which will give you headstart.

import csv, json, os
# rename this file or pass it in as process.argv[2]
# then pipe output into another file. or
with open("./foo.csv") as f:
output = {}
for line in csv.DictReader(f):
key = line.pop("epsilon_id")
if output.has_key(key):
print("Duplicate Id -> {} ".format(key))
output[key] = line
# then pipe this output into another file.
print(json.dumps(output, indent=2))
# or write a file
with open("/tmp/foo.json",'w') as f:
json.dump(output, f)
Python makes it pretty easy : this will detect all types from a csv file.
demo : https://repl.it/#markboyle/UsableIcyFeed

Related

Create json object and append json into existing json file using Groovy

I'm pretty new to Groovy (and json) and plauing arround to get the expected solution. But couldn't be able to achive it.
What I'm trying to do is parse an existing json file and then add/append additional entries as in below example:
Original Json File:
{
"stack-1": {
"name": "demo",
"createdAt": "11:00 PM",
"owner": "sarva",
"dbName": "DB"
}
}
New Json Content from Json Builder:
{
"stack-2": {
"name": "demo-2",
"createdAt": "15:00 PM",
"owner": "bhowma",
"dbName": "DB2"
}
}
Intended Json output after merge:
{
"stack-1": {
"name": "demo",
"createdAt": "11:00 PM",
"owner": "sarva",
"dbName": "DB"
},
"stack-2": {
"name": "demo-2",
"createdAt": "15:00 PM",
"owner": "Bhowma",
"dbName": "DB2"
}
}
I've tried many variations on the following code snippet but still not quite getting the right format for my intended output.
import groovy.json.*
def number = 2
def name = "demo-2"
def createdAt = "15:00 PM"
def owner = "bhowma"
def db_name = "DB2"
// here I am loading an old json file
def jsonSlurper = new JsonSlurper()
def json = jsonSlurper.parse(new File('/tmp/sarva.json'))
// Here I am building a new JSON with the above parameters.
def builder = new JsonBuilder()
def root = builder "stack-$number": [name: name, createdAt: createdAt, owner: owner, dbName: db_name]
def newJson = jsonSlurper.parseText(builder.toPrettyString())
println(json.getClass())
println(newJson.getClass())
print json
print builder
Currently, I am able to see below the o/p from both json & builder.toPrettyString() and their classes. but I am not able to merge them as intended. and I want this merge to work for as many json objects as I pass.
Current output looks like below
class groovy.json.internal.LazyMap
class groovy.json.internal.LazyMap
[stack-1:[createdAt:11:00 PM, dbName:DB, name:demo, owner:sarva]]
{"stack-2":{"name":"demo-2","createdAt":"15:00 PM","owner":"Bhowma","dbName":"DB-2"}}
Any help sorting this would be much appreciated.
Ignoring your example being incomplete, you've parsed the original JSON into a Map, so just add the new element to the map
// Here I am building a new Map with the above parameters.
json += ["stack-$number": [name: name, createdAt: createdAt, owner: owner, dbName: db_name]]
And then print out the new json from this map
println new JsonBuilder(json).toPrettyString()

Handle JSON response that could be list or custom object with Moshi

I am new to Moshi and looking for a way to setup a Moshi adapter with dagger 2 to automatically create a custom Object or List based on the JSON response.
The API can return either of the following 2 responses:
[{
"item_type": "xyz",
"items": [{
"name": "foo",
"age": 22
},
{
"name": "bar",
"age": 32
}
]
}]
or
[{
"name": "foo",
"age": 22
},
{
"name": "bar",
"age": 32
}]
I looked at Moshi Determine if JSON is array or single object for reference but in my use case the whole response JSON object has a different structure. Also, Since I am using dagger2 for dependency injection, I am not sure how and where to add the adapter as my Network module which providesMoshi is pretty generic and I won't need this custom adapter for other API's.
#Provides
#Singleton
fun provideMoshi(): Moshi = Moshi.Builder()
.add(Date::class.java, Rfc3339DateJsonAdapter().nullSafe())
.build()

Parsing json in json Groovy Katalon Studio

I got a JSON text which I should parse, but for some reason I can't parse it because it has another array inside. My JSON looks like that:
{
"statementId": "1",
"movements": [
{
"id": 65,
"date": "2019-02-05",
"number": 32,
"balance": -4.62,
"purpose": "1"
},
{
"id": 1,
"date": "2019-02-05",
"number": 22,
"balance": -3,
"purpose": "23"
},
{
"id": 32,
"date": "2019-02-05",
"number": 12,
"balance": -11,
"purpose": "2"
}
],
"startPointer": "1122",
"endPointer": "3333"
}
I am using JsonSlurper. I want to know if it is possible to catch all the data inside "movements", I have tried to use this script:
JsonSlurper slurper = new JsonSlurper()
Map parsedJson = slurper.parseText(bodyContent)
String parsed_movements = parsedJson["movements"]
I have no problem with parsing single strings, like statementId or startPointer, but when I try to parse movements with my script it gives me result as null. I have also tried parsedJson["movements"][0] to catch first movement but it also gives me an error.
I have found a lot of things about json parsers on internet and also on stackoverflow but nothing what I seek. I really don't think that it is a duplicate question.
EDIT: I tried for statement also to put each object in array like that:
def movements_array = []
for(def i = 0; i < parsedJson.movements.size(); i++) {
movements_array << parsedJson.movements[i].id
println(movements_array)
}
But it gives me an error: Cannot invoke method size() on null object, because parsedJson.movements is null.
When you do:
String parsed_movements = parsedJson["movements"]
You're sticking a map into a String, which isn't what you want.
Given the json in your question, you can just do
def movementIds = new JsonSlurper().parseText(bodyContents).movements.id
To get a list of [65, 1, 32]
If you're getting NPEs I assume the json isn't what you show in the question

Parsing and cleaning text file in Python?

I have a text file which contains raw data. I want to parse that data and clean it so that it can be used further.The following is the rawdata.
"{\x0A \x22identifier\x22: {\x0A \x22company_code\x22: \x22TSC\x22,\x0A \x22product_type\x22: \x22airtime-ctg\x22,\x0A \x22host_type\x22: \x22android\x22\x0A },\x0A \x22id\x22: {\x0A \x22type\x22: \x22guest\x22,\x0A \x22group\x22: \x22guest\x22,\x0A \x22uuid\x22: \x221a0d4d6e-0c00-11e7-a16f-0242ac110002\x22,\x0A \x22device_id\x22: \x22423e49efa4b8b013\x22\x0A },\x0A \x22stats\x22: [\x0A {\x0A \x22timestamp\x22: \x222017-03-22T03:21:11+0000\x22,\x0A \x22software_id\x22: \x22A-ACTG\x22,\x0A \x22action_id\x22: \x22open_app\x22,\x0A \x22values\x22: {\x0A \x22device_id\x22: \x22423e49efa4b8b013\x22,\x0A \x22language\x22: \x22en\x22\x0A }\x0A }\x0A ]\x0A}"
I want to remove all the hexadecimal characters,I tried parsing the data and storing in an array and cleaning it using re.sub() but it gives the same data.
for line in f:
new_data = re.sub(r'[^\x00-\x7f],\x22',r'', line)
data.append(new_data)
\x0A is the hex code for newline. After s = <your json string>, print(s) gives
>>> print(s)
{
"identifier": {
"company_code": "TSC",
"product_type": "airtime-ctg",
"host_type": "android"
},
"id": {
"type": "guest",
"group": "guest",
"uuid": "1a0d4d6e-0c00-11e7-a16f-0242ac110002",
"device_id": "423e49efa4b8b013"
},
"stats": [
{
"timestamp": "2017-03-22T03:21:11+0000",
"software_id": "A-ACTG",
"action_id": "open_app",
"values": {
"device_id": "423e49efa4b8b013",
"language": "en"
}
}
]
}
You should parse this with the json module load (from file) or loads (from string) functions. You will get a dict with 2 dicts and a list with a dict.

Streaming huge json with Akka Stream

I have a problem of huge http response with a json slab, where only portion is point of interest.
I cannot change the response structure.
here is an example
{
"searchString": "search",
"redirectUrl": "",
"0": {
"numRecords": 123,
"refinementViewModelCollector": {},
// Lots of data here
"results": [
{
"productCode": "123",
"productShortDescription": "Desc",
"brand": "Brand",
"productReview": {
"reviewScore": 0
},
"priceView": {
"salePriceDisplayable": false,
},
"productImageUrl": "url",
"alternateImageUrls": [
"url1"
],
"largeProductImageUrl": "url4",
"videoUrl": ""
},
{
"productCode": "124",
"productShortDescription": "Desc",
"brand": "Brand",
"productReview": {
"reviewScore": 0
},
"priceView": {
"salePriceDisplayable": false,
},
"preOrder": false,
"productImageUrl": "url",
"alternateImageUrls": [
"url1"
],
"largeProductImageUrl": "url4",
"videoUrl": ""
}
]
//lots of data here
}
}
My point of interest is entries in results Jason Array, but the are sitting in the middle of json
I created a small Play WS Client like this:
val wsClient: WSClient = ???
val ret = wsClient.url("url").stream()
ret.flatMap { response =>
response.body.via(JsonFraming.objectScanner(1024))
.map(_.utf8String)
.runWith(Sink.foreach(println))
}
this will not work because it will take whole json slab as Json object. I need to skip some data until "results": entry appear in the stream, then start parsing entries and skip all the rest.
Any ideas how to do this?
Check out Alpakka's JSON module, which can stream specific parts of a nested JSON structure:
response
.body
.via(JsonReader.select("$.0.results[*]"))
.map(_.utf8String)
.runWith(Sink.foreach(println)) // or runForeach(println)
There are parsers that support parsing as a stream. For a good example check out this Circe example https://github.com/circe/circe/tree/master/examples/sf-city-lots
I'd love a better, Scala-specific answer to this question, but check out the "Mixed Reads Example" in the documentation for Google's GSON library:
https://sites.google.com/site/gson/streaming
Gson also supports mixed streaming & object model access. This lets your application have the best of both worlds: the productivity of object model access with the efficiency of streaming
...
This code reads a JSON document containing an array of messages. It steps through array elements as a stream to avoid loading the complete document into memory. It is concise because it uses Gson’s object-model to parse the individual messages
This should have great memory-performance (the code reads from a Java InputStream, so the full structure is never in memory), but may require some effort to get your results into Scala case classes.