Merge several json arrays in circe - json

Let's say we have 2 json arrays. How to merge them into a single array with circe? Example:
Array 1:
[{"id": 1}, {"id": 2}, {"id": 3}]
Array 2:
[{"id": 4}, {"id": 5}, {"id": 6}]
Needed:
[{"id": 1}, {"id": 2}, {"id": 3}, {"id": 4}, {"id": 5}, {"id": 6}]
I've tried deepMerge, but it only keeps the contents of the argument, not of the calling object.

Suppose we've got the following set-up (I'm using circe-literal for convenience, but your Json values could come from anywhere):
import io.circe.Json, io.circe.literal._
val a1: Json = json"""[{"id": 1}, {"id": 2}, {"id": 3}]"""
val a2: Json = json"""[{"id": 4}, {"id": 5}, {"id": 6}]"""
Now we can combine them like this:
for { a1s <- a1.asArray; a2s <- a2.asArray } yield Json.fromValues(a1s ++ a2s)
Or:
import cats.std.option._, cats.syntax.cartesian._
(a1.asArray |#| a2.asArray).map(_ ++ _).map(Json.fromValues)
Both of these approaches are going to give you an Option[Json] that will be None if either a1 or a2 don't represent JSON arrays. It's up to you to decide what you want to happen in that situation .getOrElse(a2) or .getOrElse(a1.deepMerge(a2)) might be reasonable choices, for example.
As a side note, the current contract of deepMerge says the following:
Null, Array, Boolean, String and Number are treated as values, and values from the argument JSON completely replace values from this JSON.
This isn't set in stone, though, and it might not be unreasonable to have deepMerge concatenate JSON arrays—if you want to open an issue we can do some more thinking about it.

Related

MySQL: Update specific values in JSON array of objects

I'm using MySQL 5.7.12, and have a JSON column with the following data:
[{"tpe": "I", "val": 1}, {"tpe": "C", "val": 2}, {"tpe": "A", "val": 3}]
I would like to UPDATE val from 2 into 20 WHERE tpe='C'.
Here is my attempt:
UPDATE user SET data = JSON_SET(data->"$[1]", '$.val', 20);
This does update the value but it trims the other elements in the array and it becomes only a json-object, here how it looks after the update:
{"tpe": "C", "val": 20}
How can I get this right?
2nd question: is there a way to dynamically get the json object in the array so I don't have to hard code "$[1]" ? I tried to use JSON_SEARCH ??

Count elements in nested JSON with jq

I am trying to count all elements in a nested JSON-document with jq?
Given the following JSON-document
{"a": true, "b": [1, 2], "c": {"a": {"aa":1, "bb": 2}, "b": "blue"}}
I want to calculate the result 6.
In order to do this, I tried the following:
echo '{"a": true, "b": [1, 2], "c": {"a": {"aa":1, "bb": 2}, "b": "blue"}}' \
| jq 'reduce (.. | if (type == "object" or type == "array")
then length else 0 end) as $counts
(1; . + $counts)'
# Actual output: 10
# Desired output: 6
However, this counts the encountered objects and arrays as well and therefore yields 10 opposing to the desired output: 6
So, how can I only count the document's elements/leaf-nodes?
Thanks already in advance for you help!
Edit: What would be an efficient approach to count empty arrays and objects as well?
You can use the scalars filter to find leaf nodes. Scalars are all "simple" JSON values, i.e. null, true, false, numbers and strings. Alternatively you can compare the type of each item and use length to determine if an object or array has children.
I've expanded your input data a little to distinguish a few more corner cases:
Input:
{
"a": true,
"b": [1, 2],
"c": {
"a": {
"aa": 1,
"bb": 2
},
"b": "blue"
},
"d": [],
"e": [[], []],
"f": {}
}
This has 15 JSON entities:
5 of them are arrays or objects with children.
4 of them are empty arrays or objects.
6 of them are scalars.
Depending on what you're trying to do, you might consider only scalars to be "leaf nodes", or you might consider both scalars and empty arrays and objects to be leaf nodes.
Here's a filter that counts scalars:
[..|scalars]|length
Output:
6
And here's a filter that counts all entities which have no children. It just checks for all the scalar types explicitly (there are only six possible types for a JSON value) and if it's not one of those it must be an array or object, where we can check how many children it has with length.
[
..|
select(
(type|IN("boolean","number","string","null")) or
length==0
)
]|
length
Output:
10

Error while reading JSON file in chunksizes with python

I have a large json file, so I want to read the file in chunks while testing. I have implemented the code below:
if fpath.endswith('.json'):
with open(fpath, 'r') as f:
read_query = pd.read_json(f, lines=True, chunksize=100)
for chunk in read_query:
print(chunk)
I get the error:
File "nameoffile.py", line 168, in read_queries_func
for chunk in read_query:
File "C:\Users\Me\Python38\lib\site-packages\pandas\io\json\_json.py", line 798, in __next__
obj = self._get_object_parser(lines_json)
File "C:\Users\Me\Python38\lib\site-packages\pandas\io\json\_json.py", line 770, in _get_object_parser
obj = FrameParser(json, **kwargs).parse()
File "C:\Users\Me\Python38\lib\site-packages\pandas\io\json\_json.py", line 885, in parse
self._parse_no_numpy()
File "C:\Users\Me\Python38\lib\site-packages\pandas\io\json\_json.py", line 1159, in _parse_no_numpy
loads(json, precise_float=self.precise_float), dtype=None
ValueError: Expected object or value
Why am I getting an error?
The JSON file looks like this:
[
{
"a": "13",
"b": "55"
},
{
"a": "15",
"b": "16"
},
{
"a": "18",
"b": "45"
},
{
"a": "1650",
"b": "26"
},
.
.
.
{
"a": "214",
"b": "23"
}
]
Also, is there a way to extract just the 'a' attribute's values while reading the file? Or can that only be done after I've read the file?
Your json file contains just one object. As per the line-delimited json doc to which the doc of the chunksize argument points:
pandas is able to read and write line-delimited json files that are common in data processing pipelines using Hadoop or Spark.
For line-delimited json files, pandas can also return an iterator which reads in chunksize lines at a time. This can be useful for large files or to read from a stream.
It also implies that lines=True, and the doc for lines says:
Read the file as a json object per line.
This means that files like this work:
{"a": 1, "b": 2}
{"a": 3, "b": 4}
{"a": 5, "b": 6}
{"a": 7, "b": 8}
{"a": 9, "b": 10}
These don’t:
[
{"a": 1, "b": 2},
{"a": 3, "b": 4},
{"a": 5, "b": 6},
{"a": 7, "b": 8},
{"a": 9, "b": 10}
]
So you have to read the file in one go, or modify it as you go to have one object per line.

How to get a list of object keys in JMESPath

My Google search skills are failing me. How to get a list of all JSON object keys in JMESPath?
i.e. how to go from:
{"a": 1, "b": 2}
to:
["a", "b"]
JMESPath has the function keys. Therefore, the JMESPath expression is keys(#).
Example
echo '{"a": 1, "b": 2}' | jp "keys(#)"
returns
[
"a",
"b"
]
Tested with jp 0.1.3 on a Linux environment.

Replacing JSON file with CSV for d3js

http://bl.ocks.org/robschmuecker/7880033
I'm new to javascript and d3. The above example is a dendrogram. I can create my own. However, if I wanted to use it for something like employee data, it seems like it would be a pain to always having to be editing the json unless I'm missing some easier trick.
A csv in excel, that I've used in other charts, would seem like it would work well. Is It possible to replace the flare.json with a csv with the data? if so , how?
No, it's not possible directly. To know why, you'll have to understand the way the function d3.csv creates an array. Suppose you have this CSV:
foo, bar, baz
21, 33, 5
1, 14, 42
When parsed, it will generate a single array of objects, without nested arrays or nested objects. The first row defines the key names, and the other rows the values. This is the array generated for that CSV:
[
{"foo": 21, "bar": 33, "baz": 5},
{"foo": 1, "bar": 14, "baz": 42}
]
Or, if you don't change the type, with the numbers as strings:
[
{"foo": "21", "bar": "33", "baz": "5"},
{"foo": "1", "bar": "14", "baz": "42"}
]
You will not get anywhere close of what you want, which is an array of objects containing arrays containing objects containing arrays etc...
You can modify this array later to create the nested children you need (look at #torresomar comment below), but it's way easier to simply edit your JSON.