insert json array into postgres json [] - json

How to insert the array ["a","b","c"] into test?
create table test( f json [] );
I tried
insert into test (f) values('{\"a\",\"b\",\"c\"}');
but then the escape backslashes are displayed when I select it. Without escaping it does not work at all. (Token "a" is invalid)
select * from test;
f
----
{"\"a\"","\"b\"","\"c\""}

I suppose you just want to insert a json array (json) and not an array of json (json[]):
create table test (f json);
insert into test (f) values(to_json(array['a','b','c']));
select * from test;
f
---------------
["a","b","c"]
In case you want an array of json:
create table test (f json[]);
insert into test (f) values
(array[to_json('a'::text),to_json('b'::text),to_json('c'::text)]);
select * from test;
f
---------------------------
{"\"a\"","\"b\"","\"c\""}

Related

Extract and explode inner nested element as rows from string nested structure

I would like to explode a column to rows in a dataframe on pyspark hive.
There are two columns in the dataframe.
The column "business_id" is a string.
The column "sports_info" is a struct type, each element value is an array of string.
Data:
business_id sports_info
"abc-123" {"sports_type":
["{sport_name:most_recent,
sport_events:[{sport_id:568, val:10.827},{id:171,score:8.61}]}"
]
}
I need to get a dataframe like:
business_id. sport_id
"abc-123" 568
"abc-123" 171
I defined:
schema = StructType([ \
StructField("sports_type",ArrayType(),True)
])
df = spark.createDataFrame(data=data, schema=schema) # I am not sure how to create the df
df.printSchema()
df.show(truncate=False)
def get_ids(val):
sports_type = 'sports_type'
sport_events = 'sport_events'
sport_id = 'sport_id'
sport_ids_vals = eval(val.sports_type[0])['sport_events']
ids = [s['sport_id'] for s in sport_ids_scores]
return ids
df2 = df.withColumn('sport_new', F.udf(lambda x: get_ids(x),
ArrayType(ArrayType(StringType())))('sports_info'))
How could I create the df and extract/explode the inner nested elements?
df2 = df.withColumn('sport_new', expr("transform (sports_type, x -> regexp_extract( x, 'sport_id:([0-9]+)',1))")).show()
Explained:
expr( #use a SQL expression, only way to access transform (pre spark 3)
"transform ( # run a SQL function on an array
sports_type, # declare column to use
x # declare the name of the variable to use for each element in the array
-> # Start writing SQL code to run on each element in the array
regexp_extract( # user SQL regex functions to pull out from the string
x, #string to run regex on
'sport_id:([0-9]+)',1))" # find sport_id and capture the number following it.
)
THis will likely run faster than a UDF as it can be vectorized.

Postgres select value by key from json in a list

Given the following:
create table test (
id int,
status text
);
insert into test values
(1,'[]'),
(2,'[{"A":"d","B":"c"}]'),
(3,'[{"A":"g","B":"f"}]');
Is it possible to return?
id A B
1 null null
2 d c
3 g f
I am attempting something like this:
select id,
status::json ->> 0 #> "A" from test
Try this to address your specific example :
SELECT id, (status :: json)#>>'{0,A}' AS A, (status :: json)#>>'{0,B}' AS B
FROM test
see the result
see the manual :
jsonb #>> text[] → text
Extracts JSON sub-object at the specified path as text.
'{"a": {"b": ["foo","bar"]}}'::json #>> '{a,b,1}' → bar
This does it:
SELECT id,
(status::json->0)->"A" as A,
(status::json->0)->"B" as B
FROM test;

PostgreSQL: compare jsons [duplicate]

This question already has an answer here:
Operator does not exist: json = json
(1 answer)
Closed 3 years ago.
As known, at the moment PostgreSQL has no method to compare two json values. The comparison like json = json doesn't work. But what about casting json to text before?
Then
select ('{"x":"a", "y":"b"}')::json::text =
('{"x":"a", "y":"b"}')::json::text
returns true
while
select ('{"x":"a", "y":"b"}')::json::text =
('{"x":"a", "y":"d"}')::json::text
returns false
I tried several variants with more complex objects and it works as expected.
Are there any gotchas in this solution?
UPDATE:
The compatibility with v9.3 is needed
You can also use the #> operator. Let's say you have A and B, both JSONB objects, so A = B if:
A #> B AND A <# B
Read more here: https://www.postgresql.org/docs/current/functions-json.html
Yes there are multiple problem with your approach (i.e. converting to text). Consider the following example
select ('{"x":"a", "y":"b"}')::json::text = ('{"y":"b", "x":"a"}')::json::text;
This is like your first example example, except that I flipped the order of the x and y keys for the second object, and now it returns false, even thought the objects are equal.
Another issue is that json preserves white space, so
select ('{"x":"a", "y":"b"}')::json::text = ('{ "x":"a", "y":"b"}')::json::text;
returns false just because I added a space before the x in the second object.
A solution that works with v9.3 is to use the json_each_text function to expand the two JSON objects into tables, and then compare the two tables, e.g. like so:
SELECT NOT exists(
SELECT
FROM json_each_text(('{"x":"a", "y":"b"}')::json) t1
FULL OUTER JOIN json_each_text(('{"y":"b", "x":"a"}')::json) t2 USING (key)
WHERE t1.value<>t2.value OR t1.key IS NULL OR t2.key IS NULL
)
Note that this only works if the two JSON values are objects where for each key, the values are strings.
The key is in the query inside the exists: In that query we match all keys from the first JSON objects with the corresponding keys in the second JSON object. Then we keep only the rows that correspond to one of the following two cases:
a key exists in both JSON objects but the corresponding values are different
a key exists only in one of the two JSON objects and not the other
These are the only cases that "witness" the inequality of the two objects, hence we wrap everything with a NOT exists(...), i.e. the objects are equal if we didn't find any witnesses of inequality.
If you need to support other types of JSON values (e.g. arrays, nested objects, etc), you can write a plpgsql function based on the above idea.
Most notably A #> B AND B #> A will signify TRUE if they are both equal JSONB objects.
However, be careful when assuming that it works for all kinds of JSONB values, as demonstrated with the following query:
select
old,
new,
NOT(old #> new AND new #> old) as changed
from (
values
(
'{"a":"1", "b":"2", "c": {"d": 3}}'::jsonb,
'{"b":"2", "a":"1", "c": {"d": 3, "e": 4}}'::jsonb
),
(
'{"a":"1", "b":"2", "c": {"d": 3, "e": 4}}'::jsonb,
'{"b":"2", "a":"1", "c": {"d": 3}}'::jsonb
),
(
'[1, 2, 3]'::jsonb,
'[3, 2, 1]'::jsonb
),
(
'{"a": 1, "b": 2}'::jsonb,
'{"b":2, "a":1}'::jsonb
),
(
'{"a":[1, 2, 3]}'::jsonb,
'{"b":[3, 2, 1]}'::jsonb
)
) as t (old, new)
Problems with this approach are that JSONB arrays are not compared correctly, as in JSON [1, 2, 3] != [3, 2, 1] but Postgres returns TRUE nevertheless.
A correct solution will recursively iterate through the contents of the json and comparing arrays and objects differently. I have quickly built a set of functions that accomplishes just that.
Use them like SELECT jsonb_eql('[1, 2, 3]'::jsonb, '[3, 2, 1]'::jsonb) (the result is FALSE).
CREATE OR REPLACE FUNCTION jsonb_eql (a JSONB, b JSONB) RETURNS BOOLEAN AS $$
DECLARE
BEGIN
IF (jsonb_typeof(a) != jsonb_typeof(b)) THEN
RETURN FALSE;
ELSE
IF (jsonb_typeof(a) = 'object') THEN
RETURN jsonb_object_eql(a, b);
ELSIF (jsonb_typeof(a) = 'array') THEN
RETURN jsonb_array_eql(a, b);
ELSIF (COALESCE(jsonb_typeof(a), 'null') = 'null') THEN
RETURN COALESCE(a, 'null'::jsonb) = 'null'::jsonb AND COALESCE(b, 'null'::jsonb) = 'null'::jsonb;
ELSE
RETURN coalesce(a = b, FALSE);
END IF;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION jsonb_object_eql (a JSONB, b JSONB) RETURNS BOOLEAN AS $$
DECLARE
_key_a text;
_val_a jsonb;
_key_b text;
_val_b jsonb;
BEGIN
IF (jsonb_typeof(a) != jsonb_typeof(b)) THEN
RETURN FALSE;
ELSIF (jsonb_typeof(a) != 'object') THEN
RETURN jsonb_eql(a, b);
ELSE
FOR _key_a, _val_a, _key_b, _val_b IN
SELECT t1.key, t1.value, t2.key, t2.value FROM jsonb_each(a) t1
LEFT OUTER JOIN (
SELECT * FROM jsonb_each(b)
) t2 ON (t1.key = t2.key)
LOOP
IF (_key_a != _key_b) THEN
RETURN FALSE;
ELSE
RETURN jsonb_eql(_val_a, _val_b);
END IF;
END LOOP;
RETURN a = b;
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION jsonb_array_eql (a JSONB, b JSONB) RETURNS BOOLEAN AS $$
DECLARE
_val_a jsonb;
_val_b jsonb;
BEGIN
IF (jsonb_typeof(a) != jsonb_typeof(b)) THEN
RETURN FALSE;
ELSIF (jsonb_typeof(a) != 'array') THEN
RETURN jsonb_eql(a, b);
ELSE
FOR _val_a, _val_b IN
SELECT jsonb_array_elements(a), jsonb_array_elements(b)
LOOP
IF (NOT(jsonb_eql(_val_a, _val_b))) THEN
RETURN FALSE;
END IF;
END LOOP;
RETURN TRUE;
END IF;
END;
$$ LANGUAGE plpgsql;

How to encode tuple to JSON in elm

I have tuple of (String,Bool) that need to be encoded to JSON Array in elm.
This below link is useful for the primitive types and other list, array and object. But I need to encode tuple2.
Refer : http://package.elm-lang.org/packages/elm-lang/core/4.0.3/Json-Encode#Value
I tried different approach like encoding tuple with toString function.
It does not gives me JSON Array instead it produces String as below "(\"r"\,False)".
JSON.Decoder expecting the input paramater to decode as below snippet.
decodeString (tuple2 (,) float float) "[3,4]"
Refer : http://package.elm-lang.org/packages/elm-lang/core/4.0.3/Json-Decode
Q : When there is decode function available for tuple2, why encode function is missing it.
You can build a generalized tuple size 2 encoder like this:
import Json.Encode exposing (..)
tuple2Encoder : (a -> Value) -> (b -> Value) -> (a, b) -> Value
tuple2Encoder enc1 enc2 (val1, val2) =
list [ enc1 val1, enc2 val2 ]
Then you can call it like this, passing the types of encoders you want to use for each slot:
tuple2Encoder string bool ("r", False)
In elm 0.19 https://package.elm-lang.org/packages/elm/json/latest/Json-Encode a generalized tuple 2 encoder would be
import Json.Encode exposing (list, Value)
tuple2Encoder : ( a -> Value ) -> ( b -> Value ) -> ( a, b ) -> Value
tuple2Encoder enc1 enc2 ( val1, val2 ) =
list identity [ enc1 val1, enc2 val2 ]
Usage:
encode 0 <| tuple2Encoder string int ("1",2)

SML - Creating dictionary that maps keys to values

I need to create a dictionary in sml, but I am having extreme difficulty with an insert function.
type dict = string -> int option
As an example, here is the empty dictionary:
val empty : dict = fn key => NONE
Here is my implementation of an insert function:
fun insert (key,value) d = fn d => fn key => value
But this is of the wrong type, what I need is insert : (string*int) -> dict -> dict.
I've searched everything from lazy functions to implementing dictionaries.
Any help or direction would be greatly appreciateds!
If you are still confused on what I am trying to implement, I drafted up what I should expect to get when calling a simple lookup function
fun lookup k d = d k
- val d = insert ("foo",2) (insert ("bar",3) empty);
val d = fn : string -> int option
- lookup2 "foo" d;
val it = SOME 2 : int option
- lookup2 "bar" d;
val it = SOME 3 : int option
- lookup2 "baz" d;
val it = NONE : int option
You can reason on the signature of the function:
val insert = fn: (string * int) -> dict -> dict
When you supply key, value and a dictionary d, you would like to get back a new dictionary d'. Since dict is string -> int option, d' is a function takes a string and returns an int option.
Suppose you supply a string s to that function. There are two cases which could happen: when s is the same as key you return the associated value, otherwise you return a value by looking up d with key s.
Here is a literal translation:
fun insert (key, value) d = fn s => if s = key then SOME value
else d s