I've read the docs and it appears that there's no discernible way to perform an ALTER TABLE ... ALTER COLUMN ... USING statement to directly convert a json type column to an hstore type. There's no function available (that I'm aware of) to perform the cast.
The next best alternative I have is to create a new column of type hstore, copy my JSON data to that new column using some external tool, drop the old json column and rename the new hstore column to the old column's name.
Is there a better way?
What I have so far is:
$ CREATE TABLE blah (unstructured_data JSON);
$ ALTER TABLE blah ALTER COLUMN unstructured_data
TYPE hstore USING CAST(unstructured_data AS hstore);
ERROR: cannot cast type json to hstore
Unfortunately, PostgreSQL doesn't allow all kind of expressions within the USING clause of ALTER TABLE ... SET DATA TYPE ... (f.ex. sub-queries are disallowed).
But, you can write a function to overcome this, you just need to decide what to do with advanced types (in object's values), like arrays & objects. Here is an example, which simply converts them to string:
CREATE OR REPLACE FUNCTION my_json_to_hstore(json)
RETURNS hstore
IMMUTABLE
STRICT
LANGUAGE sql
AS $func$
SELECT hstore(array_agg(key), array_agg(value))
FROM json_each_text($1)
$func$;
After that, you can use this in your ALTER TABLE, like:
ALTER TABLE blah
ALTER COLUMN unstructured_data
SET DATA TYPE hstore USING my_json_to_hstore(unstructured_data);
There is "trap" for repeated keys - allowed by both json and hstore input, but unfortunately resolved differently (!). Consider this example value:
json '{"double_key":"key1","foo":null,"double_key":"key2"}'
In json, 'double_key is effectively 'key2'. The manual:
Because the json type stores an exact copy of the input text, it will
preserve semantically-insignificant white space between tokens, as
well as the order of keys within JSON objects. Also, if a JSON object
within the value contains the same key more than once, all the
key/value pairs are kept. (The processing functions consider the last value as the operative one.)
Bold emphasis mine.
In hstore, however, for the same order of key/value pairs, 'double_key' might effectively be 'key1'. The manual:
Each key in an hstore is unique. If you declare an hstore with
duplicate keys, only one will be stored in the hstore and there is no guarantee as to which will be kept:
Typically, the first instance of a key, but that's an implementation details that might change.
A simple and fast option to always preserve the effective, operative value: cast to jsonb before the conversion. The manual again:
[...] jsonb does not preserve white space, does not preserve
the order of object keys, and does not keep duplicate object keys.
If duplicate keys are specified in the input, only the last value is kept.
Modifying #pozs's conversion function:
CREATE OR REPLACE FUNCTION json2hstore(json)
RETURNS hstore AS
$func$
SELECT hstore(array_agg(key), array_agg(value))
FROM jsonb_each_text($1::jsonb) -- !
$func$ LANGUAGE sql IMMUTABLE STRICT;
Requires Postgres 9.4 or later. Postgres 9.3 has the json type, but not jsonb, yet. A no-op in PL/v8 might be alternative there, like #jpmc mentioned.
Related
So I have three databases - an Oracle one, SQL Server one, and a Postgres one. I have a table that has two columns: name, and value, both are texts. The value is a stringified JSON object. I need to update the nested value.
This is what I currently have:
name: 'MobilePlatform',
value:
'{
"iosSupported":true,
"androidSupported":false,
}'
I want to add {"enableTwoFactorAuth": false} into it.
In PostgreSQL you should be able to do this:
UPDATE mytable
SET MobilePlatform = jsonb_set(MobilePlatform::jsonb, '{MobilePlatform,enableTwoFactorAuth}', 'false');
In Postgres, the plain concatenation operator || for jsonb could do it:
UPDATE mytable
SET value = value::jsonb || '{"enableTwoFactorAuth":false}'::jsonb
WHERE name = 'MobilePlatform';
If a top-level key "enableTwoFactorAuth" already exists, it is replaced. So it's an "upsert" really.
Or use jsonb_set() for manipulating nested values.
The cast back to text works implicitly as assignment cast. (Results in standard format; any insignificant whitespace is removed effectively.)
If the content is valid JSON, the storage type should be json to begin with. In Postges, jsonb would be preferable as it's easier to manipulate, but that's not directly portable to the other two RDBMS mentioned.
(Or, possibly, a normalized design without JSON altogether.)
For ORACLE 21
update mytable
set json_col = json_transform(
json_col,
INSERT '$.value.enableTwoFactorAuth' = 'false'
)
where json_exists(json_col, '$?(#.name == "MobilePlatform")')
;
With json_col being JSON or VARCHAR2|CLOB column with IS JSON constraint.
(but must be JSON if you want a multivalue index on json_value.name:
create multivalue index ix_json_col_name on mytable t ( t.json_col.name.string() );
)
Two of the databases you are using support JSON data type, so it doesn't make sense to have them as stringified JSON object in a Text column.
Oracle: https://docs.oracle.com/en/database/oracle/oracle-database/21/adjsn/json-in-oracle-database.html
PostgreSQL: https://www.postgresql.org/docs/current/datatype-json.html
Apart from these, MSSQL Server also provides methods to work with JSON data type.
MS SQL Server: https://learn.microsoft.com/en-us/sql/relational-databases/json/json-data-sql-server?view=sql-server-ver16
Using a JSON type column in any of the above databases would enable you to use their JSON functions to perform the tasks that you are looking for.
If you've to use Text only then you can use replace to add the key-value pair at the end of your JSON
update dataTable set value = REPLACE(value, '}',",\"enableTwoFactorAuth\": false}") where name = 'MobilePlatform'
Here dataTable is the name of table.
The cleaner and less riskier way would be connect to db using the application and use JSON methods such as JSON.parse in Javascript and JSON.loads in Python. This would give you the JSON object (dictionary in case of Python) to work on. You can look for similar methods in other languages as well.
But i would suggest, if possible use JSON columns instead of Text to store the JSON value wherever possible.
I have created a custom type as following:
create type my_type as (camelCasedIdentifier uuid, ...);
I am using this custom type my_type to define the field names in a JSON body:
select row_to_json(row(my_table.id, ...)::my_type) from my_table;
The reason why I thought using a custom type is useful is that this way, I don't have to define the JSON field names in every query (they differ from the table column names in my case), as you would have to do with json_build_object().
The problem here however is that the field names are now all in lower case:
{"camelcasedidentifier":"d8f0a177-af13-4fa2-a2af-3bc8296d848e", ...}
I expected:
{"camelCasedIdentifier":"d8f0a177-af13-4fa2-a2af-3bc8296d848e", ...}
How can I fix this? I know this can be fixed by using select json_build_object('camelCasedIdentifier', my_table.id) from my_table, but I would rather not do that, as I will be forced to enumerate the JSON field names in every query.
In SQL identifiers are not case sensitive so your type was actually created with a field named camelcasedidentifier.
If you really need that, you have to use quoted identifiers:
create type my_type as ("camelCasedIdentifier" uuid, ...);
If you only use that type to do the JSON conversion this is acceptable, but using those dreaded quoted identifiers everywhere is going to give you more problems in the long run than they are worth it.
I'm trying to write a migration to convert an existing hstore column to JSON (not JSONB).
I tried different solutions json USING cast(hstore_column as json), some functions found over github, but nothing really worked out.
Main issue is that there's no direct conversion, second is that even if I cast the column to text as an intermediate step I need to change the default column value to json as well.
Anyone already did this?
You can simply use
alter table my_table alter column h_store_column type json using hstore_to_json(h_store_column)
Of course you will need to drop any defaults set on the column that don't align with the json data type first.
I'm trying to use Talend to get JSON data that is stored in MySQL as a VARCHAR datatype and export it into PostgreSQL 9.4 table of the following type:
CREATE TABLE myTable( myJSON as JSONB)
When I try running the job I get the following error:
ERROR: column "json_string" is of type json but expression is of type
character varying
Hint: You will need to rewrite or cast the expression. Position:
54
If I use python or just plain SQL with PostgreSQL insert I can insert a string such as '{"Name":"blah"}' and it understands it.
INSERT INTO myTable(myJSON) VALUES ('{"Name":"blah"}');
Any Idea's how this can be done in Talend?
You can add a type-cast by opening the "Advanced Settings" tab on you "tPostgresqlOutput" component. Consider the following example:
In this case, the input row to "tPostgresqlOutput_1" has one column data. This column is of type String and is mapped to the database column data of type VARCHAR (as by the default suggested by Talend):
Next, open the component settings for tPostgresqlOutput_1 and locate the "Advanced settings" tab:
On this tab, you can replace the existing data column by a new expression:
In the name column, specify the target column name.
In the SQL Expression column, do your type casting. In this case: "?::json"`. Note the usage of the placeholder character?`` which will be replaced with the original value.
In Position, specify Replace. This will replace the value proposed by Talend with your SQL expression (including the type cast).
As Reference Column use the source value.
This should do the trick.
Here is a sample schema for where in i have the input row 'r' which has question_json and choice_json columns which are json strings. From which i know the key what i wanted to extract and here is how i do
you should look at the columns question_value and choice_value. Hope this helps you
I have a Postgres JSON column where some columns have data like:
{"value":90}
{"value":99.9}
...whereas other columns have data like:
{"value":"A"}
{"value":"B"}
The -> operator (i.e. fields->'value') would cast the value to JSON, whereas the ->> operator (i.e. fields->>'value') casts the value to text, as reported by pg_typeof. Is there a way to find the "actual" data type of a JSON field?
My current approach would be to use Regex to determine whether the occurrence of fields->>'value' in fields::text is surrounded by double quotes.
Is there a better way?
As #pozs mentioned in comment, from version 9.4 there are available json_typeof(json) and jsonb_typeof(jsonb) functions
Returns the type of the outermost JSON value as a text string. Possible types are object, array, string, number, boolean, and null.
https://www.postgresql.org/docs/current/functions-json.html
Applying to your case, an example of how this could be used for this problem:
SELECT
json_data.key,
jsonb_typeof(json_data.value) AS json_data_type,
COUNT(*) AS occurrences
FROM tablename, jsonb_each(tablename.columnname) AS json_data
GROUP BY 1, 2
ORDER BY 1, 2;
I ended up getting access to PLv8 in my environment, which made this easy:
CREATE FUNCTION value_type(fields JSON) RETURNS TEXT AS $$
return typeof fields.value;
$$ LANGUAGE plv8;
As mentioned in the comments, there will be a native function for this in 9.4.