I have a table test_tab, whose column is tst_col of XMLTYPE. while trying to convert tst_col of XML to JSON, I am successfully able to do so by using :
select xml2json(t.tst_col ).to_char() JSON_VAL FROM test_tab t;
But when tst_col contains cdata in it, I get an error.
Whenever it encounters ]] it prematurely exit from an array.
Kindly help to process XML to JSON, when XML contains CDATA.
AFAIK XML2JSON is not a part of the Oracle database...
SQL> select xml2Json(xmltype('<Foo/>')) from dual;
select xml2Json(xmltype('<Foo/>')) from dual
*
ERROR at line 1:
ORA-00904: "XML2JSON": invalid identifier
SQL>
However if I were creating such a function I would definitely consider a different name for it......
Related
I'm new to athena even though I have some short experience with Hive.
I'm trying to create a table from JSON files, which are exports from MongoDB. My problem is that MongoDB uses $oid, $numberInt, $numberDoble and others as internal references, but '$' is not accepted in a column name in Athena.
This is a one line JSON file that I created to test:
{"_id":{"$oid":"61f87ebdf655d153709c9e19"}}
and this is the table that referes to it:
CREATE EXTERNAL TABLE landing.json_table (
`_id` struct<`$oid`:string>
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3://bucket-name/test/';
When I run a simple SELECT * it returns this error:
HIVE_METASTORE_ERROR: Error: name expected at the position 7 of
'struct<$oid:string>' but '$' is found. (Service: null; Status Code:
0; Error Code: null; Request ID: null; Proxy: null)
Which is related to the fact that the JSON column contains the $.
Any idea on how to handle the situation? My only resolution for now is to create a script which "clean" the json file from the unaccepted characters but I would really prefer to handle it directly in Athena if possible
If you switch to the OpenX SerDe, you can create a SerDe mapping for JSON fields with special characters like $ in the name.
See AWS Blog entry Create Tables in Amazon Athena from Nested JSON and Mappings Using JSONSerDe , section "Walkthrough: Handling forbidden characters with mappings".
A mapping that would work for your example:
CREATE EXTERNAL TABLE landing.json_table (
`_id` struct<`oid`:string>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
"mapping.oid"="$oid"
)
LOCATION 's3://bucket-name/test/';
Athena (Trino SQL) parsing JSON document (table column called document 1 in Athena) using fields (dot notation)
If the underlying json (table column called document 1 in Athena) is in the form of {a={b ...
I can parse it in Athena (Trino SQL) using
document1.a.b
However, if the JSON contains {a={"text": value1 ...
the quote marks will not parse correctly.
Is there a way to do JSON parsing of a 'field' with quotes?
If not, is there an elegant way of parsing the "text" and obtain the string in value 1? [Please see my comment below].
I cannot change the quotes in the json and its Athena "table" so I would need something that works in Trino SQL syntax.
The error message is in the form of: SQL Error [100071] [HY000]: [Simba][AthenaJDBC](100071) An error has been thrown from the AWS Athena client. SYNTAX_ERROR: Expression [redacted] is not of type ROW
NOTE: This is not a duplicate of Oracle Dot Notation Question
Dot notation works only for columns types as struct<…>. You can do that for JSON data, but judging from the error and your description this seems not to be the case. I assume your column is of type string.
If you have JSON data in a string column you can use JSON functions to parse and extract parts of them with JSONPath.
I'm trying to create a row manually via DBeaver and I am entering the following in a jsonb column:
{"US":"0.880","PA":"0.028","KY":"0.025"}
I've checked that this is valid JSON on https://jsonformatter.curiousconcept.com/#
However, this is what I get:
Any insight would be appreciated...
I even tried surrounding the object with single quotes like:
'{"US":"0.880","PA":"0.028","KY":"0.025"}'
But got an error about how the ' is an invalid token...
I was writing a nodejs script to insert a json stringified object into the column but I was getting the same error so I decided to manually try it and I can't even insert the above data...
I can insert this data in jsonb column in DBeaver via UI without problem.
But we have issue about input in json(not jsonb) column in table which has no primary key. Maybe is this your case https://github.com/dbeaver/dbeaver/issues/11704 ?
Or can you show table DDL?
insertjsonb
I used this Query to convert the data to JSON
SELECT *FROM tbl_subject FOR JSON AUTO
Iam getting the response as
when i click the response it is opening as XML file
how to change this xml to nvarchar data type
First of all, in SQL Server, JSON is not a data type in itself (XML is), but just a string representation.
What you see is due to how SQL Server Management Studio handles JSON when returned as a resultset. It is NOT xml, SSMS just slaps on an .xml file type, and prettifies the result. If you were to change how results were returned (Tools|Options|Query Results|SQL Server|General), you'd see it something like so:
JSON_F52E2B61-18A1-11d1-B105-00805F49916B
----------------------------------------------------------
[{"RowID":1,"UniversityID":1,"AcademicID":4,"CourseID":1}]
But this is just how SSMS returns result. If you were to execute your statement from an application, the result would be of string data type.
You could also change how you execute the query, to something like so:
DECLARE #nres nvarchar(max) = (SELECT * FROM dbo.tb_Subject FOR JSON AUTO)
SELECT #nres
Hope this helps!
Valid JSON can naturally have the backslash character: \. When you insert data in a SQL statement like so:
sidharth=# create temp table foo(data json);
CREATE TABLE
sidharth=# insert into foo values( '{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }');
INSERT 0 1
sidharth=# select * from foo;
data
\-----------------------------------------------------
{"foo":"bar", "bam": "{\"mary\": \"had a lamb\"}" }
(1 row)
Things work fine.
But if I copy the JSON to a file and run the copy command I get:
sidharth=# \copy foo from './tests/foo' (format text);
ERROR: invalid input syntax for type json
DETAIL: Token "mary" is invalid.
CONTEXT: JSON data, line 1: {"foo":"bar", "bam": "{"mary...
COPY foo, line 1, column data: "{"foo":"bar", "bam": "{"mary": "had a lamb"}" }"
Seems like postgres is not processing the backslashes. I think because of http://www.postgresql.org/docs/8.3/interactive/sql-syntax-lexical.html and
it I am forced to use double backslash. And that works, i.e. when the file contents are:
{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }
The copy command works. But is it correct to expect special treatment for json data types
because afterall above is not a valid json.
http://adpgtech.blogspot.ru/2014/09/importing-json-data.html
copy the_table(jsonfield)
from '/path/to/jsondata'
csv quote e'\x01' delimiter e'\x02';
PostgreSQL's default bulk load format, text, is a tab separated markup. It requires backslashes to be escaped because they have special meaning for (e.g.) the \N null placeholder.
Observe what PostgreSQL generates:
regress=> COPY foo TO stdout;
{"foo":"bar", "bam": "{\\"mary\\": \\"had a lamb\\"}" }
This isn't a special case for json at all, it's true of any string. Consider, for example, that a string - including json - might contain embedded tabs. Those must be escaped to prevent them from being seen as another field.
You'll need to generate your input data properly escaped. Rather than trying to use the PostgreSQL specific text format, it'll generally be easier to use format csv and use a tool that writes correct CSV, with the escaping done for you on writing.