flatten JSON value stored as varchar in snowflake to columns - json

one of our tables in snowflake contains the below kind of json value stored as a varchar
How do I get to this?
in this example I have achieved this by using SPLIT_PART
Not all values have two keys inside the json:
in the above case I want it to look like this:
And effectively, how to I ensure that I capture the case that more keys and values come into the json?.. 2 more columns?... how can I write a query that can automatically count for all the cases??, or is that not recommended?

Related

MySQL - list out all the JSON properties as separate columns

We have unstructured data that is being stored as JSON at MySQL (one of the tables along with Structured data). We would like to extract the data, but we are not sure how to extract the JSON Data as JSON Data could contain any property (no common properties).
Could you please help me to exact all properties by not specifying the property names.
SQL cannot dynamically append more columns to its result set after the query begins executing and it examines data in rows. The columns must be fixed in the select-list at the time the SQL query is parsed, before the query begins executing and examining data. So you must spell out the columns in the select-list of the query. This means you must know the names of all properties in advance.
You could do a query to fetch all property names:
SELECT JSON_KEYS(mydata) FROM MyTable;
This returns arrays of keys per row. There will be a lot of duplication. In your client application, you would write code to parse the result, and form a list of distinct keys.
Then you could use that list to form a second SQL query, with one column in the select-list for each key you noted in the first step.
The alternative is to forget about returning properties in separate columns. Just return the JSON documents from the database as-is. Then explode the JSON after you fetch it in the result set, and process it in application code that way.
One way or the other, you need to write application code, either before running your query or after running your query.
Welcome to "flexible" database design! :-)

create column names algorithmically for for large datasets in SQL

I'm looking to import a massive dataset into a mysql server. The issue is that the first 6 columns are fine to name, after that I have over 1000 columns of absorption values and I'd rather not sit there typing 'absorp-x' for hours. Is there a way to specify the first few column names when I create a table and then say "use the following format for all columns: absorp-x"?
Suggest you throw the 1000 absorption values into a JSON string and have a single column for that data. In that situation, you don't need to name the values but simply have a JSON array. (I assume they are numbered consecutively?)
That would probably fit into a TEXT CHARACTER SET ascii column. Or, if you have a newer version, a JSON column.

How to Guess schema in Mysqlinput on the fly in Talend

I've build a job that copy data from a mysql db table to b mysql table.
The table columns are the same except sometimes a new column can be added in table a db.
i want to retrieve all the columns from a to b but only those that exists in table b. i was able to put in the query specific select colume statment that exists in table b like:
select coulmn1,column2,columns3... from table a
the issue is if i add a new column in b that matches a the talend job schema in Mysqlinput should be changed as well cause i work with build in type.
Is there a way to force the schema columns during the job running?
If you are using a subscription version of Talend, you can use the dynamic column type. You can define a single column for your input of type "Dynamic" and map it to a column of the same type in your output component. This will dynamically get columns from table a and map them to the same columns in table b. Here's an example.
If you are using Talend Open Studio, things get a little trickier as Talend expects a list of columns for the input and output components that need to be defined at design time.
Here's a solution I put together to work around this limitation.
The idea is to list all table a's columns that are present in table b. Then convert it to a comma separated list of columns, in my example id,Theme,name and store it in a global variable COLUMN_LIST. A second output of the tMap builds the same list of columns, but this time putting single quotes between columns (so as they can be used as parameters to the CONCAT function later), then add single quotes to the beginning and end, like so: "'", id,"','",Theme,"','",name,"'" and store it in a global variable CONCAT_LIST.
On the next subjob, I query table a using the CONCAT function, giving it the list of columns to be concatenated CONCAT_LIST, thus retrieving each record in a single column like so 'value1', 'value2',..etc
Then at last I execute an INSERT query against table b, by specifying the list of columns given by the global variable COLUMN_LIST, and the values to be inserted as a single string resulting from the CONCAT function (row6.values).
This solution is generic, if you replace your table names by context variables, you can use it to copy data from any MySQL table to another table.

How to store the result of a SQL statement as a variable and use the result in an SSIS Expression?

I am using a SSIS Data Flow Task to transfer data from one table to another. Column A in Table A contains a number, the last 3 digits of which I want to store in Column B of Table B.
First I'm trying to grab all of the data in Column A and store in a variable via a simple SELECT statement SELECT COLUMN_A FROM TABLE_A. However, the variable stores the statement as a string when I want the result set of the query. I have set the EvaluateAsExpression property to False but to no avail.
Secondly I want to be able to use the result of this query in the Derived Column of my Data Flow to extract the last 3 digits and store the values in Column_B in the destination. The expression I have is:
(DT_STR,3,1252)RIGHT(#User::[VariableName],3)
I want to store this as a string hence the (DT_STR,3,1252) data type.
All I'm getting so far in Column_B of Table_B is is the last 3 characters of the SELECT statement "E_A". There is a lot of useful information on the web including YouTube videos for things like setting file paths and server names as parameters or variables but I can't see many relevant to the specifics of my query.
I have used an Execute SQL Task to insert row counts from flat files but, in this example, I want to use the Derived Column tool instead.
What am i doing wrong? Any help is gratefully appreciated.
I prefer to do all the work in SQL if you aren't doing anything else with that number.
select right(cast(ColA as varchar(20)),3) from tableA
-- you can add another cast if you want it to be an int
use that in an execute sql to result set = single row.
Map that to a variable.
In a derived column in data flow you can set that variable to the new column.
Thanks KeithL thats one solution I will use in future but I found another.
I dropped the variable and in the Expression box of the Transformation Editor did:
(DT_STR,3,1252)RIGHT((DT_STR,3,1252)Column_A,3).
In my question, I failed to cast Column_A from Table_A as a string. The first use of (DT_STR,3,1252) simply sets the destination column as a string so as not to use the same data type as the source which in my case was int.
Its the 2nd use of (DT_STR,3,1252) that actually casts Column_A from int to a string.

MySQL datatype to store pair of values?

I have a table that stores a list of expected parameters that are compared against a real set of data to check if they are within range. There are a dozen or so labeled parameters, and each parameter is either an int, float, or varchar. Each parameter can have either a range like (3.5 - 6.7) or a list of possible values like ('localizer', 'local a') or (4,5,8)
This will be dynamic, so the user will be able to add to and update the list of possible values. I don't think I want to add in lots and lots rows of possible parameters unless I can't find a way to do this in the schema.
What is the best way to store these possible parameter values in a SQL table?
Have you tried to use ENUM? e.g gender ENUM('M', 'F') accepts only M or F as a value.