Parsing JSON Payload in Vertica - json

I am using Vertica DB (and DBeaver as SQL Editor) - I am new to both tools.
I have a view with multiple columns:
someint | xyz | c | json
5 | 1542 | none | {"range":23, "rm": 51, "spx": 30}
5 | 1442 | none | {"range":24, "rm": 50, "spx": 3 }
3 | 1462 | none | {"range":24, "rm": 50, "spx": 30}
(int) | (int) | (Varchar) | (Long Varchar)
I want to create another view (or for the beginning, just be able to query it properly) of the above, but with the "json" column separated into the individual fields/columns "range", "rm" and "spx".
I imagine the output of the query / the new view to be something like the following:
someint | xyz | c | range | rm | spx
5 | 1542 | none | 23 | 51 | 30
5 | 1442 | none | 24 | 50 | 3
....
So far I have not been able to even query the "range" for example.
Hence my questions:
How can I separate the json column key-value structure into individual columns (in a query output)?
How can I transfer the desired output into a new view in Vertica?
I haven't found much help in the documentation as the procedure there is to load json text files from a drive or operate on tables, which I cannot do as I only have access to a view.

I have found a solution, so for anyone else encountering this problem:
SELECT a, xyza, cont,
MAPLOOKUP(MapJSONExtractor(json), 'range') AS range,
MAPLOOKUP(MapJSONExtractor(json), 'rm') AS rm,
MAPLOOKUP(MapJSONExtractor(json), 'spx') AS spx
FROM test;

Related

Laravel get "unique" row from mysql

I have a special scenario to fetch "unique" row.
Let's say the database is like below
| id | userid | value | others |
|----|--------|-------|---------|
| 1 | 111 | 10 | string1 |
| 2 | 112 | 30 | string2 |
| 3 | 112 | 30 | string3 |
| 4 | 113 | 50 | string4 |
what I want to achieve is to fetch the unique rows based on the "userid" so I'am able to sum all values.
the expect output row can be either id: 1 2 4 or 1 3 4 (both is acceptable for this special case because same id guarantees same value, or in general, get just one row from those row with same userid. ), so the sum will be 90.
Note: DB is extended from Eloquent\model
My old approach is to get DB::unique('userid'); then for each userid DB::where('userid', $id)->value('value'), add the result to sum; I just believe there might be a better approach.
There is Illuminate\Support\Facades\DB in Laravel, it can return the Query Builder. Not recommend to use a model that is named DB.
So just change another name.
By the way, for Eloquent\Model, you can use groupBy and sum too:
Model::groupBy('user_id')->sum('value');

getting the new row id from pySpark SQL write to remote mysql db (JDBC)

I am using pyspark-sql to create rows in a remote mysql db, using JDBC.
I have two tables, parent_table(id, value) and child_table(id, value, parent_id), so each row of parent_id may have as many rows in child_id associated to it as needed.
Now I want to create some new data and insert it into the database. I'm using the code guidelines here for the write opperation, but I would like to be able to do something like:
parentDf = sc.parallelize([5, 6, 7]).toDF(('value',))
parentWithIdDf = parentDf.write.mode('append') \
.format("jdbc") \
.option("url", "jdbc:mysql://" + host_name + "/"
+ db_name).option("dbtable", table_name) \
.option("user", user_name).option("password", password_str) \
.save()
# The assignment at the previous line is wrong, as pyspark.sql.DataFrameWriter#save doesn't return anything.
I would like a way for the last line of code above to return a DataFrame with the new row ids for each row so I can do
childDf = parentWithIdDf.flatMap(lambda x: [[8, x[0]], [9, x[0]]])
childDf.write.mode('append')...
meaning that at the end I would have in my remote databasde
parent_table
____________
| id | value |
____________
| 1 | 5 |
| 2 | 6 |
| 3 | 7 |
____________
child_table
________________________
| id | value | parent_id |
________________________
| 1 | 8 | 1 |
| 2 | 9 | 1 |
| 3 | 8 | 2 |
| 4 | 9 | 2 |
| 5 | 8 | 3 |
| 6 | 9 | 3 |
________________________
As I've written in the first code snippet above, pyspark.sql.DataFrameWriter#save doesn't return anything, looking at its documentation, so how can I achieve this?
Am I doing something completely wrong? It looks like there is no way to get data back from a Spark's action (which save is) while I would like to use this action as a transformation, shich leads me to think I may be thinking of all this in the wrong way.
A simple answer is to to use the timestamp + auto increment number to create a unique ID. This only works if there is only one server is running at an instance of time.
:)

pyqt4 - MySQL How print single/multiple row(s) of a table in the TableViewWidget

I've recently tried to create an executable with python 2.7 which can read a MySQL database.
The database (named 'montre') regroups two tables : patient and proto_1
Here is the content of those tables :
mysql> select * from proto_1;
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
| id | Nom_Montre | Date_Heure | Temperature | Pulsion_cardiaque | Taux
_oxy_sang | Humidite |
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
| 1 | montre_1 | 2017-11-27 19:33:25 | 22.30 | NULL |
NULL | NULL |
| 2 | montre_1 | 2017-11-27 19:45:12 | 22.52 | NULL |
NULL | NULL |
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
mysql> select * from patient;
+----+-----------+--------+------+------+---------------------+------------+----
----------+
| id | nom | prenom | sexe | age | date_naissance | Nom_Montre | com
mentaires |
+----+-----------+--------+------+------+---------------------+------------+----
----------+
| 2 | RICHEMONT | Robert | M | 37 | 1980-04-05 23:43:00 | montre_3 | ess
aye2 |
| 3 | PIERRET | Mandy | F | 22 | 1995-04-05 10:43:00 | montre_4 | ess
aye3 |
| 14 | PIEKARZ | Allan | M | 22 | 1995-06-01 10:32:56 | montre_1 | Hea
lthy man |
+----+-----------+--------+------+------+---------------------+------------+----
----------+
As I'm just used to code in C (no OOP), I didn't create class in the python project (shame on me...). But I managed, in two files, to create something (with mysql.connector) which can print (on the cmd) my database and excecute sub like looking-for() etc.
Now, I want to create a GUI for users with pyqt. Unfortunately, I saw that the structure is totally different, with class etc. But okay, I tried to go throught this and I've created a GUI which allows to display the table "patient". But I didn't manage (in the datasheet of QT) to find how I can use the programs I've already created to display. Neither how to display in a tableWidget only several rows of my table patient for exemple (Using QSQL).
For example, if I want to display all the table patient, I use this line (pyQt):
self.model.setTable("patient")
For this one, I got it, but that disturb me because there is no MySQL coding requisites to display my table and so I don't know how to sort only the rows we want to see and display them. If we only want to see, for example, the ID n°2, how to display in the table:widget only Robert ?
To recap, I want to know :
If I can take the coding I've created and combine it with pyQT
How to display (tableWidget) only rows which are selected by MySQL. Is that possible ?
Please find in the URL my code for a better understanding of my problem :
https://drive.google.com/file/d/1nxufjJfF17P5hN__CBEcvrbuHF-23aHN/view?usp=sharing
I hope I was clear, thank you all for your help !

Sum query for MySQL where field contain certain values

I need help with a Query, i have a table like this:
| ID | codehwos |
| --- | ----------- |
| 1 | 16,17,15,26 |
| 2 | 15,32,12,23 |
| 3 | 53,15,21,26 |
I need an outpout like this:
| codehwos | number_of_this_code |
| -------- | ---------------------- |
| 15 | 3 |
| 17 | 1 |
| 26 | 2 |
I want to sum all the time a code is used in a row.
Can anyone make a query for doing it for all the code in one time?
Thanks
You have a very poor data format. You should not store lists in strings and never store lists of numbers in strings. SQL has a great data structure for storing lists. Hint: it is called a "table" not a "string".
That said, sometimes one is stuck with other people's really poor design choices. We wouldn't make them ourselves, but we still need to get something done. Assuming you have a list of codes, you can do what you want with:
select c.code, count(*)
from codes c join
table t
on find_in_set(c.code, t.codehwos) > 0
group by c.code;
If you have any influence over the data structure, then advocate for a junction table, the right way to store this data in a relational database.

How to split CSVs from one column to rows in a new table in MSSQL 2008 R2

Imagine the following (very bad) table design in MSSQL2008R2:
Table "Posts":
| Id (PK, int) | DatasourceId (PK, int) | QuotedPostIds (nvarchar(255)) | [...]
| 1 | 1 | | [...]
| 2 | 1 | 1 | [...]
| 2 | 2 | 1 | [...]
[...]
| 102322 | 2 | 123;45345;4356;76757 | [...]
So, the column QuotedPostIds contains a semicolon-separated list of self-referencing PostIds (Kids, don't do that at home!). Since this design is ugly as a hell, I'd like to extract the values from the QuotedPostIds table to a new n:m relationship table like this:
Desired new table "QuotedPosts":
| QuotingPostId (int) | QuotedPostId (int) | DatasourceId (int) |
| 2 | 1 | 1 |
| 2 | 1 | 2 |
[...]
| 102322 | 123 | 2 |
| 102322 | 45345 | 2 |
| 102322 | 4356 | 2 |
| 102322 | 76757 | 2 |
The primary key for this table could either be a combination of QuotingPostId, QuotedPostId and DatasourceID or an additional artificial key generated by the database.
It is worth noticing that the current Posts table contains about 6,300,000 rows but only about 285,000 of those have a value set in the QuotedPostIds column. Therefore, it might be a good idea to pre-filter those rows. In any case, I'd like to perform the normalization using internal MSSQL functionality only, if possible.
I already read other posts regarding this topic which mostly dealt with split functions but neither could I find out how exactly to create the new table and also copying the appropriate value from the Datasource column, nor how to filter the rows to touch accordingly.
Thank you!
€dit: I thought it through and finally solved the problem using an external C# program instead of internal MSSQL functionality. Since it seems that it could have been done using Mikael Eriksson's suggestion, I will mark his post as an answer.
From comments you say you have a string split function that you you don't know how to use with a table.
The answer is to use cross apply something like this.
select P.Id,
S.Value
from Posts as P
cross apply dbo.Split(';', P.QuotedPostIds) as S