I have a table that with a column Info VARCHAR(MAX) constrained to always be valid json. What is the best way to query a key with JSON_QUERY/JSON_VALUE if I don't know ahead of time if the value associated with the key is a scalar or not? Currently I am returning where either has a value as shown below.
SELECT ID, JSON_VALUE(Info, '$.Key') as json_val, JSON_QUERY(Info, '$.Key') as json_query
FROM TABLE
WHERE (JSON_VALUE(Info, '$.Key') IS NOT NULL OR JSON_QUERY(Info, '$.Key') IS NOT NULL)
and relying on the fact that the results are mutually exclusive.
The problem is the JSON_QUERY in the WHERE clause prevents any indexes on the virual column vKey AS JSON_VALUE(Info, '$.Key') from being used.
As suggested by #MartinSmith, you can add a computed column like this
ALTER TABLE YourTable
ADD YourColumn AS (ISNULL(JSON_VALUE(json, '$.key'), JSON_QUERY(json, '$.key')));
You can then index it, and the index will be used in queries automatically
CREATE INDEX IndexName ON YourTable (YourColumn) INCLUDE (OtherColumns)
Note that the index will normally be used even if you use the original expression rather than the new column name.
db<>fiddle
Let's say I have a table like this
this table is the result of a query from another larger table stored in my database
All I want is to create a table like this one above and specify for each column a custom format and store it into my database
I know that I could do create table mytab as select ... etc
however i don't know how to specify the column formats that I want in mysql
could you please help ?
If you have the query sql, you should be able to do a select into to store the results in a table. Add a LIMIT clause to just store one row. You could then do SHOW CREATE TABLE tablename (from this SO answer) to get the SQL for creating the table. It would be up to you to figure out what your primary key should be.
Assuming with column formats you mean data types: Use CAST to cast to the desired data type.
create new_table as
select
cast( a.metrique as varchar(100) ) as metrique,
cast( b.nombre_de_lignes as int ) as cote_de_lignes, ...
from ...
You may specify columns properties completely or partially. Like there is no SELECT part, and you simply create empty table.
I.e. like
CREATE TABLE table_name ({any definitions allowed in table creation query:
columns specifications, indices, constraints, FKs, etc.})
SELECT ...
In this form each output column in SELECT must have alias which matches according column name defined in CREATE TABLE part. If alias is absent in the structure then a column with the name==alias will be added to the table definition with dynamically formed properties.
I have a csv data as below where data comes every 10mins in the following format. I need to insert this data into hive by mapping column names with different column names. (columns don't come in constant order they change their order, we have total 10 columns sometimes we miss many columns like one example below below)
sample csv file :-
1 2 6 4
u f b h
a f r m
q r b c
now while inserting into hive i need to replace column names
for example
1 -> NBR
2 -> GMB
3 -> GSB
4 -> KTC
5 -> VRV
6 -> AMB
now I need to insert into hive table as below
NBR GMB GSB KTC VRV AMB
u f NULL h NULL b
a f NULL m NULL r
can anyone help me with this how to insert this values into hive
Assuming you can get column headers in you source CSV, you will need to map them from source number to their column names.
sed -i 's/1/NBR/g; s/2/GMB/g; s/3/GSB/g; s/4/KTC/g; s/5/VRV/g; s/6/AMB/g;...;...;...;...' input.csv
Since you only get an unknown subset of the total columns in your hive table, you will need to translate your CSV from
NBR,GMB,AMB,KTC
u,f,b,h
a,f,r,m
q,r,b,c
to
NBR,GMB,GSB,KTC,VRV,AMB,...,...,...,...
u,f,null,b,null,h,null,null,null,null
a,f,null,r,null,m,null,null,null,null
q,r,null,b,null,c,null,null,null,null
in order to properly insert them into your table.
From the Apache Wiki:
Values must be provided for every column in the table. The standard SQL syntax that allows the user to insert values into only some columns is not yet supported. To mimic the standard SQL, nulls can be provided for columns the user does not wish to assign a value to.
Standard Syntax:
INSERT INTO TABLE tablename [PARTITION (partcol1[=val1], partcol2[=val2] ...)] VALUES values_row [, values_row ...]
Where values_row is:
( value [, value ...] )
where a value is either null or any valid SQL literal
Using LOAD DATA INPATH, even with the tblproperties("skip.header.line.count"="1") set, still requires a valid SQL literal for all columns in the table. This is why youre missing columns.
If you can not get the producer of the CSV to create a file with 1,2,...9,10 columns in order with your table columns and either consecutive commas or a null character in the data, write some kind of script to add missing column names, in the order you need them in, and the required null values in the data.
If you will have header in csv like 1,2,3,4 (as you wrote in the comment), you could use the next syntax:
insert into table (columns where you want to insert) select 1,2,3,4 (columns) from csv_table;
So, if you could know the order of csv columns, you could write easily the insert, naming only the column that you need to populate, no matter the order in the target table.
Before you could run the above insert, you should create a table that reads from csv!
I have a table full of traffic accident data with column headers such as 'Vehicle_Manoeuvre' which contains integers for example 13 represents the vehicle manoeuvre which caused the accident was 'overtaking moving vehicle'.
I know the mappings from integers to text as I have a (quite large) excel file with this data.
An example of what I want to know is percentage of the accidents involved this type of manoeuvre but I don't want to have to open the excel file and find the mappings of integers to text every time I write a query.
I could manually change the integers of all the columns (write query with all the possible mappings of each column, add them as new column, then delete the orginial columns) but this sould take a long time.
Is it possible to create some type of variable (like an array with first column as integers and second column with the mapped text) that SQL could use to understand how text relates to the integers allowing me to write a query below:
SELECT COUNT(Vehicle_Manoeuvre) FROM traffictable WHERE Vehicle_Manoeuvre='overtaking moving vehicle';
rather than:
SELECT COUNT(Vehicle_Manoeuvre) FROM traffictable WHERE Vehicle_Manoeuvre=13;
even though the data in the table is still in integer form?
You would do this with a Maneeuvres reference table:
create table Manoeuvres (
ManoeuvreId int primary key,
Name varchar(255) unique
);
insert into Manoeuvres(ManoeuvreId, Name)
values (13, 'Overtaking');
You might even have such a table already, if you know that 13 has a special meaning.
Then use a join:
SELECT COUNT(*)
FROM traffictable tt JOIN
Manoeuvres m
ON tt.Vehicle_Manoeuvre = m.ManoeuvreId
WHERE m.name = 'Overtaking';
I have a field that I want to compute based on a string and the ID generated when a record is inserted. Basically when a record is save with ID = 1, I need the computed field to read 'string_1' and so on. I am trying this is my formula (('PV'+'_')+ID) where PV is the string and ID is the primary key field in the same row as the data inserted but I'm getting a formula error. If I add quotes around ID then I just get PV_ID which is wrong. Any idea how I can reference the ID field in my formula and fetch the value?
here is my table row structure(ID,Computedfield,data1,data2). i need computedfield to have the value of the ID field concatenated with a string. any help appreciated
EDIT
Using SQL SERVER 2008 R2 Standard
Your question isn't totally clear on whether that prefix string is a string literal, or the contents of another column.
If it's a literal, you should be able to say:
ALTER TABLE dbo.YourTable
ADD ComputedColumn AS 'PV_' + CAST(ID AS VARCHAR(10)) PERSISTED
If it's a string contained in another column, you should be able to define it like this
ALTER TABLE dbo.YourTable
ADD ComputedColumn AS PV + '_' + CAST(ID AS VARCHAR(10)) PERSISTED
assuming PV is the column (of type VARCHAR) containing the prefix string.
The main point is: since you're mixing a literal string, and an INT value, you need to CAST the INT to a string first before being able to concatenate those two.
Use formula:
('PV_'+CAST(ID as varchar))
if you want to keep the resulting value - add the PERSISTED in the end