I have a google bigquery table with a column containing large JSON strings. In each row, there is a different number of keys and nested keys that I would like to flatten into columns.
My table looks as follows:
id
payload
1
{"key1":{"value":"1"},"key2":2,"key3":1,"key4":"abcde,"version":10}
2
{"key1":{"value":"2"},"key2":5,"key3":2,"key4":"defg,"version":11}
I have managed to extract single columns by using the bq functions JSON_EXTRACT_VALUE and/or JSON_EXTRACT_SCALAR:
SELECT id, JSON_EXTRACT_VALUE(payload, '$.key1') as key1
FROM `project.dataset.table`
etc., however I don't want to hand code more than 100 keys which are nested in the JSON column. There has to be a better way!
I am grateful for any kind of support!
Consider below approach
create temp function extract_keys(input string) returns array<string> language js as """
return Object.keys(JSON.parse(input));
""";
create temp function extract_values(input string) returns array<string> language js as """
return Object.values(JSON.parse(input));
""";
create temp function extract_all_leaves(input string) returns string language js as '''
function flattenObj(obj, parent = '', res = {}){
for(let key in obj){
let propName = parent ? parent + '.' + key : key;
if(typeof obj[key] == 'object'){
flattenObj(obj[key], propName, res);
} else {
res[propName] = obj[key];
}
}
return JSON.stringify(res);
}
return flattenObj(JSON.parse(input));
''';
create temp table temp_table as (
select offset, key, value, id
from your_table t,
unnest([struct(extract_all_leaves(payload) as leaves)]),
unnest(extract_keys(leaves)) key with offset
join unnest(extract_values(leaves)) value with offset
using(offset)
);
execute immediate (select '''
select * from (select * except(offset) from temp_table)
pivot (any_value(value) for replace(key, '.', '__') in (''' || keys_list || '''
))'''
from (select string_agg('"' || replace(key, '.', '__') || '"', ',' order by offset) keys_list from (
select key, min(offset) as offset from temp_table group by key
))
);
if applied to sample data as in your question
the output is
I have a distributed table in single-value model with three columns: time, id and value. I want to input parameters including several ids and one timestamp. The value may be NULL at the given timestamp. If so, can I select the linear interpolation of the two nearest neighbors on the left or/and right side as the value at this timestamp? The search time span can also be taken into account as a parameter.
set optional parameter type with three options “prev”, “next”, “linear”, default type= “prev”
default search_range = 1d
search_range can be 2d(#days), 7w(#weeks), 5M(#months), 1y(#years), no quotation marks needed
Syntax:
geInterpolatedDataPoint(dbName, tblName, 2019.01.02T02:00:01.042, ids, "linear",2d),geInterpolatedDataPoint(dbName, tblName, timestamp, ids),geInterpolatedDataPoint(dbName, tblName, timestamp, ids, 5w) and so on
The example script are as follows:
def prev_next_method(rawDataTable, timestamp, ids,type,search_range=1d){
if (type=="prev") {
original_table = select * from rawDataTable where id in ids, time between temporalAdd(timestamp,-1H) : timestamp context by id csort time limit -1// 如果表内数据是按照时间增序,则把csort time删掉,会快很多
}
if (type=="next") {
original_table = select * from rawDataTable where id in ids, time between timestamp : temporalAdd(timestamp,1H) context by id csort time limit 1
}
exclude_id= (set(ids)-set(exec id from original_table)).keys()
if (size(exclude_id) >0){
if (type=="prev") {
search=duration("-"+string(search_range))
original_table_con = select * from rawDataTable where id in exclude_id , time between temporalAdd(timestamp,search) : timestamp context by exclude_id csort time limit -1
}
if (type=="next") {
original_table_con = select * from rawDataTable where id in exclude_id , time between timestamp : temporalAdd(timestamp,search_range) context by exclude_id csort time limit 1
}
exclude_id_final = (set(exclude_id)-set(exec id from original_table_con)).keys()
exclude_table = table(timestamp as time, exclude_id_final as id, take(double(NULL), size(exclude_id_final)) as value)
return original_table.append!(original_table_con).append!(exclude_table)
}
else return original_table
}
def geInterpolatedDataPoint(dbName, tblName, timestamp, ids, type="prev",search_range=1d){
tmp=table(take(timestamp, size(ids)) as time, ids as id)
rawDataTable = loadTable(dbName,tblName)
if (type=="prev") {
res_tmp= prev_next_method(rawDataTable, timestamp, ids,"prev",search_range)
update res_tmp set time = timestamp
return res_tmp
}
if (type=="next") {
res_tmp= prev_next_method(rawDataTable, timestamp, ids,"next",search_range)
update res_tmp set time = timestamp
return res_tmp
}
if (type=="linear"){
pre = prev_next_method(rawDataTable, timestamp, ids,"prev",search_range)
nex = prev_next_method(rawDataTable, timestamp, ids,"next",search_range)
update tmp set value = double()
join_res=(pre.append!(tmp).append!(nex)).sortBy!(`id,1)
res_tmp=select time, id, nullFill((time-prev(time))\(next(time)-prev(time))*(next(value)-prev(value)),0)+prev(value) as value1 from join_res context by id
res = select timestamp as time, id, sum(value1) from res_tmp group by id
return res
}
}
I have this line of code
int Id = Convert.ToInt32(GridViewADMIN_Panel.DataKeys[e.RowIndex].Values[0]);
Getting "Input string was not in a correct format.
The error means that the value you are trying to cast is null or does not contain a valid integer. First you need to check if value is null and then caste value into int
if(GridViewADMIN_Panel.DataKeys[e.RowIndex].Values[0] != null)
{
//if value is not null then try to parse value into int
int Id = Int.TryParse(GridViewADMIN_Panel.DataKeys[e.RowIndex].Values[0]);
}
make sure, the value is int before you caste it.
I can't filter my results with andHaving condition, because AVG function returns string values.
$qb = $this->getDoctrine()->getRepository('AppBundle:Service')->createQueryBuilder('s');
$qb->join('s.ratingList', 'r')
->addSelect('AVG(r.rating) as avg_rating')
->addGroupBy('s.service')
->andHaving('avg_rating >= :rating')
->setParameter('rating', $rating)
;
In results see that avg_rating is string format not number, that's why filtering doesn't work. Now how can I get avg_rating as integer?
I need to retrieve items based on a few different restrictions, one is to have code of 234 the other is to have calculated number of less than 10, but I am not sure how to pass values to the sqlRestrictions method.
I am using {alias} but it passes Item rather than city to this.
List<Store> stores = (List<Store>) sessionFactory.getCurrentSession()
.createCriteria(Item.class)
.createAlias("Address", "address")
.createAlias("address.myJoinTable.city", "city")
.setProjection(pl)
.add(Restrictions.eq("address.myJoinTable.city",
session.load(City.class, myId)))
.add(Restrictions
.sqlRestriction("SELECT (
{alias}.id * {alias}.code * "
+ mynumber1 + " * " + mynumber2 + ")
as number from city
HAVING number < 10")
.setResultTransformer(
new AliasToBeanResultTransformer(Store.class))
.list();
You can use public static Criterion sqlRestriction(String sql, Object value, Type type) if you only need to pass one value. Use public static Criterion sqlRestriction(String sql, Object[] values, Type[] types) for multiple values.
For example:
Type[] tipos = {IntegerType.INSTANCE, IntegerType.INSTANCE};
Integer[] values = {1, 2};
criteria.add(Restrictions.sqlRestriction("SELECT ({alias}.id * {alias}.code * ? * ?) AS number FROM city HAVING number < 10", values, tipos ));