access a dictionary key value inside a list - json

Hi I have got a json response from al API with the following structure:
{'totalCount': 82,
'items': [{'id': '81',
'priority': 3,
'updatedAt': '2021-07-28T01:30:53.101Z',
'status': {'value': None, 'source': None},
'ps': {'value': None,'source': None},
'lastUpdate': '2020-09-07T03:00:17.590Z'}
....
]}
So when I check the key, values with python:
for key, value in jsonResponse.items():
print(key)
I am getting:
totalCount
items
Which are the keys of this dictionary.
So when I loop over the values, I get the ones inside the key called items, which at the same time is a list with dictionaries inside of it, how could I get the keys inside that list called items and the values inside of it as well?

You only have to play with the key, value, and nest some for blocks:
for key, value in jsonResponse.items():
print(key)
if key == "items":
for items_key and items_value in value[0].items():
# do whatever you want with the dictionary
# items here?
That way, you can access to this specific dictionary.

Related

Create single dictionary with jq

I have the following input:
[
{
constant_id: 5,
object_id: 2,
object_type: 'delimited_file',
name: 'data_file_pattern',
value: 'list_of_orders.csv',
insert_date: 2021-11-23T10:24:16.568Z,
update_date: null
},
{
constant_id: 6,
object_id: 2,
object_type: 'delimited_file',
name: 'header_count',
value: '1',
insert_date: 2021-11-23T10:24:16.568Z,
update_date: null
}
]
That I'd like to combine to get the following result:
{
data_file_pattern: 'list_of_orders.csv',
header_count: '1'
}
Basically creating a single dictionary with only the name and value keys from the input dictionaries. I believe I've done this before but for the life of me I can't figure it out again.
If you get your quoting right in the input JSON, it's as simple as calling the from_entries builtin. It converts an array of objects to a single object with given key/value pairs. It takes the field name from a field called key, Key, name or Name and the value from a field called value or Value (see Demo):
from_entries
{
"data_file_pattern": "list_of_orders.csv",
"header_count": "1"
}
Note: I believe the second field name should read header_count instead of delimited_file as you wanted to take its name from .name, not .object_type.

How to get map keys from Arrow dataset

What is the recommended approach to obtain a unique list of map keys from an Arrow dataset?
For a dataset with schema containing:
...
PARQUET:field_id: '19'
detail: map<string, struct<reported: bool, incidents_per_month: int32>
...
Sample data:
"detail": {"a": {"reported": true, "incidents_per_month: 3}, "b": {"reported": true, "incidents_per_month: 3}},
"detail": {"c": {"reported": false, "incidents_per_month: 3}}
What is the right approach to obtaining a list of unique map keys for field detail? i.e. a,b,c
Currrent (slow) approach:
map_data = dataset.field('a)
map_keys = list(set([key for chunk in map_data.iterchunks() for key in chunk.keys.unique().tolist()]))
You already found the .keys attribute of a MapArray. This gives an array of all keys, of which you can take the unique values.
But a dataset (Table) can consist of many chunks, and then accessing the data of a column gives a ChunkedArray which doesn't have this keys attribute. For that reason, you loop over the different chunks, and combine the unique values of all of those.
For now, looping over the chunks is still needed I think, but calculating the overall uniques can be done a bit more efficiently with pyarrow:
# set-up small example
map_type = pa.map_(pa.string(), pa.struct([('reported', pa.bool_()), ('incidents_per_month', pa.int32())]))
values = [
[("a", {"reported": True, "incidents_per_month": 3}), ("b", {"reported": True, "incidents_per_month": 3})],
[("c", {"reported": False, "incidents_per_month": 3})]
]
dataset = pa.table({'detail': pa.array(values, map_type)})
# then creating a chunked array of keys
map_data = dataset.column('detail')
keys = pa.chunked_array([chunk.keys for chunk in map_data.iterchunks()])
# and taking the unique of those in one go:
>>> keys.unique()
<pyarrow.lib.StringArray object at 0x7fbc578af940>
[
"a",
"b",
"c"
]
For optimal efficiency, it would still be good to avoid the python loop of pa.chunked_array([chunk.keys for chunk in map_data.iterchunks()]), and for this I opened https://issues.apache.org/jira/browse/ARROW-12564 to track this enhancement.

SSRS Report set value of parameter based on other parameter

I want to set the default value of 1 of my parameter using the other selected parameters dataset value.
for example, the content of the dataset is something like
[{'name': alex, 'id': 1},
{'name': bloom, 'id': 2},
{'name': kelly, 'id': 3},
{'name': david, 'id': 4},
{'name': lyn, 'id': 5}];
then in previous parameter, the user choose for name = alex, then how to set the next parameter value = 1, which the previous parameter's id.
Go to the parameter properties of the one you want to default > go to Default Values tab > Specify values > add 1 value and use this expression:
=Parameters!YourParameterName.Value
Is there any reason you need two parameters essentially pointing to the same thing?
Typically you would point you parameter's available values property to your dataset and set the parameter values to be ID and the parameter label to be name. This way the user chooses "Alex" from the list but internally the parameter value is actually 1

converting flattened json from list into dataframe in python

I have a nested json file which I managed to flatten, but as a result I got a list which looks like this:
[{'people_gender': 'Female',
'people_age_group': 'Young adult',
'people_distance': 91,
'time': 0.33},
{'people_gender': 'Male',
'people_age_group': 'Adult',
'people_distance': 88,
'time': 0.66}]
These are only two first instances of the list but there is of course no point of copying the whole list. Now I would like to convert it into the dataframe so the 'people_gender', 'people_age_group', 'people_distance' and 'time' are columns and in the rows are the results for respective time moments.
I simply tried:
df = pd.DataFrame(np.array(file))
but this just gives me the data frame with one column and in rows there are every entries for the given time moments and I don't know how to tackle it from there.
You can use json_normalize() to get this json in

'NoneType' object has no attribute 'replace' in Python list comprehension

I have the folllowing dictionary...
mydict = {'columns': ['col1', 'col2', 'col3'],
'rows': [['col1', 'col2', 'col3'],
['testing data 1', 'testing data 2lk\nIdrjy9dyj', 'testing data 3'],
['testing data 2', 'testing data 3', 'testing data 4'],
['testing data 3', 'testing data 4', 'testing data 5']]}
And I am using the following list comprehension to replace carriage return "\n" with this "<br>". It works fine except when it gets passed an empty string as it is reading a json file. Then it throws the error 'NoneType' object has no attribute 'replace'. I just do not know how to put a if is not none statement in the list comprehension. Any help greatly appreciated..
for items in mydict['rows']:
mydict['rows'][i] = [item.replace("\n","<br>") for item in items]
i += 1
You could just use a boolean expression here:
[item and item.replace("\n","<br>") for item in items]
This only calls item.replace() if item is considered true; None and an empty string are both considered false.
If you wanted to filter out any None items you could add a test to your list comprehension:
[item.replace("\n","<br>") for item in items if item is not None]
to remove None values or
[item.replace("\n","<br>") for item in items if item]
to only keep non-empty values.