I have a function that looks through a log file. It matches a regular expression in the log file to indicate a new log entry. Once it's done this it then grabs all the information after this point before the next regular expression (which would indicate a new log entry).
For each log entry, some relevant information is placed into a dictionary (error number, error message, etc)
At the end of my createGenerator function I yield mydict because I don't want to store every log entry and then pass it to my second function generatorCheck().
What I want generatorCheck to do is check key, value pairs that have been passed from the createGenerator function. I then want to put all the matching key value pairs into a table. I'm not sure how to do this though as I haven't worked a lot with yield or generators.
def createGenerator():
mydict = {
'key1': 'value2',
'key2': 'value3'
...
...
}
yield mydict
def generatorCheck():
dict2 = {}
createGenerator()
for i in createGenerator():
if 'key1' or 'key2' in createGenerator():
# store key, value pair in dict2
generatorCheck()
Related
Goal
I've got some complex json data with nested data in it which I am retrieving from an API I'm working with. In order to pull out the specific values I care about, I've created a function that will pull out all the values for a specific key that I can define. This is working well to retrieve the values in a list, however I am running into an issue where I need to return multiple values and associate them with one another so that I can get each result into a row in a csv file. Currently the code just returns separate arrays for each key. How would I go about associating them together? I've messed with the zip function in Python but can't seem to get it working properly. I sincerely appreciate any input you can give me.
Extract Function
def json_extract(obj, key):
"""Recursively fetch values from nested JSON."""
arr = []
def extract(obj, arr, key):
"""Recursively search for values of key in JSON tree."""
if isinstance(obj, dict):
for k, v in obj.items():
if isinstance(v, (dict, list)):
extract(v, arr, key)
elif k == key:
arr.append(v)
elif isinstance(obj, list):
for item in obj:
extract(item, arr, key)
return arr
values = extract(obj, arr, key)
return values
Main.py
res = requests.get(prod_url, headers=prod_headers, params=payload)
record_id = json_extract(res.json(), 'record_id')
status = json_extract(res.json(), 'status')
The solution was simple....just use the zip function ex: zip(record_id, status)
I had a syntax error that was preventing it from working before.
From a Django app, I am able to consume data from a separate Restful API, but what about filtering? Below returns all books and its data. But what if I want to grab only books by an author, date, etc.? I want to pass an author's name parameter, e.g. .../authors-name or /?author=name and return only those in the json response. Is this possible?
views.py
def get_books(request):
response = requests.get('http://localhost:8090/book/list/').json()
return render(request, 'books.html', {'response':response})
So is there a way to filter like a model object?
I can think of three ways of doing this:
Python's filter could be used with a bit of additional code.
QueryableList, which is the closest to an ORM for lists I've seen.
query-filter, which takes a more functional approach.
1. Build-in filter function
You can write a function that returns functions that tell you whether a list element is a match and the pass the generated function into filter.
def filter_pred_factory(**kwargs):
def predicate(item):
for key, value in kwargs.items():
if key not in item or item[key] != value:
return False
return True
return predicate
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
pred = filter_pred_factory(**request.GET)
data_filter = filter(pred, books_data)
# data_filter is cast to a list as a precaution
# because it is a filter object,
# which can only be iterated through once before it's exhausted.
filtered_data = list(data_filter)
return render(request, 'books.html', {'books': filtered_data})
2. QueryableList
QueryableList would achieve the same as the above, with some extra features. As well as /books?isbn=1933988673, you could use queries like /books?longDescription__icontains=linux. You can find other functionality here
from QueryableList import QueryableListDicts
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
queryable_books = QueryableListDicts(books_data)
filtered_data = queryable_books.filter(**request.GET)
return render(request, 'books.html', {'books':filtered_data})
3. query-filter
query-filter has similar features but doesn't copy the object-orient approach of an ORM.
from query_filter import q_filter, q_items
def get_books(request):
books_data = requests.get('http://localhost:8090/book/list/').json()
data_filter = q_filter(books_data, q_items(**request.GET))
# filtered_data is cast to a list as a precaution
# because q_filter returns a filter object,
# which can only be iterated through once before it's exhausted.
filtered_data = list(data_filter)
return render(request, 'books.html', {'books': filtered_data})
It's worth mentioning that I wrote query-filter.
I am trying to pass a couple IDs through to a custom ransacker filter for active admin
filter :hasnt_purchased_product_items_in, as: :select, collection: -> { ProductItem.all }, multiple: true
The ransacker code
ransacker :hasnt_purchased_product_items, formatter: proc { |product_id|
users_who_purchased_ids = User.joins(:orders => :items).where("orders.status = 'success'").where("order_items.orderable_id IN (?)", product_id) # return only id-s of returned items.
ids = User.where.not(id: users_who_purchased_ids).pluck(:id)
ids.present? ? ids : nil # return ids OR nil!
} do |parent| # not sure why this is needed .. but it is :)
parent.table[:id]
end
This works for a single query but not for multiple, the SQL is doing 2 separate searches instead of one
It's running the following 2 statements
order.items.orderable_id IN ('90')
and
order.items.orderable_id IN ('91')
in 2 separate SQL statements, instead of what i want which is
order.items.orderable_id IN ('90','91')
the submitted info from the active admin page is
q[hasnt_purchased_product_items_in][]: 90
q[hasnt_purchased_product_items_in][]: 91
i think i need to do something to the incoming parameter at the ransacker stage but i'm not sure how to deal with that
It shouldn't be stepping in. The param should be |product_ids| and give us an array object inside the proc that then gets used. Your ransacker should return an active record association or an arel query.
I would try each of these:
Use "_in" in the ransacker name, which bypasses the array logic.
ransacker :hasnt_purchased_product_items_in do |product_ids|
users_who_purchased_ids = User.joins(orders: :items).where(orders: {status: 'success'}, order_items: {orderable_id: product_ids}) # return only id-s of returned items.
User.where.not(id: users_who_purchased_ids)
end
Same ransacker name, but with block use rather than formatter.
ransacker :hasnt_purchased_product_items do |product_ids|
users_who_purchased_ids = User.joins(orders: :items).where(orders: {status: 'success'}, order_items: {orderable_id: product_ids}) # return only id-s of returned items.
User.where.not(id: users_who_purchased_ids)
end
I've had to guess the object to return because it's not clear which model this ransacker is on. If you can tell me the objective (e.g. filtering product recommendations for a user to exclude their purchases) then I can update the answer.
Create a scope and make it available to ransack.
User.rb
scope :hasnt_purchased_product_items_in, ->(product_ids){
users_who_purchased_ids = joins(orders: :items).where(orders: {status: 'success'}, order_items: {orderable_id: product_ids}) # return only id-s of returned items.
where.not(id: users_who_purchased_ids)
}
def self.ransackable_scopes(auth_object = nil)
%i(hasnt_purchased_product_items_in)
end
Then this works:
User.ransack({hasnt_purchased_product_items_in: [[1,2,3]]}).result
Note that the ransack want's an array of args. We want one arg, which is an array.
I have data loaded from JSON and am trying to extract arbitrary nested values using a list as input, where the list corresponds to the names of successive children. I want a function get_value(data,lookup) that returns the value from data by treating each entry in lookup as a nested child.
In the example below, when lookup=['alldata','TimeSeries','rates'], the return value should be [1.3241,1.3233].
json_data = {'alldata':{'name':'CAD/USD','TimeSeries':{'dates':['2018-01-01','2018-01-02'],'rates':[1.3241,1.3233]}}}
def get_value(data,lookup):
res = data
for item in lookup:
res = res[item]
return res
lookup = ['alldata','TimeSeries','rates']
get_value(json_data,lookup)
My example works, but there are two problems:
It's inefficient - In my for loop, I copy the whole TimeSeries object to res, only to then replace it with the rates list. As #Andrej Kesely explained, res is a reference at each iteration, so data isn't being copied.
It's not concise - I was hoping to be able to find a concise (eg one or two line) way of extracting the data using something like list comprehension syntax
If you want one-liner and you are using Python 3.8, you can use assignment expression ("walrus operator"):
json_data = {'alldata':{'name':'CAD/USD','TimeSeries':{'dates':['2018-01-01','2018-01-02'],'rates':[1.3241,1.3233]}}}
def get_value(data,lookup):
return [data:=data[item] for item in lookup][-1]
lookup = ['alldata','TimeSeries','rates']
print( get_value(json_data,lookup) )
Prints:
[1.3241, 1.3233]
I don't think you can do it without a loop, but you could use a reducer here to increase readability.
functools.reduce(dict.get, lookup, json_data)
I have written a loop to retrieve data from an API (Limesurvey) based on a list of ids and fill a row of a dataframe with the result of each loop.
I have a list with ids like this:
# list of ids
ids = ['1','427',... ,'847']
My code to query the API based on each item of the list looks as follows:
method = "get_participant_properties"
params = OrderedDict([
("sSessionKey", api.session_key),
("iSurveyID", 12345),
("aTokenQueryProperties", t),
])
# loop through API query with the 'aTokenQueryProperties' stored in the list 'tids'.
attributes = []
for t in ids:
attributes.append(api.query(method=method, params=params))
pd.DataFrame(attributes)
Unfortunately, the result is a dataframe with 158 rows, and each row is the same, i.e. the query result of the last id in my list (847).
You are not passing in the t from the loop. The t in the params definition is unrelated; if I were to run your code now I'd get a NameError exception because t hasn't been set at that point. The t expression in the params mapping is not live, it is not updated each loop iteration.
Set the 'aTokenQueryProperties' key in the loop:
method = "get_participant_properties"
params = OrderedDict([
("sSessionKey", api.session_key),
("iSurveyID", 12345),
("aTokenQueryProperties", None),
])
attributes = []
for t in ids:
params["aTokenQueryProperties"] = t
attributes.append(api.query(method=method, params=params))
Setting "aTokenQueryProperties" to None in the params OrderedDict object at the start is optional; you'd only need to do this if reserving its exact location in the params order is important and even then, because in your example it is the last element in the mapping, you'd end up with the same output anyway.