I have 2 lists like this:
Type = ['Homeless+Shelter','Food+Pantry','Seniors']
Where = ['55410','55414','54669']
And I would like to add them to a URL to create a search to use an API. Here is what I have:
for elem in Type:
url = 'https://api.citygridmedia.com/content/places/v2/search/where?type=%s&where=55410&format=json&publisher='PUBLISHER_KEY'&rpp=50' % (elem)
urllib.urlretreieve(url, 'CityGrid_Search.json)
The URL is going to an API then saving the data as a JSON file. I am inputting the Type list into the url where 'type=%s'
I would like to input the list of zipcodes that correspond to the word in the Type list where . This code works for iterating through Type, but I have to manually change the zipcode in the URL to the corresponding word. Is it possible to put list items in two different spots?
Try this code,
Type = ['Homeless+Shelter','Food+Pantry','Seniors']
Where = ['55410','55414','54669']
PUBLISHER_KEY = ""
for typeObj, whereObj in zip(Type, Where):
url = 'https://api.citygridmedia.com/content/places/v2/search/where?type=%s&where=%s&format=json&publisher=%s&rpp=50' % (typeObj, whereObj, PUBLISHER_KEY)
print url
I think, it'll help you out.
Related
I'm new to Python and using Python3 to display the data from my weather station
The problem I have is it used to work perfectly until I got a replacement station.
I found the problem
In the weather data sent there are 3 fields (not sure of the correct name) but they are
lightning_strike_last_distance
lightning_strike_last_distance_msg
lightning_strike_last_epoch
In my new station these fields are completely missing as there has been no lightning since I got the new one
As a result the station display just doesn't parse the weather data as those fields are not there.
How can I get the program to check if those fields/elements or whatever the correct name is, and if they are there parse them as usual
but if they are not there to skip those and move onto the next section
This is the relevant section of code
lightning_strike_last_distance = forecast_json["current_conditions"]["lightning_strike_last_distance"]
lightning1 = lightning_strike_last_distance*0.621371 #Convert kph to mph
data.lightning_strike_last_distance = "{0:.2f} miles".format(lightning1)
lightning_strike_last_epoch = forecast_json["current_conditions"]["lightning_strike_last_epoch"]
data.lightning_strike_last_epoch = time.strftime("%d-%m-%Y %H:%M:%S", time.localtime(lightning_strike_last_epoch))
How can I fix it so the program skips those 3 elements/sections if they are missing?
try following pattern:
lightning_strike_last_distance = forecast_json["current_conditions"]["lightning_strike_last_distance"] if "lightning_strike_last_distance" in forecast_json["current_conditions"] else None
It will set lightning_strike_last_distance to the value if it is present, and set it to None if it is not present.
Repeat that pattern for all other assignements.
to test it quickly try :
data = {"a":{"b":1,},}
valueB = data["a"]["b"] if "b" in data["a"] else None
valueC = data["a"]["c"] if "c" in data["a"] else None
print (valueB)
print (valueC)
I have searched what I can and I don't seem to be finding the answer I need. Granted I may not be wording it properly. I have tried using .find or even .rindex to count backwards, but no such luck. The value I receive from the JSON looks something like this:
"AdditionalData":"<Data><Entry Key=\"utm_campaign\" Value=\"j2c\" />
<Entry Key=\"utm_medium\" Value=\"cpc\" /><Entry Key=\"utm_source\"
Value=\"j2c\" /><Entry Key=\"job_id\" Value=\"300_xxxx_10703\" /></Data>"
I need to be able to grab the value for the key "job_id", so the "300_xxxx_11233". This value will change per object returned by the JSON response. Any help would be appreciated, and please let me know if this is already out there and I just missed it.
If the response format remains the same with every request, you could use a plain regexp expression to fetch your data, even without parsing JSON. Example:
response = "<Data><Entry Key=\"utm_campaign\" Value=\"j2c\" /><Entry Key=\"utm_medium\" Value=\"cpc\" /><Entry Key=\"utm_source\" Value=\"j2c\" /><Entry Key=\"job_id\" Value=\"300_xxxx_10703\" /></Data>"
match = response.match(%r{job_id\\?"\s+Value=\\?"(.+)\\?"}i)
match[1] if match # => "300_xxxx_10703"
If the response format can change (for example, if the order of the attributes of Entry element can change), then you need to parse JSON and use some HTML parser, such as Nokigiri, to fetch required attrbute. Code example:
parsed_response = JSON.parse(response)
doc = Nokogiri::HTML(parsed_response['AdditionalData'])
job_id = nil
doc.css('Entry').each do |el|
if el['Key'] == 'job_id'
job_id = el['Value']
break
end
end
My goal is to (1) import Twitter JSON, (2) extract data of interest, (3) create pandas data frame for the variables of interest. Here is my code:
import json
import pandas as pd
tweets = []
for line in open('00.json'):
try:
tweet = json.loads(line)
tweets.append(tweet)
except:
continue
# Tweets often have missing data, therefore use -if- when extracting "keys"
tweet = tweets[0]
ids = [tweet['id_str'] for tweet in tweets if 'id_str' in tweet]
text = [tweet['text'] for tweet in tweets if 'text' in tweet]
lang = [tweet['lang'] for tweet in tweets if 'lang' in tweet]
geo = [tweet['geo'] for tweet in tweets if 'geo' in tweet]
place = [tweet['place'] for tweet in tweets if 'place' in tweet]
# Create a data frame (using pd.Index may be "incorrect", but I am a noob)
df=pd.DataFrame({'Ids':pd.Index(ids),
'Text':pd.Index(text),
'Lang':pd.Index(lang),
'Geo':pd.Index(geo),
'Place':pd.Index(place)})
# Create a data frame satisfying conditions:
df2 = df[(df['Lang']==('en')) & (df['Geo'].dropna())]
So far, everything seems to be working fine.
Now, the extracted values for Geo result in the following example:
df2.loc[1921,'Geo']
{'coordinates': [39.11890951, -84.48903638], 'type': 'Point'}
To get rid of everything except the coordinates inside the squared brackets I tried using:
df2.Geo.str.replace("[({':]", "") ### results in NaN
# and also this:
df2['Geo'] = df2['Geo'].map(lambda x: x.lstrip('{'coordinates': [').rstrip('], 'type': 'Point'')) ### results in syntax error
Please advise on the correct way to obtain coordinates values only.
The following line from your question indicates that this is an issue with understanding the underlying data type of the returned object.
df2.loc[1921,'Geo']
{'coordinates': [39.11890951, -84.48903638], 'type': 'Point'}
You are returning a Python dictionary here -- not a string! If you want to return just the values of the coordinates, you should just use the 'coordinates' key to return those values, e.g.
df2.loc[1921,'Geo']['coordinates']
[39.11890951, -84.48903638]
The returned object in this case will be a Python list object containing the two coordinate values. If you want just one of the values, you can slice the list, e.g.
df2.loc[1921,'Geo']['coordinates'][0]
39.11890951
This workflow is much easier to deal with than casting the dictionary to a string, parsing the string, and recapturing the coordinate values as you are trying to do.
So let's say you want to create a new column called "geo_coord0" which contains all of the coordinates in the first position (as shown above). You could use a something like the following:
df2["geo_coord0"] = [x['coordinates'][0] for x in df2['Geo']]
This uses a Python list comprehension to iterate over all entries in the df2['Geo'] column and for each entry it uses the same syntax we used above to return the first coordinate value. It then assigns these values to a new column in df2.
See the Python documentation on data structures for more details on the data structures discussed above.
I am using Dell Boomi to map data from one system to another. I can use groovy in the maps but have no experience with it. I tried to do this with the other Boomi tools, but have been told that I'll need to use groovy in a script. My inbound data is:
132265,Brown
132265,Gold
132265,Gray
132265,Green
I would like to output:
132265,"Brown,Gold,Gray,Green"
Hopefully this makes sense! Any ideas on the groovy code to make this work?
It can be elegantly solved with groupBy and the spread operator:
#Grapes(
#Grab(group='org.apache.commons', module='commons-csv', version='1.2')
)
import org.apache.commons.csv.*
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def parsed = CSVParser.parse(csv, CSVFormat.DEFAULT.withHeader('code', 'color')
parsed.records.groupBy({ it.code }).each { k,v -> println "$k,\"${v*.color.join(',')}\"" }
The above prints:
132265,"Brown,Gold,Gray,Green"
Well, I don't know how are you getting your data, but here is a general way to achieve your goal. You can use a library, such as the one bellow to parse the csv.
https://github.com/xlson/groovycsv
The example for your data would be:
#Grab('com.xlson.groovycsv:groovycsv:1.1')
import static com.xlson.groovycsv.CsvParser.parseCsv
def csv = '''
132265,Brown
132265,Gold
132265,Gray
132265,Green
'''
def data = parseCsv(csv)
I believe you want to associate the number with various values of colors. So for each line you can create a map of the number and the colors associated with that number, splitting the line by ",":
map = [:]
for(line in data) {
number = line.split(',')[0]
colour = line.split(',')[1]
if(!map[number])
map[number] = []
map[number].add(colour)
}
println map
So map should contain:
[132265:["Brown","Gold","Gray","Green"]]
Well, if it is not what you want, you can extract the general idea.
Assuming your data is coming in as a comma separated string of data like this:
"132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
The following Groovy script code should do the trick.
def csvString = "132265,Brown 132265,Gold 132265,Gray 132265,Green 122222,Red 122222,White"
LinkedHashMap.metaClass.multiPut << { key, value ->
delegate[key] = delegate[key] ?: []; delegate[key] += value
}
def map = [:]
def csv = csvString.split().collect{ entry -> entry.split(",") }
csv.each{ entry -> map.multiPut(entry[0], entry[1]) }
def result = map.collect{ k, v -> k + ',"' + v.join(",") + '"'}.join("\n")
println result
Would print:
132265,"Brown,Gold,Gray,Green"
122222,"Red,White"
Do you HAVE to use scripting for some reason? This can be easily accomplished with out-of-the-box Boomi functionality.
Create a map function that prepends the ID field to a string of your choice (i.e. 222_concat_fields). Then use that value to set a dynamic process prop with that value.
The value of the process prop will contain the result of concatenating the name fields. Simply adding this function to your map should take care of it. Then use the final value to populate your result.
Well it depends upon the data how is it coming.
If the data which you have posted in the question is coming in a single document, then you can easily handle this in a map with groovy scripting.
If the data which you have posted in the question is coming into multiple documents i.e.
doc1: 132265,Brown
doc2: 132265,Gold
doc3: 132265,Gray
doc4: 132265,Green
In that case it cannot be handled into map. You will need to use Data Process Step with Custom Scripting.
For the code which you are asking to create in groovy depends upon the input profile in which you are getting the data. Please provide more information i.e. input profile, fields etc.
I'm trying to process the following with an JSON Input step:
{"address":[
{"AddressId":"1_1","Street":"A Street"},
{"AddressId":"1_101","Street":"Another Street"},
{"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},
{"AddressId":"1_102","Locality":"New York"}
]}
However this seems not to be possible:
Json Input.0 - ERROR (version 4.2.1-stable, build 15952 from 2011-10-25 15.27.10 by buildguy) :
The data structure is not the same inside the resource!
We found 1 values for json path [$..Locality], which is different that the number retourned for path [$..Street] (3509 values).
We MUST have the same number of values for all paths.
The step provides Ignore Missing Path flag but it only works if all the rows misses the same path. In that case that step acts as as expected an fills the missing values with null.
This limits the power of this step to read uneven data, which was really one of my priorities.
My step Fields are defined as follows:
Am I missing something? Is this the correct behavior?
What I have done is use JSON Input using $.address[*] to read to a jsonRow field the full map of each element p.e:
{"address":[
{"AddressId":"1_1","Street":"A Street"},
{"AddressId":"1_101","Street":"Another Street"},
{"AddressId":"1_102","Street":"One more street", "Locality":"Buenos Aires"},
{"AddressId":"1_102","Locality":"New York"}
]}
This results in 4 jsonRows one for each element, p.e. jsonRow = {"AddressId":"1_101","Street":"Another Street"}. Then using a Javascript step I map my values using this:
var AddressId = getFromMap('AddressId', jsonRow);
var Street = getFromMap('Street', jsonRow);
var Locality = getFromMap('Locality', jsonRow);
In a second script tab I inserted minified JSON parse code from https://github.com/douglascrockford/JSON-js and the getFromMap function:
function getFromMap(key,jsonRow){
try{
var map = JSON.parse(jsonRow);
}
catch(e){
var message = "Unparsable JSON: "+jsonRow+" Desc: "+e.message;
var nr_errors = 1;
var field = "jsonRow";
var errcode = "JSON_PARSE";
_step_.putError(getInputRowMeta(), row, nr_errors, message, field, errcode);
trans_Status = SKIP_TRANSFORMATION;
return null;
}
if(map[key] == undefined){
return null;
}
trans_Status = CONTINUE_TRANSFORMATION;
return map[key]
}
You can solve this by changing the JSONPath and splitting up the steps in two JSON input steps. The following website explains a lot about JSONPath: http://goessner.net/articles/JsonPath/
$..AddressId
Does in fact return all the AddressId's in the address array, BUT since Pentaho is using grid rows for input and output [4 rows x 3 columns], it can't handle a missing value aka null value when you want as results return all the Streets (3 rows) and return all the Locality (2 rows), simply because there are no null values in the array itself as in you can't drive out of your garage with 3 wheels on your car instead of the usual 4.
I guess your script returns null (where X is zero) values like:
A S X
A S X
A S L
A X L
The scripting step can be avoided same by changing the Fields path of the first JSONinput step into:
$.address[*]
This is to retrieve all the 4 address lines. Create a next JSONinput step based on the new source field which contains the address line(s) to retrieve the address details per line:
$.AddressId
$.Street
$.Locality
This yields the null values on the four address lines when a address details is not available in an address line.