Check if partial string in JSON values in Python - json

I'm trying to check if a value contains a partial string in this sample API call result in JSON format. It should look through job title text to see if it contains the word "Virtual" or "Remote" anywhere in the string. If so, append 1 to new dictionary, and if not, append 0. I keep getting 1's for each value, but not sure why 0's don't show up if it can't find any match. Thanks! Here's the code:
import requests
remote_job_check = {'Remote':[]}
job_lists = {
"jobs": [
{
"country_code": "US",
"state": "AL",
"title": "Sr. Customer Solutions Manager - Virtual Opportunities",
},
{
"country_code": "US",
"state": "CA",
"title": "Data Engineer",
},
{
"country_code": "US",
"state": "CA",
"title": "Data Engineer - Remote",
},
{
"country_code": "US",
"state": "NV",
"title": "Remote Software Engineer",
},
{
"country_code": "US",
"state": "FL",
"title": "Product Manager",
}
]
}
job_data = requests.get(job_lists).json()['jobs']
for details in job_data:
if 'Virtual' or 'Remote' in details.get('title'):
remote_job_check['Remote'].append(1)
else:
remote_job_check['Remote'].append(0)

I was able to figure it out. Thanks for the suggestions though. I learned that my original code only searches for one of the strings. The 'or' operator wasn't even working in this case:
for details in job_data:
if 'Virtual' or 'Remote' in details.get('title'): # The 'or' operator not working
remote_job_check['Remote'].append(1)
else:
remote_job_check['Remote'].append(0)
Then I tried using just one string and it worked.
for details in job_data:
if 'Virtual' in details.get('title'): # This works
remote_job_check['Remote'].append(1)
else:
remote_job_check['Remote'].append(0)
So now I figured I need to create some sort of array to iterate through and found the 'any' function. Here's the working code:
remote_job_keywords = ['Virtual', 'Remote']
if any(keyword in details['title'] for keyword in remote_job_keywords):
job_data['Remote'].append(1)
else:
job_data['Remote'].append(0)
UPDATE: Thanks to #shriakhilc for the feedback on my answer. Like he said, the above works but it was easier to fix the or operator like so:
for details in job_data:
if 'Virtual' in details.get('title') or 'Remote' in details.get('title'):
remote_job_check['Remote'].append(1)
else:
remote_job_check['Remote'].append(0)

Related

Extract data from a JSON file using python

Say if I have JSON entry as follows(The JSON file generated by fetching data from a Firebase DB):
[{"goal_savings": 0.0, "social_id": "", "score": 0, "country": "BR", "photo": "http://graph.facebook", "id": "", "plates": 3, "rcu": null, "name": "", "email": ".", "provider": "facebook", "phone": "", "savings": [], "privacyPolicyAccepted": true, "currentRole": "RoleType.PERSONAL", "empty_lives_date": null, "userId": "", "authentication_token": "-------", "onboard_status": "ONBOARDING_WIZARD", "fcmToken": ----------", "level": 1, "dni": "", "social_token": "", "lives": 10, "bills": [{"date": "2020-12-10", "role": "RoleType.PERSONAL", "name": "Supermercado", "category": "feeding", "periodicity": "PeriodicityType.NONE", "value": 100.0"}], "payments": [], "goals": [], "goalTransactions": [], "incomes": [], "achievements": [{"created_at":", "name": ""}]}]
How do I extract the content corresponding to 'value' which is present inside column 'bills' . Any way to do this ?
My python code is as follows. With this I was only able to get data within bills column. But I need only the entry corresponding to 'value' which is present inside bills.
import json
filedata = open('firebase-dataset.json','r')
data = json.load(filedata)
listoffields = [] # To produce it into a list with fields
for dic in data:
try:
listoffields.append(dic['bills']) # only non-essential bill categories.
except KeyError:
pass
print(listoffields)
The JSON you posted contains misplaced quotes.
I think you are trying to extract the value of 'value' column within bills.
try this
print(listoffields[0][0]['value'])
which will print you 100.0 as str. use float() to use it in calculations.
---edit---
Say the JSON you having contains many JSON objects separated by commas as..
[{ first-entry },{ second-entry },{ third.. }, ....and so on]
..and you want to find the value of each bill in the each JSON obj..
may be the code below will work.-
bill_value_list = [] # to store 'value' of each bill
for bill_list in listoffields:
bill_value_list.append(float(bill_list[0]['value'])) # blill_list[0] will contain complete bill dictionary.
print(bill_value_list)
print(sum(bill_value_list)) # do something usefull
Paste it after the code you posted.(no changes to your code .. since it always works :-) )

Search parameter not working with the API request

I am new to json and APIs.
This particular dataset I am working with (https://api.cdnjs.com/libraries) is searchable by placing ?search=search_term after it (like: https://api.cdnjs.com/libraries?search=cloud). But when I use another json dataset (http://api.nobelprize.org/v1/prize.json for instance) it gives an error:
{ "error":"Using an unknown parameter" }
Does this have to do with how the datasets are set up? Methods perhaps? How do I make the other json searchable like the initial one (so I can use it in my search app)?
Navigating to the main domain of the provided url, you will find that the doc reside at https://nobelprize.readme.io/v1.0
To refine your results, as the docs suggest you should use the following parameters (when searching for prizes): year, yearTo, category and numberOfLaureates.
Here is an example of a refined search:
Request url: http://api.nobelprize.org/v1/prize.json?category=physics&year=2017
Response:
{
"prizes": [{
"year": "2017",
"category": "physics",
"laureates": [{
"id": "941",
"firstname": "Rainer",
"surname": "Weiss",
"motivation": "\"for decisive contributions to the LIGO detector and the observation of gravitational waves\"",
"share": "2"
}, {
"id": "942",
"firstname": "Barry C.",
"surname": "Barish",
"motivation": "\"for decisive contributions to the LIGO detector and the observation of gravitational waves\"",
"share": "4"
}, {
"id": "943",
"firstname": "Kip S.",
"surname": "Thorne",
"motivation": "\"for decisive contributions to the LIGO detector and the observation of gravitational waves\"",
"share": "4"
}]
}]
}
If it wasn't clear enough, I made a request to the API to retrieve data concerning prizes awarded in the physics category in the year 2017.

Parsing JSON in SQL Server 2017 (Clearbit API call)

I'm pulling some data into a database on my local server with API calls via Clearbit provider. Everything was OK regarding parsing the data with SQL Server 2017 until I hit a bump.
I will go straight on the example for easier understanding.
This is the example of an API call output in JSON
{
"id": "384dfe0d-5bba-445e-a390-2d946dc84a12",
"name": "Honeywell",
"legalName": "Honeywell International Inc",
"domain": "honeywell.com",
"domainAliases": [
"honeywell.at",
"honeywell.it",
"evohome.info",
"wifithermostat.com",
"emsaviation.com",
"mytotalconnect.com",
"honeywell.nl",
"honeywell.co.za",
"honeywell.com.au",
"honeywell.ca",
"alliedsignal.com",
"emsdss.com",
"primusepic.com",
"alarmnet-me.com",
"lebow.com",
"honeywell.ie",
"honeywell.jp",
"honeywell.com.br",
"trendcontrol.co.uk",
"honeywellforjaguar.co.uk",
"aviaso.com",
"skyforce.co.uk",
"newenglandinstruments.com",
"honeywell.fi",
"alarmnet.com",
"skyconnect.com",
"skyforceuk.com",
"securitex.com",
"missionready.com",
"honeywellaerospace.com",
"formation.com",
"aclon.com",
"electrocorp.com",
"ultrak.com",
"satcom1.com",
"hsmpats.com",
"myaerospace.com",
"emsglobaltracking.com",
"fascocontrols.com",
"honeywellnow.com",
"bendixbrakes.com",
"elmwoodsensors.com",
"ovationselect.com",
"honeywellbusinessaviation.com",
"iflyaspire.com",
"btrinc.com",
"honeywellspecialtymaterials.com",
"magneticsensors.com",
"activeye.com",
"egarrett.com",
"novar-eds.com",
"aviaso.co.uk",
"chadwick-helmuth.com",
"datainstruments.com",
"lebowproducts.com",
"honeywell-produktkatalog.de",
"honeywellforjaguar.com",
"hobbs-corp.com",
"emsgt.com",
"honeywellaes.com",
"honeywellbuildingsolutions.com",
"satcom1.aero",
"honeywell-building-solutions.de",
"lifesafetydistribution.com",
"godirect.com",
"garrettbulletin.com",
"yourhomeexpert.com",
"aerospacetrading.com",
"sensorsystems.com",
"wifithermostat.info",
"honeywell-fachseminare.de",
"hobbscorporation.com",
"kcl.hu",
"honeywell.sk",
"esser.info",
"inertialsensor.com",
"sensotec.com",
"notifier.com",
"honeywellgreer.com",
"smartact.de",
"honeywellfire.com",
"iris-systems.com",
"honeywell.ru",
"lxei.com",
"thermalswitch.com",
"hightempsolutions.com",
"aubetech.com",
"honeywell-haustechnik.de",
"careersathoneywell.com",
"garrettbyhoneywell.com",
"honeywell.in",
"honeywell.cn",
"honeywell.com.mx",
"kcp.com",
"satamatics.com",
"myflite.com"
],
"site": {
"title": "Honeywell",
"h1": null,
"metaDescription": " We are blending products with software solutions to link people and businesses to the information they need to be more efficient, safer and connected. ",
"metaAuthor": null,
"phoneNumbers": [
"+1 877-271-8620",
"+1 800-633-3991",
"+1 877-841-2840",
"+1 480-353-3020",
"+1 973-455-3388",
"+1 973-204-9621",
"+32 2 728 20 45",
"+32 476 20 90 19",
"+44 7794 007289",
"+86 21 2219 6509"
],
"emailAddresses": [
"domains#honeywell.com",
"HoneywellPrivacy#honeywell.com",
"rob.ferris#honeywell.com",
"ilse.schouteden#honeywell.com",
"chris.martin2#honeywell.com",
"Anahi.Espinosa#honeywell.com",
"lydia.lu#honeywell.com",
"madhavi.jha#Honeywell.com",
"Steven.Brecken#Honeywell.com",
"Steve.Brecken#Honeywell.com",
"Eugene.Tan#Honeywell.com"
]
},
"category": {
"sector": "Consumer Discretionary",
"industryGroup": "Automobiles & Components",
"industry": "Automotive",
"subIndustry": "Automotive",
"sicCode": "3714",
"naicsCode": null
},
"tags": [
"Automotive",
"Enterprise",
"B2B",
"Electrical"
],
"description": " We are blending products with software solutions to link people and businesses to the information they need to be more efficient, safer and connected. ",
"foundedYear": 1936,
"location": "115 Tabor Rd, Morris Plains, NJ 07950, USA",
"timeZone": "America/New_York",
"utcOffset": -4,
"geo": {
"streetNumber": "115",
"streetName": "Tabor Road",
"subPremise": null,
"city": "Morris Plains",
"postalCode": "07950",
"state": "New Jersey",
"stateCode": "NJ",
"country": "United States",
"countryCode": "US",
"lat": 40.8358456,
"lng": -74.4771042
},
"logo": "https://logo.clearbit.com/honeywell.com",
"facebook": {
"handle": "293855263965203",
"likes": null
},
"linkedin": {
"handle": "company/honeywell"
},
"twitter": {
"handle": "HoneywellNow",
"id": "257492733",
"bio": "Please visit us over at #Honeywell.",
"followers": 2322,
"following": 1,
"location": "Morris Plains, NJ",
"site": "https:",
"avatar":
},
"crunchbase": {
"handle": "organization/honeywell"
},
"emailProvider": false,
"type": "public",
"ticker": "HON",
"phone": "+1 973-455-2000",
"metrics": {
"alexaUsRank": 6045,
"alexaGlobalRank": 18053,
"googleRank": null,
"employees": 51779,
"employeesRange": "1000+",
"marketCap": 102920000000,
"raised": null,
"annualRevenue": 39302000000,
"fiscalYearEnd": 12
},
"indexedAt": "2017-07-11T23:00:41.115Z",
"tech": [
"crazy_egg",
"google_analytics",
"google_tag_manager",
"asp_net",
"mouseflow",
"marketo",
"go_squared",
"microsoft_exchange_online",
"outlook",
"recaptcha"
],
"parent": {
"domain": null
},
"similarDomains": [
"abb-livingspace.com",
"alerton.com",
"gereports.com",
"honeywellprocess.com",
"honeywelluk.com",
"johnsoncontrols.com",
"jpinstruments.com",
"lenel.com",
"maxitrol.com",
"nucalgon.com",
"schneider-electric.us",
"siemens.com"
]
}
If you look at the example up here you will see "domainAliases": [...]
and that is the part of the JSON I still need to parse.
This is the parse query for SQL that I already have:
SELECT *
, JSON_VALUE(JSONData,'$.name') AS CompanyName
, JSON_VALUE(JSONData,'$.category.sector') AS CategorySector
, JSON_VALUE(JSONData, '$.category.industryGroup') AS CategoryIndustryGroup
, JSON_VALUE(JSONData, '$.category.industry') AS CategoryIndustry
, JSON_VALUE(JSONData, '$.category.subIndustry') AS CategorySubIndustry
, JSON_VALUE(JSONData, '$.category.sicCode') AS CategorySicCode
, JSON_VALUE(JSONData, '$.category.naicsCode') AS CategoryNaicsCode
, JSON_VALUE(JSONData, '$.metrics.employees') AS EmployeesNumber
, JSON_VALUE(JSONData, '$.metrics.employeesRange') AS EmployeesRange
, JSON_VALUE(JSONData, '$.metrics.marketCap') AS MarketCap
, JSON_VALUE(JSONData, '$.metrics.annualRevenue') AS AnnualRevenue
, JSON_VALUE(JSONData, '$.similarDomains') AS SimilarDomains
FROM Domains;
I want this data ("domainAliases") to be stored in other table as the data in the upper query (I know that the parse query I already have is only a SELECT query but I also have an UPDATE version of the query).
Here is an example picture of how the finished product in a new table, same database should look. The left column is called Company Name, the 2nd column is called Domain Aliases:
Now WHERE is the JSON data stored? I have it stored in a Column called JSONData, tablename: Domains and all this is in a database called Domainbank. JSONData datatype is nvarchar(max).
I need the data to be grouped by the name of the company and next to the company name there should be aliases domain just like the picture example shows. Now keep in mind that I will run this query for 10k+ JSONDatas and the new table that is going to be created will be super huge but as long as it is all grouped by the company name with all the alias domains it should be good. Some of the JSONDatas did not return the API call in the correct format because they either didn't find the data or something else went wrong, so If the query doesnt find anyting under the "domainAliases": [...] or if it doesn't even find the "domainAliases": [...] then I don't need the company to appear on the new table.
So short recap: let's make a new table (Let's call it AliasDomains), find the data under "domainAliases": [...] also pull the company name out JSON_VALUE(JSONData,'$.name') AS CompanyName, Store the data in the new table as the picture example higher in the post and then group by CompanyName.
So, from your post I am not completely clear on what your question is, but I assume it is how to write some SQL statement to accomplish the above?
First of all, I'd say you should not care of the GROUP BY in the insert, do GROUP BY when retrieving data out of the table.
Having said that you can quite easily accomplish what you want with a SELECT from the Domains table together with a CROSS APPLY OPENJSON statement, like so:
INSERT INTO AliasDomains(CompanyName, DomainAliases)
SELECT JSON_VALUE(JSONData, '$.name'), value
FROM Domains
CROSS APPLY OPENJSON (JSONData, '$.domainAliases')
EDIT: Should probably add that value in the above statement is returned from OPENJSON, e.g. it references the values of the (in this case domainAliases) path you want.
Hope this helps?!
Niels

Which is the right approach to validate a mandatory field in front end or back end?

I am new in back end development. I have already write the API to update user info, whose request body like this -
{
"id": 26,
"email": "tom.richards#yahoo.com",
"firstName": "Tommy",
"lastName": "Richards",
"photoUrl": null,
"userAddress": [
{
"id": 8,
"type": "home",
"addressLine1": "DP Road",
"addressLine2": "Main Street",
"city": "Los Angel",
"state": "CA",
"country": "USA",
"postalCode":915890
},
{
"id": 25,
"type": "office",
"addressLine1": "Dr Red Road",
"addressLine2": null,
"city": "SA",
"state": "CA",
"country": "USA",
"postalCode":918950
}
]
}
Where should ideally validate the address type [in my case home or office] in front end[Web site or Phone] or back end [server side] or both side? Which is good approach to validate address type ? If we validate it on backend side, which will cause any performance issue ?
Note -
If the developer pass any string, the address type of like pass string will create in DB.
It's a good approach for the back end validate everything it gets from any web request, as you can't tell if its a good request coming from your web site or some other source (maybe even some malicious attack). The back-end must protect itself and validates everything it gets.
On top of it, on some (if not all) cases it's good to do some validations on the front end side too, one reason is that if you have a request that you know the back end is going to fail- save yourself the trouble and check it before you do an expensive request
Reasons
Front End Validation : Validation on the server, User has to wait for the response in case of the invalid data.
Backend Validation : To make sure that the data could not alter by the intruder.
Validate #database end : Like Mongoose provide the built in and custom validation.
As per the requirement we have to cross check the data before user profile update.

Access deeper elements of a JSON using postgresql 9.4

I want to be able to access deeper elements stored in a json in the field json, stored in a postgresql database. For example, I would like to be able to access the elements that traverse the path states->events->time from the json provided below. Here is the postgreSQL query I'm using:
SELECT
data#>> '{userId}' as user,
data#>> '{region}' as region,
data#>>'{priorTimeSpentInApp}' as priotTimeSpentInApp,
data#>>'{userAttributes, "Total Friends"}' as totalFriends
from game_json
WHERE game_name LIKE 'myNewGame'
LIMIT 1000
and here is an example record from the json field
{
"region": "oh",
"deviceModel": "inHouseDevice",
"states": [
{
"events": [
{
"time": 1430247045.176,
"name": "Session Start",
"value": 0,
"parameters": {
"Balance": "40"
},
"info": ""
},
{
"time": 1430247293.501,
"name": "Mission1",
"value": 1,
"parameters": {
"Result": "Win ",
"Replay": "no",
"Attempt Number": "1"
},
"info": ""
}
]
}
],
"priorTimeSpentInApp": 28989.41467999999,
"country": "CA",
"city": "vancouver",
"isDeveloper": true,
"time": 1430247044.414,
"duration": 411.53,
"timezone": "America/Cleveland",
"priorSessions": 47,
"experiments": [],
"systemVersion": "3.8.1",
"appVersion": "14312",
"userId": "ef617d7ad4c6982e2cb7f6902801eb8a",
"isSession": true,
"firstRun": 1429572011.15,
"priorEvents": 69,
"userAttributes": {
"Total Friends": "0",
"Device Type": "Tablet",
"Social Connection": "None",
"Item Slots Owned": "12",
"Total Levels Played": "0",
"Retention Cohort": "Day 0",
"Player Progression": "0",
"Characters Owned": "1"
},
"deviceId": "ef617d7ad4c6982e2cb7f6902801eb8a"
}
That SQL query works, except that it doesn't give me any return values for totalFriends (e.g. data#>>'{userAttributes, "Total Friends"}' as totalFriends). I assume that part of the problem is that events falls within a square bracket (I don't know what that indicates in the json format) as opposed to a curly brace, but I'm also unable to extract values from the userAttributes key.
I would appreciate it if anyone could help me.
I'm sorry if this question has been asked elsewhere. I'm so new to postgresql and even json that I'm having trouble coming up with the proper terminology to find the answers to this (and related) questions.
You should definitely familiarize yourself with the basics of json
and json functions and operators in Postgres.
In the second source pay attention to the operators -> and ->>.
General rule: use -> to get a json object, ->> to get a json value as text.
Using these operators you can rewrite your query in the way which returns correct value of 'Total Friends':
select
data->>'userId' as user,
data->>'region' as region,
data->>'priorTimeSpentInApp' as priotTimeSpentInApp,
data->'userAttributes'->>'Total Friends' as totalFriends
from game_json
where game_name like 'myNewGame';
Json objects in square brackets are elements of a json array.
Json arrays may have many elements.
The elements are accessed by an index.
Json arrays are indexed from 0 (the first element of an array has an index 0).
Example:
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
-- returns "Mission1"
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
This did help me