Create table from JSON file in PostgreSQL - json

I'm just learning how to use PostgreSQL and JSON. I came across this great tutorial, but the syntax was made for SQL Server. I am trying to take the following JSON file and begin parsing it into a table with columns for squad, name, age, and powers.
The JSON code is
CREATE TABLE heroes (
id serial NOT NULL PRIMARY KEY,
info json NOT NULL
);
insert into heroes (info)
values (('
{
"squadName": "Super hero squad",
"homeTown": "Metro City",
"formed": 2016,
"secretBase": "Super tower",
"active": true,
"members": [
{
"name": "Molecule Man",
"age": 29,
"secretIdentity": "Dan Jukes",
"powers": [
"Radiation resistance",
"Turning tiny",
"Radiation blast"
]
},
{
"name": "Madame Uppercut",
"age": 39,
"secretIdentity": "Jane Wilson",
"powers": [
"Million tonne punch",
"Damage resistance",
"Superhuman reflexes"
]
},
{
"name": "Eternal Flame",
"age": 1000000,
"secretIdentity": "Unknown",
"powers": [
"Immortality",
"Heat Immunity",
"Inferno",
"Teleportation",
"Interdimensional travel"
]
}
]
}
'::json));
I can access the first level info of the JSON with no issue, eg
SELECT info -> 'squadName' AS squad from heroes; or SELECT info -> 'active' AS active from heroes;
However, when trying to dig deeper into the JSON, I end up with a single row, the correct squad name and NULL for member names:
SELECT info -> 'squadName' AS Squad,
info ->'members' ->> 'name' AS Name
from heroes;
The tutorial uses CROSS APPLY OPENJSON(..) to handle this, but I am not sure of what to do in PostgreSQL.
Any help would be appreciated. I am using this as a learning exercise.

You can do a lateral cross join of json_array_elements().
SELECT h.info->>'squadName' AS squad,
m.m->>'name' AS name
FROM heroes h
CROSS JOIN LATERAL json_array_elements(h.info->'members') m
(m);
db<>fiddle
But as a side note: The schema of the JSON looks pretty static to me. You should consider not to abuse JSON but use relational means like (lookup and/or linking) tables and columns instead.

Related

How to query deep nested json array from couchbase?

How to query deep nested json array from couchbase? I have the following documents in the couchbase bucket. I need to query to list all apps who have Permissions "android.permission.BATTERY_STATS"
How to query to list all apps with permissions from nested json array?
My Json Documents,
Document:1
{
"data": {
"com.facebook.katana": {
"studioId": "Facebook",
"screenshotUrls": [
"https://lh3.googleusercontent.com/JcPdPqplBxgG6dEQuxvuhO4jvE64AzxOCGWe8w55dMMeXU4rZs2MwpfGQTWvv6QR-g",
"https://lh3.googleusercontent.com/w0kSYY7jlPjGDd3KEVgDTpzUf4k67G7rfELOf6qj1SSC7n6Ege44vp8QkeX57ZM6bFU"
],
"primaryCategoryName": "Social",
"studioName": "Facebook",
"description": "Keeping up with friends is faster and easier than ever. Share updates and photos, engage with friends and Pages, and stay connected to communities important to you"
"starRatings": {
"1": 9706642,
"2": 3384344,
"3": 7224416,
"4": 12323358,
"5": 49438051
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BATTERY_STATS",
"android.permission.BLUETOOTH",
"android.permission.READ_PHONE_STATE",
"android.permission.READ_PROFILE",
"android.permission.READ_SMS",
"android.permission.READ_SYNC_SETTINGS",
"com.nokia.pushnotifications.permission.RECEIVE",
"com.sec.android.provider.badge.permission.READ",
"com.sec.android.provider.badge.permission.WRITE",
"com.sonyericsson.home.permission.BROADCAST_BADGE"
],
"appId": "com.facebook.katana",
"userRatingCount": 82076811,
"currency": "USD",
"iconUrl": "https://lh3.googleusercontent.com/ZZPdzvlpK9r_Df9C3M7j1rNRi7hhHRvPhlklJ3lfi5jk86Jd1s0Y5wcQ1QgbVaAP5Q=w100",
"releaseDate": "Nov 14, 2018",
"appName": "Facebook",
"studioUrl": "https://www.facebook.com/facebook",
"hasInAppPurchases": 1,
"bundleId": "com.facebook.katana",
"version": "198.0.0.53.101",
"commentCount": 22211936,
"fileSizeBytes": 58044343,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"SOCIAL"
],
"tagline": "Find friends, watch live videos, play games & save photos in your social network",
"averageUserRating": 4.0770621299744,
"primaryCategoryId": "SOCIAL",
"videoScreenUrl": "https://lh4.ggpht.com/3RG_Y8JPK0Hcyui9OcapiONP_aDWKTRZ50wqZW_wbyOF0FamAYEYZfMTW9Cs1OT1kA"
}
},
"response_msec": 11,
"status": 200
}
Document:2
{
"data": {
"com.whatsapp": {
"studioId": "WhatsApp Inc.",
"screenshotUrls": [
"https://lh3.googleusercontent.com/MMue08byixTw74ST_VkNQDUUJBgVEbjNHDYLhIuHmYhMIMJIp3KjVlnhhqZQOZUtNt8",
"https://lh3.googleusercontent.com/foFmwvVGIwWWXJIukN7png18lFjFgbw3K7BqIm8G-jsFgSTVtkCa-dDkFApUzbvzIvbe"
],
"primaryCategoryName": "Communication",
"studioName": "WhatsApp Inc.",
"description": "WhatsApp Messenger is a FREE messaging app available for Android and other smartphones.
"starRatings": {
"1": 4713598,
"2": 1917919,
"3": 4962745,
"4": 11307648,
"5": 55392894
},
"numDownloads": "1,000,000,000+ downloads",
"price": 0,
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.ACCESS_FINE_LOCATION",
"android.permission.ACCESS_NETWORK_STATE",
"android.permission.ACCESS_WIFI_STATE",
"android.permission.AUTHENTICATE_ACCOUNTS",
"android.permission.BLUETOOTH",
"android.permission.BROADCAST_STICKY",
"android.permission.CAMERA",
"android.permission.CHANGE_WIFI_STATE",
"android.permission.GET_ACCOUNTS",
"android.permission.GET_TASKS",
"android.permission.INSTALL_SHORTCUT",
"android.permission.INTERNET",
"android.permission.MANAGE_ACCOUNTS",
"com.whatsapp.permission.REGISTRATION",
"com.whatsapp.permission.VOIP_CALL",
"com.whatsapp.sticker.READ"
],
"appId": "com.whatsapp",
"userRatingCount": 78294804,
"currency": "USD",
"iconUrl": "https://lh6.ggpht.com/mp86vbELnqLi2FzvhiKdPX31_oiTRLNyeK8x4IIrbF5eD1D5RdnVwjQP0hwMNR_JdA=w100",
"releaseDate": "Nov 5, 2018",
"appName": "WhatsApp Messenger",
"studioUrl": "http://www.whatsapp.com/",
"bundleId": "com.whatsapp",
"version": "2.18.341",
"commentCount": 19763316,
"fileSizeBytes": 23857699,
"formattedPrice": "",
"categoryIds": [
"APPLICATION",
"COMMUNICATION"
],
"tagline": "Simple. Personal. Secure.",
"averageUserRating": 4.4145045280457,
"primaryCategoryId": "COMMUNICATION",
"videoScreenUrl": "https://lh3.ggpht.com/aZrXAunkovhf0630Ykz1A7h2rzFX_dErd6fRiB7fNKU_DkNtetTquEra1bjc3sR2kLs"
}
},
"response_msec": 15,
"status": 200
}
As I say in the comment, this is a tricky one. I'm going to try to simplify your docs first, and then give an answer that I came up with.
You have two docs, which contain a nested object with a permissions array. Each nested object has a (potentially) different name. So, let's assume we have two simple docs like this:
id: doc1
{
"foo": {
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
}
}
id: doc2
{
"bar": {
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
}
The first one has a "foo" nested object, the second has a "bar" nested object, but both nested objects have a "permissions" array. You want to find all the documents that have a permission of "android.permission.BATTERY_STATS".
I checked out the N1QL docs for anything that might be helpful, and I especially checked out the Object Functions section. There's a function called OBJECT_UNWRAP that might do the trick. From the docs: "This function enables you to unwrap an object without knowing the name in the name-value pair."
So, if I simply unwrap the above documents, then I can basically discard the "foo" and the "bar" parts.
SELECT META(b).id, OBJECT_UNWRAP(b).permissions
FROM sstbucket b
You can put unwrap a deeper nested object if necessary, but I'm trying to keep this simple.
The results of that query would be:
[
{
"id": "doc1",
"permissions": [
"android.permission.ACCESS_COARSE_LOCATION",
"android.permission.BATTERY_STATS"
]
},
{
"id": "doc2",
"permissions": [
"android.permission.ACCESS_FINE_LOCATION"
]
}
]
And now, it's a simple ANY/SATISFIES statement to find the document:
SELECT META(b).id
FROM sstbucket b
WHERE ANY p IN OBJECT_UNWRAP(b).permissions SATISFIES p == 'android.permission.BATTERY_STATS' END;
Which would return
[
{
"id": "doc1"
}
]
So, that works. What I don't know for sure is how to create a proper index for this particular query. I created a primary index just to make it work (CREATE PRIMARY INDEX ON sstbucket), but that's not going to perform very well.
You can use OBJECT functions (https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/objectfun.html) and Array indexing.
If you need document ID or whole document.
CREATE INDEX ix1 ON default ( DISTINCT ARRAY (DISTINCT ARRAY permision
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT META(d).id FROM default AS d
WHERE ANY app IN OBJECT_VALUES(d.data)
SATISFIES (ANY permision IN app.permissions
SATISFIES permision = "android.permission.BATTERY_STATS"
END)
END;
If you need only appId and see if it uses covering index.
CREATE INDEX ix2 ON default ( ALL ARRAY (ALL ARRAY [permision, app.appId]
FOR permision IN app.permissions END)
FOR app IN OBJECT_VALUES(data) END);
SELECT [permision, app.appId][1] AS appId FROM default AS d
UNNEST OBJECT_VALUES(d.data) AS app
UNNEST app.permissions AS permision
WWHERE [permision, app.appId] >= ["android.permission.BATTERY_STATS"] AND
[permision, app.appId] < [SUCCESSOR("android.permission.BATTERY_STATS")] ;

BQ: How to UNNEST into new table

I'm exporting billing data from Google Cloud Platform to BigQuery (BQ).
The task at hand is to build a query that UNNEST relevant data to a new 'flat' table
The structure of the data in BQ is this:
[{
"billing_account_id": "01234-1778EC-123456",
"service": {
"id": "2062-016F-44A2",
"description": "Maps"
},
"sku": {
"id": "5D8F-0D17-AAA2",
"description": "Google Maps"
},
"usage_start_time": "2018-11-05 14:45:00 UTC",
"usage_end_time": "2018-11-05 15:00:00 UTC",
"project": {
"id": null,
"name": null,
"labels": []
},
"labels": [],
"system_labels": [],
"location": null,
"export_time": "2018-11-05 21:54:09.779 UTC",
"cost": "5.0",
"currency": "EUR",
"currency_conversion_rate": "0.87860000000017424",
"usage": {
"amount": "900.0",
"unit": "seconds",
"amount_in_pricing_units": "0.00034674063800277393",
"pricing_unit": "month"
},
"credits": "-1.25",
"invoice": {
"month": "201811"
}
},
I wish to schedule a job that builds a new table every day with just this schema
billing_account_id, usage_start_time, usage_end_time, cost, credit_amount
So far I'm at this:
select billing_account_id, usage_start_time, usage_end_time, cost, credits AS CREDITS from clientBilling.gcp_billing_export_v1_XXXX , UNNEST(credits);
But in the results credits are still nested and not 'flat' as I need. Any input is welcome, thanks! :)
Result
credits is an array of structs (each struct being "name, amount") - a "repeated" record in BigQuery - so you have to first unnest the array and then reference the struct member you want.
Thus:
UNNEST the credits record
Alias the credits.amount struct member as credit_amount
SELECT
billing_account_id,
usage_start_time,
usage_end_time,
cost,
credit.amount as credit_amount
FROM
`optimum-rock-145719.billing_export.gcp_billing_export_v1*`,
UNNEST(credits) as credit
This will return a result table with just the credits.amount column as credits_amount. You were doing step 1, but not step 2, and ignoring the unnested fields in your SELECT clause.

Parsing JSON in SQL Server 2017 (Clearbit API call)

I'm pulling some data into a database on my local server with API calls via Clearbit provider. Everything was OK regarding parsing the data with SQL Server 2017 until I hit a bump.
I will go straight on the example for easier understanding.
This is the example of an API call output in JSON
{
"id": "384dfe0d-5bba-445e-a390-2d946dc84a12",
"name": "Honeywell",
"legalName": "Honeywell International Inc",
"domain": "honeywell.com",
"domainAliases": [
"honeywell.at",
"honeywell.it",
"evohome.info",
"wifithermostat.com",
"emsaviation.com",
"mytotalconnect.com",
"honeywell.nl",
"honeywell.co.za",
"honeywell.com.au",
"honeywell.ca",
"alliedsignal.com",
"emsdss.com",
"primusepic.com",
"alarmnet-me.com",
"lebow.com",
"honeywell.ie",
"honeywell.jp",
"honeywell.com.br",
"trendcontrol.co.uk",
"honeywellforjaguar.co.uk",
"aviaso.com",
"skyforce.co.uk",
"newenglandinstruments.com",
"honeywell.fi",
"alarmnet.com",
"skyconnect.com",
"skyforceuk.com",
"securitex.com",
"missionready.com",
"honeywellaerospace.com",
"formation.com",
"aclon.com",
"electrocorp.com",
"ultrak.com",
"satcom1.com",
"hsmpats.com",
"myaerospace.com",
"emsglobaltracking.com",
"fascocontrols.com",
"honeywellnow.com",
"bendixbrakes.com",
"elmwoodsensors.com",
"ovationselect.com",
"honeywellbusinessaviation.com",
"iflyaspire.com",
"btrinc.com",
"honeywellspecialtymaterials.com",
"magneticsensors.com",
"activeye.com",
"egarrett.com",
"novar-eds.com",
"aviaso.co.uk",
"chadwick-helmuth.com",
"datainstruments.com",
"lebowproducts.com",
"honeywell-produktkatalog.de",
"honeywellforjaguar.com",
"hobbs-corp.com",
"emsgt.com",
"honeywellaes.com",
"honeywellbuildingsolutions.com",
"satcom1.aero",
"honeywell-building-solutions.de",
"lifesafetydistribution.com",
"godirect.com",
"garrettbulletin.com",
"yourhomeexpert.com",
"aerospacetrading.com",
"sensorsystems.com",
"wifithermostat.info",
"honeywell-fachseminare.de",
"hobbscorporation.com",
"kcl.hu",
"honeywell.sk",
"esser.info",
"inertialsensor.com",
"sensotec.com",
"notifier.com",
"honeywellgreer.com",
"smartact.de",
"honeywellfire.com",
"iris-systems.com",
"honeywell.ru",
"lxei.com",
"thermalswitch.com",
"hightempsolutions.com",
"aubetech.com",
"honeywell-haustechnik.de",
"careersathoneywell.com",
"garrettbyhoneywell.com",
"honeywell.in",
"honeywell.cn",
"honeywell.com.mx",
"kcp.com",
"satamatics.com",
"myflite.com"
],
"site": {
"title": "Honeywell",
"h1": null,
"metaDescription": " We are blending products with software solutions to link people and businesses to the information they need to be more efficient, safer and connected. ",
"metaAuthor": null,
"phoneNumbers": [
"+1 877-271-8620",
"+1 800-633-3991",
"+1 877-841-2840",
"+1 480-353-3020",
"+1 973-455-3388",
"+1 973-204-9621",
"+32 2 728 20 45",
"+32 476 20 90 19",
"+44 7794 007289",
"+86 21 2219 6509"
],
"emailAddresses": [
"domains#honeywell.com",
"HoneywellPrivacy#honeywell.com",
"rob.ferris#honeywell.com",
"ilse.schouteden#honeywell.com",
"chris.martin2#honeywell.com",
"Anahi.Espinosa#honeywell.com",
"lydia.lu#honeywell.com",
"madhavi.jha#Honeywell.com",
"Steven.Brecken#Honeywell.com",
"Steve.Brecken#Honeywell.com",
"Eugene.Tan#Honeywell.com"
]
},
"category": {
"sector": "Consumer Discretionary",
"industryGroup": "Automobiles & Components",
"industry": "Automotive",
"subIndustry": "Automotive",
"sicCode": "3714",
"naicsCode": null
},
"tags": [
"Automotive",
"Enterprise",
"B2B",
"Electrical"
],
"description": " We are blending products with software solutions to link people and businesses to the information they need to be more efficient, safer and connected. ",
"foundedYear": 1936,
"location": "115 Tabor Rd, Morris Plains, NJ 07950, USA",
"timeZone": "America/New_York",
"utcOffset": -4,
"geo": {
"streetNumber": "115",
"streetName": "Tabor Road",
"subPremise": null,
"city": "Morris Plains",
"postalCode": "07950",
"state": "New Jersey",
"stateCode": "NJ",
"country": "United States",
"countryCode": "US",
"lat": 40.8358456,
"lng": -74.4771042
},
"logo": "https://logo.clearbit.com/honeywell.com",
"facebook": {
"handle": "293855263965203",
"likes": null
},
"linkedin": {
"handle": "company/honeywell"
},
"twitter": {
"handle": "HoneywellNow",
"id": "257492733",
"bio": "Please visit us over at #Honeywell.",
"followers": 2322,
"following": 1,
"location": "Morris Plains, NJ",
"site": "https:",
"avatar":
},
"crunchbase": {
"handle": "organization/honeywell"
},
"emailProvider": false,
"type": "public",
"ticker": "HON",
"phone": "+1 973-455-2000",
"metrics": {
"alexaUsRank": 6045,
"alexaGlobalRank": 18053,
"googleRank": null,
"employees": 51779,
"employeesRange": "1000+",
"marketCap": 102920000000,
"raised": null,
"annualRevenue": 39302000000,
"fiscalYearEnd": 12
},
"indexedAt": "2017-07-11T23:00:41.115Z",
"tech": [
"crazy_egg",
"google_analytics",
"google_tag_manager",
"asp_net",
"mouseflow",
"marketo",
"go_squared",
"microsoft_exchange_online",
"outlook",
"recaptcha"
],
"parent": {
"domain": null
},
"similarDomains": [
"abb-livingspace.com",
"alerton.com",
"gereports.com",
"honeywellprocess.com",
"honeywelluk.com",
"johnsoncontrols.com",
"jpinstruments.com",
"lenel.com",
"maxitrol.com",
"nucalgon.com",
"schneider-electric.us",
"siemens.com"
]
}
If you look at the example up here you will see "domainAliases": [...]
and that is the part of the JSON I still need to parse.
This is the parse query for SQL that I already have:
SELECT *
, JSON_VALUE(JSONData,'$.name') AS CompanyName
, JSON_VALUE(JSONData,'$.category.sector') AS CategorySector
, JSON_VALUE(JSONData, '$.category.industryGroup') AS CategoryIndustryGroup
, JSON_VALUE(JSONData, '$.category.industry') AS CategoryIndustry
, JSON_VALUE(JSONData, '$.category.subIndustry') AS CategorySubIndustry
, JSON_VALUE(JSONData, '$.category.sicCode') AS CategorySicCode
, JSON_VALUE(JSONData, '$.category.naicsCode') AS CategoryNaicsCode
, JSON_VALUE(JSONData, '$.metrics.employees') AS EmployeesNumber
, JSON_VALUE(JSONData, '$.metrics.employeesRange') AS EmployeesRange
, JSON_VALUE(JSONData, '$.metrics.marketCap') AS MarketCap
, JSON_VALUE(JSONData, '$.metrics.annualRevenue') AS AnnualRevenue
, JSON_VALUE(JSONData, '$.similarDomains') AS SimilarDomains
FROM Domains;
I want this data ("domainAliases") to be stored in other table as the data in the upper query (I know that the parse query I already have is only a SELECT query but I also have an UPDATE version of the query).
Here is an example picture of how the finished product in a new table, same database should look. The left column is called Company Name, the 2nd column is called Domain Aliases:
Now WHERE is the JSON data stored? I have it stored in a Column called JSONData, tablename: Domains and all this is in a database called Domainbank. JSONData datatype is nvarchar(max).
I need the data to be grouped by the name of the company and next to the company name there should be aliases domain just like the picture example shows. Now keep in mind that I will run this query for 10k+ JSONDatas and the new table that is going to be created will be super huge but as long as it is all grouped by the company name with all the alias domains it should be good. Some of the JSONDatas did not return the API call in the correct format because they either didn't find the data or something else went wrong, so If the query doesnt find anyting under the "domainAliases": [...] or if it doesn't even find the "domainAliases": [...] then I don't need the company to appear on the new table.
So short recap: let's make a new table (Let's call it AliasDomains), find the data under "domainAliases": [...] also pull the company name out JSON_VALUE(JSONData,'$.name') AS CompanyName, Store the data in the new table as the picture example higher in the post and then group by CompanyName.
So, from your post I am not completely clear on what your question is, but I assume it is how to write some SQL statement to accomplish the above?
First of all, I'd say you should not care of the GROUP BY in the insert, do GROUP BY when retrieving data out of the table.
Having said that you can quite easily accomplish what you want with a SELECT from the Domains table together with a CROSS APPLY OPENJSON statement, like so:
INSERT INTO AliasDomains(CompanyName, DomainAliases)
SELECT JSON_VALUE(JSONData, '$.name'), value
FROM Domains
CROSS APPLY OPENJSON (JSONData, '$.domainAliases')
EDIT: Should probably add that value in the above statement is returned from OPENJSON, e.g. it references the values of the (in this case domainAliases) path you want.
Hope this helps?!
Niels

Convert to dataframe from JSON in R

I am facing issues with conversion of JSON to dataframe. I tried using libraries: jsonlite, RJSONIO,rjson.
I keep getting 'invalid character in the string' or unclosed string.
I am getting this data from a standard API so should be able to parse this json. Also, JSON editors can parse this data just fine.
My question is:
Is there a standard way using which I can make sure that my dataframe gets created and ignore above errors?
My best guess was to convert this data to JSON format using toJSON function from either of the libraries but if I use
newdata <- fromJSON(toJSON(data))
it somehow never gets converted to dataframe. Why is that?
If I instead use
newdata <- fromJSON(data)
I get a valid dataframe but sometimes because of above errors, it doesn't work which is what I am trying to know. How do you deal with this?
I have tried using this too freshDeskTicketsToDF <-
jsonlite::fromJSON(paste(readLines(textConnection(freshDeskTickets)), collapse=""))
It seemed to solve the problem but somewhere I got unclosed string with this method which I otherwise did not.
Are there better ways to deal with this in R?
Also, why is it that using toJSON on data passed to fromJSON never gets converted to a dataframe?
If I decide to take off html tags from the values assisgned to keys in JSON data. How does that work? Can I do that?
Edit: It looks like I get this error when I have <html tags> in my "string data" but I have them all across my JSON data and I don't get it every time.
How to deal with problems like this?
Note: this issue not specific to the data that I have. What I am looking for is ways to deal with problems like these and not one specific solution to a single problem.
I just realized thattoJSON converts R objects to JSON and not JSON to valid JSON. Is there a way to do it instead?
Sample data:
[
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205968,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Order number-100403891",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174093,
"type": "Order Status query",
"due_by": "2016-09-02T08:57:30Z",
"fr_due_by": "2016-09-02T02:57:30Z",
"is_escalated": true,
"description": "<div dir=\"ltr\">Hi Team,<div><br></div>\n<div>I have ordered an item from your website, order number-100403891. I had called on August 30 2016 to postpone the delivery date. The guy i spoke from your end had confirmed that he will hold and push the delivery date to September 5 or 6 or 7 2016. And he confirmed the same.</div>\n<div>However, the guy I spoke to<b> did not do it</b>. </div>\n<div>I got to know it from ABHINAV from your customer care team who I spoke to on August 1st at 13:10. Hence I have put a request again and he said he will talk to some guys and give me the desired dates for delivery which is 5,6,7 of September 2016. </div>\n<div>Please let me know the concern on this and hope for a quick turn around.</div>\n<div><br></div>\n<div>Thank you,</div>\n<div>Hari,</div>\n<div>+91-9538199699.</div>\n</div>\n",
"description_text": "Hi Team,\r\n\r\nI have ordered an item from your website, order number-100403891. I had\r\ncalled on August 30 2016 to postpone the delivery date. The guy i spoke\r\nfrom your end had confirmed that he will hold and push the delivery date to\r\nSeptember 5 or 6 or 7 2016. And he confirmed the same.\r\nHowever, the guy I spoke to* did not do it*.\r\nI got to know it from ABHINAV from your customer care team who I spoke to\r\non August 1st at 13:10. Hence I have put a request again and he said he\r\nwill talk to some guys and give me the desired dates for delivery which is\r\n5,6,7 of September 2016.\r\nPlease let me know the concern on this and hope for a quick turn around.\r\n\r\nThank you,\r\nHari,\r\n+91-9538199699.\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:18Z",
"updated_at": "2016-09-11T11:00:33Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022148025,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Defect in d piece",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174092,
"type": "Return",
"due_by": "2016-09-01T15:51:00Z",
"fr_due_by": "2016-09-01T09:51:00Z",
"is_escalated": false,
"description": "<div><br></div>\n<div><br></div>\n<div><br></div>\n<div><div style=\"font-size:75%;color:#575757\">Sent from Samsung Mobile</div></div>",
"description_text": "\n\n\nSent from Samsung Mobile",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:00Z",
"updated_at": "2016-09-06T09:00:14Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205895,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Re: StalkBuyLove Return Request for order: 100404435",
"to_emails": [
"StalkBuyLove <contact#stalkbuylove.com>"
],
"product_id": null,
"id": 174088,
"type": "Refund query",
"due_by": "2016-09-01T15:43:56Z",
"fr_due_by": "2016-09-01T09:43:56Z",
"is_escalated": true,
"description": "<div>Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. <br><br>Sent from my iPhone</div>\n<div>\n<br>On Sep 1, 2016, at 12:38 PM, StalkBuyLove <contact#stalkbuylove.com> wrote:<br><br>\n</div>\n<blockquote><div>\n<div><img title=\"StalkBuyLove\" alt=\"Stalkbuylove\" src=\"http://www.stalkbuylove.com/launcher_icons/Newlogo_Stalkbuylove_240x50.png\"></div>\n<div>Hello <b>Anamica Aggarwal</b>,</div>\n<div>We have initiated a return request for order: <b>100404435</b> with the following products:</div>\n<table style=\"width:80%\">\r\n <tbody>\n<tr style=\"background-color:#B0C4DE\">\r\n <th>Item Name</th>\r\n <th>Sku</th>\r\n </tr>\n<tr>\r\n <td style=\"text-align:center\">Articuno Top</td>\r\n <td style=\"text-align:center\">IN1627MTOTOPPCH-198-18</td>\r\n </tr>\n</tbody>\n</table>\n<div>Lots of love,</div>\n<div>Team SBL</div>\n<img src=\"http://mandrillapp.com/track/open.php?u=30069003&id=bff0a5daee4a47fe9c6b04d2680c3c39\" height=\"1\" width=\"1\">\r\n</div></blockquote>",
"description_text": "Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. \n\nSent from my iPhone\n\n> On Sep 1, 2016, at 12:38 PM, StalkBuyLove <contact#stalkbuylove.com> wrote:\n> \n> \n> Hello Anamica Aggarwal,\n> \n> We have initiated a return request for order: 100404435 with the following products:\n> \n> Item Name\tSku\n> Articuno Top\tIN1627MTOTOPPCH-198-18\n> Lots of love,\n> \n> Team SBL\n> \n",
"custom_fields": {
},
"created_at": "2016-09-01T07:43:56Z",
"updated_at": "2016-09-11T11:00:32Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205881,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Details for order",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174086,
"type": "Order Status query",
"due_by": "2016-09-01T15:42:50Z",
"fr_due_by": "2016-09-01T09:42:50Z",
"is_escalated": false,
"description": "<div><span></span></div>\n<div>\n<span>Hey can i get details of my order </span><br><span>How much more time will it take to get delivered? </span><br><span>Order no-</span><h2 style=\"font-weight: normal; margin: 0px;\"><font><span style=\"background-color: rgba(255, 255, 255, 0);\">100403837</span></font></h2>\n<span></span><br><span>Sent from my iPhone</span><br>\n</div>",
"description_text": "Hey can i get details of my order \r\nHow much more time will it take to get delivered? \r\nOrder no-\r\n100403837\r\n\r\nSent from my iPhone\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:42:50Z",
"updated_at": "2016-09-06T09:00:13Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": true,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022204690,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Refund",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174080,
"type": "Refund query",
"due_by": "2016-09-01T15:36:26Z",
"fr_due_by": "2016-09-01T09:36:26Z",
"is_escalated": true,
"description": "<div>\r<br>Bank statement as asked for refund! Please intiate the proccedings asap!<br>\n</div>",
"description_text": "\r\nBank statement as asked for refund! Please intiate the proccedings asap!\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:36:26Z",
"updated_at": "2016-09-07T08:00:19Z"
}
]
library(jsonlite)
df <- stream_in(file("~/data/sample.json"))
This stream_in function directly convert into datafram

Access deeper elements of a JSON using postgresql 9.4

I want to be able to access deeper elements stored in a json in the field json, stored in a postgresql database. For example, I would like to be able to access the elements that traverse the path states->events->time from the json provided below. Here is the postgreSQL query I'm using:
SELECT
data#>> '{userId}' as user,
data#>> '{region}' as region,
data#>>'{priorTimeSpentInApp}' as priotTimeSpentInApp,
data#>>'{userAttributes, "Total Friends"}' as totalFriends
from game_json
WHERE game_name LIKE 'myNewGame'
LIMIT 1000
and here is an example record from the json field
{
"region": "oh",
"deviceModel": "inHouseDevice",
"states": [
{
"events": [
{
"time": 1430247045.176,
"name": "Session Start",
"value": 0,
"parameters": {
"Balance": "40"
},
"info": ""
},
{
"time": 1430247293.501,
"name": "Mission1",
"value": 1,
"parameters": {
"Result": "Win ",
"Replay": "no",
"Attempt Number": "1"
},
"info": ""
}
]
}
],
"priorTimeSpentInApp": 28989.41467999999,
"country": "CA",
"city": "vancouver",
"isDeveloper": true,
"time": 1430247044.414,
"duration": 411.53,
"timezone": "America/Cleveland",
"priorSessions": 47,
"experiments": [],
"systemVersion": "3.8.1",
"appVersion": "14312",
"userId": "ef617d7ad4c6982e2cb7f6902801eb8a",
"isSession": true,
"firstRun": 1429572011.15,
"priorEvents": 69,
"userAttributes": {
"Total Friends": "0",
"Device Type": "Tablet",
"Social Connection": "None",
"Item Slots Owned": "12",
"Total Levels Played": "0",
"Retention Cohort": "Day 0",
"Player Progression": "0",
"Characters Owned": "1"
},
"deviceId": "ef617d7ad4c6982e2cb7f6902801eb8a"
}
That SQL query works, except that it doesn't give me any return values for totalFriends (e.g. data#>>'{userAttributes, "Total Friends"}' as totalFriends). I assume that part of the problem is that events falls within a square bracket (I don't know what that indicates in the json format) as opposed to a curly brace, but I'm also unable to extract values from the userAttributes key.
I would appreciate it if anyone could help me.
I'm sorry if this question has been asked elsewhere. I'm so new to postgresql and even json that I'm having trouble coming up with the proper terminology to find the answers to this (and related) questions.
You should definitely familiarize yourself with the basics of json
and json functions and operators in Postgres.
In the second source pay attention to the operators -> and ->>.
General rule: use -> to get a json object, ->> to get a json value as text.
Using these operators you can rewrite your query in the way which returns correct value of 'Total Friends':
select
data->>'userId' as user,
data->>'region' as region,
data->>'priorTimeSpentInApp' as priotTimeSpentInApp,
data->'userAttributes'->>'Total Friends' as totalFriends
from game_json
where game_name like 'myNewGame';
Json objects in square brackets are elements of a json array.
Json arrays may have many elements.
The elements are accessed by an index.
Json arrays are indexed from 0 (the first element of an array has an index 0).
Example:
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
-- returns "Mission1"
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
This did help me