Read graph into NetworkX from JSON file - json

I have downloaded my Facebook data. I got the data in the form of JSON files.
Now I am trying to read these JSON files into NetworkX. I don't find any function to read graph from JSON file into NetworkX.
In another post, found the info related to reading a graph from JSON, where the JSON file was earlier created from NetworkX using json.dump().
But here in my case I have downloaded the data from Facebook. Is there any function to read graph from JSON file into NetworkX?

Unlike Pandas tables or Numpy arrays, JSON files has no rigid structure so one can't write a function to convert any JSON file to Networkx graph. If you want to construct a graph based on JSON, you should pick all needed info yourself. You can load a file with json.loads function, extract all nodes and edges according to your rules and then put them into your graph with add_nodes_from and add_edges_from functions.
For example Facebook JSON file you can write something like it:
import json
import networkx as nx
with open('fbdata.json') as f:
json_data = json.loads(f.read())
G = nx.DiGraph()
G.add_nodes_from(
elem['from']['name']
for elem in json_data['data']
)
G.add_edges_from(
(elem['from']['id'], elem['id'])
for elem in json_data['data']
)
nx.draw(
G,
with_labels=True
)
And get this graph:

Related

How to load JSON data (call from API) without key directly to S3 bucket using Python?

I am relatively new to AWS s3 I am calling an API to load the JSON data directly to s3 bucket. From s3 bucket data will be read by Snowflake. After researching I found that using Boto3 we can load data into s3 directly. Code will look something like below, however one thing I am not sure about is What should I put for the key as there is no object created in my S3 bucket. Also, what is the good practice to load the JSON data to s3 ? Do I need to encode JSON data to 'UTF-8' as done here by SO user Uwe Bretschneider.
Thanks in advance!
Python code:
import json,urllib.request
import boto3
data = urllib.request.urlopen("https://api.github.com/users?since=100").read()
output = json.loads(data)
print (output) #Checking the data
s3 = boto3.client('s3')
s3.put_object(
Body=str(json.dumps(data))
Bucket='I_HAVE_BUCKET_NAME'
Key='your_key_here'
)
By using put_object, which means you are creating a new object in the bucket, so there is no existing key.
This key is just like a file name in the file system. You can specify whatever names you like, such as my-data.json or some-dir/my-data.json. You can find out more in https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html.
As for encoder, it's always good to specify the encoding IMO, just to make sure your source file has properly encoded too.

Open json.gz in python

I am trying to access a json object which is stored as a zipped gz on an html website. I would like to do this directly with urllib if possible.
This is what I have tried:
import urllib
import json
#get the zip file
test = urllib.request.Request('http://files.tmdb.org/p/exports/movie_ids_01_27_2021.json.gz')
#unzip and read
with gzip.open(test, 'rt', encoding='UTF-8') as zipfile:
my_object = json.loads(zipfile)
but this fails with:
TypeError: filename must be a str or bytes object, or a file
Is it possible to read the json directly like this (e.g. I don't want to download locally).
Thank you.
Use requests library. pip install requests if you don't have it.
Then use the following code:
import requests
r = requests.get('http://files.tmdb.org/p/exports/movie_ids_01_27_2021.json.gz')
print(r.content)
r.content will be the binary content of the gzip file, but it will consume 11352985 bytes of memory (10.8 MB) because the data need to be kept somewhere.
then you can use
gzip.decompress(r.content)
to decompress the gzip binary and get the data. that will consume much bigger memory after decompression.

Struggling to import NZ companies extract into R (json)

The NZ companies register offers a json file containing all publicly available business info. This file comes in at a whopping 40gb, but there is also a smaller json file (~250mb) containing data on unincorporated entities (sole traders etc). As a warm up excercise I thought i'd have a go importing it into R to get an idea of size, scalability and computational reqs.
I'm having alot of trouble importing the smaller json file into R. I've tried jsonlite, RJSONIO, rjson but it appears that the file is written in an 'unorthodox' json format, hence the standard 'fromJSON' commands are falling over. Below is a portion of the file (2 entities) which i've been trying to import into R: test.json
library(jsonlite)
json <- fromJSON("test.json", flatten=TRUE)
Error in parse_con(txt, bigint_as_char) :
parse error: invalid object key (must be a string)
zbn": [{ "entity": [{ { "australianBusinessNumbe
(right here) ------^
NB: JSONlint doesn't seem to think the file is a valied JSON file
My thought is that I may need to use stream_in() or readLines() but I am no very proficient with these functions. Any help or insight greatly appreciated. Cheers

Sample java spark program to read and load json file as a RDD

I am looking for a sample java program that can read a local json file in spark.
The example is part of the documentation at http://spark.apache.org/docs/latest/sql-programming-guide.html#json-datasets:
// sc is an existing JavaSparkContext.
SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc);
// A JSON dataset is pointed to by path.
// The path can be either a single text file or a directory storing text files.
DataFrame people = sqlContext.read.json("examples/src/main/resources/people.json");
Either you have to create your own class of that specific jsonFormat and in spark sc.textFile you have to create objects of that class and return rdd of those objects otherwise you will need to implement json record reader which will implement this interface RecordReader[Key, Value].

How to properly configure json data to load into a Highstock multiple line graph

Looking for the correct json formating for a Highstock multiple line graph:
Highstock is a great graphing api with lots of documentation. I just can't seem to find out how to format the json file. This is the graph that I am trying to set up:
http://www.highcharts.com/stock/demo/compare
This is the api document that discusses how to load data through json:
http://docs.highcharts.com/#preprocesssing-data-from-a-file
^--- Only problem here is that the json data example is set up for -Highcharts- not Highstock.
My json data follows this format. There should be six lines of data on the graph if it loads properly:
[timestamp, lineA, lineB, lineC, lineE, lineF]
Example:
[1366009207,-46.11,-19.71,-36.94,-20.21,-20.88,8.84]
[1366009217,-31.38,-21.74,-27.27,-24.64,-22.66,8.77]
etc...
Found the answer! Here is an example of their json data.
http://www.highcharts.com/samples/data/jsonp.php?filename=msft-c.json&callback=?
The key: Each graph-line must be loaded as a seperate json file.