json.load loads a string instead of json - json

I have a list of dictionaries written to a data.txt file. I was expecting to be able to read the list of dictionaries in a normal way when I load, but instead, I seem to load up a string.
For example - when I print(data[0]), I was expecting the first dictionary in the list, but instead, I got "[" instead.
Below attached is my codes and txt file:
read_json.py
import json
with open('./data.txt', 'r') as json_file:
data = json.load(json_file)
print(data[0])
data.txt
"[
{
"name": "Disney's Mulan (Mandarin) PG13 *",
"cast": [
"Jet Li",
"Donnie Yen",
"Yifei Liu"
],
"genre": [
"Action",
"Adventure",
"Drama"
],
"language": "Mandarin with no subtitles",
"rating": "PG13 - Some Violence",
"runtime": "115",
"open_date": "18 Sep 2020",
"description": "\u201cMulan\u201d is the epic adventure of a fearless young woman who masquerades as a man in order to fight Northern Invaders attacking China. The eldest daughter of an honored warrior, Hua Mulan is spirited, determined and quick on her feet. When the Emperor issues a decree that one man per family must serve in the Imperial Army, she steps in to take the place of her ailing father as Hua Jun, becoming one of China\u2019s greatest warriors ever."
},
{
"name": "The New Mutants M18",
"cast": [
"Maisie Williams",
"Henry Zaga",
"Anya Taylor-Joy",
"Charlie Heaton",
"Alice Braga",
"Blu Hunt"
],
"genre": [
"Action",
"Sci-Fi"
],
"language": "English",
"rating": "M18 - Some Mature Content",
"runtime": "94",
"open_date": "27 Aug 2020",
"description": "Five young mutants, just discovering their abilities while held in a secret facility against their will, fight to escape their past sins and save themselves."
}
]"
The above list is formatted properly for easy reading but the actual file is a single line and the different lines are denoted with "\n". Thanks for any help.

remove double quote in data.txt is useful for me。
eg. modify
"[{...},{...}]"
to
[{...},{...}]
Hope it helps!

Related

How join/merge/update JSON dictionaries without overwriting data

I have a JSON list of dictionaries like so:
data = [{
"title": "Bullitt",
"release_year": "1968",
"locations": "1153-57 Taylor Street",
"fun_facts": "Embarcadero Freeway, which was featured in the film was demolished in 1989 because of structural damage from the 1989 Loma Prieta Earthquake)",
"production_company": "Warner Brothers / Seven Arts\nSeven Arts",
"distributor": "Warner Brothers",
"director": "Peter Yates",
"writer": "Alan R. Trustman",
"actor_1": "Steve McQueen",
"actor_2": "Jacqueline Bisset",
"actor_3": "Robert Vaughn",
"id": 498
},
{
"title": "Bullitt",
"release_year": "1968",
"locations": "John Muir Drive (Lake Merced)",
"production_company": "Warner Brothers / Seven Arts\nSeven Arts",
"distributor": "Warner Brothers",
"director": "Peter Yates",
"writer": "Alan R. Trustman",
"actor_1": "Steve McQueen",
"actor_2": "Jacqueline Bisset",
"actor_3": "Robert Vaughn",
"id": 499
}]
How do I combine these dictionaries without overwriting the data?
So, the final result which I am trying to get is:
data = {
"title": "Bullitt",
"release_year": "1968",
"locations": ["1153-57 Taylor Street", "John Muir Drive (Lake Merced)"]
"fun_facts": "Embarcadero Freeway, which was featured in the film was demolished in 1989 because of structural damage from the 1989 Loma Prieta Earthquake)",
"production_company": "Warner Brothers / Seven Arts\nSeven Arts",
"distributor": "Warner Brothers",
"director": "Peter Yates",
"writer": "Alan R. Trustman",
"actor_1": "Steve McQueen",
"actor_2": "Jacqueline Bisset",
"actor_3": "Robert Vaughn",
"id": 498, 499
}
I looked into merging JSON objects but all I came across was overwriting data. I do not want to overwrite anything. Not really sure how to approach this problem.
Would I have to make an empty list for the locations field and search through the entire data set looking for titles that are the same and take their locations and append them to the empty list and then finally update the dictionary? Or is there a better way/best practice when it comes to something like this?
This is one approach using a simple iteration.
Ex:
result = {}
tolook = ('locations', 'id')
for d in data:
if d['title'] not in result:
result[d['title']] = {k: [v] if k in tolook else v for k, v in d.items()}
else:
for i in tolook:
result[d['title']][i].append(d[i])
print(result) # Or result.values()
Output:
{'Bullitt': {'actor_1': 'Steve McQueen',
'actor_2': 'Jacqueline Bisset',
'actor_3': 'Robert Vaughn',
'director': 'Peter Yates',
'distributor': 'Warner Brothers',
'fun_facts': 'Embarcadero Freeway, which was featured in the film '
'was demolished in 1989 because of structural damage '
'from the 1989 Loma Prieta Earthquake)',
'id': [498, 499],
'locations': ['1153-57 Taylor Street',
'John Muir Drive (Lake Merced)'],
'production_company': 'Warner Brothers / Seven Arts\nSeven Arts',
'release_year': '1968',
'title': 'Bullitt',
'writer': 'Alan R. Trustman'}}
python Dictionary
-----------------
Dictionaries store data values in key:value pairs. A collection which is unordered, changeable and does not allow duplicates.
thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
python List
-----------
Lists are used to store multiple items in a single variable.
We can change, add, and remove items in a list after it has been created.
Since lists are indexed, lists can have items with the same value:
mylist = ["apple", "banana", "cherry"]
heres my logic, hope it helps.
------------------------------
temp = {}
for each dictionary in data[{},{}] {
for each key in dictionary.keys {
does temp.keys contain key {
for each value in dictionary.key.values {
does value exist in temp.key.values {
# do nothing
}
else {add value to corresponding temp.key.values}
}
} else {(add key value pair)}
}
}

Deserializing Nested JSON API response with Django

I'm pretty new to the DRF and serializing/deserializing. I'm slowly building a dashboard for my business during the corona virus and learning to code. I am in a little deep, but after spending more than $10k on developers on upwork and not really get much result, I figured, what do I have to lose?
Our software provider has a full API for our needs https://developer.myvr.com/api/, but absolutely no dashboard to report statistics about our clients reservation data.
The end result will be a synchronization of some of the data from their API to my database which will be hosted through AWS. I chose to do it this way due to having to do some post processing of data from the API. For example, we need to calculate occupancy rates(which is not an endpoint), expenses from our accounting connection and a few other small calculations in which the data is not already in the provided API. I originally wanted to use the data from the API solely, but I'm hesitant due to the reasons above.
That's the backstory, here are the questions:
The API response is extremely complex and nested multiple times, what is the best practise to extract a replication of the structure of the data to my own Database? Would I have to create models for each field manually?
Example response:
```{
"uri": "https://api.myvr.com/v1/properties/b6b0f2fe278f612b/",
"id": "b6b0f2fe278f612b",
"key": "b6b0f2fe278f612b",
"accessDescription": null,
"accommodates": 11,
"active": false,
"addressOne": "11496 Zermatt Dr",
"addressTwo": null,
"allowTurns": true,
"amenities": "https://api.myvr.com/v1/property-amenities/?propertyId=b6b0f2fe278f612b",
"automaticallyApprove": false,
"baseNightlyRate": "395.00",
"baseRate": {
"uri": "https://api.myvr.com/v1/rates/660c299d4785c32e/",
"id": "660c299d4785c32e",
"key": "660c299d4785c32e",
"externalId": null,
"baseRate": true,
"changeoverDay": null,
"created": "2019-01-19T08:02:36Z",
"currency": "USD",
"endDate": "2020-01-18",
"minStay": 3,
"modified": "2019-01-19T08:02:36Z",
"monthly": 0,
"name": "Base Rate",
"weekNight": 39500,
"nightly": 39500,
"position": 0,
"property": {
"name": "API Demo Property",
"uri": "https://api.myvr.com/v1/properties/b6b0f2fe278f612b/",
"id": "b6b0f2fe278f612b",
"externalId": null,
"key": "b6b0f2fe278f612b",
"slug": "api-demo-property"
},
"ratePlan": {
"uri": "https://api.myvr.com/v1/rate-plans/862caa3f5267602d/",
"key": "862caa3f5267602d",
"name": "Default Rates for Property"
},
"repeat": true,
"startDate": "2020-01-18",
"weekend": 0,
"weekendNight": 0,
"weekly": 250000
},
"bathrooms": "4.0",
"bedrooms": 4,
"bookingUrl": "https://myvr.com/reservation/redirect/booking/b6b0f2fe278f612b/",
"checkInTime": "16:00:00",
"checkOutTime": "10:00:00",
"city": "Truckee",
"commissionStructure": null,
"countryCode": "US",
"created": "2016-01-19T00:01:48Z",
"currency": "USD",
"customFields": {},
"description": "Luxurious living, scenic mountain setting, entertainment galore. Located on a quiet street in Tahoe Donner, our well equipped modern home is nestled into the wilderness. A babbling creek greets visitors approaching the front step as it collects into a small pond with a cascading waterfall. <br/>\n<br/>\nInside, over 3,000 sqft of luxurious living space divides itself between two floors. On the first floor, a beautiful kitchen with granite counters, gas stove and stainless steel appliances opens to a large great room centered around a wood burning fireplace and featuring 30' soaring ceilings. A spacious loft overlooks the great room, showcasing a large poker/card table. Upstairs features a large entertainment room, complete with wet bar, shuffleboard table, and state-of-the-art television setup with surround sound. The scenic backyard is accessible from a large deck featuring a new hot tub with seating for 7.",
"externalId": null,
"feePlan": {
"uri": "https://api.myvr.com/v1/fee-plans/4d1c44383755051b/",
"key": "4d1c44383755051b",
"name": "Default Fees for Listing"
},
"headline": "Beautiful Four Bedroom Lake Front Property",
"houseRules": null,
"instantBookingsEnabled": false,
"lat": "39.3422523000",
"level": "unit",
"localAreaDescription": "Tahoe Donner is a year round activity resort. The amenities include private beach/boat launching facilities, pools, recreation center, tennis, horseback riding, golf, downhill skiing as well as cross country skiing. Truckee is a historical mining town-having a western feel but also has museums, theaters, fine dining plus 2 large supermarkets-all less than 3 miles from the house. Our home is also located within a 15 minute drive to 4 major ski resorts. Downtown Reno is a short 40 minute drive away for those seeking a night on the town or the thrill of a Nevada casino.",
"lon": "-120.2271947000",
"lowestNightlyRate": "395.00",
"manual": "",
"modified": "2019-10-18T17:18:43Z",
"name": "API Demo Property",
"owner": null,
"postalCode": "96161",
"ratePlan": {
"uri": "https://api.myvr.com/v1/rate-plans/862caa3f5267602d/",
"key": "862caa3f5267602d",
"name": "Default Rates for Property"
},
"ratePlanLocked": false,
"region": "CA",
"shortCode": "API",
"size": 3000,
"slug": "api-demo-property",
"suitableElderly": "yes",
"suitableEvents": "unknown",
"suitableGroups": "yes",
"suitableHandicap": "no",
"suitableInfants": "unknown",
"suitableKids": "yes",
"suitablePets": "no",
"suitableSmoking": "no",
"transitDescription": null,
"type": "house",
"weekendNights": [
5,
6
]
}```
I think the best way to populate the database would be to run a custom management command to run a once off script, I've done this previously with another database, however I'm still stuck as I don't really want to write these models manually. Also a concern is if a field is missing or the structure changes.
This project is definitely above my skills and extremely ambitious, but I would appreciate any feedback or advice anyone might have.
Thanks,
Darren
So I didn't really get any interest in this question, but I ended up working it out myself.
I hope someone googles it and might find it helpful.
import requests
from rest_framework.response import Response
from django.core.management.base import BaseCommand, CommandError
from reservation.models import Reservation
import time
MYVR_URL = 'https://api.myvr.com/'
MYVR_RESERVATION = 'v1/reservations/?limit=100'
headers = {
'Authorization': 'Basic SOmeAPiCodeHeRe123=',
}
class Command(BaseCommand):
help = 'Imports new properties and saves the objects in the database'
def handle(self, *args, **options):
url = MYVR_URL + MYVR_RESERVATION
print("Populating Reservations")
def looping_api(url, headers):
while url:
r = requests.request("GET", url, headers=headers)
url = r.json().get('next')
props_data = r.json().get('results')
start_time = time.time()
for prop in props_data:
try:
created = Reservation.objects.update_or_create(
myvr_key=prop.get('key'),
adults=prop.get('adults'),
children=prop.get('children'),
checkin=prop.get('checkIn'),
checkout=prop.get('checkOut'),
checkinTime=prop.get('checkInTime'),
checkoutTime=prop.get('checkOutTime'),
guestFirstName=prop.get('firstName'),
dateCreated=prop.get('created'),
dateBooked=prop.get('dateBooked'),
dateCancelled=prop.get('dateCanceled'),
contact=prop.get('contact').get('name'),
contact_key=prop.get('contact').get('key'),
guest_type=prop.get('guestType'),
property_name=prop.get('property').get('name'),
property_key=prop.get('property').get('key'),
source_code=prop.get('source').get('code'),
source_name=prop.get('source').get('name'),
total_due=prop.get('quote').get('totalDue'),
total_refundables=prop.get(
'quote').get('totalRefundableFees'),
total_nonrefundables=prop.get(
'quote').get('totalNonrefundableFees'),
reference_id=prop.get('referenceId'),
)
print(
f"Added obj {prop.get('key')}")
except AttributeError as error:
print(f"{error} attribute is null or owner booking")
url = r.json().get('next')
print(r.json().get('next'))
print(len(props_data))
end_time = time.time()
duration = (end_time - start_time)
print(duration)
looping_api(url, headers)

Retrieving a specific object from JSON

This is a snippet of a JSON file that was returned by TMDB and I'm trying to access the title of every object. I've tried using the following methods like from this post How to access specific value from a nested array within an object array.
"results": [
{
"vote_count": 2358,
"id": 283366,
"video": false,
"vote_average": 6.5,
"title": "Miss Peregrine's Home for Peculiar Children",
"popularity": 20.662756,
"poster_path": "/AvekzUdI8HZnImdQulmTTmAZXrC.jpg",
"original_language": "en",
"original_title": "Miss Peregrine's Home for Peculiar Children",
"genre_ids": [
18,
14,
12
],
"backdrop_path": "/9BVHn78oQcFCRd4M3u3NT7OrhTk.jpg",
"adult": false,
"overview": "A teenager finds himself transported to an island where he must help protect a group of orphans with special powers from creatures intent on destroying them.",
"release_date": "2016-09-28"
},
{
"vote_count": 3073,
"id": 381288,
"video": false,
"vote_average": 6.8,
"title": "Split",
"popularity": 17.488396,
"poster_path": "/rXMWOZiCt6eMX22jWuTOSdQ98bY.jpg",
"original_language": "en",
"original_title": "Split",
"genre_ids": [
27,
53
],
"backdrop_path": "/4G6FNNLSIVrwSRZyFs91hQ3lZtD.jpg",
"adult": false,
"overview": "Though Kevin has evidenced 23 personalities to his trusted psychiatrist, Dr. Fletcher, there remains one still submerged who is set to materialize and dominate all the others. Compelled to abduct three teenage girls led by the willful, observant Casey, Kevin reaches a war for survival among all of those contained within him β€” as well as everyone around him β€” as the walls between his compartments shatter apart.",
"release_date": "2016-11-15"
},
var titles = results.map(function extract(item){return item.title})
The map function iterates through the array and builds the resulting array by applying the extract function on each item.
for (var i =0; i < obj.results.length; i++) {
console.log(obj.results[i].title);
}
First, we get the results key, and then, iterate over it since it is an array.
Results is nothing more than an array of objects, there's nothing nested.
You could use .forEach()
results.forEach(function(item){
console.log(item.title);
});

Insert mongodb to my json file

"Title": "Anti-Mage",
"Url": "antimage",
"ID": 1,
"Lore": "The monks of Turstarkuri watched the rugged valleys below their mountain monastery as wave after wave of invaders swept through the lower kingdoms. Ascetic and pragmatic, in their remote monastic eyrie they remained aloof from mundane strife, wrapped in meditation that knew no gods or elements of magic. Then came the Legion of the Dead God, crusaders with a sinister mandate to replace all local worship with their Unliving Lord's poisonous nihilosophy. From a landscape that had known nothing but blood and battle for a thousand years, they tore the souls and bones of countless fallen legions and pitched them against Turstarkuri. The monastery stood scarcely a fortnight against the assault, and the few monks who bothered to surface from their meditations believed the invaders were but demonic visions sent to distract them from meditation. They died where they sat on their silken cushions. Only one youth survived - a pilgrim who had come as an acolyte, seeking wisdom, but had yet to be admitted to the monastery. He watched in horror as the monks to whom he had served tea and nettles were first slaughtered, then raised to join the ranks of the Dead God's priesthood. With nothing but a few of Turstarkuri's prized dogmatic scrolls, he crept away to the comparative safety of other lands, swearing to obliterate not only the Dead God's magic users - but to put an end to magic altogether. ",
"SuggestedRoleLevels": {
"Carry": 2,
"Escape": 3
},
"Enabled": true,
"Side": "Radiant",
"Aliases": [
"am"
],
"AttackCapabilities": "DOTA_UNIT_CAP_MELEE_ATTACK",
"PrimaryAttribute": "DOTA_ATTRIBUTE_AGILITY",
"Initial": {
"Strength": 22,
"StrengthGain": 1.2,
"Agility": 22,
"AgilityGain": 2.8,
"Intelligence": 15,
"IntelligenceGain": 1.8,
"Health": 568,
"HealthRegen": 0.91,
"Mana": 195,
"ManaRegen": 0.61,
"Armor": 2.08,
"MagicResistance": 0.25,
"MinDamage": 49,
"MaxDamage": 53,
"AvgDamage": 51,
"IncreasedAttackSpeed": 22,
"BaseAttackTime": 1.45,
"AttackTime": 1.19,
"AttacksPerSecond": 0.84,
"AttackAnimationPoint": 0.3,
"AttackAcquisitionRange": 600,
"AttackRange": 128,
"VisionDayRange": 1800,
"VisionNightRange": 800,
"ProjectileSpeed": 0,
"MovementSpeed": 315,
"TurnRate": 0.5
},
"Abilities": [
{
"Title": "Mana Break",
"Url": "mana_break",
"HeroAbilityUrl": "antimage_mana_break",
"ID": 5003,
"Description": "Burns an opponent's mana on each attack. Mana Break deals 60% of the mana burned as damage to the target. Mana Break is a Unique Attack Modifier, and does not stack with other Unique Attack Modifiers.",
"Lore": "A modified technique of the Turstarkuri monks' peaceful ways is to turn magical energies on their owner.",
"Notes": [
"Mana Burn is blocked by spell immunity.",
"You can lifeleech the damage dealt by this skill with a Lifesteal aura."
],
"Type": "DOTA_ABILITY_TYPE_BASIC",
"Behavior": [
"DOTA_ABILITY_BEHAVIOR_PASSIVE"
],
"DamageType": "DAMAGE_TYPE_PHYSICAL",
"AbilitySpecial": [
{
"Name": "Damage Per_burn",
"Url": "damage_per_burn",
"Value": [
0.6
]
},
{
"Name": "Mana Burned Per Hit",
"Url": "mana_per_hit",
"ValueType": "FIXED",
"Value": [
28,
40,
52,
64
]
}
]
That is a slice from my json file, I wanna insert it inside a mongodb called dota2.
I was trying the command:
mongoimport --jsonArray -d dota2 -c docs --file heroes.json
the return told me:
2015-10-31T12:09:35.755-0200 connected to: localhost
2015-10-31T12:09:35.842-0200 imported 110 documents
But I won't get my data information, the command : db.dota.find()
Return empty...
Someone can help me? Yes...I am a newbie with mongodb. :/
As per your command you are inserting the data into the collection named docs present in database dota2.
So you should probably do like
use dota2;
db.docs.find();
Ok, I understand, my data are inside mongodb! But inside docs collections :) It works!

Pig : result of json loader empty

I'm using cdh5 quickstart vm and I have a file like this(not full here):
{"user_id": "kim95",
"type": "Book",
"title": "Modern Database Systems: The Object Model, Interoperability, and
Beyond.",
"year": "1995",
"publisher": "ACM Press and Addison-Wesley",
"authors": {},
"source": "DBLP"
}
{"user_id": "marshallo79",
"type": "Book",
"title": "Inequalities: Theory of Majorization and Its Application.",
"year": "1979",
"publisher": "Academic Press",
"authors": {("Albert W. Marshall"), ("Ingram Olkin")},
"source": "DBLP"
}
and I used this script:
books = load 'data/book-seded.json'
using JsonLoader('t1:tuple(user_id:
chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,source:chararray,authors:bag{T:tuple(author:chararray)})');
STORE books INTO 'book-no-seded.tsv';
the script works , but the generated file is empty, do you have any idea?
Finally , only this schema worked : If I add or remove a space different from this configuration then i gonna have an error( i also added "name" for tuples and specified "null" when it was empty, and changed the order between authors and source, but even without this congiguration it will still be wrong)
{"user_id": "kim95", "type": "Book","title": "Modern Database Systems: The Object Model, Interoperability, and Beyond.", "year": "1995", "publisher": "ACM Press and Addison-Wesley", "authors": [{"name":null"}], "source": "DBLP"}
{"user_id": "marshallo79", "type": "Book", "title": "Inequalities: Theory of Majorization and Its Application.", "year": "1979", "publisher": "Academic Press", "authors": [{"name":"Albert W. Marshall"},{"name":"Ingram Olkin"}], "source": "DBLP"}
And the working script is this one :
books = load 'data/book-seded-workings-reduced.json'
using JsonLoader('user_id:chararray,type:chararray,title:chararray,year:chararray,publisher:chararray,authors:{(name:chararray)},source:chararray');
STORE books INTO 'book-table.csv'; //whether .tsv or .csv
try STORE books INTO 'book-no-seded.tsv' using USING org.apache.pig.piggybank.storage.JsonStorage();
You need to bu sure that the LOAD schema is good. You can try to do a DUMP books to quick check.
We had to be careful with the input data and the schema when we used the Pig JsonLoader for this tutorial http://gethue.com/hadoop-tutorials-ii-1-prepare-the-data-for-analysis/.