I'm using the sqlalchemy to store a binary tree data in the db:
class Distributor(Base):
__tablename__ = "distributors"
id = Column(Integer, primary_key=True)
upline_id = Column(Integer, ForeignKey('distributors.id'))
left_id = Column(Integer, ForeignKey('distributors.id'))
right_id = Column(Integer, ForeignKey('distributors.id'))
how can I generate json "tree" format data like the above listed:
{'id':1,children:[{'id':2, children:[{'id':3, 'id':4}]}]}
I'm guessing you're asking to store the data in a JSON format? Or are you trying to construct JSON from the standard relational data?
If the former, why don't you just create entries like:
{id: XX, parentId: XX, left: XX, right: XX, value: "foo"}
For each of the nodes, and then reconstruct the tree manually from the entries? Just start form the head (parentId == null) and then assemble the branches.
You could also add an additional identifier for the tree itself, in case you have multiple trees in the database. Then you would just query where the treeId was XXX, and then construct the tree from the entries.
I hesitate to provide this answer, because I'm not sure I really understand your the problem you're trying to solve (A binary tree, JSON, sqlalchemy, none of these are problems).
What you can do with this kind of structure is to iterate over each row, adding edges as you go along. You'll start with what is basically a cache of objects; which will eventually become the tree you need.
import collections
idmap = collections.defaultdict(dict)
for distributor in session.query(Distributor):
dist_dict = idmap[distributor.id]
dist_dict['id'] = distributor.id
dist_dict.setdefault('children', [])
if distributor.left_id:
dist_dict.['children'].append(idmap[distributor.left_id])
if distributor.right_id:
dist_dict.['children'].append(idmap[distributor.right_id])
So we've got a big collection of linked up dicts that can represent the tree. We don't know which one is the root, though;
root_dist = session.query(Distributor).filter(Distributor.upline_id == None).one()
json_data = json.dumps(idmap[root_dist.id])
Related
I am new to django and I am working on project where I am storing a key value pair dictionary as JSON in database.
Now later I want to show it on the html page all the list of packages but not able to access those key value pair as dictionary.
here is my
models.py
class Packages(models.Model):
package_ID = models.AutoField("Package ID", primary_key=True)
package_Name = models.CharField("Package Name", max_length=30, null=False)
attribute_values = models.CharField("Item Details JSON", max_length=500, null=False)
package_Price = models.IntegerField("Package Price", null=False, default=0.00)
quantity = models.IntegerField("Quantity", null=False, default=00)
prod_ID = models.ForeignKey(Product, on_delete=models.CASCADE, verbose_name="Product ID (FK)")
its data entry is something like this
package_ID = 1
package_Name = "basic card"
attribute_values = {"sizes": "8.5 in. x 11 in.", "Colour": "Full-Color Front - Unprinted Back",}
package_Price = 200
quantity = 400
prod_ID = 1
package_ID = 2
package_Name = "best card"
attribute_values = {"sizes": "8.5 in. x 11 in.", "Colour": "Full-Color Front - Unprinted Back",}
package_Price = 200
quantity = 500
prod_ID = 1
here the problem is that attribute_values fields stores a JSON string so I have to convert it in dictionary first and then have to access its dictionary.
the steps I want to do:
get all the packages having same product key:
now get the attribute_values of those packages
and convert that JSON into dictionary
then display the attribute_values of each package separately in html in particular block.
I want output something like this:
but I am getting this:
With Django version >=3.1 one can simply use JSONField for storing JSON in databases.
It is supported for the following databases:
JSONField is supported on MariaDB 10.2.7+, MySQL 5.7.8+, Oracle,
PostgreSQL, and SQLite 3.9.0+ (with the JSON1 extension
enabled).
Using the JSONField should be pretty simple:
class Packages(models.Model):
# other fields
attribute_values = models.JSONField()
Let's say we have an instance of Packages we can simply set a dictionary / variable that can be valid JSON on to attribute_values:
package.attribute_values = {"sizes": "8.5 in. x 11 in.", "Colour": "Full-Color Front - Unprinted Back",}
package.save()
This will be stored as JSON in the database.
When we want to access this in python it would automatically be deserialized to their python value:
package.attribute_values['new_key'] = 'new value'
package.save()
Considering your current problem with the text field you need to simply convert your string to equivalent python objects. This can be done using json.loads:
import json
attribute_values_dictionary = json.loads(package.attribute_values)
You can also add a method to your model to do this for you:
import json
class Packages(models.Model):
# Your fields
def attribute_values_to_python(self):
return json.loads(self.attribute_values)
Now in the view you can simply write:
attribute_values_dictionary = package.attribute_values_to_python()
In the template you can simply write:
{{ package.attribute_values_to_python }}
It is possible to store json in postgres using the json data type. Check this tutorial for an introduction: http://www.postgresqltutorial.com/postgresql-json/
Consider I am storing the following json in such a field:
{
"address": {
"street1": "123 seasame st"
}
}
I want a to store separately a reference to the street field. For example, I might have another object which is using data from this json structure and wants to store a reference to where it got the data. Maybe something like this:
class Product():
__tablename__ = 'Address'
street_1 = Column(String)
data_source = ?
Now I could make data_source a string and just store namespaces like address.street, but if I did this postgres has no idea what that means. Working with that in queries would mean parsing the string and other inefficient stuff. Does postgres support referring to fields stored inside json data structures?
This question is related to JSON foreign keys in PostgreSQL , but in this case I don't necessarily want a fk relationship. I just want to create a reference, which is not necessarily enforced in the way a fk is.
update:
To be more clear, I want to reference the location of something in the json structure on another attribute and store that reference in a column. In the below code, Address.data_source is a reference to the location of the street data (for example address.street1 in this case)
class Address():
__tablename__ = 'Address'
street_1 = Column(String)
sample_id = Column(Integer, ForeignKey('DataSample.uid'))
data_source = ?
class DataSample():
__tablename__ = 'DataSample'
uid = Column(Integer, primary_key=True)
data = Column(JSONB)
body = {
"address": {
"street1": "123 seasame st"
}
}
datasample = DataSample(data=body)
address = Address(street_1=datasample.data['address']['street_1'],
sample_id=datasample.uid,
data_source=?)
As clarified, the question is seeking a way to flexibly specify a path within a JSON object of a particular record. Keys are being handled in normal columns. Constraints on JSONB fields are not available, and there is no specific support for specifying paths within JSON objects.
I worked with the following in SQL Fiddle using PostgreSQL 9.6:
CREATE TABLE datasample (
id integer PRIMARY KEY,
data jsonb
);
CREATE TABLE address (
id integer PRIMARY KEY,
street_1 text,
sample_id integer REFERENCES datasample (id),
data_source text
);
INSERT INTO datasample(id, data)
VALUES (1, '{"address":{"street_1": "123 seasame st"}}');
INSERT INTO address(id,street_1, sample_id, data_source)
VALUES (1,'123 seasame st',1,'datasample.data->''address''->>''street''');
A typical lookup of the street address (needed to retrieve street_1) would resemble:
SELECT datasample.data->'address'->>'street_1'
FROM datasample
WHERE id=1;
There is no special postgres type for identifying columns. Strings are the closest available and you will need to retrieve the string (or array of strings, or object containing strings, if one of those simplifies parsing) and use it to build the query. In tbe first code block, I stored it as the (escaped) fragment of query - 'datasample.data->''address''->>''street'''. Though longer, it would require only retrieval and unescaping to use in a new custom query. I did not find a way to use the string as a fragment within the same SQL statement, though it might be possible to combine it with other bits of text to form a full statement that could be run through EXECUTE.
I have two mysql tables: Owners & Pets
Owner case class:
Owner(id: Int, name: String, age: Int)
Pet case class:
Pet(id: Int, ownerId: Int, type: String, name: String)
I want to create out of those tables list of OwnerAndPets:
case class OwnerAndPets(ownerId: Int,
name: String,
age: String,
pets: List[Pet])
(its for migrations purposes, I want to move those tables to be a collection of mongodb, which the collection documents would be OwnerAndPets objects)
I have two issues:
when I use join with quill on Owner & Pet I get list of tuples [(Owner, Pet)]
and if I have few pets for an owner I will get:
[(Owner(1, "john", 30), Pet(3,1,"dog","max")),
(Owner(1, "john", 30), Pet(4,1,"cat","snow"))]
I need it as (Owner(1, "john", 30), [Pet(3,1,"dog","max"), Pet(4,1,"cat","snow")])
how can I make it like this?
when I use join with quill on Owner & Pet I will not get owners that dont have pets and its fine cause this is what it supposed to be, but in my script in this case I would want to create object like:
OwnerAndPets(Owner(2, "mark", 30), List[])
Would appreciate any help, thanks
this is my join query:
query[Owner].join(query[Pet]).on((o, p) => o.id == p.o_id)
Your question highlights one of the major differences between FRM (Functional Relational Mapping) systems like Quill and Slick as opposed to ORMs like Hibernate. The purpose of FRM systems is not to build a particular domain-specific object hierarchy e.g. OwnersAndPets, but rather, to be able translate a single database query into some set of objects that can reasonably be pulled out of that single query's result set - this is typically a tuple. This means it is up to you to join the tuples (Owner_N, Pet_1-N) object into a single OwnersAndPets object in memory. Typically this can be done via groupBy and map operators:
run(query[Owner].join(query[Pet]).on((o, p) => o.id == p.o_id))
.groupBy(_._1)
.map({case (owner,ownerPetList) =>
OwnerAndPets(
owner.id,owner.name,owner.age+"", // Not sure why you made 'age' a String in OwnerAndPets
ownerPetList.map(_._2))
})
That said, there are some database vendors (e.g. Postgres) that internally implement array types so in some cases you can do the join on the database-level but this is not the case for MySQL and many others.
I'm a new spark user currently playing around with Spark and some big data and I have a question related to Spark SQL or more formally the SchemaRDD. I'm reading a JSON file containing data about some weather forecasts and I'm not really interested in all of the fields that I have ... I only want 10 fields out of 50+ fields returned for each record. Is there a way (similar to filter) that I can use to specify the names of some fields that I want remove from spark.
Just a small descriptive example. Consider I have the Schema "Person" with 3 fields "Name", "Age", and "Gender" and I'm not interested in the "Age" field and wold like to remove it. Can I use spark some how to do that. ? Thanks
If you are using Spark 1.2, you can do the following (using Scala)...
If you already know what fields you want to use, you can construct the schema for these fields and apply this schema to the JSON dataset. Spark SQL will return a SchemaRDD. Then, you can register it and query it as a table. Here is a snippet...
// sc is an existing SparkContext.
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
// The schema is encoded in a string
val schemaString = "name gender"
// Import Spark SQL data types.
import org.apache.spark.sql._
// Generate the schema based on the string of schema
val schema =
StructType(
schemaString.split(" ").map(fieldName => StructField(fieldName, StringType, true)))
// Create the SchemaRDD for your JSON file "people" (every line of this file is a JSON object).
val peopleSchemaRDD = sqlContext.jsonFile("people.txt", schema)
// Check the schema of peopleSchemaRDD
peopleSchemaRDD.printSchema()
// Register peopleSchemaRDD as a table called "people"
peopleSchemaRDD.registerTempTable("people")
// Only values of name and gender fields will be in the results.
val results = sqlContext.sql("SELECT * FROM people")
When you look at the schema of peopleSchemaRDD (peopleSchemaRDD.printSchema()), you will only see name and gender field.
Or, if you want to explore the dataset and determine what fields you want after you see all fields, you can ask Spark SQL to infer the schema for you. Then, you can register the SchemaRDD as a table and use projection to remove unneeded fields. Here is a snippet...
// Spark SQL will infer the schema of the given JSON file.
val peopleSchemaRDD = sqlContext.jsonFile("people.txt")
// Check the schema of peopleSchemaRDD
peopleSchemaRDD.printSchema()
// Register peopleSchemaRDD as a table called "people"
peopleSchemaRDD.registerTempTable("people")
// Project name and gender field.
sqlContext.sql("SELECT name, gender FROM people")
You can specify what fields you would like to have in the schemaRDD. Below is an example. Create a case class, with only the fields that you need. Read the data into an rdd, then specify the only the fileds that you need(in the same order as you have specified the schema in the case class).
Sample Data: People.txt
foo,25,M
bar,24,F
Code:
case class Person(name: String, gender: String)
val people = sc.textFile("People.txt").map(_.split(",")).map(p => Person(p(0), p(2)))
people.registerTempTable("people")
I would like to generate a simple json file from an database.
I am not an expert in parsing json files using python nor NDB database engine nor GQL.
What is the right query to search the data? see https://developers.google.com/appengine/docs/python/ndb/queries
How should I write the code to generate the JSON using the same schema as the json described here below?
Many thanks for your help
Model Class definition using NDB:
# coding=UTF-8
from google.appengine.ext import ndb
import logging
class Albums(ndb.Model):
"""Models an individual Event entry with content and date."""
SingerName = ndb.StringProperty()
albumName = ndb.StringProperty()
Expected output:
{
"Madonna": ["Madonna Album", "Like a Virgin", "True Blue", "Like a Prayer"],
"Lady Gaga": ["The Fame", "Born This Way"],
"Bruce Dickinson": ["Iron Maiden", "Killers", "The Number of the Beast", "Piece of Mind"]
}
For consistency, model names should by singular (Album not Albums), and property names should be lowercase_with_underscores:
class Album(ndb.Model):
singer_name = ndb.StringProperty()
album_name = ndb.StringProperty()
To generate the JSON as described in your question:
1) Query the Album entities from the datastore:
albums = Album.query().fetch(100)
2) Iterate over them to form a python data structure:
albums_dict = {}
for album in albums:
if not album.singer_name in albums_dict:
albums_dict[album.singer_name] = []
albums_dict[album.singer_name].append(album.album_name)
3) use json.dumps() method to encode to JSON.
albums_json = json.dumps(albums_dict)
Alternatively, you could use the built in to_dict() method:
albums = Album.query().fetch(100)
albums_json = json.dumps([a.to_dict() for a in albums])