I've managed to get this far:
import json
import pymongo
from bson import json_util
from pymongo import Connection
c = Connection()
db = c.test
collection = db.messages
for doc in collection.find({"mailbox":"bass-e"}, { "body" : "true" }):
doc
print doc
But what comes out is a JSON object. What I want is just the data. What packages/methods do I need to use to just get the text in the body column?
import pymongo
from pymongo import Connection
c = Connection()
db = c.test
collection = db.messages
messages = []
for doc in collection.find({"mailbox":"bass-e"}, { "body" : "true" }).limit(3):
messages.append(doc["body"])
for x in messages:
print x
Related
I am trying to add a json file into my Postgres database using sqlalchemy and flask however i'm getting the error. I've created the table Farmers in my pgadmin however now im trying to add the json data in f1.
Error:
line 20, in insert_data
f1 = Farmers(farmers={{"W":1000000,"Z":22758,"J1_I":0.66},{"W":3500000,"Z":21374,"J1_I":2.69},{"W":2500000,"Z":14321,"J1_I":0.76},{"W":2500000,"Z":14321,"J1_I":0.76}})
TypeError: unhashable type: 'dict'
The upload.py file is:
import os
import flask
from flask_sqlalchemy import SQLAlchemy
from flask import Flask, jsonify, send_from_directory
from sqlalchemy.dialects.postgresql import JSON
APP = Flask(__name__)
APP.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://postgres:admin#localhost:5432/flaskwebapp'
db = SQLAlchemy(APP)
class Farmers(db.Model):
id = db.Column(db.Integer, primary_key=True, index=True)
W = db.Column(db.Integer)
Z = db.Column(db.Integer)
J1_I = db.Column(db.Float)
#db.create_all()
def insert_data():
f1 = Farmers(farmers={{"W":1000000,"Z":22758,"J1_I":0.66},{"W":3500000,"Z":21374,"J1_I":2.69},{"W":2500000,"Z":14321,"J1_I":0.76},{"W":2500000,"Z":14321,"J1_I":0.76}})
db.session.add(f1)
db.session.commit()
print('Data inserted to DB!')
insert_data()
Json array object file:
{
"farmers":[ {
"W":1000000,
"Z":22758,
"J1_I":0.66
},
{
"W":3500000,
"Z":21374,
"J1_I":2.69
},
{
"W":2500000,
"Z":14321,
"J1_I":0.76
},
{
"W":2500000,
"Z":14321,
"J1_I":0.76
}]}
Any ideas on how to fix this?
You are trying to store multiple objects in your database at once. For this you have to create an object of type Farmers for each element of the list. You can then store these created objects in the database.
def insert_data():
# The list of objects with their associated data.
farmers = [
{"W":1000000,"Z":22758,"J1_I":0.66},
{"W":3500000,"Z":21374,"J1_I":2.69},
{"W":2500000,"Z":14321,"J1_I":0.76},
{"W":2500000,"Z":14321,"J1_I":0.76}
]
# An object is created for each record and added to a list.
farmer_objects = [Farmers(**data) for data in farmers]
# The objects created are added to the session.
db.session.add_all(farmer_objects)
# The session is closed and the objects it contains are saved.
db.session.commit()
By the way, the JSON file contains an object with a list of objects that contain the data. The list is accessible under the key "farmers". So you have to extract the list from the loaded file first.
For some reason you are trying to pass a dict.
I'm working on a spark streaming app/code which continuously reads data from localhost 9098. Is there a way to modify localhost into <users/folder/path> so to read data from folder path or json automatically ?
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.log4j.Logger
import org.apache.log4j.Level
object StreamingApplication extends App {
Logger.getLogger("Org").setLevel(Level.ERROR)
//creating spark streaming context
val sc = new SparkContext("local[*]", "wordCount")
val ssc = new StreamingContext(sc, Seconds(5))
// lines is a Dstream
val lines = ssc.socketTextStream("localhost", 9098)
// words is a transformed Dstream
val words = lines.flatMap(x => x.split(" "))
// bunch of transformations
val pairs = words.map(x=> (x,1))
val wordsCount = pairs.reduceByKey((x,y) => x+y)
// print is an action
wordsCount.print()
// start the streaming context
ssc.start()
ssc.awaitTermination()
}
Basically, I need help to modify code below:
val lines = ssc.socketTextStream("localhost", 9098)
to this:
val lines = ssc.socketTextStream("<folder path>")
fyi, I'm using IntelliJ Idea to build this.
I'd recommend reading Spark documentation, especially the scaladoc.
There seem to exist a method fileStream.
https://spark.apache.org/docs/2.4.0/api/java/org/apache/spark/streaming/StreamingContext.html
I am trying to create a dataframe column with JSON data which does not have a fixed schema. I am trying to write it in its original form as map/object but getting various errors.
I don't want to convert it to a string as I need to write this data in it's original form to the file.
Later this file is used for json processing, original structure should not be compromised.
Currently when I try writing data to a file it contain all the escape characters and is considered entire json as a string instead of complex type. Eg
{"field1":"d1","field2":"app","value":"{\"data\":\"{\\\"app\\\":\\\"am\\\"}\"}"}
You could try to make up a schema for the json file.
I don't know what output you expect.
As a clue I give you an example and two interesting links:
spark-read-json-with-schema
spark-schema-explained-with-examples
import org.apache.log4j.{Level, Logger}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.types.{StringType, StructType}
object RareJson {
val spark = SparkSession
.builder()
.appName("RareJson")
.master("local[*]")
.config("spark.sql.shuffle.partitions","4") //Change to a more reasonable default number of partitions for our data
.config("spark.app.id","RareJson") // To silence Metrics warning
.getOrCreate()
val sc = spark.sparkContext
val sqlContext = spark.sqlContext
val input = "/home/cloudera/files/tests/rare.json"
def main(args: Array[String]): Unit = {
Logger.getRootLogger.setLevel(Level.ERROR)
try {
val structureSchema = new StructType()
.add("field1",StringType)
.add("field2",StringType)
.add("value",StringType,true)
val rareJson = sqlContext
.read
.option("allowBackslashEscapingAnyCharacter", true)
.option("allowUnquotedFieldNames", true)
.option("multiLine", true)
.option("mode", "DROPMALFORMED")
.schema(structureSchema)
.json(input)
rareJson.show(truncate = false)
// To have the opportunity to view the web console of Spark: http://localhost:4041/
println("Type whatever to the console to exit......")
scala.io.StdIn.readLine()
} finally {
sc.stop()
println("SparkContext stopped")
spark.stop()
println("SparkSession stopped")
}
}
}
output
+------+------+---------------------------+
|field1|field2|value |
+------+------+---------------------------+
|d1 |app |{"data":"{\"app\":\"am\"}"}|
+------+------+---------------------------+
You can try to parse the value column too if it maintain the same format along the all rows.
I am quite new to django. I am trying to convert sql data fetched from a remote postgresql database into JSON so as to use it in react. But while dumping the data it throws as error.
`AttributeError: 'str' object has no attribute 'get'`
I tried many versions of dumping sql data into json like coneverting data into list and using RealDictCursor but each one of them throws a new error.
Views.py
from django.shortcuts import render, get_object_or_404
from django.http import JsonResponse
from django.http import HttpResponse
from .marketer import marketer
def marketer_list(request):
return JsonResponse(marketer)
marketer.py (function to fetch data and establish the connection)
from django.shortcuts import render, get_object_or_404
import json
import psycopg2
from psycopg2.extras import RealDictCursor
def marketer(self):
connection = psycopg2.connect(user = "db-user",
password = "*****",
host = "18.23.42.2",
port = "5432",
database = "db-name")
cursor = connection.cursor(cursor_factory = RealDictCursor)
postgreSQL_select_Query = "select id from auth_permission"
result = cursor.execute(postgreSQL_select_Query)
#print("Selecting rows from mobile table using cursor.fetchall")
#mobile = dictfetchall(result)
#items = [dict(zip([key[0] for key in cursor.description], row)) for
row in result]
return json.dumps(cursor.fetchall(), indent=2)
Error at Url page
AttributeError: 'str' object has no attribute 'get'
or
in some other methods
is not JSON serializable
I am trying to save data in form of JSON (returned as result from POST request)
def get_data(...):
...
try:
_r = requests.post(
_url_list,
headers=_headers
)
return _r.json()
except Exception as ee:
print('Could not get data: {}'.format(ee))
return None
Into a table in SQLITE database as backend.
def add_to_flight_data(_data):
if _data:
try:
new_record = FlightData(data=_data)
db.session.add(new_record)
db.session.commit()
print('Data instertedto DB!')
return "Success"
except Exception as e:
print('Data NOT instertedto DB! {}'.format(e))
pass
This is my simple flask code
import os
import time
import auth
import json
import requests
import datetime
from flask import Flask
from flask_marshmallow import Marshmallow
from flask_sqlalchemy import SQLAlchemy
# from safrs.safrs_types import JSONType
project_dir = os.path.dirname(os.path.abspath(__file__))
database_file = "sqlite:///{}".format(os.path.join(project_dir, "2w.sqlite"))
app = Flask(__name__)
app.config["SQLALCHEMY_DATABASE_URI"] = database_file
db = SQLAlchemy(app)
ma = Marshmallow(app)
class FlightData(db.Model):
id = db.Column(db.Integer, primary_key=True)
created = db.Column(db.DateTime, server_default=db.func.now())
json_data = db.Column(db.JSONType, default={})
def __init__(self, data):
self.data = data
It seems like there is perhaps no option to save JSON in sqlite
json_data = db.Column(db.JSONType, default={})
Please ADVISE
Thanks.
I believe that you should be using db.JSON, not db.JSONType as there is no such column type in sqlalchemy.
Regardless of that, SQLite has no JSON data type, so sqlalchemy won't be able to map columns of type db.JSON onto anything. According to the documentation only Postgres and some MySQL are supported. There is support for JSON in SQLite with the JSON1 extension, but sqlalchemy will not be able to make use of it.
Your best bet then is to declare the column as db.Text and use json.dumps() to jsonify the data on write. Alternatively modify your get_data() function to check for a JSON response (check the Content-type header or try calling _r.json() and catching exceptions), and then return _r.content which will already be a JSON string.
Use json.loads() to read data back from the db.