How to store binary data in Dynamo with boto? - boto

I can't determine from the docs/examples how to store/read binary data from DynamoDB using boto's dynamodb2. How is it done?
My guess was with an item value like { 'B': binary-data } but that causes an error in the JSON encoder.

boto provides the Binary class to do this automatically:
from boto.dynamodb2.table import Table
from boto.dynamodb.types import Binary
Table('mytable').put_item({'hashkey': Binary('\x01\x02')})

It needs to be base 64 encoded into a string.
from base64 import b64encode
data = {'B': b64encode(binary_data)}
There is a library that can do this for you: PynamoDB.
The code which handles serialization to and from binary for Python 2 and 3 can be found here.
Disclaimer: I am the author of PynamoDB.

Related

sqlalchemy/psycopg2 desreializes my jsons as dicts but doesn't preserve order

I have a psql query that returns a json object. I fetch this query with results.fetchall() and I get the json properly as a dict.
however, as I'm in python 3.4, not yet in 3.6, the objects' order is not preserved in the dict. I saw there's a way to use OrderedDict to keep the order of the json but I'm not sure how to tell sqlalchemy/psycopg2 to use it.
can anybody help please?
As indicated in the documentation, you must provide a custom deserializer when creating your engine:
from functools import partial
import json, collections
engine = create_engine(
...,
json_deserializer=partial(
json.loads,
object_pairs_hook=collections.OrderedDict),
)

Create JSON file from MongoDB document using Python

I am using MongoDB 3.4 and Python 2.7. I have retrieved a document from the database and I can print it and the structure indicates it is a Python dictionary. I would like to write out the content of this document as a JSON file. When I create a simple dictionary like d = {"one": 1, "two": 2} I can then write it to a file using json.dump(d, open("text.txt", 'w'))
However, if I replace d in the above code with the the document I retrieve from MongoDB I get the error
ObjectId is not JSON serializable
Suggestions?
As you have found out, the issue is that the value of _id is in ObjectId.
The class definition for ObjectId is not understood by the default json encoder to be serialised. You should be getting similar error for ANY Python object that is not understood by the default JSONEncoder.
One alternative is to write your own custom encoder to serialise ObjectId. However, you should avoid inventing the wheel and use the provided PyMongo/bson utility method bson.json_util
For example:
from bson import json_util
import json
json.dump(json_util.dumps(d), open("text.json", "w"))
The issue is that “_id” is actually an object and not natively deserialized. By replacing the _id with a string as in mydocument['_id'] ='123 fixed the issue.

Scala code to insert JSON string to mongo DB using scala drivers/casbah

I found some code which can parse JSON document, convert it to BSON and then insert. But this code is implemented using Java classes in casbah. I could not find corresponding implementation in Scala.
Also casbah documentation says "In general, you should only need the org.mongodb.scala and org.bson namespaces in your code. You should not need to import from the com.mongodb namespace as there are equivalent type aliases and companion objects in the Scala driver. The main exception is for advanced configuration via the builders in MongoClientSettings which is considered to be for advanced users."
If you see below code and note imports, they are using com.mongodb classes. I can use below code in scala and make it work, but I want to know if there is Scala implementation out there to insert JSON into mongodb.
import com.mongodb.DBObject
import com.mongodb.casbah.MongoClient
import com.mongodb.casbah.MongoClientURI
import com.mongodb.util.JSON
val jsonString = """{"card_id" : 75893645814809,"cust_id": 1008,"card_info": {"card_type" : "Travel Card","credit_limit": 126839},"card_dates" : [{"date":"1997-09-09" },{"date":"2007-09-07" }]}"""
val dbObject: DBObject = JSON.parse(jsonString).asInstanceOf[DBObject]
val mongo = MongoClient(MongoClientURI("mongodb://127.0.0.1:27017"))
val buffer = new java.util.ArrayList[DBObject]()
buffer.add(dbObject)
mongo.getDB("yourDBName").getCollection("yourCollectionName").insert(buffer)
buffer.clear()
Reference : Scala code to insert JSON string to mongo DB
I found few link online which suggests to use different JSON parser libraries, but none of them seems straightforward even though above ~5 lines of code can insert JSON document in Java. I would like to achieve similar thing in Java.

Problems with using a .tsv file that contains JSON as a data feed file in Gatling

I am using Gatling to stress test a RESTful API. I will be posting data that is JSON to a particular URI. I want to use a feed file that is a .tsv where each line is a particular JSON element. However, I get errors and I just can't seem to find a pattern or system to add "" to my .tsv JSON so the feed will work. Attached is my code and tsv file.
package philSim
import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._
class eventAPISimulation extends Simulation {
object Query {
val feeder = tsv("inputJSON.tsv").circular
val query = forever {
feed(feeder)
.exec(
http("event")
.post("my/URI/here")
.body(StringBody("${json}")).asJSON
)
}
}
val httpConf = http.baseURL("my.url.here:portnumber")
val scn = scenario("event").exec(Query.query)
setUp(scn.inject(rampUsers(100) over (30 seconds)))
.throttle(reachRps(2000) in (30 seconds), holdFor(3 minutes))
.protocols(httpConf)
}
Here is an example of my unedited .tsv with JSON:
json
{"userId":"234342234","secondaryIdType":"mobileProfileId","secondaryIdValue":"66666638","eventType":"push","eventTime":"2015-01-23T23:20:50.123Z","platform":"iPhoneApp","notificationId":"123456","pushType":1,"action":"sent","eventData":{}}
{"userId":"234342234","secondaryIdType":"mobileProfileId","secondaryIdValue":"66666638","eventType":"INVALID","eventTime":"2015-01-23T23:25:20.342Z","platform":"iPhoneApp","notificationId":"123456","pushType":1,"action":"received","eventData":{"osVersion":"7.1.2","productVersion":"5.9.2"}}
{"userId":"234342234","secondaryIdType":"mobileProfileId","secondaryIdValue":"66666638","eventType":"push","eventTime":"2015-01-23T23:27:30.342Z","platform":"iPhoneApp","notificationId":"123456","pushType":1,"action":"followedLink","eventData":{"deepLinkUrl":"URL.IS.HERE","osVersion":"7.1.2","productVersion":"5.9.2"}}
{"userId":"234342234","secondaryIdType":"mobileProfileId","secondaryIdValue":"66666638","eventType":"push","eventTime":"2015-01-23T23:27:30.342Z","platform":"AndroidApp","notificationId":"123456","pushType":1,"action":"followedLink","eventData":{"deepLinkUrl":"URL.IS.HERE"}}
{"userId":"234342234","secondaryIdType":"mobileProfileId","secondaryIdValue":"66666638","eventType":"push","eventTime":"2015-01-23T23:25:20.342Z","platform":"iPhoneApp","notificationId":"123456","pushType":1,"action":"error","eventData":{}}
I have seen this blog post which talks about manipulating quotation marks (") to get the author's JSON with .tsv to work but the author doesn't offer a system how. I have tried various things and nothing I do really works. Some JSON will work with the quotation wrap similar to what the author of the paper does. However, this doesn't work for everything. What are the best practices for dealing with JSON and Gatling? Thanks for your help!
Straight from Gatling's documentation : Use rawSplit so that Gatling's TSV parser will be able to handle your JSON entries:
tsv("inputJSON.tsv", rawSplit = true).circular

Comments in textual serialized protobuf? (not the scheme definition)

I'm using textual protobuf files for system configuration.
One problem I have with this is that the serialized protobuf format does not support comments.
Is there any way around this?
I'm talking about the textual serialized data format, not the scheme definition.
Was this problem solved somewhere by someone?
Textual Protobuf format (serialized protobuf messages in text formal) supports comments using the # syntax. I could not find a reference for the same in any online documentation but have used the same in projects in the past so I put together a small example that one can test with:
Sample message description - [SampleProtoSchema.proto]
message SampleProtoSchema {
optional int32 first_val = 1; // Note: This supports C/C++ style comments
optional int32 second_val = 2;
}
Sample text message - [SampleTextualProto.prototxt]
# This is how textual protobuf format supports comments
first_val: 12 # can also be inline comments
# This is another comment
second_val: 23
Note though that these comments cannot be generated automatically at time of serialization. They can only be added manually afterwards.
Compile and test:
> protoc --python_out=. SampleProtoSchema.proto
>
> ipython
[1]: import SampleProtoSchema_pb2
[2]: sps = SampleProtoSchema_pb2.SampleProtoSchema()
[3]: from google.protobuf import text_format
[4]: with open('SampleTextualProto.prototxt', 'r') as f:
text_format.Merge(f.read(), sps)
[5]: sps.first_val
[5]> 12
[6]: sps.second_val
[6]> 23
You may want to take a look at the Piqi project. It addresses this problem by introducing a new human-readable "Piq" data format and a command-line tool for converting data between Protobuf, Piq, JSON and XML formats.
The Piq data format was specially designed for human interaction. It supports comments, binary literals and verbatim text literals.