Python bytes to geojson Point - mysql

I have a MySQL database where I have Point type location data and a Django (Django Rest Framework) backend where I am trying to retrieve that data. If I try to get that location data from phpMyAdmin the returned location is something like this POINT(23.89826 90.267535). In my Django backend however, I get a bytes as the returned location. The returned value is something like this
b'\x00\x00\x00\x00\x01\x01\x00\x00\x00\x12N\x0b^\xf4\xe57#C\xe2\x1eK\x1f\x91V#'
The database uses utf8mb4_unicode_ci collation.
If I try to convert the returned bytes to a string with .decode('utf-8') I get UnicodeDecodeError
>>> s = b'\x00\x00\x00\x00\x01\x01\x00\x00\x00\x12N\x0b^\xf4\xe57#C\xe2\x1eK\x1f\x91V#'
>>> s.decode('utf-8')
Traceback (most recent call last):
File "<console>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf4 in position 13: invalid continuation byte
I get the same bytes array even if I perform a raw query from Django with the MySQL function St_AsGeoJson(location).
I then tried geojson. When I feed that bytes to geojson.Point() I get a geojson back but instead of 2 floats the coordinates array consists 25 integer values.
>>> s = b'\x00\x00\x00\x00\x01\x01\x00\x00\x00\x12N\x0b^\xf4\xe57#C\xe2\x1eK\x1f\x91V#'
>>> geojson.Point(s)
{"coordinates": [0, 0, 0, 0, 1, 1, 0, 0, 0, 18, 78, 11, 94, 244, 229, 55, 64, 67, 226, 30, 75, 31, 145, 86, 64], "type": "Point"}
How can I retrieve the Point data from the bytes or this geojson?

I had this problem because I was using plain Django and Django models doesn't have a field type that deals with Geo data. I was using a CharField with a max_length=255 and then tried to parse whatever that CharField retrieved from the database. I have solved the problem by using GeoDjango and Django REST Framework GIS. Django REST Framework GIS is not necessary. I used it because I am using Django REST Framework and it outputs the Geo data in a nice format.
Steps were to
Install GDAL(Geospatial Data Abstraction Library)
sudo apt-get install gdal-bin
sudo apt-get install python3-gdal
Add django.contrib.gis and rest_framework_gis to settings.INSTALLED_APPS
Set GDAL_LIBRARY_PATH in settings, in my case it's GDAL_LIBRARY_PATH = os.getenv('GDAL_LIBRARY_PATH')
Update model import from from django.db import models to from django.contrib.gis.db import models
Update the model to use a Geo field. More: https://docs.djangoproject.com/en/2.1/ref/contrib/gis/model-api/
Links
https://docs.djangoproject.com/en/2.1/ref/contrib/gis/
https://github.com/djangonauts/django-rest-framework-gis
https://github.com/domlysz/BlenderGIS/wiki/How-to-install-GDAL

Related

How to push the data from rds to kafka queue in json format

I use kafka topic to receive message from mysql database.I need to write python code to push the data in json format from mysql to kafka topic.My requirement is to get the output in json format but not in raw strings.
Below is the python code to dump the mysql table data to kafka topic in json format.
Code:
connection = mysql.connector.connect(host='xyz.us-east-1.rds.amazonaws.com', database='testdb',user='stdnt', password='pssw123')
cursor=connection.cursor()
statement='SELECT * FROM patients_vital_info'
cursor.execute(statement)
data=cursor.fetchall()
producer = KafkaProducer(bootstrap_servers=['localhost:9092'],
api_version=(0,11,5),value_serializer=lambda x:
json.dumps(x).encode('utf-8'))
for i in data:
producer.send('test',i)
sleep(1)
Output from kafka topic in raw string format:
[3, 69, 175]
[4, 68, 171]
[5, 72, 177]
[1, 78, 162]
[2, 66, 157]
[3, 72, 156]
The output should be pushed in json format while writing the message to kafka queue.
Expected output:
{"bp":140,"heartBeat":73,"Customerid":1}
cursor.fetchall() returns a row iterator, not a dictionary with key-value pairs of column to value. Your data is also, correctly, a JSON array
You'd need to build the JSON yourself if you want to include column names or use Kafka Connect JDBC source / Debezium rather than Python to do exactly what you're looking for

reading JSON from file and extract the keys returns attribute str has no keys

I am new to Python (and JSON) so apologies of this is obvious to you.
I pull some data from an API using the following code
import requests
import json
headers = {'Content-Type': 'application/json', 'accept-encoding':'identity'}
api_url = api_url_base+api_token+api_request #variables removed for security
response = requests.get(api_url, headers=headers)
data=response.json()
keys=data.keys
if response.status_code == 200:
print(data["message"], "saving to file...")
print("Found the following keys:")
print(keys)
with open('vulns.json', 'w') as outfile:
json.dump(response.content.decode('utf-8'),outfile)
print("File Saved.")
else:
print('The site returned a', response.status_code, 'error')
this works, I get some data returned and I am able to write the file.
I am trying to change what's returned form a short format to a long format and to check its working I need to see the keys, I was trying to do this offline using the written file (as practice for reading JSON from files).
I wrote these few lines (taken from this site https://www.kite.com/python/answers/how-to-print-the-keys-of-a-dictionary-in-python)
import json
with open('vulns.json') as json_file:
data=json.load(json_file)
print(data)
keys=list(data.keys())
print(keys)
Unfortunately, whenever I run this it returns this error
Python 3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> print(keys)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'keys' is not defined
>>> & C:/Users/xxxx/AppData/Local/Microsoft/WindowsApps/python.exe c:/Temp/read-vulnfile.py
File "<stdin>", line 1
& C:/Users/xxxx/AppData/Local/Microsoft/WindowsApps/python.exe c:/Temp/read-vulnfile.py
^
SyntaxError: invalid syntax
>>> exit()
PS C:\Users\xxxx\Documents\scripts\Python> & C:/Users/xxx/AppData/Local/Microsoft/WindowsApps/python.exe c:/Temp/read-vulnfile.py
Traceback (most recent call last):
File "c:\Temp\read-vulnfile.py", line 6, in <module>
keys=list(data.keys)
AttributeError: 'str' object has no attribute 'keys'
The Print(data) command returns what looks like JSON, this is the opening line:
{"count": 1000, "message": "Vulnerabilities found: 1000", "data":
[{"...
I cant show the content it's sensitive.
why is this looking at a str object rather than a dictionary?
how do I read JSON back into a dictionary please?
You just have that content stored in file as a string. Just open the vulns.json in some editor and there most likely is something like "{'count': 1000, ... instead of {"count": 1000, ....
It's opened by json.load, but translated to string (see this table).
So you should take one step back and take a look what happens during saving to file. You take some content from your response, but dump the string decoded value into a file. Take instead a try with
json.dump(response.json(), outfile)
(or just use data variable you already have provided).
This should allow you to succesfully dump and load data as a dict.

Why does Pycryptodome MAC check fail when encrypting and decrypting JSON files?

I am trying to do encrypt some JSON data with AES-256, using a password hashed with pbkdf2_sha256 as the key. I want to store the data in a file, be able to load it up, decrypt it, alter it, encrypt it, store it, and repeat.
I am using the passlib and pycryptodome libraries with python 3.8. The following test occurs inside a docker container and throws an error I haven't been able to correct
Does anyone have any clues on how I can improve my code (and knowledge)?
Test.py:
import os, json
from Crypto.PublicKey import RSA
from Crypto.Cipher import AES
from passlib.hash import pbkdf2_sha256
def setJsonData(jsonData, jsonFileName):
with open(jsonFileName, 'wb') as jsonFile:
password = 'd'
key = pbkdf2_sha256.hash(password)[-16:]
data = json.dumps(jsonData).encode("utf8")
cipher = AES.new(key.encode("utf8"), AES.MODE_EAX)
ciphertext, tag = cipher.encrypt_and_digest(data)
[ jsonFile.write(x) for x in (cipher.nonce, tag, ciphertext) ]
def getJsonData(jsonFileName):
with open(jsonFileName, 'rb') as jsonFile:
password = 'd'
key = pbkdf2_sha256.hash(password)[-16:]
nonce, tag, ciphertext = [ jsonFile.read(x) for x in (16, 16, -1) ]
cipher = AES.new(key.encode("utf8"), AES.MODE_EAX, nonce)
data = cipher.decrypt_and_verify(ciphertext, tag)
return json.loads(data)
dictTest = {}
dictTest['test'] = 1
print(str(dictTest))
setJsonData(dictTest, "test")
dictTest = getJsonData("test")
print(str(dictTest))
Output:
{'test': 1}
Traceback (most recent call last):
File "test.py", line 37, in <module>
dictTest = getJsonData("test")
File "test.py", line 24, in getJsonData
data = cipher.decrypt_and_verify(ciphertext, tag)
File "/usr/local/lib/python3.8/site-packages/Crypto/Cipher/_mode_eax.py", line 368, in decrypt_and_verify
self.verify(received_mac_tag)
File "/usr/local/lib/python3.8/site-packages/Crypto/Cipher/_mode_eax.py", line 309, in verify
raise ValueError("MAC check failed")
ValueError: MAC check failed
Research:
Looked into this answer, but I believe my verify() call is in
the right place
I noted that in the python docs, it says:
loads(dumps(x)) != x if x has non-string keys.
but, when I re-run the test with dictTest['test'] = 'a' I have the same error.
I suspected the problem was the json formatting, so I did the same test with a string and didn't make the json.loads and json.dumps calls, but I have the same error
The problem here is that key = pbkdf2_sha256.hash(password)[-16:] hashes the key with a new salt each call. Therefore, the cipher used to encrypt and decrypt the cipher text is going to be different, yielding different data, and thus failing the integrity check.
I changed my key derivation function to the following:
h = SHA3_256.new()
h.update(password.encode("utf-8"))
key = h.digest()

Convert a pipeline_pb2.TrainEvalPipelineConfig to JSON or YAML file for tensorflow object detection API

I want to convert a pipeline_pb2.TrainEvalPipelineConfig to JSON or YAML file format for tensorflow object detection API. I tried converting the protobuf file using :
import tensorflow as tf
from google.protobuf import text_format
import yaml
from object_detection.protos import pipeline_pb2
def get_configs_from_pipeline_file(pipeline_config_path, config_override=None):
'''
read .config and convert it to proto_buffer_object
'''
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
with tf.gfile.GFile(pipeline_config_path, "r") as f:
proto_str = f.read()
text_format.Merge(proto_str, pipeline_config)
if config_override:
text_format.Merge(config_override, pipeline_config)
#print(pipeline_config)
return pipeline_config
def create_configs_from_pipeline_proto(pipeline_config):
'''
Returns the configurations as dictionary
'''
configs = {}
configs["model"] = pipeline_config.model
configs["train_config"] = pipeline_config.train_config
configs["train_input_config"] = pipeline_config.train_input_reader
configs["eval_config"] = pipeline_config.eval_config
configs["eval_input_configs"] = pipeline_config.eval_input_reader
# Keeps eval_input_config only for backwards compatibility. All clients should
# read eval_input_configs instead.
if configs["eval_input_configs"]:
configs["eval_input_config"] = configs["eval_input_configs"][0]
if pipeline_config.HasField("graph_rewriter"):
configs["graph_rewriter_config"] = pipeline_config.graph_rewriter
return configs
configs = get_configs_from_pipeline_file('pipeline.config')
config_as_dict = create_configs_from_pipeline_proto(configs)
But when I try converting this returned dictionary to YAML with yaml.dump(config_as_dict) it says
TypeError: can't pickle google.protobuf.pyext._message.RepeatedCompositeContainer objects
For json.dump(config_as_dict) it says :
Traceback (most recent call last):
File "config_file_parsing.py", line 48, in <module>
config_as_json = json.dumps(config_as_dict)
File "/usr/lib/python3.5/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python3.5/json/encoder.py", line 198, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/usr/lib/python3.5/json/encoder.py", line 256, in iterencode
return _iterencode(o, 0)
File "/usr/lib/python3.5/json/encoder.py", line 179, in default
raise TypeError(repr(o) + " is not JSON serializable")
TypeError: label_map_path: "label_map.pbtxt"
shuffle: true
tf_record_input_reader {
input_path: "dataset.record"
}
is not JSON serializable
Would appreciate some help here.
JSON can only dump a subset of the python primtivies primitives and dict and list collections (with limitation on self-referencing).
YAML is more powerful, and can be used to dump arbitrary Python objects. But only if those objects can be "investigated" during the representation phase of the dump, which essentially limits that to instances of pure Python classes. For objects created at the C level, one can make explicit dumpers, and if not available Python will try and use the pickle protocol to dump the data to YAML.
Inspecing protobuf on PyPI shows me that there are non-generic wheels available, which is always an indication for some C code optimization. Inspecting one of these files indeed shows a pre-compiled shared object.
Although you make a dict out of the config, this dict can of course only be dumped when all its keys and all its values can be dumped. Since your keys are strings (necessary for JSON), you need to look at each of the values, to find the one that doesn't dump, and convert that to a dumpable object structure (dict/list for JSON, pure Python class for YAML).
You might want to take a look at Module json_format

Php json array into Python3

I have a php script that outputs a json array that looks like this...
[{"year":"2016","Month":"Apr","the_days":"16, 29, 30"},
{"year":"2016","Month":"May","the_days":"13, 27"},
{"year":"2016","Month":"Jun","the_days":"10, 11, 24"},
{"year":"2016","Month":"Jul","the_days":"08, 22, 23"},
{"year":"2016","Month":"Aug","the_days":"06, 20"},
{"year":"2016","Month":"Sep","the_days":"02, 03, 16, 17, 30"},
{"year":"2016","Month":"Oct","the_days":"01, 14, 15, 29"},
{"year":"2016","Month":"Nov","the_days":"25"},
{"year":"2016","Month":"Dec","the_days":"09, 10, 23, 24"}]
I'm trying to put together some Python that will (eventually) output something like....
Apr: 16, 29, 30
May: 13, 27
//etc
...but I'm not having any luck pulling the array out.
This is code that I'm using in Python3 (that I've pulled together from other Stack questions that I've searched for).
import urllib.request
import json
response = urllib.request.urlopen('http://www.captainobviousobviously.co.uk/private/Apijson.php')
content = response.read()
data = json.load(content.decode('utf-8'))
print(data)
This is the error that I'm getting...
Traceback (most recent call last):
File "/home/pi/Python/availableDates.py", line 6, in <module>
data = json.load(content.decode('utf-8'))
File "/usr/lib/python3.4/json/__init__.py", line 265, in load
return loads(fp.read(),
AttributeError: 'str' object has no attribute 'read'
I'm not really sure how to fix it.
Replace
data = json.load(content.decode('utf-8'))
with
data = json.loads(content.decode('utf-8'))
'load' is for files and 'loads' for strings.
Refer What is the difference between json.dumps and json.load?.
As for the code for your problem
for i in data:
print (str(i['Month'])+":"+str(i['the_days']))
Use json.loads instead. load is for loading from a stream, such as a file, whereas loads loads from a string.
data = json.loads(content.decode('utf-8'))
From the Python documentation:
json.load
Deserialize fp (a .read()-supporting file-like object containing a JSON document) to a Python object using this conversion table.
A string isn't a "file-like object", which is why you get your error - the JSON is trying to call .read on the string, but that doesn't exist.
You need to use json.loads(<json str>). If you want you can do the following
content = response.read().decode()
data = json.loads(content)
for d in data:
print(d["Month"], d["the_days"], sep=":")