This question already has an answer here:
Is there a way to store Mosquitto payload into an MySQL database for history purpose?
(1 answer)
Closed 4 years ago.
I've connected a device that communicates to my mosquitto MQTT server (RPi) and is sending out publications to a specified topic. What I want to do now is to store the messages published on that topic on the MQTT server into a MySQL database. I know how MySQL works, but I don't know how to listen for these incoming publications. I'm looking for a light-weight solution that runs in the background. Any pointers or ideas on libraries to use are very welcome.
I've done something similar in the last days:
live-collecting weatherstation-data with pywws
publishing with pywws.service.mqtt to mqtt-Broker
python-script on NAS collecting the data and writing to MariaDB
#!/usr/bin/python -u
import mysql.connector as mariadb
import paho.mqtt.client as mqtt
import ssl
mariadb_connection = mariadb.connect(user='USER', password='PW', database='MYDB')
cursor = mariadb_connection.cursor()
# MQTT Settings
MQTT_Broker = "192.XXX.XXX.XXX"
MQTT_Port = 8883
Keep_Alive_Interval = 60
MQTT_Topic = "/weather/pywws/#"
# Subscribe
def on_connect(client, userdata, flags, rc):
mqttc.subscribe(MQTT_Topic, 0)
def on_message(mosq, obj, msg):
# Prepare Data, separate columns and values
msg_clear = msg.payload.translate(None, '{}""').split(", ")
msg_dict = {}
for i in range(0, len(msg_clear)):
msg_dict[msg_clear[i].split(": ")[0]] = msg_clear[i].split(": ")[1]
# Prepare dynamic sql-statement
placeholders = ', '.join(['%s'] * len(msg_dict))
columns = ', '.join(msg_dict.keys())
sql = "INSERT INTO pws ( %s ) VALUES ( %s )" % (columns, placeholders)
# Save Data into DB Table
try:
cursor.execute(sql, msg_dict.values())
except mariadb.Error as error:
print("Error: {}".format(error))
mariadb_connection.commit()
def on_subscribe(mosq, obj, mid, granted_qos):
pass
mqttc = mqtt.Client()
# Assign event callbacks
mqttc.on_message = on_message
mqttc.on_connect = on_connect
mqttc.on_subscribe = on_subscribe
# Connect
mqttc.tls_set(ca_certs="ca.crt", tls_version=ssl.PROTOCOL_TLSv1_2)
mqttc.connect(MQTT_Broker, int(MQTT_Port), int(Keep_Alive_Interval))
# Continue the network loop & close db-connection
mqttc.loop_forever()
mariadb_connection.close()
If you are familiar with Python the Paho MQTT library is simple, light on resources, and interfaces well with Mosquitto. To use it simply subscribe to the topic and set up a callback to pass the payload to MySQL using peewee as shown in this answer. Run the script in the background and call it good!
Related
Actually, I am trying to update one table with multiple processes via pymysql, and each process reads a CSV file split from a huge one in order to promote the speed. But I get the Lock wait timeout exceeded; try restarting transaction exception when I run the script. After searching the posts on this site, I found one post which mentioned that to set or build the built-in LOAD_DATA_INFILE, but no details on it. How can I do it with 'pymysql' to reach my aim?
---------------------------first edit----------------------------------------
Here's the job method:
`def importprogram(path, name):
begin = time.time()
print('begin to import program' + name + ' info.')
# "c:\\sometest.csv"
file = open(path, mode='rb')
csvfile = csv.reader(codecs.iterdecode(file, 'utf-8'))
connection = None
try:
connection = pymysql.connect(host='a host', user='someuser', password='somepsd', db='mydb',
cursorclass=pymysql.cursors.DictCursor)
count = 1
with connection.cursor() as cursor:
sql = '''update sometable set Acolumn='{guid}' where someid='{pid}';'''
next(csvfile, None)
for line in csvfile:
try:
count = count + 1
if ''.join(line).strip():
command = sql.format(guid=line[2], pid=line[1])
cursor.execute(command)
if count % 1000 == 0:
print('program' + name + ' cursor execute', count)
except csv.Error:
print('program csv.Error:', count)
continue
except IndexError:
print('program IndexError:', count)
continue
except StopIteration:
break
except Exception as e:
print('program' + name, str(e))
finally:
connection.commit()
connection.close()
file.close()
print('program' + name + ' info done.time cost:', time.time()-begin)`
And the multi-processing method:
import multiprocessing as mp
def multiproccess():
pool = mp.Pool(3)
results = []
paths = ['C:\\testfile01.csv', 'C:\\testfile02.csv', 'C:\\testfile03.csv']
name = 1
for path in paths:
results.append(pool.apply_async(importprogram, args=(path, str(name))))
name = name + 1
print(result.get() for result in results)
pool.close()
pool.join()
And the main method:
if __name__ == '__main__':
multiproccess()
I am new to Python. How can I make the code or the way itself goes wrong? Should I use only one single process to finish the data reading and importing?
Your issue is that you are exceeding the time allowed for a response to be fetched from the server, so the client is automatically timing out.
In my experience, adjust the wait timeout to something like 6000 seconds, combine into one CSV and just leave the data to import. Also, I would recommend running the query direct from MySQL rather than Python.
The way I usually import CSV data from Python to MySQL is through the INSERT ... VALUES ... method, and I only do so when some kind of manipulation of the data is required (i.e. inserting different rows into different tables).
I like your approach and understand your thinking but in reality there is no need. The benefit to the INSERT ... VALUES ... method is that you won't run into any timeout issue.
I am trying to run the following code to populate a table in parallel for a certain application. First the following function is defined which is supposed to connect to my db and execute the sql command with the values given (to insert into table).
def dbWriter(sql, rows) :
# load cnf file
MYSQL_CNF = os.path.abspath('.') + '/mysql.cnf'
conn = MySQLdb.connect(db='dedupe',
charset='utf8',
read_default_file = MYSQL_CNF)
cursor = conn.cursor()
cursor.executemany(sql, rows)
conn.commit()
cursor.close()
conn.close()
And then there is this piece:
pool = dedupe.backport.Pool(processes=2)
done = False
while not done :
chunks = (list(itertools.islice(b_data, step)) for step in
[step_size]*100)
results = []
for chunk in chunks :
print len(chunk)
results.append(pool.apply_async(dbWriter,
("INSERT INTO blocking_map VALUES (%s, %s)",
chunk)))
for r in results :
r.wait()
if len(chunk) < step_size :
done = True
pool.close()
Everything works and there are no errors. But at the end, my table is empty, meaning somehow the insertions were not successful. I have tried so many things to fix this (including adding column names for insertion) after many google searches and have not been successful. Any suggestions would be appreciated. (running code in python2.7, gcloud (ubuntu). note that indents may be a bit messed up after pasting here)
Please also note that "chunk" follows exactly the required data format.
Note. This is part of this example
Please note that the only thing I am changing in the above example (linked) is that I am separating the steps for creation of and inserting into the tables since I am running my code on gcloud platform and it enforces GTID standards.
Solution was changing dbwriter function to:
conn = MySQLdb.connect(host = # host ip,
user = # username,
passwd = # password,
db = 'dedupe')
cursor = conn.cursor()
cursor.executemany(sql, rows)
cursor.close()
conn.commit()
conn.close()
Odoo 9 custom module binary field attachment=True parameter added later after that new record will be stored in filesystem storage.
Binary Fields some old records attachment = True not used, so old record entry not created in ir.attachment table and filesystem not saved.
I would like to know how to migrate old records binary field value store in filesystem storage?. How to create/insert records in ir_attachment row based on old records binary field value? Is any script available?
You have to include the postgre bin path in pg_path in your configuration file. This will restore the file store that contains the binary fields
pg_path = D:\fx\upsynth_Postgres\bin
I'm sure that you no longer need a solution to this as you asked 18 months ago, but I have just had the same issue (many gigabytes of binary data in the database) and this question came up on Google so I thought I would share my solution.
When you set attachment=True the binary column will remain in the database, but the system will look in the filestore instead for the data. This left me unable to access the data from the Odoo API so I needed to retrieve the binary data from the database directly, then re-write the binary data to the record using Odoo and then finally drop the column and vacuum the table.
Here is my script, which is inspired by this solution for migrating attachments, but this solution will work for any field in any model and reads the binary data from the database rather than from the Odoo API.
import xmlrpclib
import psycopg2
username = 'your_odoo_username'
pwd = 'your_odoo_password'
url = 'http://ip-address:8069'
dbname = 'database-name'
model = 'model.name'
field = 'field_name'
dbuser = 'postgres_user'
dbpwd = 'postgres_password'
dbhost = 'postgres_host'
conn = psycopg2.connect(database=dbname, user=dbuser, password=dbpwd, host=dbhost, port='5432')
cr = conn.cursor()
# Get the uid
sock_common = xmlrpclib.ServerProxy ('%s/xmlrpc/common' % url)
uid = sock_common.login(dbname, username, pwd)
sock = xmlrpclib.ServerProxy('%s/xmlrpc/object' % url)
def migrate_attachment(res_id):
# 1. get data
cr.execute("SELECT %s from %s where id=%s" % (field, model.replace('.', '_'), res_id))
data = cr.fetchall()[0][0]
# Re-Write attachment
if data:
data = str(data)
sock.execute(dbname, uid, pwd, model, 'write', [res_id], {field: str(data)})
return True
else:
return False
# SELECT attachments:
records = sock.execute(dbname, uid, pwd, model, 'search', [])
cnt = len(records)
print cnt
i = 0
for res_id in records:
att = sock.execute(dbname, uid, pwd, model, 'read', res_id, [field])
status = migrate_attachment(res_id)
print 'Migrated ID %s (attachment %s of %s) [Contained data: %s]' % (res_id, i, cnt, status)
i += 1
cr.close()
print "done ..."
Afterwards, drop the column and vacuum the table in psql.
import os, sys
AWS_DIRECTORY = '/home/jenkins/.aws'
certificates_folder = 'my_folder'
SUCCESS = 'success'
class AmazonKMS(object):
def __init__(self):
# making sure boto3 has the certificates and region files
result = os.system('mkdir -p ' + AWS_DIRECTORY)
self._check_os_result(result)
result = os.system('cp ' + certificates_folder + 'kms_config ' + AWS_DIRECTORY + '/config')
self._check_os_result(result)
result = os.system('cp ' + certificates_folder + 'kms_credentials ' + AWS_DIRECTORY + '/credentials')
self._check_os_result(result)
# boto3 is the amazon client package
import boto3
self.kms_client = boto3.client('kms', region_name='us-east-1')
self.global_key_alias = 'alias/global'
self.global_key_id = None
def _check_os_result(self, result):
if result != 0 and raise_on_copy_error:
raise FAILED_COPY
def decrypt_text(self, encrypted_text):
response = self.kms_client.decrypt(
CiphertextBlob = encrypted_text
)
return response['Plaintext']
when using it
amazon_kms = AmazonKMS()
amazon_kms.decrypt_text(blob_password)
getting
E ClientError: An error occurred (AccessDeniedException) when calling the Decrypt operation: The ciphertext refers to a customer master key that does not exist, does not exist in this region, or you are not allowed to access.
stacktrace is
../keys_management/amazon_kms.py:77: in decrypt_text
CiphertextBlob = encrypted_text
/home/jenkins/.virtualenvs/global_tests/local/lib/python2.7/site-packages/botocore/client.py:253: in _api_call
return self._make_api_call(operation_name, kwargs)
/home/jenkins/.virtualenvs/global_tests/local/lib/python2.7/site-packages/botocore/client.py:557: in _make_api_call
raise error_class(parsed_response, operation_name)
This happens in a script that runs once an hour.
it's only failing 2 -3 times a day.
after a retry it succeed.
Tried to upgraded from boto3 1.2.3 to 1.4.4
what is the possible cause for this behavior ?
My guess is that the issue is not in anything you described here.
Most likely the login-tokes time out or something along those lines. To investigate this further a closer look on the way the login works here is probably helpful.
How does this code run? Is it running inside AWS like on Lambda or EC2? Do you run it from your own server (looks like it runs on jenkins)? How is the login access established? What are those kms_credentials used for and how do they look like? Do you do something like assumeing a role (which would probably work through access tokens which after some time will no longer work)?
The MySQL database powering our Django site has developed some integrity problems; e.g. foreign keys that refer to nonexistent rows. I won't go into how we got into this mess, but I'm now looking at how to fix it.
Basically, I'm looking for a script that scans all models in the Django site, and checks whether all foreign keys and other constraints are correct. Hopefully, the number of problems will be small enough so they can be fixed by hand.
I could code this up myself but I'm hoping that somebody here has a better idea.
I found django-check-constraints but it doesn't quite fit the bill: right now, I don't need something to prevent these problems, but to find them so they can be fixed manually before taking other steps.
Other constraints:
Django 1.1.1 and upgrading has been determined to break things
MySQL 5.0.51 (Debian Lenny), currently with MyISAM tables
Python 2.5, might be upgradable but I'd rather not right now
(Later, we will convert to InnoDB for proper transaction support, and maybe foreign key constraints on the database level, to prevent similar problems in the future. But that's not the topic of this question.)
I whipped up something myself. The management script below should be saved in myapp/management/commands/checkdb.py. Make sure that intermediate directories have an __init__.py file.
Usage: ./manage.py checkdb for a full check; use --exclude app.Model or -e app.Model to exclude the model Model in the app app.
from django.core.management.base import BaseCommand, CommandError
from django.core.management.base import NoArgsCommand
from django.core.exceptions import ObjectDoesNotExist
from django.db import models
from optparse import make_option
from lib.progress import with_progress_meter
def model_name(model):
return '%s.%s' % (model._meta.app_label, model._meta.object_name)
class Command(BaseCommand):
args = '[-e|--exclude app_name.ModelName]'
help = 'Checks constraints in the database and reports violations on stdout'
option_list = NoArgsCommand.option_list + (
make_option('-e', '--exclude', action='append', type='string', dest='exclude'),
)
def handle(self, *args, **options):
# TODO once we're on Django 1.2, write to self.stdout and self.stderr instead of plain print
exclude = options.get('exclude', None) or []
failed_instance_count = 0
failed_model_count = 0
for app in models.get_apps():
for model in models.get_models(app):
if model_name(model) in exclude:
print 'Skipping model %s' % model_name(model)
continue
fail_count = self.check_model(app, model)
if fail_count > 0:
failed_model_count += 1
failed_instance_count += fail_count
print 'Detected %d errors in %d models' % (failed_instance_count, failed_model_count)
def check_model(self, app, model):
meta = model._meta
if meta.proxy:
print 'WARNING: proxy models not currently supported; ignored'
return
# Define all the checks we can do; they return True if they are ok,
# False if not (and print a message to stdout)
def check_foreign_key(model, field):
foreign_model = field.related.parent_model
def check_instance(instance):
try:
# name: name of the attribute containing the model instance (e.g. 'user')
# attname: name of the attribute containing the id (e.g. 'user_id')
getattr(instance, field.name)
return True
except ObjectDoesNotExist:
print '%s with pk %s refers via field %s to nonexistent %s with pk %s' % \
(model_name(model), str(instance.pk), field.name, model_name(foreign_model), getattr(instance, field.attname))
return check_instance
# Make a list of checks to run on each model instance
checks = []
for field in meta.local_fields + meta.local_many_to_many + meta.virtual_fields:
if isinstance(field, models.ForeignKey):
checks.append(check_foreign_key(model, field))
# Run all checks
fail_count = 0
if checks:
for instance in with_progress_meter(model.objects.all(), model.objects.count(), 'Checking model %s ...' % model_name(model)):
for check in checks:
if not check(instance):
fail_count += 1
return fail_count
I'm making this a community wiki because I welcome any and all improvements to my code!
Thomas' answer is great but is now a bit out of date.
I have updated it as a gist to support Django 1.8+.