OperationalError (1366, "Incorrect string value: '\\xE2\\x80\\x8ESen...') MySQL Django - mysql

I develop django website on cpanel with MySQL database. I have a function that pull feeds from this website https://travelcommunication.net/feed/ and create an object from that (web scraping using beautifulsoup4).
But when I try to grab the content section, the error appears. But that only happens with certain items, not all.
I try on my local (with sqlite database) and all working fine. I have also tried on heroku (with PostgreSQL database) and all working fine.
Here is my code:
#views.py
def pull_feeds(request, pk):
if request.user.is_superuser:
source = Autoblogging.objects.get(pk=pk)
url = requests.get(source.url)
soup = BeautifulSoup(url.content, "html.parser")
length = source.items
items = soup.find_all('item')[:length]
contents = soup.find_all('content:encoded')[:length]
for i in range(length-1, -1, -1):
content = contents[i].text
title = items[i].title.text
body = content[content.find('<p>'):] #the problem is here .. when i comment this, everything works fine
category = Category.objects.get(pk=source.category.id)
if not Post.objects.filter(title=title).exists():
post = Post(title=title,
body=body, #the problem is here .. when i comment this, everything works fine
category=category)
link = content[content.find('src=')+5:content.find('alt')-2]
img_data = requests.get(link).content
with open('temp_image.jpg', 'wb') as handler:
handler.write(img_data)
with open('temp_image.jpg', 'rb') as handler:
file_name = link.split("/")[-1]
post.cover.save(file_name, files.File(handler))
os.remove("temp_image.jpg")
return redirect("news:autoblogging")
else:
return HttpResponse("Sorry you are not allowed to access this page")
Does anyone know how to fix this error? Thanks.

Related

Storing images in MySql using python

I have used a function named InsertBlob to insert images in Mysql by letting users input the file path but the error i often get is " it must be of type list, tuple or dict"..is there a way to overcome the error or must i do one whole different coding to get what i want?
def InsertBlob(FilePath):
with open(FilePath, "rb") as File:
BinaryData = File.read()
SQLStatement = "INSERT INTO Images (Photo) VALUES (%s)".format(BinaryData)
print(SQLStatement)
MyCursor.execute(SQLStatement , BinaryData)
print("1.Insert Image\n2. Read Image")
MenuInput = input()
if int(MenuInput) == 1:
UserFilePath = input("Enter File Path: ")
InsertBlob(UserFilePath)

Migrating MySQL blob (image) to FileMaker container using PowerShell

In searching I've found a number of other people that have tried, but none that have been successful.
Here's the problem. I want to take a bunch of images I have stored on my MySQL server in blobs and move them into FileMaker containers.
The best lead I've got is the putas command. It looks something like putas ('$Image','JPEG').
My particular application is as follows. $DataSet.Image1 is a JPEG file stored as "0xFFD8....". The data being in this format may well be the issue, but I don't know what I'd need to convert it to first.
$cmd.CommandText = "update Checklists set Image1 = PutAs('$($DataSet.Image1)', 'JPEG')"
$cmd.ExecuteNonQuery();
All I keep getting is syntax error, but I've tried the syntax many different ways I can't get it to go no matter what I do.
I'd very much like to see someone having success with this to post their example. Any other ideas or workarounds are welcome as well.
Edit:
Here is some extra info. Greg Lane at Skeleton Key gives this example, but I'm not sure how to translate it to PowerShell.
import java.sql.*; import java.io.*;
def url = "jdbc:filemaker://localhost/fmserver_sample";
def driver = "com.filemaker.jdbc.Driver";
def user = "admin";
def password = "";
System.setProperty("jdbc.drivers", driver);
connection = DriverManager.getConnection (url, user, password);
filename = "/Users/Greg/Pictures/vacation/DSC_0202.jpg";
file = new File (filename);
inputstream = new FileInputStream (filename);
sql = "INSERT INTO english_nature (ID, img) VALUES (-1, PutAs(?, 'JPEG'))";
pstatement = connection.prepareStatement ( sql );
pstatement.setBinaryStream (1, inputstream, (int)file.length ());
pstatement.execute ();
//cleanup
pstatement = null;
inputstream = null;
file = null;
connection.close();
I figured it out. For anyone in the future here is how you do it.
$cmd.CommandText = "update Checklists set Image1 = PutAs(?, 'JPEG') where serial = '$($DataSet.serial)' AND ChecklistNumber = 1"
$cmd.Parameters.Add('?', $DataSet.Image1)
$cmd.Prepare()
$cmd.ExecuteNonQuery();

Iterating through multiline input, and match to database items

I need help iterating through input to a webapp I'm writing, which looks like:
The users will be inputting several hundred (or thousands) of urls pasted from excel documents, each on a new line like this. Thus far, as you can see, I've created the input page, an output page, and written the code to query the database.
from flask import Flask,render_template, request
from flask_sqlalchemy import SQLAlchemy
from urllib.parse import urlparse
from sqlalchemy.ext.declarative import declarative_base
app = Flask(__name__)
app.config["DEBUG"] = True
app.config["SECRET_KEY"] = "secret_key_here"
db = SQLAlchemy(app)
SQLALCHEMY_DATABASE_URI = db.create_engine(connector_string_here))
app.config[SQLALCHEMY_DATABASE_URI] = SQLALCHEMY_DATABASE_URI
app.config["SQLALCHEMY_POOL_RECYCLE"] = 299
app.config["SQLALCHEMY_TRACK_MODIFICATIONS"] = False
db.Model = declarative_base()
class Scrapers(db.Model):
__tablename__ = "Scrapers"
id = db.Column(db.Integer, primary_key = True)
scraper_dom = db.Column(db.String(255))
scraper_id = db.Column(db.String(128))
db.Model.metadata.create_all(SQLALCHEMY_DATABASE_URI)
Session = db.sessionmaker()
Session.configure(bind=SQLALCHEMY_DATABASE_URI)
session = Session()
scrapers = session.query(Scrapers.scraper_dom, Scrapers.scraper_id).all()
#app.route("/", methods=["GET","POST"])
def index():
if request.method == "Get":
return render_template("url_page.html")
else:
return render_template("url_page.html")
#app.route("/submit", methods=["GET","POST"])
def submit():
sites = [request.form["urls"]]
for site in sites:
que = urlparse(site).netloc
return render_template("submit.html", que=que)
#scrapers.filter(Scrapers.scraper_dom.in_(
#next(x.scraper_id for x in scrapers if x.matches(self.fnetloc))
As is apparent, this is incomplete. I've omitted previous attempts at matching the input, as I realized I had issues iterating through the input. At first, I could only get it to print all of the input instead of iterating over it. And now, it prints like this:
Which is just repeating the urlparse(site).netloc for the first line of input, some random number of times. It is parsing correctly and returning the actual value I will need to use later (for each urlparse(site).netloc match scraper_dom and return associated scraper_id). Now, though, I've tried using input() but kept getting errors with [request.form["urls"]] not being an iterable.
Please help, it'd be much appreciated.
Output of sites:
New output with:
que = [urlparse(site).netloc for site in request.form["urls"].split('\n')]

Pyramid Unit Test Sending Parameter

I have a Pyramid web-application that I am trying to unit-test.
In my tests file I have this snippet of code:
anyparam = {"isApple": "True"}
#parameterized.expand([
("ParamA", anyparam, 'success')])
def test_(self, name, params, expected):
request = testing.DummyRequest(params=params)
request.session['AI'] = ''
response = dothejob(request)
self.assertEqual(response['status'], expected, "expected response['status']={0} but response={1}".format(expected, response))
Whereas in my views:
#view_config(route_name='function', renderer='json')
def dothejob(request):
params = json.loads(request.body)
value = params.get('isApple') #true or false.
However, when I'm trying to unit-test it, I am getting this error:
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
However, when I make the same request via POST via web-browser it works perfectly fine.
By doing testing.DummyRequest(params=params) you are only populating request.params, not request.body.
You probably want to do something like:
request = testing.DummyRequest(json_body=params)
Also, you may want to use directly request.json_body in your code instead of json.loads(request.body).

Replace many parts of a string python3

Via the modules urllib and re I am attempting to scrape a web page for it's text content. I'm following a guide provided by "SentDex" on youtube found here ( https://www.youtube.com/watch?v=GEshegZzt3M ) and the documentation with the official Python site to cobble together a quick solution. The information that comes back has plenty of HTML markup and special characters that I am trying to remove. My end result is successful but I feel that it is hard coded solution and is only useful for this one scenario.
The code is as follows :
url = "http://someUrl.com/dir/doc.html" #Target URL
values = {'s':'basics',
'submit':'search'} #Set parameters for later use
data = urllib.parse.urlencode(values) #Really not sure...
data = data.encode('utf-8') #set to UTF-8
req = urllib.request.Request(url,data)#Arrange the request parameters
resp = urllib.request.urlopen(req)#Get the document's contents matching that data type from that URL
respData = resp.read() #read the content into a variable
#BS4 method
soup = BeautifulSoup(respData, 'html.parser')
text = soup.find_all("p")
#end BS4
#re method
text = re.findall(r"<p>(.*?)</p>",str(respData)) #get all paragraph tag contents
text = str(text) #convert it to a string
#end re
conds = ["<b>","</b>","<i>","</i>","\\","[","]","\'"] #things to remove from text
for case in conds:#for each of those things
text = text.replace(case,"") #remove string AKA replace with nothing
Are there more effective ways to achieve the end goal of removing all "Markup" from a string further than explicit definitions of each condition?