MySQL Screwed Up Output - mysql

I'm see some very strange outputs from MySQL, and I don't know whether it's my console or my data that's causing this. Here are some screenshots:
Any ideas?
edit:
mysql> describe transformed_step_a1_sfdc_lead_history;
+-------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+--------------+------+-----+---------+-------+
| old_value | varchar(255) | YES | | NULL | |
| new_value | varchar(255) | YES | | NULL | |
+-------------------+--------------+------+-----+---------+-------+
Max

To verify if there is any control characters, you can use -s option, see http://dev.mysql.com/doc/refman/5.5/en/mysql-command-options.html#option_mysql_raw

It's impossible to tell exactly what the problem is from your screenshots, but the text in your database contains control characters. The usual culprit is CR, which moves the cursor back to the beginning of the line and starts overwriting text already there.
If you have programmatic access to your database then you will be able to dump the values with control characters expressed as pintables so that you can see what is actually in there.

Related

mariadb select * returns empty set

when I run SELECT * FROM urlcheck
it returns 'EMPTY Set (0.0 sec)'
According SHOW TABLE STATUS LIKE 'urlcheck'
The table has 3 rows.
Table structure is:
+-------------+---------------+------+-----+---------+----------------+<br>
| Field | Type | Null | Key | Default | Extra |<br>
+-------------+---------------+------+-----+---------+----------------+<br>
| id | int(11) | NO | PRI | NULL | auto_increment |<br>
| coursegroup | varchar(20) | YES | | NULL | |<br>
| url | varchar(2588) | YES | | NULL | |<br>
+-------------+---------------+------+-----+---------+----------------+<br>
I start by selecting the database with USE db
any ideas why this happened. I know this is similiar to Mysql select always returns empty setMysql select always returns empty set but that was apparently a corrupted database. I have truncated this database and add new rows and I still get the same problem. The code that adds records FWIW is
cur.execute('insert into urlcheck (coursegroup, url) values("'+coursegroup+'","'+url+'");')
db.commit
cur.close
The problem was a syntax error in my code.
should have been:
db.commit()
cur.close()
lack of parentheses caused the code not to write. I leave this here even as it redounds to my own humiliation in the hopes it helps someone else.

Pandas to_sql discarding rows when appending to mysql table

I'm working with articles scraped from online newspapers with a mysql database and python. I want to use pandas to_sql method on a dataframe for appending recently scraped articles to a mysql table. It works pretty well, but im having some problems with the following:
Since the articles are automatically scraped from news sites, about 1% of them have issues (encoding, or texts are too long or stuff like that) and dont fit on the mysql table fields. Pandas to_sql method for some reason IGNORES these errors and discards the rows that do not fit. For example I have the following mysql table:
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(255) | YES | | NULL | |
| description | text | YES | | NULL | |
| content | text | YES | | NULL | |
| link | varchar(300) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
And I also have a Dataframe that contains 15 rows and 4 columns (title, description, content, link).
If 1 of those rows has a title larger than 255 characters, it wont fit in the mysql table. I expected an error when doing df.to_sql('press', con=con, index=False, if_exists='append'), that way I know i have a problem to fix; but the actual result was that 14 ROWS where appended instead of 15.
This could work for me, but i need to know which row was discarded so i can flag it for later revision. Is it possible to tell pandas to let me know which indexes are ignored?
Thanks!

PuTTY outputs weird stuff when selecting in MySQL

I've encountered a strange problem when I was using PuTTY to query the following MySQL command: select * from gts_camera
The output seems extremely weird:
As you can see, putty outputs loads of "PuTTYPuTTYPuTTY..."
Maybe it's because of the table attribute set:
mysql> describe gts_kamera;
+---------+----------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+----------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| datum | datetime | YES | | CURRENT_TIMESTAMP | |
| picture | longblob | YES | | NULL | |
+---------+----------+------+-----+-------------------+----------------+
This table stores some big pictures and their date of creation.
(The weird ASCII-characters you can see on top of the picture is the content.)
Does anybody know why PuTTY outputs such strange stuff, and how to solve/clean this?
Cause I can't type any other commands afterwards. I have to reopen the session again.
Sincerely,
Michael.
The reason this happens is because of the contents of the file (as you have a column defined with longblob). It may have some characters that Putty will not understand, therefore it will break as it is happening with you.
There is a configuration that may help though.
You can also not select every column in that table (at least not the *blob ones) as:
select id, datum from gts_camera;
Or If you still want to do it use the MySql funtion HEX:
select id, datum, HEX(picture) as pic from gts_camera;

SQLAlchemy/MySQL binary blob is being utf-8 encoded?

I'm using SQLAlchemy and MySQL, with a files table to store files. That table is defined as follows:
mysql> show full columns in files;
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
| id | varchar(32) | utf8_general_ci | NO | PRI | NULL | | select,insert,update,references | |
| created | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| updated | datetime | NULL | YES | | NULL | | select,insert,update,references | |
| content | mediumblob | NULL | YES | | NULL | | select,insert,update,references | |
| name | varchar(500) | utf8_general_ci | YES | | NULL | | select,insert,update,references | |
+---------+--------------+-----------------+------+-----+---------+-------+---------------------------------+---------+
The content column of type MEDIUMBLOB is where the files are stored. In SQLAlchemy that column is declared as:
__maxsize__ = 12582912 # 12MiB
content = Column(LargeBinary(length=__maxsize__))
I am not quite sure about the difference between SQLAlchemy's BINARY type and LargeBinary type. Or the difference between MySQL's VARBINARY type and BLOB type. And I am not quite sure if that matters here.
Question: Whenever I store an actual binary file in that table, i.e. a Python bytes or b'' object , then I get the following warning
.../python3.4/site-packages/sqlalchemy/engine/default.py:451: Warning: Invalid utf8 character string: 'BCB121'
cursor.execute(statement, parameters)
I don't want to just ignore the warning, but it seems that the files are in tact. How do I handle this warning gracefully, how can I fix its cause?
Side note: This question seems to be related, and it seems to be a MySQL bug that it tries to convert all incoming data to UTF-8 (this answer).
Turns out that this was a driver issue. Apparently the default MySQL driver stumbles with Py3 and utf8 support. Installing cymysql into the virtual Python environment resolved this problem and the warnings disappear.
The fix: Find out if MySQL connects through socket or port (see here), and then modify the connection string accordingly. In my case using a socket connection:
mysql+cymysql://user:pwd#localhost/database?unix_socket=/var/run/mysqld/mysqld.sock
Use the port argument otherwise.
Edit: While the above fixed the encoding issue, it gave rise to another one: blob size. Due to a bug in CyMySQL blobs larger than 8M fail to commit. Switching to PyMySQL fixed that problem, although it seems to have a similar issue with large blobs.
Not sure, but your problem might have the same roots as the one I had several years ago in python 2.7: https://stackoverflow.com/a/9535736/68998. In short, Mysql's interface does not let you be certain if you are working with a true binary string or a text in a binary collation (used because of a lack of case-sensitive utf8 collation). Therefore, a Mysql binding has the following options:
return all string fields as binary strings, and leave the decoding to you
decode only the fields that do not have a binary flag (so much fun when some of the fields are unicode and other are str)
have an option to force decoding to unicode for all string fields, even true binary
My guess is that in your case, the third option is somewhere enabled in the underlying Mysql binding. And the first suspect is your connection string (connection params).

viewing mysql blob with putty

I am saving a serialized object to a mysql database blob.
After inserting some test objects and then trying to view the table, i am presented with lots of garbage and "PuTTYPuTTY" several times.
I believe this has something to do with character encoding and the blob containing strange characters.
I am just wanting to check and see if this is going to cause problems with my database, or if this is just a problem with putty showing the data?
Description of the QuizTable:
+-------------+-------------+-------------------+------+-----+---------+----------------+---------------------------------+-------------------------------------------------------------------------------------------------------------------+
| Field | Type | Collation | Null | Key | Default | Extra | Privileges | Comment |
+-------------+-------------+-------------------+------+-----+---------+----------------+---------------------------------+-------------------------------------------------------------------------------------------------------------------+
| classId | varchar(20) | latin1_swedish_ci | NO | | NULL | | select,insert,update,references | FK related to the ClassTable. This way each Class in the ClassTable is associated with its quiz in the QuizTable. |
| quizId | int(11) | NULL | NO | PRI | NULL | auto_increment | select,insert,update,references | This is the quiz number associated with the quiz. |
| quizObject | blob | NULL | NO | | NULL | | select,insert,update,references | This is the actual quiz object. |
| quizEnabled | tinyint(1) | NULL | NO | | NULL | | select,insert,update,references | |
+-------------+-------------+-------------------+------+-----+---------+----------------+---------------------------------+-------------------------------------------------------------------------------------------------------------------+
What i see when i try to view the table contents:
select * from QuizTable;
questionTextq ~ xp sq ~ w
t q1a1t q1a2xt 1t q1sq ~ sq ~ w
t q2a1t q2a2t q2a3xt 2t q2xt test3 | 1 |
+-------------+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
3 rows in set (0.00 sec)
I believe you can use the hex function on blobs as well as strings. You can run a query like this.
Select HEX(quizObject) From QuizTable Where....
Putty is reacting to what it thinks are terminal control character strings in your output stream. These strings allow the remote host to change something about the local terminal without redrawing the entire screen, such as setting the title, positioning the cursor, clearing the screen, etc..
It just so happens that when trying to 'display' something encoded like this, that a lot of binary data ends up sending these characters.
You'll get this reaction catting binary files as well.
blob will completely ignore any character encoding settings you have. It's really intended for storing binary objects like images or zip files.
If this field will only contain text, I'd suggest using a text field.