I want to store mysql regular epxression to mysql database field. Specifically I want to store word boundaries expression into the database. For example:
[[:<:]]my expression here[[:>:]]
If I put this value directly into the database (for example using Sequel Pro) the value is stored correctly.
Problem occur when I want to store this value through Ruby on Rails:
my_instance.sql_expression = "[[:<:]]my expression here[[:>:]]"
my_instance.save
=> true
But value that is actually stored to database looks like this:
my_instance.sql_expression
=> "[[::]]"
It seems that in string Rails ignore everything what is between "<" and ">" including signs itselfs.
The project is in Ruby 1.8.7 and Rails 2.3.5.
This sounds like you're using something like xss_terminate to filter your models before saving them. I'd look in your model definition for something which has a before_save or other hook that might be intrusively doing this.
This is not standard Rails behavior.
Related
I have a model which have FileField field and in this model we have millions of records.
class MyModel(models.Model):
media = models.FileField(upload_to=my_function, db_index=True)
...
These media records are stored in database like;
media/some/folder/filename.jpg
media/filename.jpg
media/2021091240-10328.JPG
media/aXay-123.jpeg
media/some/another/folder/202110-12-12.jpeg
etc. and I need to find records which are not have nested path like /some/folder/ or /some/another/folder/ with django orm __iregex lookup.
So, I tried something like;
MyModel.objects.filter(media__iregex=r"^media/[0-9a-zA-Z\-][.][a-zA-Z]")
but it does not match and I do not understand to write proper regex [mysql regexp].
How can I do filter with mysql regexp with using Django orm to get records with only have pattern like; media/filename.extension?
Your regex has no quantifier, and thus will pick exactly one character for the [0-Aa-zA-Z\-] character group.
You can simply filter out elements that contain at least two slashes with:
MyModel.objects.filter(media__iregex=r'^media/[^/]+$')
not sure how far I'm going to get with this, but I'm going through a database removing certain bits and pieces in preparation for a conversion to different software.
I'm struggling with the image tags as on the site they currently look like
[img:<string>]<image url>[/img:<string>]
those strings are in another field called bbcode_uid
The query I'm running to make the changes so far is
UPDATE phpbb_posts SET post_text = REPLACE(post_text, '[img:]', '');
So my actual question, is there any way of pulling in each string from bbcode_uid inside of that SQL query so that I don't have to run the same command 10,000+ times, changing the unique string every time.
Alternatively could I include something inside [img:] to also include the next 8 characters, whatever they may be, as that is the length of the string that is used.
Hoping to save time with this, otherwise I might have to think of another way of doing it.
As requested.
The text I wish to replace would be
[img:1nynnywx]http://i.imgur.com/Tgfrd3x.jpg[/img:1nynnywx]
I want to end up with just
http://i.imgur.com/Tgfrd3x.jpg
Just removing the code around the URL, however each post_text has a different string which is contained inside bbcode_uid.
Method 1
LIB_MYSQLUDF_PREG
If you want more regular expression power in your database, you can consider using LIB_MYSQLUDF_PREG. This is an open source library of MySQL user functions that imports the PCRE library. LIB_MYSQLUDF_PREG is delivered in source code form only. To use it, you'll need to be able to compile it and install it into your MySQL server. Installing this library does not change MySQL's built-in regex support in any way. It merely makes the following additional functions available:
PREG_CAPTURE extracts a regex match from a string. PREG_POSITION returns the position at which a regular expression matches a string. PREG_REPLACE performs a search-and-replace on a string. PREG_RLIKE tests whether a regex matches a string.
All these functions take a regular expression as their first parameter. This regular expression must be formatted like a Perl regular expression operator. E.g. to test if regex matches the subject case insensitively, you'd use the MySQL code PREG_RLIKE('/regex/i', subject). This is similar to PHP's preg functions, which also require the extra // delimiters for regular expressions inside the PHP string
you can refer this link :github.com/hholzgra/mysql-udf-regexp
Method 2
Use php program, fetch records one by one , use php preg_replace
refer : www.php.net/preg_replace
reference:http://www.online-ebooks.info/article/MySql_Regular_Expression_Replace.html
You might be able to do this with substring_index().
The following will work on your example:
select substring_index(substring_index(post_text, '[/img:', 1), ']', -1)
Situation:
I have user model. attribute "meta_data" in db represents "text" type field.
In model it seriazized by custom class. ( serialize :meta_data, CustomJsonSerializer.new )
It means, when I have an instance of user, I can work with meta_data like with Hash.
User.first.meta_data['username']
Problem:
I need to write a search function, which will search users by given string. I can do it by manual building search query in rails ex. User.where("email LIKE '%#{string}%'")...
But what about meta_data ? Should I search in this field by LIKE statement too? If I will do so, it will decrease relevance of found record.
For example:
I have 2 users. One of them has username "patrick", another one is "sergio"
meta data in db will look like this:
1) {username: patrick}
2) {username: sergio}
I want to find sergio , I enter a search string "ser" => but I have 2 results, instead of one. This meta_data string "{uSERname: Patrick}" also has "ser", so it makes this record irrelevant.
Do you have any idea how to solve it?
That's really the problem with serialized data. In theory, the serialization could be an algorithm that is very unsearchable. It could do a Hoffman encoding, or other compression, and store the serialization in binary. You are relying on the assumption that the serialization uses JSON and your string will still be findable as a sub-string in the serialization.
Then the problem you are having is another issue. Other data in the serialization can mess up your results.
In general, if you serialize data, you are making a choice to not be searchable.
So a solution would be to add an additional field that you populate in a way that you control. Have a values field and store a pipe (|) delimited value that you can search. So if the data is {firstname: "Patrick", lastname: "Stern"}, your meta_values field might be "Patrick|Stern".
Also, don't use the where method with a string with #{} expansion of input values. The makes it vulnerable to SQL attacks. Instead use:
where("meta_values is like :pattern", pattern: "%#{string}%")
I know that may not look very different, but ActiveRecord will go through a sanitizing this way. If someone has a semi-colon in string, then ActiveRecord will escape the semi-colon in the search condition.
I have a couple escaped characters in user-entered fields that I can't figure out.
I know they are the "smart" single and double quotes, but I don't know how to search for them in mysql.
The characters in ruby, when output from Ruby look like \222, \223, \224 etc
irb> "\222".length => 1
So - do you know how to search for these in mysql? When I look in mysql, they look like '?'.
I'd like to find all records that have this character in the text field. I tried
mysql> select id from table where field LIKE '%\222%'
but that did not work.
Some more information - after doing a mysqldump, this is how one of the characters is represented - '\\xE2\\x80\\x99'. It's the smart single quote.
Ultimately, I'm building an RTF file and the characters are coming out completely wrong, so I'm trying to replace them with 'dumb' quotes for now. I was able to do a gsub(/\222\, "'").
Thanks.
I don't quite understand your problem but here is some info for you:
First, there are no escaped characters in the database. Because every character being stored as is, with no escaping.
they don't "look ilke ?". I's just wrong terminal settings. SET NAMES query always should be executed first, to match client encoding.
you have to determine character set and use it on every stage - in the database, in the mysql client, in ruby.
you should distinguish ruby strings representation from character itself.
To enter character in the mysql query, you can use char function. But in terminal only. In ruby just use the character itself.
smart quotes looks like 2-byte encoded in the unicode. You have to determine your encoding first.
Borland StarTeam seems to store its data as UTF-8 encoded data in VarChar fields. I have an ASP.NET MVC site that returns some custom HTML reports using the StarTeam database, and I would like to find a better solution for getting the correct data, for a rewrite with MVC2.
I tried a few things with Encoding GetBytes and GetString, but I couldn't get it to work (we use mostly Delphi at work); then I figured out a T-SQL function to return a NVarChar from UTF-8 stored in a VarChar, and created new SQL views which return the data as NVarChar, but it's slow.
The actual problem appears like this: “description†instead of “description”, in both SSMS and in a webpage when using Linq2SQL
Is there a way to get the proper data out of these fields using Entity Framework or Linq2SQL?
Well, once you get the data out, you could always try this:
Encoding.UTF8.GetString(Encoding.Default.GetBytes(item.Description))
assuming the field is encoded in the system ANSI page. You might have to create the right encoding with Encoding.GetEncoding() if for some reason it isn't (looked up from DB type, for example).