Hive subquery in rlike clause - mysql

I want to clean robots entries from log file. One of the way to identify crawlers by the user agent field in weblog.I've stored raw logs in one folder and token of the popular crawlers in crawler table. TO clean logs those have user agent matched with token i made this query
CREATE TABLE temp
AS
SELECT host,time,method,url,protocol,status,size,referer,agent
FROM raw_logs
WHERE
agent NOT RLIKE (SELECT concat_ws("|",collect_set(concat("(.*",token,".*)"))) FROM crawler) ;
It gives me parseException cannot recognize input near 'SELECT' 'concat_ws' '(' in expression specification
If i replace result of sub query manually then it works perfect.
CREATE TABLE temp
AS
SELECT host,time,method,url,protocol,status,size,referer,agent
FROM raw_logs
WHERE agent NOT RLIKE '(.*Googlebot.*)|(.*bingbot.*)' ;
So sub query in LIKE clause not supported by hive 1.0.1 ?
Similar query in mysql works perfect.

Related

Rails - How to reference model's own column value during update statement?

Is it possible to achieve something like this?
Suppose name and plural_name are fields of Animal's table.
Suppose pluralise_animal is a helper function which takes a string and returns its plural literal.
I cannot loop over the animal records for technical reasons.
This is just an example
Animal.update_all("plural_name = ?", pluralise_animal("I WANT THE ANIMAL NAME HERE, the `name` column's value"))
I want something similar to how you can use functions in MySQL while modifying column values. Is this out-of-scope or possible?
UPDATE animals SET plural_name = CONCAT(name, 's') -- just an example to explain what I mean by referencing a column. I'm aware of the problems in this example.
Thanks in advance
I cannot loop over the animal records for technical reasons.
Sorry, this cannot be done with this restriction.
If your pluralizing helper function is implemented in the client, then you have to fetch data values back to the client, pluralize them, and then post them back to the database.
If you want the UPDATE to run against a set of rows without fetching data values back to the client, then you must implement the pluralization logic in an SQL expression, or a stored function or something.
UPDATE statements run in the database engine. They cannot call functions in the client.
Use a ruby script to generate a SQL script that INSERTS the plural values into a temp table
File.open(filename, 'w') do |file|
file.puts "CREATE TEMPORARY TABLE pluralised_animals(id INT, plural varchar(50));"
file.puts "INSERT INTO pluralised_animals(id, plural) VALUES"
Animal.each.do |animal|
file.puts( "( #{animal.id}, #{pluralise_animal(animal.name)}),"
end
end
Note: replace the trailing comma(,) with a semicolon (;)
Then run the generated SQL script in the database to populate the temp table.
Finally run a SQL update statement in the database that joins the temp table to the main table...
UPDATE animals a
INNER JOIN pluralised_animals pa
ON a.id = pa.id
SET a.plural_name = pa.plural;

Could execute an UPDATE clause in WHERE?

I'm learning SQL injection and I built a web application(PHP + MYSQL(5.6)) without protection of SQL injection.
In brief, my web application use
SELECT * FROM XXX.USER WHERE user_name='${USERNAME}' AND password='${PASSWORD}'
to handle login(if that sql returns only 1 row, then login succeed).
At the beginning, I found input USERNAME Sayakiss' -- then my SQL:
SELECT * FROM XXX.USER WHERE user_name='Sayakiss' -- ' AND password='${PASSWORD}'
By that way, attacker can login as Sayakiss without password.
Then I find something more interesting(select clause can be in if function) -- attacker input USERNAME as
Sayakiss' and if((select ascii(mid(z,p,1)) from x.y limit n,1)=c,1,0) --
This can check character the ascii of the character of p position of n-th row of the column z of table x.y equals c or not.
If attacker login succeed, then he knows the ascii of the character equals c.
So attacker can get everything of my database by a enumeration!
Now I wonder, how to (if it's possible) execute a update query to write database by a similar way?
I believe so, that a attacker can make an update, probably will be needing names of table and fields to run it correctly.
I think the query would be something like
'Sayakiss'; UPDATE table_name SET field1=new-value1, field2=new-value2
WHERE user_name='Sayakiss'; --
Relevant and
Some more

Django mysql count distinct gives different result to postgres

I'm trying to count distinct string values for a fitered set of results in a django query against a mysql database versus the same data in a postgres database. However, I'm getting really confusing results.
In the code below, NewOrder represents queries against the same data in a postgres database, and OldOrder is the same data in a MYSQL instance.
( In the old database, completed orders had status=1, in the new DB complete status = 'Complete'. In both the 'email' field is the same )
OldOrder.objects.filter(status=1).count()
6751
NewOrder.objects.filter(status='Complete').count()
6751
OldOrder.objects.filter(status=1).values('email').distinct().count()
3747
NewOrder.objects.filter(status='Complete').values('email').distinct().count()
3825
print NewOrder.objects.filter(status='Complete').values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = Complete
print OldSale.objects.filter(status=1).values('email').distinct().query
SELECT DISTINCT "order_order"."email" FROM "order_order" WHERE "order_order"."status" = 1
And here is where it gets really bizarre
new_orders = NewOrder.objects.filter(status='Complete').values_list('email', flat=True)
len(set(new_orders))
3825
old_orders = OldOrder.objects.filter(status=1).values_list('email',flat=True)
len(set(old_orders))
3825
Can anyone explain this discrepancy? And possibly point me as to why results would be different between postgres and mysql? My only guess is a character encoding issue, but I'd expect the results of the python set() to also be different?
Sounds like you're probably using a case-insensitive collation in MySQL. There's no equivalent in PostgreSQL; the closest is the citext data type, but usually you just compare lower(...) of strings, or use ILIKE for pattern matching.
I don't know how to say it in Django, but I'd see if the count of the set of distinct lowercased email addresses is the same as the old DB.
According to the Django docs something like this might work:
NewOrder.objects.filter(status='Complete').values(Lower('email')).distinct()

Rails & MySQL: SELECT Statement Single vs. Double Quotes

I have a CRON job which executes a SELECT statement to grab records. When the SELECT runs on my dev machine, it produces the following statement:
SELECT `users`.* FROM `users` WHERE `users`.`id` = 87 LIMIT 1
This is successful.
When the SELECT runs on my production (hosted) machine it produces the statement with double quotes:
SELECT "users".* FROM "users" WHERE "users”.”id” = 87 LIMIT 1
This is not successful and I get a MySQL 1064 error,
#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '.* FROM "users" WHERE "users
The code is the same on both machines, but my dev MySQL is version 5.5.33, whereas production is 5.1.67 (I don't have control over this to set/update it)
Is there a way to force single quotes or another preferred method to handle this situation?
Thanks for your time and assistance.
--EDIT--
Here are the main code snippets that are invoked via my CRON job:
/lib/tasks/reports.rake
namespace :report do
desc "Send Daily Report"
task :daily => :environment do
User.where(:report_daily => 1).find_each do |user|
ReportsMailer.send_report(user, 'daily').deliver
end
end
/app/mailers/reports_mailer.rb
def send_report(user, date_increment)
#user = user
#date_increment = date_increment
get_times(user)
mail :to => user.email, :subject=> "Report: #{#dates}"
end
--EDIT2--
So it looks like I need to use slanted single quotes (`) in order for this to work successfully. How do I force my app or MySQL to use these instead of double (") quotes?
I don't know why it does this, but I do know that if you're referencing column names in MYSQL, you need to use ``, whereas values / data should be wrapped in "", like this:
SELECT `users`.* FROM `users` WHERE `users`.`id` = "87" LIMIT 1
I learnt this the hard way back in the day when I was learning how to do simple MYSQL queries
Here's some documentation from MYSQL's site for you:
The identifier quote character is the backtick (“`”):
mysql> SELECT * FROM `select` WHERE `select`.id > 100;
Identifier quote characters can be included within an identifier if
you quote the identifier. If the character to be included within the
identifier is the same as that used to quote the identifier itself,
then you need to double the character. The following statement creates
a table named a`b that contains a column named c"d:
mysql> CREATE TABLE `a``b` (`c"d` INT);
Is there any reason you couldn't put some of your sql statement directly into your code like:
User.where("`report_daily`=1").find_each do |user|
After further inspection, and working with my hosting company, its turns out that my query is timing out on their server. Thanks to all that responded.
Since you are not using any literals, the format of the generated SQL statements should be determined by the underlying adapter. Perhaps you have a different mysql adapter installed or configured on each machine. Check the installed version. For example:
bundle show mysql
and also check the adapter configuration for your project in database.yml. For example:
adapter: mysql
A comparison of the results of these checks between each machine should tell you if you are using different adapters on the two machines.

SQL returns no rows via myODB, but in SQL there are rows

I have an SQL table called "tbl_einheit". phpmyadmin shows more than 14.000 rows in the table. When accessing via webpage, the table is empty "eof".
I minimized the SQL Statment, and deleted all WHERE, ORDER BY elements, so that simply
SELECT * FROM tbl_einheit
is the statement. But it still returns an empty result set. I also tried
SELECT E . * FROM tbl_einheit E, ( SELECT #a := NULL ) AS init LIMIT 0,30
but also empty.
Any suggestions?
the reason is you have some data type in your mysql dtaabse that ADODB connector in ASSp can't recognize, so asp thinks it is EOF.
use CAST in MySQL to convert data type to something asp can understand, example:
SELECT CAST(SUM(Entry_Data_1) as UNSIGNED) as score FROM contests_entries
Put a trace in your code to make sure you are executing the code you think you are.
Double-check your connection string.