func.max get unexpected result? - sqlalchemy

I've got a table (story_id, votes)
with data
[(1,3), (2,4)]
when I try to do a query...
session.query(table.c.story_id, func.max(table.c.votes)).first()
i'll get:
(1,4)
the result I expect is:
(2,4)
where is the misunderstanding?

You are not grouping by anything so the database can simply return any row with a query like this. If you would add a group_by=table.c.story_id than this would return the proper result.
Most databases would, because of this, block these queries by default since the returned result would be arbitrary. In PostgreSQL for example you would get an error that story_id is not part of an aggregate query and not in the group by clause so it wouldn't know what row to return.
Either way, try this: session.query(table.c.story_id, func.max(table.c.votes)).group_by(table.c.story_id).first()

Related

Limit Results from mysql using if statement

I've got if statement code that works for the where clause of mysql query, but not for the limit statement. Could someone please help me to get this to work or a similar alternative? It gives me an expected result Boolean error.
I'm trying to use the if statement to find out if the $limit variable has any value except OFF. If it does, then limit the results of the mysql query by whatever number, 20 in this case. I'm not sure if that helps.
I honestly don't know if this the correct way of doing this.
$limit='20'; or whatever number
IF('$limit'<>'OFF',LIMIT $limit,'$limit'=0)
you could avoid if using a ternary operator
$limit='20'; or whatever number
$limit_result = ($limit<>'OFF' ? $limit : 0);

MySQL 5.7 RAND() and IF() without LIMIT leads to unexpected results

I have the following query
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table LIMIT 20) t
which returns something like this:
That's exactly what you would expect. However, as soon as I remove the LIMIT 20 I receive highly unexpected results (there are more rows returned than 20, I cut it off to make it easier to read):
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table) t
Side notes:
I'm using MySQL 5.7.18-15-log and this is a highly abstracted example (real query is much more difficult).
I'm trying to understand what is happening. I do not need answers that offer work arounds without any explanations why the original version is not working. Thank you.
Update:
Instead of using LIMIT, GROUP BY id also works in the first case.
Update 2:
As requested by zerkms, I added t.res = 0 and t.res + 1 to the second example
The problem is caused by a change introduced in MySQL 5.7 on how derived tables in (sub)queries are treated.
Basically, in order to optimize performance, some subqueries are executed at different times and / or multiple times leading to unexpected results when your subquery returns non-deterministic results (like in my case with RAND()).
There are two easy (and likewise ugly) workarounds to get MySQL to "materialize" (aka return deterministic results) these subqueries: Use LIMIT <high number> or GROUP BY id both of which force MySQL to materialize the subquery and return the expected results.
The last option is turn off derived_merge in the optimizer_switch variable: derived_merge=off (make sure to leave all the other parameters as they are).
Further readings:
https://mysqlserverteam.com/derived-tables-in-mysql-5-7/
Subquery's rand() column re-evaluated for every repeated selection in MySQL 5.7/8.0 vs MySQL 5.6

SUM(IF(COND,EXPR,NULL)) and IF(COND, SUM(EXPR),NULL)

I'm working of generating sql request by parsing Excel-like formulas.
So for a given formula, I get this request :
SELECT IF(COL1='Y', SUM(EXPR),NULL)
FROM Table
I don't get the results I want. If I manually rewrite the request like this it works :
SELECT SUM(IF(COL1='Y', EXPR, NULL))
FROM Table
Also, the first request produces the right value if I add a GROUP BY statement, for COL1='Y' row :
SELECT IF(COL1='Y', SUM(EXPR),NULL)
FROM Table
GROUP BY COL1
Is there a way to keep the first syntax IF(COND, SUM(EXPR), NULL) and slightly edit it to make it works without a GROUP BY statement ?
You have to use GROUP BY since you are using SUM - otherwise SQL engine is not able to tell how do you want to summarize the column.
Alternatively you could summarize this column only:
SELECT SUM(EXPR)
FROM Table
WHERE COL1='Y'
But then you would have to run separate query for each such column, read: not recommended for performance reasons.

Explain COUNT query with ActiveRecord

I want to do something like the following:
Post.count.explain # doesn't work
This fails because EXPLAIN is a method on Relation and Post.count isn't a relation. It's just a regular integer that is the result of the query. So how could a count query be EXPLAINed?
Here's a form that generates the exact same SQL query, but returns a Relation to call explain on:
Post.select('count(*)').explain
Both generate the SQL
SELECT COUNT(*) FROM `posts`
...so the query plan should be the same.
From ActiveRecord::Relation#explain, we can make that method accept a block.
module ExplainBlock
def explain_block(&block)
exec_explain(collecting_queries_for_explain { instance_exec(&block) })
end
end
ActiveRecord::Relation.include(ExplainBlock)
Then Post.all.explain_block { count } .
The COUNT shouldn't affect the query plan, since the only difference it does is to tell the database to fetch the row data, but the rows need to be found anyway with/without the COUNT.

Why is my query wrong?

before i use alias for table i get the error:
: Integrity constraint violation: 1052 Column 'id' in field list is ambiguous
Then i used aliases and i get this error:
unknown index a
I am trying to get a list of category name ( dependant to a translation) and the associated category id which is unique. Since i need to put them in a select, i see that i should use the lists.
$categorie= DB::table('cat as a')
->join('campo_cat as c','c.id_cat','=','a.id')
->join('campo as d','d.id','=','c.id_campo')
->join('cat_nome as nome','nome.id_cat','=','a.id')
->join('lingua','nome.id_lingua','=','lingua.id')
->where('lingua.lingua','=','it-IT')
->groupby('nome.nome')
->lists('nome.nome','a.id');
The best way to debug your query is to look at the raw query Laravel generates and trying to run this raw query in your favorite SQL tool (Navicat, MySQL cli tool...), so you can dump it to log using:
DB::listen(function($sql, $bindings, $time) {
Log::info($sql);
Log::info($bindings);
});
Doing that with yours I could see at least one problem:
->where('lingua.lingua','=','it-IT')
Must be changed to
->where('lingua.lingua','=',"'it-IT'")
As #jmail said, you didn't really describe the problem very well, just what you ended up doing to get around (part of) it. However, if I read your question right you're saying that originally you did it without all the aliases you got the 'ambiguous' error.
So let me explain that first: this would happen, because there are many parts of that query that use id rather than a qualified table`.`id.
if you think about it, without aliases you query looks a bit like this: SELECT * FROM `cat` JOIN `campo_cat` ON `id_cat` = `id` JOIN `campo` ON `id` = `id_campo`; and suddenly, MySQL doesn't know to which table all these id columns refer. So to get around that all you need to do is namespace your fields (i.e. use ... JOIN `campo` ON `campo`.`id` = `campo_cat`.`id_campo`...). In your case you've gone one step further and aliased your tables. This certianly makes the query a little simpler, though you don't need to actually do it.
So on to your next issue - this will be a Laravel error. And presumably happening because your key column from lists($valueColumn, $keyColumn) isn't found in the results. This is because you're referring to the cat.id column (okay in your aliased case a.id) in part of the code that's no longer in MySQL - the lists() method is actually run in PHP after Laravel gets the results from the database. As such, there's no such column called a.id. It's likely it'll be called id, but because you don't request it specifically, you may find that the ambiguous issue is back. My suggestion would be to select it specifically and alias the column. Try something like the below:
$categories = DB::table('cat as a')
->join('campo_cat as c','c.id_cat','=','a.id')
->join('campo as d','d.id','=','c.id_campo')
->join('cat_nome as nome','nome.id_cat','=','a.id')
->join('lingua','nome.id_lingua','=','lingua.id')
->where('lingua.lingua','=','it-IT')
->groupby('nome.nome')
->select('nome.nome as nome_nome','a.id as a_id') // here we alias `.id as a_id
->lists('nome_nome','a_id'); // here we refer to the actual columns
It may not work perfectly (I don't use ->select() so don't know whether you pass an array or multiple parameters, also you may need DB::raw() wrapping each one in order to do the aliasing) but hopefully you get my meaning and can get it working.