Alternatives to using "having" clause for alias fields - mysql

I have this somewhat complex sql query that works ok without the final where clause. I'm looking to filter some records using the column unreviewed_records which is an alias
Problem is that I get an error saying unreviewed_records cannot be found. I found some information saying that alias fields are not permitted to be used in where clauses and I'm not sure what's the best way to fix this. Considered using a computed column but I'm not sure how that works yet and I'm hoping there's an easier fix to the query.
Also I find that switching to using the "having" clause work for aliases, but I'll only resort to this if there's no better alternative, to avoid the performance hit.
Any pointers would be helpful :)
select
r_alias.serv_id, r_alias.node_id,
SUM(g_alias.total_records)- SUM(r_alias.reviewed_records) AS unreviewed_records,
SUM(r_alias.reviewed_records) AS reviewed_records,
SUM(g_alias.total_records) AS total_records,
FROM (
SELECT prs.serv_id,
prs.node_id,
SUM(prs.reviewed_records) AS reviewed_records,
FROM p_rev_server prs
WHERE
prs.area_id = 3
AND prs.subId = 3
AND prs.sId = 12
GROUP BY prs.serv_id, prs.node_id, prs.domain_name
) r_alias
INNER JOIN (SELECT
serv_id,
node_id,
SUM(pgs.total_records) AS total_records,
FROM p_gen_serve pgs
WHERE pgs.area_id = 3
AND pgs.subId = 3
AND pgs.sId = 12
AND pgs.total_records > 0
GROUP BY pgs.serv_id, pgs.node_id, pgs.domain_name
) g_alias
ON g_alias.serv_id = r_alias.serv_id AND g_alias.node_id = r_alias.node_id
LEFT JOIN p_cust_columns cust_cols
ON cust_cols.node_id = r_alias.node_id AND cust_cols.serv_id = r_alias.serv_id
where (((NOT (unreviewed_records IS NULL)) AND (unreviewed_records = 5)))
group by r_alias.serv_id, r_alias.node_id
order by g_alias.node_id ASC
limit 25

The reason aliases are not allowed in a WHERE clause is that the expressions in the SELECT list are not evaluated until after the rows are filtered by the WHERE clause. So it's a chicken-and-egg problem.
The easiest and most common alternative is a derived table:
SELECT a, b, c
FROM (
SELECT a, b, a+b AS c
FROM mytable
WHERE b = 1234
) AS t
WHERE c = 42;
This example shows that you can put some filtering conditions inside the derived table subquery, so you can at least reduce the result set partially, before the result of the subquery is turned into a temporary table.
Then in the outer query, you can reference a column that was derived from an expression in the select-list of the subquery. In this example, it's the c column.
The CTE approach is basically the same, it creates a temporary table to store the result of the inner query (the CTE), and then you can apply conditions to that in the outer query.
WITH t AS (
SELECT a, b, a+b AS c
FROM mytable
WHERE b = 1234
)
SELECT a, b, c
FROM t
WHERE c = 42;
The CTE solution is not better than the derived-table approach, unless you need to reference the CTE multiple times in the outer query, i.e. doing a self-join.

Yeah, you are kind of SOL, WHERE can't know what an alias will be. So, frankly, a CTE, common table expression, is probably your best bet here. It should work, though not all RDBMS really support them (MySQL for example only in version 8).

Related

I understand what not in does in sql but I'm confused what it does when there are two things

Does it compare buy_ccy to sell_ccy in the second select? Then the sell_ccy to the buy_ccy? Does the order of the second select statement matter? Thank you in advance! :)
select
customer,order_no,buy_ccy,sell_ccy
from
fxbook f1
where
(buy_ccy,sell_ccy) not in (select
sell_ccy,buy_ccy
from
fxbook
where
f1.customer <> customer)"
A multi-column IN() comparison simply compares ALL columns in the order you present them, but here is a good reason why you should include table aliases in column references:
where (f1.buy_ccy , f1.sell_ccy)
not in (select fxbook.sell_ccy, fxbook.buy_ccy from fxbook ...)
SO f1.buy_ccy compares to fxbook.sell_ccy
AND f1.sell_ccy compares to fxbook.buy_ccy
Think of NOT as just the inverse of. Here, if there is a match this gets converted to false, and if there was no match that gets converted to true. The rows that are true are the ones that get returned by the query.
I would caution you from using not in with a subquery. If any value in the subquery is null, then no rows are returned at all.
For this reason, I strongly recommend not exists with subqueries:
select f.*
from fxbook f
where not exists (select 1
from fxbook f2
where f2.buy_ccy = f.buy_ccy and
f2.sell_ccy = f.sell_ccy and
f2.customer <> f.customer
);

Set a list in a variable in subquery - MYSQL

My problem is the following, I want set a list of ID in a variable, then use this variable in a subquery. The problem is that WorkBench (my GUI) return the following error : "subquery returning multiple rows". It seems to me that's what I want.
Please explain me where I am wrong.
This is my query :
set #listID := (select ID_VOIE as ID from voies
where ORIGINE = 'XXX'
group by CODE_INSEE, CODE_VOIE
having count(*) > 1);
select substring(v.CODE_INSEE,1,2), count(*) from voies v
where v.ID_VOIE in (#listID)
group by substring(vs.CODE_INSEE,1,2);
The thing is I'm blocked with the "group by", I want do a groupd by after a first group by, that's why I can't (or at least i didn't find a way) write the request with a single WHERE clause.
The thing is I know that I can put the whole request directly in my subquery instead of using variable but :
It can let me use this trick in another requests that needed this behaviour (DRY concept !)
I'm not sure but the subquery will be executed in each turn of my loop, and that will be very inefficient
So I seek 2 possible ways : a way that let me use a list in a variable in a subquery OR a way that let me use "group by" twice in a single query.
Thanks you in advance for your answers (oh and sorry for my english, this is not my maternal language).
Unless you need that variable for something else, you should be able to skip it entirely as follows:
SELECT
SUBSTRING(v.CODE_INSEE,1,2),
COUNT(*)
FROM
voies v
WHERE
v.ID_VOIE in
(SELECT
ID_VOIE as ID
FROM
voies
WHERE
ORIGINE = 'XXX'
GROUP BY
CODE_INSEE,
CODE_VOIE
HAVING COUNT(*) > 1)
GROUP BY
SUBSTRING(vs.CODE_INSEE,1,2);
As you say, the subquery will be executed for all rows. To avoid that, a variable would be best, but MySQL doesn't support table variables. Instead, you can use a temporary table:
IF EXISTS DROP TABLE myTempTable;
CREATE TEMPORARY TABLE myTempTable (ID_VOIE int); -- I don't know the datatype
INSERT INTO myTempTable (ID_VOIE)
SELECT DISTINCT -- using distinct so I can join instead of use IN.
ID_VOIE as ID from voies
WHERE
ORIGINE = 'XXX'
GROUP BY
CODE_INSEE, CODE_VOIE
HAVING COUNT(*) > 1
And now you can do this:
SELECT
SUBSTRING(v.CODE_INSEE,1,2), COUNT(*)
FROM
voies v
JOIN myTempTable tt ON
v.ID_VOIE = tt.ID_VOIE
GROUP BY SUBSTRING(vs.CODE_INSEE,1,2);

Is it possible in sqlalchemy to indicate a table to select from and not select any column from it?

Using sqlalchemy I would like to do something like:
q = session.query(a, b.id, func.count(a.id))
q = q.outerjoin(b, b.id == a.b_id)
q = q.group_by(b.id)
However in most of sql implementations it is impossible to select fields that are not in group by clause.
Can I order sqlalchemy to select from table a, but not select any field directly from a? In this case I would be able to just change join order but I've got some complex queries that aren't so easy to modify.
You can set the FROM clause explicitly with select_from:
session.query(b.id, func.count(a.id)).select_from(a).outerjoin(b, ...)...

DISTINCT with as clause

$query="SELECT a.pk_i_id,a.i_price,b.s_title,c.pk_i_id AS img_id,c.s_extension,d.s_city,d.s_city_area from zl_t_item a, zl_t_item_description b, zl_t_item_resource c, zl_t_item_location d where a.fk_i_category_id=$cat_id and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC";
In this above query i need to add DISTINCT before this c.pk_i_id AS img_id ??
it shows error when i did like below
$query="SELECT a.pk_i_id,a.i_price,b.s_title,DISTINCT c.pk_i_id AS img_id,c.s_extension,d.s_city,d.s_city_area from zl_t_item a, zl_t_item_description b, zl_t_item_resource c, zl_t_item_location d where a.fk_i_category_id=$cat_id and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC";
what is the problem on it?.
It is invalid use of DISTINCT keyword. You can only apply it on a set of columns and not for a specific column skipping other columns
DISTINCT should be applied right after SELECT for a column or set of columns you cannot use DISTINCT between the columns
SELECT DISTINCT c.pk_i_id AS img_id,
a.pk_i_id,a.i_price,b.s_title,c.s_extension,d.s_city,d.s_city_area
from zl_t_item a, zl_t_item_description b, zl_t_item_resource c,
zl_t_item_location d where a.fk_i_category_id=$cat_id
and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id
and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC
In general, using DISTINCT is performance kill.
DISTINCT is actually a filter to remove duplicates.
So, while selecting multiple columns the DISTINCT clause should be applied to the complete set rather than a single column.
Hence you are seeing an error.
The query can be rewritten based on the requirements. If you want filter out duplicates then either you can apply row rank, or group by and having clause to achieve the intended results.
DISTINCT always works on all columns, you might must put it directly after SELECT.
In MySQL there's an easy way to get only one row per img_id, add a GROUP BY img_id
SELECT
a.pk_i_id
,a.i_price
,b.s_title
,c.pk_i_id AS img_id
,c.s_extension
,d.s_city
,d.s_city_area
from
zl_t_item a
,zl_t_item_description b
,zl_t_item_resource c
,zl_t_item_location d
where
a.fk_i_category_id = $cat_id
and a.pk_i_id = b.fk_i_item_id
and a.pk_i_id = c.fk_i_item_id
and a.pk_i_id = d.fk_i_item_id
GROUP BY img_id
ORDER BY
a.dt_pub_date DESC
Of course this is a proprietary MySQL syntax which breaks all the rules of relational dabatabses and will not work with any other RDBMS.
You can have either SELECT DISTINCT <columns> or SELECT <columns> (which actually defaults to SELECT ALL <columns>.) You can't apply DISTINCT to a specific column.
So, the:
SELECT a.pk_i_id ,a.i_price, b.s_title, DISTINCT c.pk_i_id ...
is invalid SQL.

Finding even values in a table MySQL

In MySQL, i have a table with a column full of positive integers and i want to filter out all the odd integers. It seems like there is nothing in the MySQL documentation. I tried the following query.
select kapsule.owner_name,
kapsule.owner_domain,
count(xform_action)
from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id
and (count(xform_action))%2=0
group by kapsule.owner_name;
I want to keep only those values where count(xform_action) is even. The table looks like this.
To filter out resultset after GROUP BY you need to use HAVING clause.
WHERE clause is used to filter source rows before GROUP BY occurs.
Try
SELECT k.owner_name,
k.owner_domain,
COUNT(x.xform_action) cnt -- < you probably meant to use SUM() instead of COUNT() here
FROM kapsule k JOIN rec_xform x -- < use JOIN notation for clarity
ON x.g_conf_id = k.g_conf_id
GROUP BY k.owner_name
HAVING cnt % 2 = 0
You probably meant to use SUM() (sums values of a column of all rows in a group) aggregate instead of COUNT() (returns number of rows in a group)
Here is SQLFiddle demo (for both SUM() and COUNT())
For aggregate functions like COUNT(*) using GROUP BY you need to use HAVING clause
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id and
group by kapsule.owner_name, kapsule.owner_domain
HAVING (count(xform_action))%2=0
or you could use alias (i.e. AS) like:
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) count_form from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id and
group by kapsule.owner_name, kapsule.owner_domain
HAVING count_form%2=0
And you could use JOIN as more efficient than the old one of joining tables. And by the way
if you have GROUP BY the fields before the aggregate function should be in GROUP BY like:
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) count_form from kapsule A
INNER JOIN rec_xform B
ON A.g_conf_id=B.g_conf_id and
GROUP BY by A.owner_name, A.owner_domain
HAVING count_form%2=0
See examples here