In MySQL, i have a table with a column full of positive integers and i want to filter out all the odd integers. It seems like there is nothing in the MySQL documentation. I tried the following query.
select kapsule.owner_name,
kapsule.owner_domain,
count(xform_action)
from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id
and (count(xform_action))%2=0
group by kapsule.owner_name;
I want to keep only those values where count(xform_action) is even. The table looks like this.
To filter out resultset after GROUP BY you need to use HAVING clause.
WHERE clause is used to filter source rows before GROUP BY occurs.
Try
SELECT k.owner_name,
k.owner_domain,
COUNT(x.xform_action) cnt -- < you probably meant to use SUM() instead of COUNT() here
FROM kapsule k JOIN rec_xform x -- < use JOIN notation for clarity
ON x.g_conf_id = k.g_conf_id
GROUP BY k.owner_name
HAVING cnt % 2 = 0
You probably meant to use SUM() (sums values of a column of all rows in a group) aggregate instead of COUNT() (returns number of rows in a group)
Here is SQLFiddle demo (for both SUM() and COUNT())
For aggregate functions like COUNT(*) using GROUP BY you need to use HAVING clause
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id and
group by kapsule.owner_name, kapsule.owner_domain
HAVING (count(xform_action))%2=0
or you could use alias (i.e. AS) like:
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) count_form from kapsule, rec_xform
where rec_xform.g_conf_id=kapsule.g_conf_id and
group by kapsule.owner_name, kapsule.owner_domain
HAVING count_form%2=0
And you could use JOIN as more efficient than the old one of joining tables. And by the way
if you have GROUP BY the fields before the aggregate function should be in GROUP BY like:
select kapsule.owner_name, kapsule.owner_domain,
count(xform_action) count_form from kapsule A
INNER JOIN rec_xform B
ON A.g_conf_id=B.g_conf_id and
GROUP BY by A.owner_name, A.owner_domain
HAVING count_form%2=0
See examples here
Related
I have this somewhat complex sql query that works ok without the final where clause. I'm looking to filter some records using the column unreviewed_records which is an alias
Problem is that I get an error saying unreviewed_records cannot be found. I found some information saying that alias fields are not permitted to be used in where clauses and I'm not sure what's the best way to fix this. Considered using a computed column but I'm not sure how that works yet and I'm hoping there's an easier fix to the query.
Also I find that switching to using the "having" clause work for aliases, but I'll only resort to this if there's no better alternative, to avoid the performance hit.
Any pointers would be helpful :)
select
r_alias.serv_id, r_alias.node_id,
SUM(g_alias.total_records)- SUM(r_alias.reviewed_records) AS unreviewed_records,
SUM(r_alias.reviewed_records) AS reviewed_records,
SUM(g_alias.total_records) AS total_records,
FROM (
SELECT prs.serv_id,
prs.node_id,
SUM(prs.reviewed_records) AS reviewed_records,
FROM p_rev_server prs
WHERE
prs.area_id = 3
AND prs.subId = 3
AND prs.sId = 12
GROUP BY prs.serv_id, prs.node_id, prs.domain_name
) r_alias
INNER JOIN (SELECT
serv_id,
node_id,
SUM(pgs.total_records) AS total_records,
FROM p_gen_serve pgs
WHERE pgs.area_id = 3
AND pgs.subId = 3
AND pgs.sId = 12
AND pgs.total_records > 0
GROUP BY pgs.serv_id, pgs.node_id, pgs.domain_name
) g_alias
ON g_alias.serv_id = r_alias.serv_id AND g_alias.node_id = r_alias.node_id
LEFT JOIN p_cust_columns cust_cols
ON cust_cols.node_id = r_alias.node_id AND cust_cols.serv_id = r_alias.serv_id
where (((NOT (unreviewed_records IS NULL)) AND (unreviewed_records = 5)))
group by r_alias.serv_id, r_alias.node_id
order by g_alias.node_id ASC
limit 25
The reason aliases are not allowed in a WHERE clause is that the expressions in the SELECT list are not evaluated until after the rows are filtered by the WHERE clause. So it's a chicken-and-egg problem.
The easiest and most common alternative is a derived table:
SELECT a, b, c
FROM (
SELECT a, b, a+b AS c
FROM mytable
WHERE b = 1234
) AS t
WHERE c = 42;
This example shows that you can put some filtering conditions inside the derived table subquery, so you can at least reduce the result set partially, before the result of the subquery is turned into a temporary table.
Then in the outer query, you can reference a column that was derived from an expression in the select-list of the subquery. In this example, it's the c column.
The CTE approach is basically the same, it creates a temporary table to store the result of the inner query (the CTE), and then you can apply conditions to that in the outer query.
WITH t AS (
SELECT a, b, a+b AS c
FROM mytable
WHERE b = 1234
)
SELECT a, b, c
FROM t
WHERE c = 42;
The CTE solution is not better than the derived-table approach, unless you need to reference the CTE multiple times in the outer query, i.e. doing a self-join.
Yeah, you are kind of SOL, WHERE can't know what an alias will be. So, frankly, a CTE, common table expression, is probably your best bet here. It should work, though not all RDBMS really support them (MySQL for example only in version 8).
Im trying to do a select query on a table along with an inner join afterwards to link data from the owner to the cats
the ownercat is using a foreign key on the id linking to the ownerinfo id
USE CATTERY;
SELECT
OWNERINFO.ID, OWNERINFO.First_Name, OWNERINFO.Last_Name, OWNERINFO.Phone, OWNERINFO.AddrL1, OWNERINFO.AddrL2, OWNERINFO.AddrL3, OWNERINFO.PostCode,
GROUP_CONCAT(DISTINCT OWNERCAT.Chip_ID)
FROM OWNERINFO
INNER JOIN OWNERCAT ON OWNERINFO.ID = OWNERCAT.ID
WHERE ID = 1;
I get returned the following error:
Error Code: 1052. Column 'ID' in where clause is ambiguous 0.0014 sec
removing the concat distinct statement still produces the same error, im not sure how to get around this issue
You need to define from which table the ID on WHERE-clause come from (you can use aliases). Secondly, as you are using GROUP_CONCAT, you should have GROUP BY in the query:
SELECT
oi.ID,
oi.First_Name,
oi.Last_Name,
oi.Phone,
oi.AddrL1,
oi.AddrL2,
oi.AddrL3,
oi.PostCode,
GROUP_CONCAT(DISTINCT oc.Chip_ID)
FROM OWNERINFO oi
INNER JOIN OWNERCAT oc ON oc.ID=oi.ID
WHERE oi.ID = 1
GROUP BY oi.ID
The problem is in the WHERE clause: ID is ambiguous, because that column is available in both tables.
You may think that, since you are joining the tables on ID, the database is able to tell that it has the same value, but that's not actually the case.
So just qualify the column in the WHERE clause, ie change this:
WHERE ID = 1
To either:
WHERE OWNERINFO.ID = 1
Or the equivalent:
WHERE OWNERCAT.ID = 1
Also please note that your query uses GROUP_CONCAT(), which is an aggregate function. This implies that you need a GROUP BY clause, that should list all non-aggregated column (ie all columns other than the one that is within GROUP_CONCAT()).
How can I optimize this query SQL?
CREATE TABLE table1 AS
SELECT * FROM temp
WHERE Birth_Place IN
(SELECT c.DES_COM
FROM tableCom AS c
WHERE c.COD_PROV IS NULL)
ORDER BY Cod, Birth_Date
I think that the problem is the IN clause
First of all it's not quite valid SQL, since you are selecting and sorting by columns that are not part of the group. What you want to do is called "select top N in group", check out Select first row in each GROUP BY group?
Your query doesn't make sense, because you have SELECT * with GROUP BY. Ignoring that, I would recommend writing the query as:
SELECT t.*
FROM temp t
WHERE EXISTS (SELECT 1
FROM tableCom c
WHERE t.Birth_Place = c.DES_COM AND
c.COD_PROV IS NULL
)
ORDER BY Cod, Birth_Date;
For this, I recommend an index on tableCom(desc_com, cod_prov). Your database might also be able to use an an index on temp(cod, birth_date, birthplace).
I find it really annoying to be not able to get the number of rows without having to use group by. I just need to get the "Total count" that my subquery returned.
Here is what my subquery looks like:
select sales_flat_order.increment_id, sales_flat_order.created_at, sales_flat_order.status, dispatch.dispatch_date,
DATEDIFF(TO_DATE(dispatch.dispatch_date), TO_DATE(sales_flat_order.created_at)) as delay
FROM
magentodb.sales_flat_order
LEFT OUTER JOIN erpdb.dispatch
ON
sales_flat_order.increment_id == dispatch.order_num
where
TO_DATE(created_at) >= DATE_SUB(current_date(),6)
AND
TO_DATE(created_at) <= DATE_SUB(current_date(), 3)
AND
sales_flat_order.status NOT IN ('canceled', 'exchange', 'rto', 'pending_auth', 'pending_payment' ,'partial_refund','refund', 'refund_cash', 'partial_refund_cash', 'holded')
)
AS TempFiltered
Now, I add 1 extra WHERE clause in my outer query so that it returned "lesser" number of rows, let's call this column y .
I then require to take percentage of x to y(i.e number of rows returned by outer query to subquery)
I do not wan to repeat my subquery only to get count of the rows. HOw do I get it?
This is what I have so far: But ofcourse it is wrong. I can not get count of all my rows without having to exclude select columns or using them in group by. HOw do I resolve this?
SELECT tempfiltered.delay, count(*) as countOfOrders,(100*count(*))/tempfiltered.Total) over () as percentage
FROM
(
select count(*) as Total, sales_flat_order.increment_id, sales_flat_order.created_at, sales_flat_order.status, dispatch.dispatch_date,
DATEDIFF(TO_DATE(dispatch.dispatch_date), TO_DATE(sales_flat_order.created_at)) as delay
FROM
magentodb.sales_flat_order
LEFT OUTER JOIN erpdb.dispatch
ON
sales_flat_order.increment_id == dispatch.order_num
where
TO_DATE(created_at) >= DATE_SUB(current_date(),6)
AND
TO_DATE(created_at) <= DATE_SUB(current_date(), 3)
AND
sales_flat_order.status NOT IN ('canceled', 'exchange', 'rto', 'pending_auth', 'pending_payment' ,'partial_refund','refund', 'refund_cash', 'partial_refund_cash', 'holded')
)
AS TempFiltered
Where
DATEDIFF(TO_DATE(TempFiltered.dispatch_date), TO_DATE(TempFiltered.created_at)) > 1
GROUP BY tempfiltered.delay
ORDER BY tempfiltered.delay
You could change the subquery into a SELECT INTO query, and put the data in a temporary table, and use that in the main query, and separately just select count(*) of that temporary table. That should pretty much satisfy your requirement.
$query="SELECT a.pk_i_id,a.i_price,b.s_title,c.pk_i_id AS img_id,c.s_extension,d.s_city,d.s_city_area from zl_t_item a, zl_t_item_description b, zl_t_item_resource c, zl_t_item_location d where a.fk_i_category_id=$cat_id and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC";
In this above query i need to add DISTINCT before this c.pk_i_id AS img_id ??
it shows error when i did like below
$query="SELECT a.pk_i_id,a.i_price,b.s_title,DISTINCT c.pk_i_id AS img_id,c.s_extension,d.s_city,d.s_city_area from zl_t_item a, zl_t_item_description b, zl_t_item_resource c, zl_t_item_location d where a.fk_i_category_id=$cat_id and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC";
what is the problem on it?.
It is invalid use of DISTINCT keyword. You can only apply it on a set of columns and not for a specific column skipping other columns
DISTINCT should be applied right after SELECT for a column or set of columns you cannot use DISTINCT between the columns
SELECT DISTINCT c.pk_i_id AS img_id,
a.pk_i_id,a.i_price,b.s_title,c.s_extension,d.s_city,d.s_city_area
from zl_t_item a, zl_t_item_description b, zl_t_item_resource c,
zl_t_item_location d where a.fk_i_category_id=$cat_id
and a.pk_i_id=b.fk_i_item_id and a.pk_i_id=c.fk_i_item_id
and a.pk_i_id=d.fk_i_item_id ORDER BY a.dt_pub_date DESC
In general, using DISTINCT is performance kill.
DISTINCT is actually a filter to remove duplicates.
So, while selecting multiple columns the DISTINCT clause should be applied to the complete set rather than a single column.
Hence you are seeing an error.
The query can be rewritten based on the requirements. If you want filter out duplicates then either you can apply row rank, or group by and having clause to achieve the intended results.
DISTINCT always works on all columns, you might must put it directly after SELECT.
In MySQL there's an easy way to get only one row per img_id, add a GROUP BY img_id
SELECT
a.pk_i_id
,a.i_price
,b.s_title
,c.pk_i_id AS img_id
,c.s_extension
,d.s_city
,d.s_city_area
from
zl_t_item a
,zl_t_item_description b
,zl_t_item_resource c
,zl_t_item_location d
where
a.fk_i_category_id = $cat_id
and a.pk_i_id = b.fk_i_item_id
and a.pk_i_id = c.fk_i_item_id
and a.pk_i_id = d.fk_i_item_id
GROUP BY img_id
ORDER BY
a.dt_pub_date DESC
Of course this is a proprietary MySQL syntax which breaks all the rules of relational dabatabses and will not work with any other RDBMS.
You can have either SELECT DISTINCT <columns> or SELECT <columns> (which actually defaults to SELECT ALL <columns>.) You can't apply DISTINCT to a specific column.
So, the:
SELECT a.pk_i_id ,a.i_price, b.s_title, DISTINCT c.pk_i_id ...
is invalid SQL.