Is there a way to NULL the values that are known to be the same in the rest of the dataset?
mysql> SELECT
-> `p1`.`id`,
-> `p1`.`name`,
-> `pp1`.`name` `product_property_name`,
-> `pp1`.`value` `product_property_name`
-> FROM
-> `product` `p1`
-> INNER JOIN
-> `product_property` `pp1`
-> ON
-> `p1`.`id` = `pp1`.`product_id`;
+----+------+-----------------------+-----------------------+
| id | name | product_property_name | product_property_name |
+----+------+-----------------------+-----------------------+
| 1 | Tar | foo | bar |
| 1 | Tar | foo1 | bar1 |
| 1 | Tar | foo2 | bar2 |
| 2 | Qaz | too | doo |
| 2 | Qaz | too1 | doo1 |
+----+------+-----------------------+-----------------------+
5 rows in set (0.00 sec)
In this case product is returned more than once because of INNER JOIN with product_property. I only need the first row of every product to group the results.
Therefore, the desired output:
+----+------+-----------------------+-----------------------+
| id | name | product_property_name | product_property_name |
+----+------+-----------------------+-----------------------+
| 1 | Tar | foo | bar |
| 1 | NULL | foo1 | bar1 |
| 1 | NULL | foo2 | bar2 |
| 2 | Qaz | too | doo |
| 2 | NULL | too1 | doo1 |
+----+------+-----------------------+-----------------------+
This would allow to dramatically cut the memory utilisation, esp. when grouping large datasets.
You can do what you want using variables:
SELECT `p1`.`id`,
(case when name <> #prevname then `p1`.`name` end) as name,
`pp1`.`name` as `product_property_name`,
`pp1`.`value` as `product_property_name`,
#prevname := name
FROM `product` `p1` INNER JOIN
`product_property` `pp1` cross join
(select #prevname := '') const
ON `p1`.`id` = `pp1`.`product_id`;
Although this works in practice, it is not guaranteed to work, because MySQL does not guarantee the ordering of the evaluation of columns in a select statement (so in theory the #prevname := name could happen first).
However, you might consider using group_concat() instead, and dispensing with the multiple rows per property:
SELECT `p1`.`id`,
`p1`.`name`,
group_concat(pp1.name, ', ') as product_property_names,
group_concat(pp1.value, ', ') as product_property_values
FROM `product` `p1` INNER JOIN
`product_property` `pp1`
ON `p1`.`id` = `pp1`.`product_id`
group by p1.id, p1.name;
Or, as a single pair:
SELECT `p1`.`id`,
`p1`.`name`,
group_concat(concat(pp1.name, '=', pp1.value), '; ') as product_property_pairs
FROM `product` `p1` INNER JOIN
`product_property` `pp1`
ON `p1`.`id` = `pp1`.`product_id`
group by p1.id, p1.name;
The latter two should reduce the space more than the first solution.
Related
I am trying to fetch all the categories and their count (no of products in that category) of those products where keyword matches. The query I tried doesn't give me the correct result.
Also I want the parent categories till level 1 and their count as well.
e.g. I am trying with keyword watch, then category "watches" should be there with some count. Also the parent category "accessories" with the sum of its descendant categories count.
my table structures are:
tblProducts: There are 5 categories of a product, fldCategoryId1, fldCategoryId2, fldCategoryId3, fldCategoryId4 and fldCategoryId5. fldProductStatus should be 'A'
+-----------------------------+-------------------+
| Field | Type |
+-----------------------------+-------------------+
| fldUniqueId | bigint(20) |
| fldCategoryId1 | bigint(20) |
| fldCategoryId2 | bigint(20) |
| fldCategoryId3 | bigint(20) |
| fldCategoryId4 | bigint(20) |
| fldCategoryId5 | bigint(20) |
| fldProductStatus | enum('A','P','D') |
| fldForSearch | longtext |
+-----------------------------+-------------------+
tblCategory:
+------------------------------+-----------------------+
| Field | Type |
+------------------------------+-----------------------+
| fldCategoryId | bigint(20) |
| fldCategoryName | varchar(128) |
| fldCategoryParent | int(11) |
| fldCategoryLevel | enum('0','1','2','3') |
| fldCategoryActive | enum('Y','N') |
+------------------------------+-----------------------+
Search Query:
SELECT count( c.fldCategoryId ) AS cnt, c.fldCategoryLevel, c.fldCategoryParent, c.fldCategoryId, c.fldCategoryName, p.fldForSearch, c.fldCategoryParent
FROM tblCategory c, tblProducts p
WHERE (
c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5
)
AND p.fldProductStatus = 'A'
AND (
MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
)
GROUP BY c.fldCategoryId
Note: The table is in the InnoDB engine and have FULLTEXT search index on 'fldForSearch' column.
EDIT: sample data can be found in sqlfiddle
I'm not sure what you mean by:
Also I want the parent categories till level 1 and their count as well.
But the following query will show you a count for each category (including those with 0 found products), and a general rollup:
SELECT
c.fldCategoryId,
c.fldCategoryLevel,
c.fldCategoryName,
COUNT( * ) AS cnt
FROM tblCategory c
LEFT JOIN tblProducts p ON
(c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5)
AND p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
GROUP BY
c.fldCategoryId
c.fldCategoryLevel,
c.fldCategoryName
WITH ROLLUP;
Notes:
you cannot select p.fldForSearch if you expect a count of all the products in the category. fldForSearch is on a per product basis, it defeats the grouping purpose
I left joined with products so it returns the categories with 0 products matching your keywords. If you don't want this to happen just remove the LEFT keyword
I haven't checked the MATCH condition I assume it's correct.
Start by not splaying an array (fldCategoryId...) across columns. Instead, add a new table.
Once you have done that, the queries change, such as getting rid of OR clauses.
Hopefully, any further issues will fall into place.
Since your category tree has a fixed height (4 levels), you can create a transitive closure table on the fly with
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
The result will look like
| descendantId | ancestorId |
|--------------|------------|
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| ... | ... |
| 5 | 1 |
| 5 | 2 |
| 5 | 5 |
| ... | ... |
You can now use it in a subquery (derived table) to join it with products using descendantId and with categories using ancestorId. That means that a product from category X will be indirectly associated with all ancestors of X (as well as with X). For example: Category 5 is a child of 2 - and 2 is a child of 1. So all products from category 5 must be counted for categories 5, 2 and 1.
Final query:
SELECT c.*, coalesce(sub.cnt, 0) as cnt
FROM tblCategory c
LEFT JOIN (
SELECT tc.ancestorId, COUNT(DISTINCT p.fldUniqueId) AS cnt
FROM tblProducts p
JOIN (
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
) tc ON tc.descendantId IN (
p.fldCategoryId1,
p.fldCategoryId2,
p.fldCategoryId3,
p.fldCategoryId4,
p.fldCategoryId5
)
WHERE p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST ( '+(watches watch)' IN BOOLEAN MODE )
GROUP BY tc.ancestorId
) sub ON c.fldCategoryId = sub.ancestorId
Result for your sample data (without level, since it seems to be wrong anyway):
| fldCategoryId | fldCategoryName | fldCategoryParent | fldCategoryActive | cnt |
|---------------|-----------------|-------------------|-------------------|-----|
| 1 | Men | 0 | Y | 5 |
| 2 | Accessories | 1 | Y | 5 |
| 3 | Men Watch | 1 | Y | 3 |
| 5 | Watch | 2 | Y | 5 |
| 6 | Clock | 2 | Y | 3 |
| 7 | Wrist watch | 1 | Y | 2 |
| 8 | Watch | 2 | Y | 4 |
| 9 | watch2 | 3 | Y | 2 |
| 10 | fastrack | 8 | Y | 3 |
| 11 | swish | 8 | Y | 2 |
| 12 | digital | 5 | Y | 2 |
| 13 | analog | 5 | Y | 2 |
| 14 | dual | 5 | Y | 1 |
Demos:
sqlfiddle
rextester
Note that the outer (left joined) subquery is logically not necessary. But from my experience MySQL doesn't perform well without it.
There are still ways for performance optimisation. One is to store the transitive closure table in an indexed temporary table. You can also persist it in a regular table, if categories do rarely change. You can also manage it with triggers.
I have a table named tb_customer master.
mysql> select COSTUMER_ID, NAMA, ATTENTION, IN_DATE, IN_REF, JOB_REF, LAST_CARGO FROM tb_customer_master;
+-------------+----------------------+-------------------------+------------+--------+---------+----------------------+
| COSTUMER_ID | NAMA | ATTENTION | IN_DATE | IN_REF | JOB_REF | LAST_CARGO |
+-------------+----------------------+-------------------------+------------+--------+---------+----------------------+
| 2 | Eagletainer | Ms. Joyce Ong Chong Mei | NULL | 1234 | 123 | Lube |
| 5 | APL | Test | 21-11-2015 | sgdgfa | sgfsd | FOOD |
+-------------+----------------------+-------------------------+------------+--------+---------+----------------------+
2 rows in set (0.00 sec)
And I have table too as table master that have behavior as a report master
mysql> select REPAIR_ESTIMATE_ID, EIR_REF, COSTUMER_ID FROM tb_master_repair_estimate;
+--------------------+------------+-------------+
| REPAIR_ESTIMATE_ID | EIR_REF | COSTUMER_ID |
+--------------------+------------+-------------+
| 38 | 1545053 | 5 |
| 40 | 1545052 | 5 |
| 41 | 1545054 | 5 |
+--------------------+------------+-------------+
3 rows in set (0.00 sec)
Now, for a case, I want to subquery of them like this
mysql> SELECT
-> a.EIR_REF as EIR,
->
-> (SELECT NAMA FROM tb_customer_master c
-> WHERE a.COSTUMER_ID = c.COSTUMER_ID ) as "Name Of Customer",
->
-> (SELECT ATTENTION FROM tb_customer_master c
-> WHERE a.COSTUMER_ID = c.COSTUMER_ID ) as "ATTENTION"
->
-> FROM tb_master_repair_estimate a
->
-> WHERE a.EIR_REF = "1545052";
+------------+----------------------+-----------+
| EIR | Name Of Customer | ATTENTION |
+------------+----------------------+-----------+
| 1545052 | APL | Test |
+------------+----------------------+-----------+
1 row in set (0.00 sec)
My question is, I want to make my last query to be simply. How can I make With a one column eir, name of customer, attention, in_date, in_ref and so on to be simply, not as select one by one in subquery. It is so long command if determine it one by one.
Any suggestion so appreciated
UPDATE,
Thanks for the quickly response. Ther reason why I am using subquery is because my table of report master have many foreign key.
This is the complete tbl_report
mysql> select EIR_REF, NO_TANK, COSTUMER_ID, TANK_ID, TOTAL from tb_master_repair_estimate where EIR_REF = "1545052";
+------------+---------+-------------+---------+-------+
| EIR_REF | NO_TANK | COSTUMER_ID | TANK_ID | TOTAL |
+------------+---------+-------------+---------+-------+
| 1545052 | 7 | 5 | 1 | NULL |
+------------+---------+-------------+---------+-------+
And another table again named tb_tank_type
mysql> select * from tb_tank_type;
+---------+-----------+------------+
| TANK_ID | NAMA_TYPE | KETERANGAN |
+---------+-----------+------------+
| 1 | IM04 | NULL |
| 2 | XXXX | NULL |
+---------+-----------+------------+
2 rows in set (0.00 sec)
I am try to make them on one unite using my code like this:
mysql> SELECT
-> a.EIR_REF as EIR,
->
-> (SELECT NAMA_TYPE FROM tb_tank_type b
-> WHERE b.TANK_ID = a.TANK_ID) as "type tank",
->
-> (SELECT NAMA FROM tb_customer_master c
-> WHERE a.COSTUMER_ID = c.COSTUMER_ID ) as "Name Of Customer",
->
-> (SELECT ATTENTION FROM tb_customer_master c
-> WHERE a.COSTUMER_ID = c.COSTUMER_ID ) as "ATTENTION",
->
-> (SELECT PREFIX FROM tb_iso_tanks d
-> WHERE a.NO_TANK = d.ID_TANK) as "PREFIX",
->
-> (SELECT SERIAL_NUMBER FROM tb_iso_tanks d
-> WHERE a.NO_TANK = d.ID_TANK) as "SERIAL_NUMBER"
->
-> FROM tb_master_repair_estimate a
->
-> WHERE a.EIR_REF = "1545052";
+------------+-----------+----------------------+-----------+--------+---------------+
| EIR | type tank | Name Of Customer | ATTENTION | PREFIX | SERIAL_NUMBER |
+------------+-----------+----------------------+-----------+--------+---------------+
| 1545052 | IM04 | APL | Test | EOLU | 1234567 |
+------------+-----------+----------------------+-----------+--------+---------------+
1 row in set (0.00 sec)
Btw, thanks again.
So a case like this :
There are another table again.
JOINs are usually much faster, and simpler, than subqueries:
SELECT a.EIR_REF as EIR, c.NAMA AS `Name of Customer`, c.ATTENTION AS ATTENTION
FROM tb_master_repair_estimate a
LEFT JOIN tb_customer_master c USING (COSTUMER_ID)
WHERE a.EIR_REF = "1545052"
;
If I understand it correctly, you might use a JOIN your table tb_customer_master to your table tb_master_repair_estimate to simplify your query.
Example derived from your statement:
SELECT
a.EIR_REF as EIR,
c.NAMA as "Name Of Customer",
c.ATTENTION as "ATTENTION"
FROM tb_master_repair_estimate a
INNER JOIN tb_customer_master c ON a.COSTUMER_ID = b.COSTUMER_ID
WHERE a.EIR_REF = "1545052";
Another benefit: this is way more readable.
Looks like you would need to use a join in this instance:
select a.EIR_REF, c.NAMA, c.ATTENTION
from tb_master_repair_estimate a
join tb_customer_master c
on c.COSTUMER_ID = a.COSTUMER_ID
where a.EIR_REF = '1545052';
Here is the table I use:
+-------------+----------+---------------------+
| sourceindex | source | pa |
+-------------+----------+---------------------+
| 0 | this | 0.13842974556609988 |
| 1 | is | 0.26446279883384705 |
| 2 | a | 0.26446279883384705 |
| 3 | book | 0.13842974556609988 |
| 4 | , | 0.26446279883384705 |
| 5 | that | 0.13842974556609988 |
I want to add a column which will be the result log(sum(pa))/pa.
Any suggestions on how I could do that?
You can use a cross join to to calculate log(sum(pa)) and in your outer you can divide the result with each value of pa colum
update
test t
join (select
`sourceindex`, `source`, `pa` , log_sum/pa new_col
from
test
cross join (select log(sum(pa)) log_sum
from test ) a
) t1
on (t.sourceindex= t1.sourceindex
and t.source = t1.source
and t.pa = t1.pa
)
set t.new_col = t1.new_col
Demo
But its better if you switch your logic to show your calculation with select query
select `sourceindex`, `source`, `pa` , log_sum/pa new_col
from
test
cross join (select log(sum(pa)) log_sum
from test ) t
Demo
I have a table:
id | type | subtype
how shall I create a query to output as following
type1 | subtype1 | count-subtype1 | count-type1
type1 | subtype2 | count-subtype2 | count-type1
type2 | subtype3 | count-subtype3 | count-type2
type2 | subtype4 | count-subtype4 | count-type2
Namely subtotal as a column in output.
With no "WITH ROLLUP"
To awnser this query sucessfully (and this is where some awsers fails) is that you need to know what a roollup does. If you don't want to perform a "manual" sql rollup, there is another answer around which solves your query.
what do you need is two queries, one to count the subtypes within the types and another to count the types.
first count the subtypes (and lets call this query s).
select count(*) count_subtype, type, subtype from Foo group by type, subtype;
and another query to count the types (and lets call this query t).
select count(*) count_type, type from Foo froup by type
and now you need to merge the two queries:
select t.type, s.subtype, s.count_subtype, t.conttype from
(select count(*) count_subtype, type, subtype from Foo group by type, subtype) as s
join
(select count(*) count_type, type from Foo froup by type) as t
on (t.type=s.type);
Assuming that I have this structure of table:
CREATE TABLE `test` (
`id` int(11) NOT NULL auto_increment,
`type` varchar(128) default NULL,
`subtype` varchar(128) default NULL,
KEY `id` (`id`));
And this data:
INSERT INTO `test` VALUES (1,'a','1'),(2,'a','2'),(3,'a','3'),(4,'a','4'),(5,'b','4'),
(6,'c','4'),(7,'c','1'),(8,'c','2'),(9,'c','2');
I can do this:
SELECT test.type, test.subtype, count(test.subtype) as countsubtype, testbytype.counttype
FROM (test)
LEFT JOIN (SELECT type, count(type) AS counttype FROM test group by type) AS testbytype ON test.type = testbytype.type
GROUP by type, subtype;
+------+---------+--------------+-----------+
| type | subtype | countsubtype | counttype |
+------+---------+--------------+-----------+
| a | 1 | 1 | 4 |
| a | 2 | 1 | 4 |
| a | 3 | 1 | 4 |
| a | 4 | 1 | 4 |
| b | 4 | 1 | 1 |
| c | 1 | 1 | 4 |
| c | 2 | 2 | 4 |
| c | 4 | 1 | 4 |
+------+---------+--------------+-----------+
Query:
SELECT type, subtype, sum(type), sum(subtype) from table_name GROUP BY id
I have a couple of very large tables (over 400,000 rows) that look like the following:
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | NULL |
| 3684515 | 3.0476 | NULL |
| 3684516 | 2.6499 | NULL |
| 3684517 | 0.3585 | NULL |
| 3684518 | 1.6919 | NULL |
| 3684519 | 2.8515 | NULL |
| 3684520 | 4.0728 | NULL |
| 3684521 | 4.0224 | NULL |
| 3684522 | 5.8207 | NULL |
| 3684523 | 6.8291 | NULL |
+---------+--------+---------------+...about 400,000 more
I need to assign each row in the M1_Percentile column a value that represents "the percent of rows with M1 values equal or lower to the current row's M1 value"
In other words, I need:
I implemented this sucessfully, but it is FAR FAR too slow. If anyone could create a more efficient version of the following code, I would really appreciate it!
UPDATE myTable AS X JOIN (
SELECT
s1.ID, COUNT(s2.ID)/ (SELECT COUNT(*) FROM myTable) * 100 AS percentile
FROM
myTable s1 JOIN myTable s2 on (s2.M1 <= s1.M1)
GROUP BY s1.ID
ORDER BY s1.ID) AS Z
ON (X.ID = Z.ID)
SET X.M1_Percentile = Z.percentile;
This is the (correct but slow) result from the above query if the number of rows is limited to the ones you see (10 rows):
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | 60 |
| 3684515 | 3.0476 | 50 |
| 3684516 | 2.6499 | 30 |
| 3684517 | 0.3585 | 10 |
| 3684518 | 1.6919 | 20 |
| 3684519 | 2.8515 | 40 |
| 3684520 | 4.0728 | 80 |
| 3684521 | 4.0224 | 70 |
| 3684522 | 5.8207 | 90 |
| 3684523 | 6.8291 | 100 |
+---------+--------+---------------+
Producing the same results for the entire 400,000 rows takes magnitudes longer.
I cannot test this, but you could try something like:
update table t
set mi_percentile = (
select count(*)
from table t1
where M1 < t.M1 / (
select count(*)
from table));
UPDATE:
update test t
set m1_pc = (
(select count(*) from test t1 where t1.M1 < t.M1) * 100 /
( select count(*) from test));
This works in Oracle (the only database I have available). I do remember getting that error in MySQL. It is very annoying.
Fair warning: mysql isn't my native environment. However, after a little research, I think the following query should be workable:
UPDATE myTable AS X
JOIN (
SELECT X.ID, (
SELECT COUNT(*)
FROM myTable X1
WHERE (X.M1, X.id) >= (X1.M1, X1.id) as Rank)
FROM myTable as X
) AS RowRank
ON (X.ID = RowRank.ID)
CROSS JOIN (
SELECT COUNT(*) as TotalCount
FROM myTable
) AS TotalCount
SET X.M1_Percentile = RowRank.Rank / TotalCount.TotalCount;