Better SQL to sum same columns across multiple tables? - mysql

I am trying to get sum for some columns from across multiple mysql tables using python/sqlalchemy. The number of tables is dynamic, and each table has same schema.
Table_1
| col1 | col2| ... |
Table_2
| col1 | col2| ... |
Table_...
| col1 | col2| ... |
I studied sqlachemy, and realised that the better idea might be to generate a SQL text and execute it, creating models might not be a good solution, I feel that may introduce additional cost on performance, I prefer a single SQL statement.
select (t1.col1 + t2.col1 + t3.col1 + t?.col1 ...) as col1, (t1.col2 + t2.col2 + ...) as col2,
... from
(select sum(col1), sum(col2), sum(col3) ... from Table_1 as t1,
select sum(col1), sum(col2), sum(col3) ... from Table_2 as t2,
...
)
The above is the SQL I intend to make using python. I am not a SQL professional, so I am not sure if that is a good statement, and I am wondering if there are any better solution, simpler and efficient, other than this?

Your general approach looks reasonable. Getting the SUMs from the individual tables as a single row, and the combining those, is the most efficient approach. There's just a couple of minor fixes.
It looks like you will need to provide an alias for each of the SUM() expression returned.
And you're going to need to wrap the SELECT from each table in a set of parens, and give each of those inline views an alias.
Also, there's a potential for one of the inner SUM() expressions to return a NULL, so the addition performed in the outer query could return a NULL. One fix for that would be wrap the inner SUM expressions in a IFNULL or COALESCE, to replace a NULL with a zero, but that could introduces a zero where the outer SUM would really be a NULL.
Personally, I'd avoid using the comma notation for the JOIN operation. The comma is valid, but I'd write it out using the CROSS JOIN keywords, to make it a little more readable.
But my preference would be avoid the JOIN and the addition operations in the outer query. I'd use a SUM aggregate in the outer query, something like this:
SELECT SUM(t.col1_tot) AS col1_tot
, SUM(t.col2_tot) AS col2_tot
, SUM(t.col3_tot) AS col3_tot
FROM ( SELECT SUM(col1) AS col1_tot
, SUM(col2) AS col2_tot
, SUM(col3) AS col3_tot
FROM table1
UNION ALL
SELECT SUM(col1) AS col1_tot
, SUM(col2) AS col2_tot
, SUM(col3) AS col3_tot
FROM table2
UNION ALL
SELECT SUM(col1) AS col1_tot
, SUM(col2) AS col2_tot
, SUM(col3) AS col3_tot
FROM table3
) t
That avoids anomalies with NULL values, and makes it return the same values that would be returned if the the individual tables were all concatenated together. But this isn't any more efficient than what you have.
To use the JOIN method, as in your query (if I don't mind returning a zero where a NULL would have been returned in the query above, to that approach to work:
SELECT t1.col1_tot + t2.col1_tot + t3.col1_tot AS col1_tot
, t1.col2_tot + t2.col2_tot + t3.col2_tot AS col2_tot
, t1.col3_tot + t2.col3_tot + t3.col3_tot AS col3_tot
FROM ( SELECT IFNULL(SUM(col1),0) AS col1_tot
, IFNULL(SUM(col2),0) AS col2_tot
, IFNULL(SUM(col3),0) AS col3_tot
FROM table1
) t1
CROSS
JOIN ( SELECT IFNULL(SUM(col1),0) AS col1_tot
, IFNULL(SUM(col2),0) AS col2_tot
, IFNULL(SUM(col3),0) AS col3_tot
FROM table2
) t2
CROSS
JOIN ( SELECT IFNULL(SUM(col1),0) AS col1_tot
, IFNULL(SUM(col2),0) AS col2_tot
, IFNULL(SUM(col3),0) AS col3_tot
) t3
But, again, my personal preference would be to avoid doing those addition operations in the outer query. I'd use the SUM aggregate, and UNION the results from the individual tables, rather than doing a join.

Unless you have some where clauses to join those tables together, you're going to end up with a cartesian join, where every record from each table in the query is joined against all other combinations of records from the other tables. so if each of those tables has (say) 1000 records, and you've got 5 tables in the query, you're going to end up with 1000^5 = 1,000,000,000,000,000 records in the result set.
What you want is probably something more like this:
SELECT sum(col1) AS sum1, sum(col2) AS sum2, ....
FROM (
SELECT col1, col2, col3, ... FROM table1
UNION ALL
SELECT col1, col2, col3, ... FROM table2
UNION ALL
...
) a
The inner UNION join will take all the columns from each of those tables and turn them into a single contiguous result set. The outer query will then take each of those columns and sum up the values.

This may help u,
select SUM(col1),SUM(col2) from
(
select col1,col2 from Table1
union all
select col1,col2 from Table2
union all
select col1,col2 from Table3
)t

Related

SQL Union Query - Referencing to alias of derived table

I have a complicated aggregate-functions query that produces a result-set, and which has to be amended with a single row that contains the totals and averages of that result-set.
My idea is to assign an alias to the result-set, and then use that alias in a second query, after a UNION ALL statement.
But, I can't successfully use the alias, in the subsequent SELECT statement, after the UNION ALL statement.
For the sake of simplicity, I won't post the original query here, just a simplified list of the variants I've tried:
SELECT * FROM fees AS Test1 WHERE Percentage = 15
UNION ALL
(SELECT * FROM fees AS Test2 WHERE Percentage > 15)
UNION ALL
(SELECT * FROM (SELECT * FROM fees AS Test3 WHERE Percentage < 10) AS Test4)
UNION ALL
SELECT * FROM Test3
The result is:
MySQL said: Documentation
#1146 - Table 'xxxxxx.Test3' doesn't exist
The result is the same if the last query references to the table Test1, Test2, or Test4.
So, how should I assign an alias to a result-set/derived table in earlier queries and use that same alias in latter queries, all within a UNION query?
Amendment:
My primary query is:
SELECT
COALESCE(referrers.name,order_items.ReferrerID),
SUM(order_items.quantity) as QtySold,
ROUND(SUM((order_items.quantity*order_items.price+order_items.shippingcosts)/((100+order_items.vat)/100)), 2) as TotalRevenueNetto,
ROUND(100*SUM(order_items.quantity*order_items.purchasepricenet)/SUM((order_items.quantity*order_items.price+order_items.shippingcosts)/((100+order_items.vat)/100)), 1) as PurchasePrice,
ROUND(100*SUM(order_items.quantity*COALESCE(order_items.calculatedfee,0)+order_items.quantity*COALESCE(order_items.calculatedcost,0))/SUM((order_items.quantity*order_items.price+order_items.shippingcosts)/((100+order_items.vat)/100)), 1) as Costs,
ROUND(100*SUM(order_items.calculatedprofit) / SUM( (order_items.quantity*order_items.price + order_items.shippingcosts)/((100+order_items.vat)/100) ) , 1) as Profit,
COALESCE(round(100*Returns.TotalReturns_Qty/SUM(order_items.quantity),2),0) as TotalReturns
FROM order_items LEFT JOIN (SELECT order_items.ReferrerID as ReferrerID, sum(order_items.quantity) as TotalReturns_Qty FROM order_items WHERE OrderType='returns' and OrderTimeStamp>='2017-12-1 00:00:00' GROUP BY order_items.ReferrerID) as Returns ON Returns.ReferrerID = order_items.ReferrerID LEFT JOIN `referrers` on `referrers`.`referrerId` = `order_items`.`ReferrerID`
WHERE ( ( order_items.BundleItemID in ('-1', '0') and order_items.OrderType in ('order', '') ) or ( order_items.BundleItemID is NULL and order_items.OrderType = 'returns' ) ) and order_items.OrderTimestamp >= '2017-12-1 00:00:00'
GROUP BY order_items.ReferrerID
ORDER BY referrers.name ASC
I want to make a grand-total of all the rows resulting from query above with:
SELECT 'All marketplaces', SUM(QtySold), SUM(TotalRevenueNetto), AVG(PurchasePrice), AVG(Costs), AVG(Profit), AVG(TotalReturns) FROM PrimaryQuery
I want to do this with a single query.
Your query is well-written. You may be able to get a total line by using a surrounding query with a dummy GROUP BY clause and WITH ROLLUP:
SELECT
COALESCE(Referrer, 'All marketplaces'),
SUM(QtySold) AS QtySold,
SUM(TotalRevenueNetto) AS TotalRevenueNetto,
AVG(PurchasePrice) AS PurchasePrice,
AVG(Costs) AS Costs,
AVG(Profit) AS Profit,
AVG(TotalReturns) AS TotalReturns
FROM
(
SELECT
COALESCE(referrers.name,order_items.ReferrerID) AS Referrer,
SUM(order_items.quantity) AS QtySold,
...
) PrimaryQuery
GROUP BY Referrer ASC WITH ROLLUP;
I'm not entirely sure what you are attempting to solve, but I guess something like the following:
Hypothetical 'main' query:
SELECT T1.ID
, Sum(total_grade)/COUNT(subjects) as AverageGrade
FROM A_Table T1
JOIN AnotherTable T2
ON T2.id = T1.id
GROUP BY T1.ID
You want sub resultsets, without having to keep querying the same data.
Edit: I mistakenly thought the linked documentation and method mentioned below was for the current version of mySQL. It is however a draft for a future version, and CTE's are not currently supported.
In the absence of CTE support, I would probably just insert the resultset into a temporary table. Something like:
CREATE TABLE TEMP_TABLE(ID INT, AverageGrade DECIMAL(15, 3))
INSERT INTO TEMP_TABLE
SELECT T1.ID
, Sum(total_grade)/COUNT(subjects) as AverageGrade
FROM A_Table T1
JOIN AnotherTable T2
ON T2.id = T1.id
GROUP BY T1.ID
SELECT ID, AverageGrade FROM TEMP_TABLE WHERE AverageGrade > 5
UNION ALL
SELECT COUNT(ID) AS TotalCount, SUM(AverageGrade) AS Total_AVGGrade FROM TEMP_TABLE
DROP TABLE TEMP_TABLE
(Disclaimer: I'm not too familiar with mySQL, there may be some syntax errors here. The general idea should be clear, though.)
That is, of course, if i had to do it like this, there are probably better ways to achieve the same. See Thorsten Kettner's comments on the matter.
(Previous answer assuming CTE is a posibility:)
A CTE approach looks like:
WITH CTE AS
(
SELECT T1.ID
, Sum(total_grade)/COUNT(subjects) as AverageGrade
FROM A_Table T1
JOIN AnotherTable T2
ON T2.id = T1.id
GROUP BY T1.ID
)
SELECT ID, AverageGrade FROM CTE WHERE AverageGrade > 5
UNION ALL
SELECT COUNT(ID) AS TotalCount, SUM(AverageGrade) AS Total_AVGGrade FROM CTE
You have the error because every query involved in UNION doens't know the alias of other.
DB Engine execute, in your case, 4 queries and then paste them with UNION operation.
Your real table is fees. Test3 is an alias used in the third query.
If you want to process the results of UNION operation, you must encapsulate your queries in a MAIN query.
It looks like you need something like below. Please try
SELECT * FROM fees AS Test2 WHERE Percentage >= 15
UNION ALL
SELECT * FROM fees AS Test3 WHERE Percentage < 10
You can't use a table alias based on a subquery (is not in the scope of the outer united select) you must repeat the code eg:
SELECT * FROM fees AS Test1 WHERE Percentage = 15
UNION ALL
SELECT * FROM fees AS Test2 WHERE Percentage > 15
UNION ALL
SELECT * FROM (
SELECT * FROM fees AS Test3 WHERE Percentage < 10
) AS Test4
UNION ALL
SELECT * FROM fees AS Test3 WHERE Percentage < 10

Select and order by an union

Is it possible to:
SELECT * FROM table1 , table2 ORDER BY (a UNION)
I tried that but doesn't work.
I looked on Google for some answers but got nothing and I don't know how to look anymore, what to search so this is my last solution: ask here. Maybe one of you knows a clause I don't and would help in my case. I don't know how else to think this query...
The union is made between two columns from two tables (or more). So i want to order every possible row by this new column made with union. Something like (so this will be generic) :
SELECT * FROM table1 , table2 ORDER BY ((SELECT col1 AS col FROM table1) UNION ALL (SELECT col2 AS col FROM table2) ORDER BY col DESC);
Try this query like that :-
SELECT * FROM(
SELECT * FROM table1
UNION
SELECT * FROM table2
) as tab ORDER BY col_name
If you want to do the union and then order, you can do:
select t1.*
from table1 t1
union
select t2.*
from table2 t2
order by a;
Notes:
Use union all rather than union, unless you specifically want to incur the overhead of removing duplicates.
The use of * implies that the two tables have the same columns in the same order (and compatible types).

Using SQL to find all possible combinations of column variables

I have a table in SQL which has N columns. Call them "Col1", "Col2", ..., "ColN". I can find out how many unique elements there are in Col1 by the query:
select count(distinct Col1) from mytable
and I can do this, independently for each column. Assuming I have M_1 unique elements in Col1, M_2 in Col2, etc., what single command can I use to find the total number of all possible combinations for my dataset? That is, what single query would calculate (M_1*M_2*...*M_N) for me?
PS: very new to SQL here, so I'm not sure if this matters - but I am using MySQL Workbench on Windows.
SELECT COUNT(*)
FROM (SELECT DISTINCT col1 FROM YourTable) AS t1
CROSS JOIN (SELECT DISTINCT col2 FROM YourTable) AS t2
CROSS JOIN (SELECT DISTINCT col3 FROM YourTable) AS t3
...
CROSS JOIN calculates the cross product between the given tables.
Another way to write it would be:
SELECT COUNT(DISTINCT t1.col1, t2.col2, t3.col3, ...)
FROM YourTable AS t1
CROSS JOIN YourTable AS t2
CROSS JOIN YourTable AS t3
...
But probably the simplest would be:
SELECT COUNT(DISTINCT col1)*COUNT(DISTINCT col2)*COUNT(DISTINCT col3)*...
FROM YourTable
This doesn't require computing any cross-products, so it should be most efficient. If you have indexes on the columns, it won't even have to read the table data, it can all be done using the indexes.

Get data from multiple SELECT sub-queries for reporting from MySQL database

I'm trying to achieve is to create one complex query consisting of a few sub-queries. The idea is to give it to a business person to run on a weekly basis to pull reporting data.
The effect would be similar to the query below, where all data from many tables are displayed in one result.
select * from table1, table2, table3
So I need something like, but it's not working.
select
(select * from table1 where ...... ) as table1,
(select * from table2 where....... ) as table2
Manually, I could run the sub-queries separately, then manually append the results into one big excel sheet. But I want to make it easier for the business person to do this, and minimize errors.
Is this possible in MySQL?
The reason for this is I'm converting a legacy Oracle PIVOT SQL statements into the MySQL equivalence, and the sub-queries are pretty complex.
I can provide the Oracle SQL if needed.
Much appreciated as always.
After some fiddling around:
select * from
(select * from table1 where survey_user_id=4 ) as T1
,
(select * from table2 where survey_field_type_id=100 ) as T2
,
(select * from table3 ) as T3
If i understand you correctly you just need UNION :D
(SELECT column1 AS name1, column2 AS name2 FROM table1 WHERE ...... )
UNION
(SELECT column3 AS name1, column4 AS name2 FROM table2 WHERE ...... )
UNION
....
As mentioned bellow in comment,
columns need to have the same name (you can use aliases for it) and stay in the same order.
select main.*,
(select col from tbl1 where tbl1.id=main.id) as col1,
(select col from tbl2 where tbl2.id=main.id) as col2,
(select col from tbl3 where tbl3.id=main.id) as col3
from master as main

Needs help in database query

I have two tables Table1 and Table2. There are 10 fields in Table1 and 9 fields in Table2. There is one common column in both the tables i.e. AdateTime. This column saves unix time stamp of user actions. I want to display records from both the tables as a single result but sorting must me according to AdateTime. Recent action should be display first. Sometimes many recent actions in Table1 but few in Table2. Vice versa is also possible. So I want to fetch combine result set from both the tables using single query. I am using PHP MySQL.
Try
SELECT t1.*, t2.*
FROM table1 t1 INNER JOIN table2 t2
ON t1.AdateTime = t2.AdateTime
ORDER BY t1.AdateTime
or (if tables are not related)
SELECT * FROM
(SELECT ADateTime, col1, col2, col3, col4 FROM table1
UNION
SELECT ADateTime, col1, col2, 1 AS col3, NULL AS col4 FROM table2) t2
ORDER by ADateTime
I would use UNION ALL with an inline view. So something like
select col1,col2,col3,col4,col5,col6,col7,col8,col9,AdateTime
from
(
select col1,col2,col3,col4,col5,col6,col7,col8,col9,AdateTime from Table1
UNION ALL
select col1,col2,col3,col4,col5,col6,col7,col8,null as col9,AdateTime from Table2
) t
order by t.Adatetime desc;
yes you can do it. you just need to join these 2 tables with a join condition. when the join condition matches for a row only that row ll be displayed then further you can write the Code for any operation. use order by AdateTime
select t1.column_1234,t2.column_1234
from t1 table1 , t2 table2
where t1.matching_column = t2.matching_column
order by t1.AdateTime;
t1.matching_column And t2.matching_column are the Primary And Foreign keys for these tables (Matching Column)