I would like to know the best way to achieve the subquery in SELECT clause using SQLAlchemy
Here is my database table.
mysql> select * from feedback_question;
+----+---------------------+-------------+--------------+
| id | created | question_id | feedbacktype |
+----+---------------------+-------------+--------------+
| 1 | 2022-03-31 09:53:14 | 488 | 1 |
| 2 | 2022-03-31 09:53:21 | 508 | 1 |
| 3 | 2022-03-31 09:53:27 | 607 | 2 |
| 4 | 2022-03-31 09:53:31 | 606 | 1 |
| 5 | 2022-03-31 09:53:33 | 608 | 2 |
| 6 | 2022-03-31 09:55:30 | 608 | 2 |
| 7 | 2022-03-31 09:55:33 | 606 | 1 |
| 8 | 2022-03-31 09:55:40 | 607 | 1 |
| 9 | 2022-03-31 09:58:40 | 607 | 2 |
| 10 | 2022-03-31 09:59:04 | 607 | 2 |
+----+---------------------+-------------+--------------+
10 rows in set (0.00 sec)
Here is the actual query that I would like to define using SQLAlchemy.
mysql> select T.question_id,
T.feedbacktype,
count(T.feedbacktype) AS "count",
round( count(T.feedbacktype) * 100 /
(select count(U.feedbacktype) from feedback_question as U where U.question_id=T.question_id)
) AS "countpercent"
from feedback_question as T group by question_id, feedbacktype;
+-------------+--------------+-------+--------------+
| question_id | feedbacktype | count | countpercent |
+-------------+--------------+-------+--------------+
| 488 | 1 | 1 | 100 |
| 508 | 1 | 1 | 100 |
| 606 | 1 | 2 | 100 |
| 607 | 1 | 1 | 25 |
| 607 | 2 | 3 | 75 |
| 608 | 2 | 2 | 100 |
+-------------+--------------+-------+--------------+
6 rows in set (0.00 sec)
Here is how my current statement looks like:
dbsession.query( cls.question_id, cls.feedbacktype, func.count(cls.feedbacktype).label("count"), XXXXXXXXXX.label("countpercent") ).filter(cls.question_id.in_(question_ids)).group_by(cls.question_id, cls.feedbacktype).all()
I am using SQLAlchemy 1.4 and MySQL 5.7
I reviewed the SQLAlchemy documentation about subqueries and scalars, but I am not sure how to apply them in this case.
Thanks!
You will need to use scalar_subquery and aliased. This example is using postgresql but mysql should work. The output is not ordered as it is in your example. There is even more information in the correlated subquery section.
code
import sys
from sqlalchemy import (
create_engine,
Integer,
DateTime
)
from sqlalchemy.schema import (
Column,
)
from sqlalchemy.sql import select, func
from sqlalchemy.orm import declarative_base, Session, aliased
Base = declarative_base()
username, password, db = sys.argv[1:4]
engine = create_engine(f"postgresql+psycopg2://{username}:{password}#/{db}", echo=False)
class FeedbackQuestion(Base):
__tablename__ = "feedback_question"
id = Column(Integer, primary_key=True)
created_at = Column(DateTime)
question_id = Column(Integer, nullable=False)
feedbacktype = Column(Integer, nullable=False)
Base.metadata.create_all(engine)
def extract(line):
id, created_at, question_id, feedbacktype = line.split('|')[1:-1]
return dict(id=id, created_at=created_at, question_id=question_id, feedbacktype=feedbacktype)
lines = [extract(line) for line in """| 1 | 2022-03-31 09:53:14 | 488 | 1 |
| 2 | 2022-03-31 09:53:21 | 508 | 1 |
| 3 | 2022-03-31 09:53:27 | 607 | 2 |
| 4 | 2022-03-31 09:53:31 | 606 | 1 |
| 5 | 2022-03-31 09:53:33 | 608 | 2 |
| 6 | 2022-03-31 09:55:30 | 608 | 2 |
| 7 | 2022-03-31 09:55:33 | 606 | 1 |
| 8 | 2022-03-31 09:55:40 | 607 | 1 |
| 9 | 2022-03-31 09:58:40 | 607 | 2 |
| 10 | 2022-03-31 09:59:04 | 607 | 2 |
""".splitlines() if line.strip()]
with Session(engine) as session, session.begin():
session.add_all([FeedbackQuestion(**line) for line in lines])
with Session(engine) as session, session.begin():
T = aliased(FeedbackQuestion, name="T")
U = aliased(FeedbackQuestion, name="U")
denom = select(func.count(U.feedbacktype)).where(U.question_id == T.question_id).scalar_subquery()
q = select(T.question_id, T.feedbacktype, func.count(T.feedbacktype).label("count"), func.round((func.count(T.feedbacktype) * 100) / denom).label("countpercent")).group_by(T.question_id, T.feedbacktype)
print (q)
for result in session.execute(q):
print(result)
query
SELECT "T".question_id, "T".feedbacktype, count("T".feedbacktype) AS count, round((count("T".feedbacktype) * :count_1) / (SELECT count("U".feedbacktype) AS count_2
FROM feedback_question AS "U"
WHERE "U".question_id = "T".question_id)) AS countpercent
FROM feedback_question AS "T" GROUP BY "T".question_id, "T".feedbacktype
output
(606, 1, 2, 100.0)
(608, 2, 2, 100.0)
(488, 1, 1, 100.0)
(607, 1, 1, 25.0)
(607, 2, 3, 75.0)
(508, 1, 1, 100.0)
Related
so i have a query like this :
SELECT * FROM my_db.my_order where product_id = 395 order by id desc;
and the output of this table is like this
+----------+----------+----------+---------------------+------------+--------+
| order_id | buyer_Id | quantity | createdAt | product_id | status |
+----------+----------+----------+---------------------+------------+--------+
| 6232 | 89450 | 1 | 2020-05-06 17:44:41 | 395 | 1 |
| 6232 | 89450 | 1 | 2020-05-06 17:44:41 | 395 | 1 |
| 6232 | 23048 | 2 | 2020-05-06 17:44:41 | 395 | 1 |
| 6232 | 89464 | 1 | 2020-05-06 17:44:40 | 395 | 1 |
| 6232 | 89463 | 1 | 2020-05-06 17:44:40 | 395 | 1 |
| 6232 | 89463 | 2 | 2020-05-06 17:43:25 | 395 | 0 |
| 6232 | 89464 | 2 | 2020-05-06 17:43:19 | 395 | 0 |
+----------+----------+----------+---------------------+------------+--------+
so i want to count total of quantity for this product_id where the status are = 1, so i made this query
SELECT
SUM(grouped_my_order_tbl.quantity)
FROM
(
SELECT
mo.order_seller_id AS order_seller_id,
mo.quantity AS quantity,
mo.status AS status,
mo.product_id AS product_id
FROM my_order mo
) grouped_my_order_tbl
WHERE
grouped_my_order_tbl.product_id = 395
AND grouped_my_order_tbl.order_seller_id = 6232
AND grouped_my_order_tbl.status = 1;
and the output for the count is 6 instead of 5
where's my wrong at?
expected result : sum = 5
update : the exact sum should be 5 because there's duplicate buyer_id and createdAt so the system read the data twice, instead it's just only 1 record not 2 record
I have a mysql query that sometimes results in missing values. For my dashboard I'd like to fill those values, but would prefer to avoid build dummy tables if I can.
query:
SELECT COUNT(Comms_Timestamp) as call_count,DAYOFWEEK(Comms_Timestamp) as bucket
FROM tblTest GROUP BY bucket;
results in
+------------+--------+
| call_count | bucket |
+------------+--------+
| 4 | 1 |
| 7 | 2 |
| 7 | 3 |
| 1 | 5 |
| 6 | 6 |
| 1 | 7 |
+------------+--------+
In the above example you can see bucket 4 is missing. I consider the method where the join is to a select union array, however since both fields are aggregates, I'm not sure how to go about it.
test data is
+---------------------+
| Comms_Timestamp |
+---------------------+
| 2018-12-24 06:04:05 |
| 2018-12-24 12:18:39 |
| 2018-12-21 04:24:31 |
| 2018-12-21 08:32:44 |
| 2018-12-30 01:41:06 |
| 2018-12-30 01:53:00 |
| 2018-12-30 01:53:39 |
| 2018-12-30 02:00:01 |
| 2018-12-17 15:55:03 |
| 2018-12-17 16:04:12 |
| 2018-12-17 16:05:41 |
| 2018-12-17 16:07:43 |
| 2018-12-17 16:10:25 |
| 2018-12-18 14:03:22 |
| 2018-12-18 14:03:29 |
| 2018-12-18 14:10:19 |
| 2018-12-18 14:10:29 |
| 2018-12-18 14:10:31 |
| 2018-12-18 14:10:47 |
| 2018-12-18 14:10:55 |
| 2018-12-20 08:21:07 |
| 2018-12-28 11:03:59 |
| 2018-12-28 12:06:40 |
| 2018-12-28 12:15:01 |
| 2018-12-28 14:29:24 |
| 2019-01-05 13:33:43 |
+---------------------+
Since you are using mysql and don't have access to the seq_ option, here is an alternative way:
SELECT A.x AS bucket, IF(ISNULL(COUNT(t2.Comms_Timestamp)), 0, COUNT(t2.Comms_Timestamp)) AS call_count FROM
(select 1 x union select 2 union select 3 union select 4 union select 5 union select 6 union select 7) AS A
LEFT JOIN tblTest AS t2 ON DAYOFWEEK(t2.Comms_Timestamp) = A.x
GROUP BY bucket
ORDER BY bucket;
It may not be the prettiest option but will do what you need.
Here is a db fiddel link: db<>fiddle
If you are using MariaDB there is their Sequence Storage Engine
There is no create needed for this table, however the maximum value must be known.
select version();
| version() |
| :------------------------------------------ |
| 10.3.11-MariaDB-1:10.3.11+maria~stretch-log |
create table bob (a int)
✓
insert into bob values (4),(2)
✓
select * from seq_1_to_5
| seq |
| --: |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
SELECT s.seq, bob.a
FROM seq_1_to_5 s
LEFT JOIN bob
ON bob.a = s.seq
ORDER BY s.seq
seq | a
--: | ---:
1 | null
2 | 2
3 | null
4 | 4
5 | null
db<>fiddle here
You can use IFNULL() function in MYSQL:
SELECT IFNULL(COUNT(C.Comms_Timestamp),0) as call_count,IFNULL(DAYOFWEEK(C.Comms_Timestamp),0) as bucket
FROM tblCommunication as C LEFT JOIN tblCareTeam as CT on C.id_Case = CT.id_Case
GROUP BY CT.id_Site,bucket
HAVING CT.id_Site=8;
I have 2 simple tables. First table contains the absences made by the workers.
tbl_absences
+----+-------------+------------+-----------------------+
| id | id_employee | day_number | is_covered_by_a_range |
+----+-------------+------------+-----------------------+
| 1 | 1 | 18 | true/false |
| 2 | 1 | 3 | true/false |
| 3 | 2 | 21 | true/false |
| 4 | 1 | 13 | true/false |
| 5 | 2 | 22 | true/false |
| 6 | 1 | 10 | true/false |
| 7 | 1 | 7 | true/false |
.....
The second table contains periods during which the worker was ill and was a medial leave ( days between range_start and range_end )
tbl_sick_leave
+----+-------------+-------------+-----------+
| id | id_employee | range_start | range_end |
+----+-------------+-------------+-----------+
| 1 | 1 | 4 | 8 |
| 2 | 1 | 13 | 18 |
| 3 | 1 | 15 | 21 |
| 4 | 2 | 9 | 12 |
.....
I want that column is_covered_by_a_range in table tbl_absences to hold a boolean value: true if that day is covered by any range in tbl_sick_leave, and false if otherwise.
To update only for worker 1, my approach is a query something like this:
update tbl_absences AS a
SET a.is_covered_by_a_range=(EXISTS(
SELECT * FROM tbl_sick_leave AS s
WHERE s.id_employee=1 AND a.day_number BETWEEN s.range_start AND s.range_end
))
WHERE a.id_employee=1
Note: tbl_sick_leave can hold more than one range that covers a specific day.
Can this be done better? I'm afraid of the performance of such a query when it comes to big tables.
TEST 1
tbl_absences
+----+-------------+------------+-----------------------+
| id | id_employee | day_number | is_covered_by_a_range |
+----+-------------+------------+-----------------------+
| 1 | 1 | 5 | -1 |
| 2 | 1 | 10 | -1 |
+----+-------------+------------+-----------------------+
tbl_sick_leave
+----+-------------+-------------+-----------+
| id | id_employee | range_start | range_end |
+----+-------------+-------------+-----------+
| 1 | 1 | 2 | 6 |
+----+-------------+-------------+-----------+
Using this query it will update both rows:
update tbl_absences AS a
SET a.is_covered_by_a_range=(EXISTS(
SELECT * FROM tbl_sick_leave AS s
WHERE s.id_employee=1 AND a.day_number BETWEEN s.range_start AND s.range_end
))
WHERE a.id_employee=1
result (correct value in is_covered_by_a_range ):
tbl_absences
+----+-------------+------------+-----------------------+
| id | id_employee | day_number | is_covered_by_a_range |
+----+-------------+------------+-----------------------+
| 1 | 1 | 5 | 1 |
| 2 | 1 | 10 | 0 |
+----+-------------+------------+-----------------------+
You can do it with a JOIN
UPDATE tbl_absences AS a
LEFT JOIN tbl_sick_leave AS s ON a.id_employee = s.id_employee AND a.day_number BETWEEN s.range_start AND s.range_end
SET a.is_covered_by_a_range = s.id IS NOT NULL
WHERE a.id_employee = 1
It's hard to predict the performance, but you should certainly start with indexes on id_employee in both tables. If that's not good enough, a composite index on (id_employee, day_number in tbl_absences might help. If not, post the output of an EXPLAIN query.
I have three tables:
ITEMS
+----+--------------------------+----------+
| id | nome | quantity |
+----+--------------------------+----------+
| 1 | Pantaloni beige | 10 |
| 2 | Camicia cotone e seta | 1 |
| 3 | Camicia da notte | 5 |
| 4 | Completo notte | 3 |
+----+--------------------------+----------+
TRANSACTIONS
+----+---------------------+----------+-------------+
| id | data | quantity | id_articolo |
+----+---------------------+----------+-------------+
| 1 | 2016-07-19 15:28:09 | 3 | 1 |
| 2 | 2016-07-19 15:29:50 | 1 | 1 |
| 3 | 2016-07-19 15:59:34 | 1 | 2 |
| 4 | 2016-07-19 16:00:59 | 1 | 3 |
| 5 | 2016-07-19 16:01:10 | 1 | 188 |
| 6 | 2016-07-19 16:11:15 | 1 | 193 |
| 7 | 2016-07-19 16:11:24 | 1 | 194 |
| 8 | 2016-07-19 16:11:55 | 1 | 195 |
| 9 | 2016-07-19 16:51:14 | 1 | 204 |
+----+---------------------+----------+-------------+
RETURNED_ITEMS
+----+---------+-------------+----------+
| id | id_reso | id_articolo | quantity |
+----+---------+-------------+----------+
| 1 | 54 | 1 | 6 |
| 2 | 54 | 3 | 1 |
| 3 | 54 | 392 | 1 |
| 4 | 54 | 398 | 1 |
+----+---------+-------------+----------+
joined on "transactions.id_articolo" = "returned_items.id_articolo" = "items.id"
I want to retrieve a complete list containing only available products in which
(items.quantity) - (transactions.quantity) - (returned_items.quantity) > 0
eg. In the data above
item 1 = 0 [excluded]
item 2 = 0 [excluded]
item 3 = 3 [included in the list]
item 4 = 3 [included]
Any idea?
Thanks a lot!
V.
Looks like you will need to use inline views to aggregate quantity from the transactions table and the returned items tables
SELECT i.id
, i.quantity
, IFNULL(t.quantity,0) AS t_quantity
, IFNULL(r.quantity,0) AS r_quantity
, i.quantity - IFNULL(t.quantity,0) + IFNULL(r.quantity,0) AS calc_qty
FROM items i
LEFT
JOIN ( SELECT tt.id_articolo
, SUM(tt.quantity) AS quantity
FROM transactions tt
GROUP BY tt.id_articolo
) t
ON t.id_articolo = i.id
LEFT
JOIN ( SELECT rr.id_articolo
, SUM(rr.quantity) AS quantity
FROM returned_items rr
GROUP BY rr.id_articolo
) r
ON r.id_articolo = i.id
HAVING calc_qty > 0
ORDER BY i.id
For testing, omit the HAVING clause.
Note that the expression for calc_qty in the query above includes an addition operation. Go ahead and combine the quantity values with whatever arithmetic operations you need. Go ahead and do a subtraction is that satisfies the requirements.
I have a single table like :
mysql> select RefID,State,StartTime,EndTime from execReports limit 5;
+--------------------------------------+-----------+---------------------+---------------------+
| RefID | State | StartTime | EndTime |
+--------------------------------------+-----------+---------------------+---------------------+
| 00019a52-8480-4431-9ad2-3767c3933627 | Completed | 2016-04-18 13:45:00 | 2016-04-18 13:45:01 |
| 00038a8a-995e-4cb2-a335-cb05d5b3e92d | Aborted | 2016-05-03 04:00:00 | 2016-05-03 04:00:02 |
| 001013f8-0b86-456f-bd59-a7ef066e565f | Completed | 2016-04-14 03:30:00 | 2016-04-14 03:30:11 |
| 001f8d23-3022-4271-bba0-200494de678a | Failed | 2016-04-30 05:00:00 | 2016-04-30 05:00:02 |
| 0027ba42-1c37-4e50-a7d6-a4e24056e080 | Completed | 2016-04-18 03:45:00 | 2016-04-18 03:45:02 |
+--------------------------------------+-----------+---------------------+---------------------+
I can extract the count of exec for each state with :
mysql> select distinct State,count(StartTime) as nbExec from execReports group by State;
+-----------+--------+
| State | nbExec |
+-----------+--------+
| Aborted | 3 |
| Completed | 14148 |
| Failed | 49 |
+-----------+--------+
4 rows in set (0.02 sec)
I can extract the count of exec for each week with :
mysql> select distinct extract(week from StartTime) as Week, count(StartTime) as nbExec from execReports group by Week;
+------+--------+
| Week | nbExec |
+------+--------+
| 14 | 1317 |
| 15 | 3051 |
| 16 | 3066 |
| 17 | 3059 |
| 18 | 3059 |
| 19 | 652 |
+------+--------+
6 rows in set (0.01 sec)
But I would like to extract a crossing table like :
+------+---------+-----------+--------+---------+---------+
| Week | nbExec | Completed | Failed | Running | Aborted |
+------+---------+-----------+--------+---------+---------+
| 14 | 1317 | 1312 | 3 | 1 | 1 |
| 15 | 3051 | 3050 | 1 | 0 | 0 |
| 16 | 3066 | 3060 | 3 | 2 | 1 |
| 17 | 3059 | 3058 | 0 | 1 | 0 |
| 18 | 3059 | 3057 | 1 | 0 | 1 |
| 19 | 652 | 652 | 0 | 0 | 0 |
+------+---------+-----------+--------+---------+---------+
I'm stuck on this for a few days. Any help appreciated.
Best regards
select extract(week from StartTime) as Week, count(StartTime) as nbExec,
sum(if(state="Completed",1,0)) Completed,
sum(if(state="Failed",1,0)) Failed,
sum(if(state="Aborted",1,0)) Aborted
from execReports group by Week;
demo
You can join multi tables for this. If you want for dynamic row to column, check this: MySQL pivot row into dynamic number of columns
SELECT
a.week,
count(a.StartTime) as nbExec,
count(b1.StartTime) as Completed,
count(b2.StartTime) as Failed,
count(b3.StartTime) as Running,
count(b4.StartTime) as Aborted,
FROM execReports a
LEFT JOIN execReports b1 ON a.refID = b1.refID and b1.state ='Completed'
LEFT JOIN execReports b2 ON a.refID = b2.refID and b2.state ='Failed'
LEFT JOIN execReports b3 ON a.refID = b3.refID and b3.state ='Running'
LEFT JOIN execReports b4 ON a.refID = b4.refID and b4.state ='Aborted'
GROUP BY 1