Translate mySQL query to Postgres - mysql

I'm not able to write a Postgres query. I always get an error or get wrong results. I try to compare counts between today and yesterday.
This MySQL query which is working fine:
SELECT
DATE_FORMAT(crh.date, '%d-%m-%Y') AS name,
DATE_FORMAT(crh.date, '%Y-%m-%d') AS nameGroup,
COUNT(crh.id) AS turnover,
crh_.name AS nameChr,
crh_.nameGroup AS nameGroupChr,
crh_.turnover AS turnoverChr
FROM
camera_reports_history AS crh
LEFT JOIN(
SELECT DATE_FORMAT(crh_.date, '%d-%m-%Y') AS name,
DATE_FORMAT(crh_.date, '%Y-%m-%d') AS nameGroup,
COUNT(crh_.id) AS turnover
FROM
camera_reports_history AS crh_
WHERE
crh_.date >= '2018-07-09 00:00:00' AND crh_.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup
) AS crh_
ON
crh_.nameGroup = DATE_FORMAT(SUBDATE(crh.date, 1),
'%Y-%m-%d')
WHERE
crh.date >= '2018-07-10 00:00:00' AND crh.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup
Result:
"10-07-2018","2018-07-10","418","09-07-2018","2018-07-09","581"
"11-07-2018","2018-07-11","389","10-07-2018","2018-07-10","418"
"12-07-2018","2018-07-12","453","11-07-2018","2018-07-11","389"
"13-07-2018","2018-07-13","401","12-07-2018","2018-07-12","453"
...
My PostgreSQL query looks like this:
SELECT
to_char(crh."date", 'DD-MM-YYYY') AS name,
to_char(crh."date", 'YYYY-MM-DD') AS nameGroup,
COUNT(crh.id) AS turnover,
crh_.name AS nameChr,
crh_.nameGroup AS nameGroupChr,
crh_.turnover AS turnoverChr
FROM
camera_reports_history AS crh
LEFT JOIN(
SELECT to_char(crh_."date", 'DD-MM-YYYY') AS name,
to_char(crh_."date", 'YYYY-MM-DD') AS nameGroup,
COUNT(crh_.id) AS turnover
FROM
camera_reports_history AS crh_
WHERE
crh_.date >= '2018-07-09 00:00:00' AND crh_.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup,
name
ORDER BY
nameGroup
) AS crh_
ON
crh_.nameGroup = to_char(
crh."date" - INTERVAL '1 day',
'YYYY-MM-DD'
)
WHERE
crh.date >= '2018-07-10 00:00:00' AND crh.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup,
name
ORDER BY
nameGroup
errors:
ERROR: column `crh.date` must appear in the GROUP BY clause or be used in an aggregate function
if I insert necessary columns:
GROUP BY nameGroup, name, date, crh_.nameGroup, crh_.name, crh_.turnover
I will get useless results.
Could someone help me please?

I found a solution:
SELECT
DISTINCT ON(nameGroup)
to_char(crh."date", 'DD-MM-YYYY') AS name,
to_char(crh."date", 'YYYY-MM-DD') AS nameGroup,
crh__.turnover AS turnover,
crh_.name AS nameChr,
crh_.nameGroup AS nameGroupChr,
crh_.turnover AS turnoverChr
FROM
camera_reports_history AS crh
LEFT JOIN(
SELECT
to_char(crh__."date", 'YYYY-MM-DD') AS nameGroup__,
COUNT(crh__.id) AS turnover
FROM
camera_reports_history AS crh__
WHERE
crh__.date >= '2018-07-10 00:00:00' AND crh__.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup__
) AS crh__ ON crh__.nameGroup__ = to_char((crh."date"), 'YYYY-MM-DD')
LEFT JOIN(
SELECT to_char(crh_."date", 'DD-MM-YYYY') AS name,
to_char(crh_."date", 'YYYY-MM-DD') AS nameGroup,
COUNT(crh_.id) AS turnover
FROM
camera_reports_history AS crh_
WHERE
crh_.date >= '2018-07-09 00:00:00' AND crh_.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup,
name
ORDER BY
nameGroup
) AS crh_ ON crh_.nameGroup = to_char((crh."date" - INTERVAL '1 day'), 'YYYY-MM-DD')
WHERE
crh.date >= '2018-07-10 00:00:00' AND crh.date <= '2018-07-20 14:02:22'
GROUP BY
nameGroup,
name,
crh.date,
crh__.turnover,
crh_.nameGroup,
crh_.name,
crh_.turnover
ORDER BY
nameGroup
I make a 2nd LEFT JOIN to count the results and use DISTINCT ON(nameGroup) on the main SELECT.
I think it's not a perfect query. but my problem is solved at this moment.
Please feel free to optimate this query.

Transaction blocks work well in postgresql, so instead of a complex compound statement that attempts to sort, join, group and filter all at once, you can use temporary tables. This allows you to break the steps up into multiple simple statements and store intermediate data in temp tables that disappear at the end of the transaction block.
I think you'll find this method a lot easier to debug.

Related

Redshift SQL Query Between Current Date and 7 days ago

I've been trying to filter the data for the last X number of days.
All these columns work as standalone results when I remove the time filter from the where clause.
I keep on getting the error of no results when I join and add a time filter in where clause.
SELECT x.datex, Signups, Page_load FROM (SELECT CAST (mp_date AS DATE) AS datex, mp_event_name, COUNT(DISTINCT mp_device_id) AS Signups
FROM mp_master_event
WHERE mp_event_name = 'email_page_submit' AND datex between CURRENT_DATE AND CURRENT_DATE - INTERVAL '7 DAY'
GROUP BY mp_event_name, datex
ORDER BY datex DESC) x JOIN (SELECT CAST(mp_date AS DATE) AS datex, mp_event_name, COUNT(DISTINCT mp_device_id) AS Page_load
FROM mp_master_event
WHERE mp_event_name = 'home_page_load_confirm' AND datex between CURRENT_DATE AND CURRENT_DATE - INTERVAL '7 DAY'
GROUP BY datex, mp_event_name
ORDER BY datex DESC) y ON x.datex = y.datex
What should I do? Also since the table is huge, it's taking a lot of time to query and this is just 10% of what I want to achieve with the whole query. Any suggestions are appreciated.
Got it :)
datex >= DATE(dateadd(DAY,-7, current_date))
SELECT x.datex, Signups, Page_load FROM (SELECT CAST (mp_date AS DATE) AS datex, mp_event_name, COUNT(DISTINCT mp_device_id) AS Signups
FROM mp_master_event
WHERE mp_event_name = 'email_page_submit' AND datex >= DATE(dateadd(DAY,-7, current_date))
GROUP BY mp_event_name, datex
ORDER BY datex DESC) x JOIN (SELECT CAST(mp_date AS DATE) AS datex, mp_event_name, COUNT(DISTINCT mp_device_id) AS Page_load
FROM mp_master_event
WHERE mp_event_name = 'home_page_load_confirm' AND datex >= DATE(dateadd(DAY,-7, current_date))
GROUP BY datex, mp_event_name
ORDER BY datex DESC) y ON x.datex = y.datex
datex between CURRENT_DATE AND CURRENT_DATE - INTERVAL '7 DAY'
should be
datex between CURRENT_DATE - INTERVAL '7 DAY' AND CURRENT_DATE
earlier date should be first

sql group by with double conditions

I need to get the amount of distinct parent_ids that fill in one of the conditions below , grouped by day:
parent_ids that have both status = pending & processing
OR
parent_ids who have both status = canceled and processing.
I ve tried something similar to :
SELECT count(parent_id) as pencan, created_at, DATE_FORMAT(a.created_at, '%Y') AS year_key, DATE_FORMAT(a.created_at, '%m-%d') as day_key
FROM sales_flat_order_status_history
where created_at BETWEEN '2010-01-01 00:00:00' AND '2013-04-30 23:59:59'
GROUP BY created_at ,parent_id
HAVING SUM(status = 'processing')
AND SUM(status IN ('pending', 'cancelling'))
I think you just need to fix the group by:
SELECT DATE(created_at), count(parent_id) as pencan
FROM sales_flat_order_status_history
where created_at >= '2010-01-01' AND
created_at < '2013-05-01'
GROUP BY DATE(created_at) , parent_id
HAVING SUM(status = 'processing') AND
SUM(status IN ('pending', 'cancelling'))

SQL Sub query with Grouping

SELECT customer_email, count(*) AS Order_Count,
MAX(created_at) as Last_Order_Date,
SUM(base_total_paid) AS Total_Lifetime_Sales,
SUM(base_total_offline_refunded+base_total_online_refunded) AS Refund_Total,
FROM mage_sales_order AS o
WHERE o.created_at > “2018-01-01”
AND
value NOT IN (Select customer_email
FROM mage_sales_order
WHERE WHERE o.created_at < “2018-10-01”)
Trying to remove orders that have been purchased in the last week, however gets stuck on the WHERE AND, And not sure! Thank you for any help!
I think you want to remove email id who ordered < '2018-01-01' so it would be below query
SELECT customer_email, count(*) AS Order_Count,
MAX(created_at) as Last_Order_Date,
SUM(base_total_paid) AS Total_Lifetime_Sales,
SUM(base_total_offline_refunded+base_total_online_refunded) AS Refund_Total
FROM mage_sales_order AS o
WHERE o.created_at > '2018-01-01' AND
customer_email NOT IN (Select customer_email
FROM mage_sales_order
WHERE o.created_at < '2018-10-01'
)
group by customer_email
You need a group by. The answer to your question is:
SELECT customer_email, count(*) AS Order_Count,
MAX(created_at) as Last_Order_Date,
SUM(base_total_paid) AS Total_Lifetime_Sales,
SUM(base_total_offline_refunded+base_total_online_refunded) AS Refund_Total
FROM mage_sales_order AS o
WHERE o.created_at < CURRENT_DATE - INTERVAL '1 week'
GROUP BY customer_email;
If you want to filter customers who haven't made a recent order:
SELECT customer_email, count(*) AS Order_Count,
MAX(created_at) as Last_Order_Date,
SUM(base_total_paid) AS Total_Lifetime_Sales,
SUM(base_total_offline_refunded+base_total_online_refunded) AS Refund_Total
FROM mage_sales_order AS o
GROUP BY customer_email
HAVING MAX(o.created_at) < CURRENT_DATE - INTERVAL '1 week'
Note that date/time functions differ by database, so the exact syntax might differ depending on what database you are using.

Better performance in MySQL subqueries for timeline graph

i have a query with subqueries for a timeline widget of participants, leads and customers.
For example with 15k rows in the table but only 2k in this date range (January 1st to January 28th) this takes about 40 seconds!
SELECT created_at as date,
(
SELECT COUNT(id)
FROM participant
WHERE created_at <= date
) as participants,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "lead"
AND created_at <= date
) as leads,
(
SELECT COUNT(DISTINCT id)
FROM participant
WHERE participant_type = "customer"
AND created_at <= date
) as customer
FROM participant
WHERE created_at >= '2016-01-01 00:00:00'
AND created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
How can i improve the performance?
The table fields are declared as follows:
id => primary_key, INT 10, auto increment
participant_type => ENUM "lead,customer", NULLABLE, ut8_unicode_ci
created_at => TIMESTAMP, default '0000-00-00 00:00:00'
Possibly try using conditions within the counts (or sums) to get the values you want, having cross joined things:-
SELECT a.created_at as date,
SUM(IF(b.created_at <= a.created_at, 1, 0)) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead" AND b.created_at <= a.created_at, b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer" AND b.created_at <= a.created_at, b.id, NULL)) AS customer
FROM participant a
CROSS JOIN participant b
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
or maybe move the date check into the join
SELECT a.created_at as date,
COUNT(b.id) AS participants,
COUNT(DISTINCT IF(b.participant_type = "lead", b.id, NULL)) AS leads,
COUNT(DISTINCT IF(b.participant_type = "customer", b.id, NULL)) AS customer
FROM participant a
LEFT OUTER JOIN participant b
ON b.created_at <= a.created_at
WHERE a.created_at >= '2016-01-01 00:00:00'
AND a.created_at <= '2016-01-28 23:59:59'
GROUP BY date(date)
I'm not clearly understanding what you want to do with this query. But may I can provide way for optimization.
Try this one:
SELECT
participants.day as day,
participants.total_count,
leads.lead_count,
customer.customer_count
FROM
(
SELECT created_at as day, COUNT(id) as total_count
FROM participant
WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as participants
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as lead_count
FROM participant
WHERE participant_type = "lead"
AND created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as leads ON (participants.day = leads.day)
LEFT JOIN
(
SELECT created_at as day, COUNT(DISTINCT id) as customer_count
FROM participant
WHERE participant_type = "customer"
AND WHERE created_at BETWEEN '2016-01-01 00:00:00' AND '2016-01-28 23:59:59'
GROUP BY day
) as customer ON (participants.day = customer.day)
Add index to the query. You can execute Explain on this query.
With the help of EXPLAIN, you can see where you should add indexes to tables so that the statement executes faster by using indexes to find rows.

MYSQL UNION GROUP BY

I'm doing this select statement:
SELECT * FROM (
SELECT COUNT(t.text) as count, COUNT(DISTINCT(t.from_user_id)) as usercount, DATE_FORMAT(t.created_at,'%Y-%m-%d %H:00') datepart
FROM TABLE1 t WHERE t.created_at >= '2015-08-12 00:00:00' AND t.created_at <= '2015-08-13 18:30:00' AND t.eliminar IS NULL
GROUP BY datepart) as t
UNION ALL
SELECT * FROM (
SELECT COUNT(b.id) as count, COUNT(DISTINCT(b.from_user_id)) as usercount, DATE_FORMAT(b.created_at,'%Y-%m-%d %H:00') datepart
FROM TABLE2 b WHERE b.created_at >= '2015-08-12 00:00:00' AND b.created_at <= '2015-08-13 18:30:00' AND b.eliminar IS NULL
GROUP BY datepart) as x GROUP BY datepart
this select gets this:
I'm trying to view with datepart grouped but I can't, any idea what I'm doing wrong?
TABLE2 only have (id,from_user_id,eliminar) and all are NULL except created_at, in this row I have entire 2015 year by day and hour, same format as TABLE1
SOLVED:
SELECT DISTINCT * FROM (
SELECT COUNT(t.text) as count, COUNT(DISTINCT(t.from_user_id)) as usercount, DATE_FORMAT(t.created_at,'%Y-%m-%d %H:00') datepart
FROM TABLE1 t WHERE t.created_at >= '2015-08-12 00:00:00' AND t.created_at <= '2015-08-13 18:30:00' AND t.eliminar IS NULL
GROUP BY datepart
UNION ALL
SELECT COUNT(t.id) as count, COUNT(DISTINCT(t.from_user_id)) as usercount, DATE_FORMAT(t.created_at,'%Y-%m-%d %H:00') datepart
FROM TABLE2 t WHERE t.created_at >= '2015-08-12 00:00:00' AND t.created_at <= '2015-08-13 18:30:00' AND t.eliminar IS NULL
GROUP BY datepart) as x GROUP BY datepart ORDER BY datepart