Counting refferers new members between 2 dates from the same members table - mysql

The project I'm working on right now needs a reference system (currently they have 50K members). I decided to add ref and ref_id field in members table.
Structure of members table;
id (int auto),
admin (enum (1,0)),
ref (enum (1,0)),
ref_id (int),
country_id (int),
city_id(int),
town_id(int),
totalRef (int),
fullName (varchar),
registrationDate (datetime)
I would like to list referers data which has new members between 2 dates. I wanted to provide a bit more details so I also tried to add country, city, and town in the query. I tried following query but I don't think this is a good approach to go with considering it takes really long time to load ;
SELECT m.id, m.fullName, m.country_id, m.city_id, m.town_id, m.totalRef,
(select name from country where country.id = m.country_id) as countryName,
(select name from city where city.id = m.city_id) as cityName,
(select name from town where town.id = m.town_id) as townName,
(select count(id) from members where members.ref_id = m.id AND ref_id > 0 AND registrationDate BETWEEN '2011.11.04 00:00:00' AND '2011.11.04 23:59:59') as newRef
FROM members as m
WHERE
m.country_id = '224' AND
m.city_id = '4567' AND
m.town_id = '78964' AND
m.admin = '0' AND
m.ref = '1'
ORDER BY newRef DESC
LIMIT 0, 25
I will be glad if you could help me about this problem. Thank you in advance.

Something like this -
SELECT
m.id,
m.fullName,
m.country_id,
m.city_id,
m.town_id,
m.totalRef,
cnt.name countryName,
ct.name cityName,
t.name townName,
m2.newRef
FROM members as m
LEFT JOIN country cnt
ON cnt.id = m.country_id
LEFT JOIN city ct
ON ct.id = m.city_id
LEFT JOIN town t
ON t.id = m.town_id
LEFT JOIN (
SELECT ref_id, COUNT(id) newRef FROM members
WHERE ref_id > 0 AND registrationDate BETWEEN '2011.11.04 00:00:00' AND '2011.11.04 23:59:59'
GROUP BY ref_id
) m2
ON m2.ref_id = m.id
WHERE
m.country_id = '224' AND
m.city_id = '4567' AND
m.town_id = '78964' AND
m.admin = '0' AND
m.ref = '1'
ORDER BY
newRef DESC
LIMIT
0, 25;

Related

mysql: simpler and more efficient query for getting unique products

I would like to optimize my database query but I am not sure how to do this.
I want to get a list of stores' products opinions, ordered by opinion dates (from newest to oldest ones), but the products need to be unique.
For example, there are 3 users: U1, U2, U3.
There are 2 stores in the city:
S1 (with products P11, P12, P13, P14)
S2 (with products P21, P22, P23, P24)
Users added some opinions (the newest on the top, the oldest on the bottom):
U1: P22
U1: P13
U2: P21
U3: P13
U2: P23
U1: P23
What I want to achieve is:
U1: P22
U1: P13
U2: P21
U2: P23
The query I created is very long and a bit complicated. Could I simplify it somehow?
$sql_query = "
SELECT a.*
, b.name AS 'store_name'
, b.city AS 'store_city'
, c.name AS 'product_name'
FROM `app_products_opinion` AS a
JOIN `app_products_stores` AS b
ON a.store_ID = b.ID
JOIN `app_products` AS c
ON a.product_ID = c.ID
WHERE a.created_on IN
(
SELECT max(created_on) as created_on
FROM app_products_opinion
WHERE show_on_list='1' AND (added_by='".$_SESSION["CMSUserID"]."' OR status = '1')
GROUP by product_ID
ORDER by created_on DESC
)
AND a.show_on_list='1'
AND a.store_ID='".$id_store['ID']."' $addtosql
AND a.photo != ''
AND (a.added_by='".$_SESSION["CMSUserID"]."' OR a.status='1')
ORDER BY a.created_on DESC
";
You could try grouping by product_id and also joining by product_ID and date
(simplified code)
SELECT a.user_id, a.product_ID
from app_products_opinion a
INNER JOIN (
SELECT product_ID, max(created_on) as created_on
FROM app_products_opinion
WHERE show_on_list='1' AND (added_by='".$_SESSION["CMSUserID"]."' OR status = '1')
GROUP by product_ID
ORDER by created_on DESC
) t on a.created_on = t.created_on
AND a.product_ID = t.product_ID
I don't know if you think it's simpler (and ignoring, $addtosql) but you could do this...
SELECT a.*
, b.name AS 'store_name'
, b.city AS 'store_city'
, c.name AS 'product_name'
FROM `app_products_opinion` AS a
JOIN `app_products_stores` AS b
ON a.store_ID = b.ID
JOIN `app_products` AS c
ON a.product_ID = c.ID
JOIN
(
SELECT product_id
, max(created_on) created_on
FROM app_products_opinion
WHERE show_on_list = 1
AND (added_by = 'M' OR status = 1)
GROUP
by product_ID
) x
ON a.created_on = x.created_on
AND a.product_id = x.product_id
AND a.show_on_list = 1
AND a.store_ID = 'N'
AND a.photo != ''
AND (a.added_by = 'Z' OR a.status = 1)

How can I optimize my sql code?

I have following tables
contacts
contact_id | contact_slug | contact_first_name | contact_email | contact_date_added | company_id | contact_is_active | contact_subscribed | contact_last_name | contact_company | contact_twitter
contact_campaigns
contact_campaign_id | contact_id | contact_campaign_created | company_id | contact_campaign_sent
bundle_feedback
bundle_feedback_id | bundle_id, contact_id | company_id | bundle_feedback_rating | bundle_feedback_favorite_track_id | bundle_feedback_supporting | campaign_id
bundles
bundle_id | bundle_name | bundle_created | company_id | bundle_is_active
tracks
track_id | company_id | track_title
I wrote this query, but it works slowly, how can I optimize this query to make it faster ?
SELECT SQL_CALC_FOUND_ROWS c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks/sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.bundle_feedback_supporting, 0) AS feedbackSupporting
FROM contacts AS c
LEFT JOIN
(SELECT c.contact_id,
count(cc.contact_campaign_id) AS sendCampaignsCount
FROM contacts AS c
LEFT JOIN contact_campaigns AS cc ON cc.contact_id = c.contact_id
WHERE c.company_id = '876'
AND c.contact_is_active = '1'
AND cc.contact_campaign_sent = '1'
GROUP BY c.contact_id) AS icc ON icc.contact_id = c.contact_id
LEFT JOIN
(SELECT bf.contact_id,
count(*) AS countfeedbacks,
bf.bundle_feedback_supporting
FROM bundle_feedback bf
JOIN bundles b
JOIN contacts c
LEFT JOIN tracks t ON bf.bundle_feedback_favorite_track_id = t.track_id
WHERE bf.bundle_id = b.bundle_id
AND bf.contact_id = c.contact_id
AND bf.company_id='876'
GROUP BY bf.contact_id) AS ibf ON ibf.contact_id = c.contact_id
WHERE c.company_id = '876'
AND contact_is_active = '1'
ORDER BY percentFeedback DESC LIMIT 0, 25;
I have done 2 improvements
1) Removed the contacts which is getting joined unnecessarily twice and put the condition at the final where condition.
2) Removed as per SQL_CALC_FOUND_ROWS
Which is fastest? SELECT SQL_CALC_FOUND_ROWS FROM `table`, or SELECT COUNT(*)
SELECT c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks/sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.bundle_feedback_supporting, 0) AS feedbackSupporting
FROM contacts AS c
LEFT JOIN
(SELECT cc.contact_id,
count(cc.contact_campaign_id) AS sendCampaignsCount
FROM contact_campaigns
WHERE cc.contact_campaign_sent = '1'
GROUP BY cc.contact_id) AS icc ON icc.contact_id = c.contact_id
LEFT JOIN
(SELECT bf.contact_id,
count(*) AS countfeedbacks,
bf.bundle_feedback_supporting
FROM bundle_feedback bf
JOIN bundles b
LEFT JOIN tracks t ON bf.bundle_feedback_favorite_track_id = t.track_id
WHERE bf.bundle_id = b.bundle_id
GROUP BY bf.contact_id) AS ibf ON ibf.contact_id = c.contact_id
WHERE c.company_id = '876' and c.contact_is_active = '1'
First, you are not identifying any indexes you have to optimize the query. That said, I would ensure you have at least the following composite / covering indexes.
table index
contacts ( company_id, contact_is_active )
contact_campaigns ( contact_id, contact_campaign_sent )
bundle_feedback ( contact_id, bundle_feedback_supporting )
Next, as noted in other answer, unless you really need how many rows qualified, remove the "SQL_CALC_FOUND_ROWS".
In your first left-join (icc), you do a left-join on contact_campaigns (cc), but then throw into your WHERE clause an "AND cc.contact_campaign_sent = '1'" which turns that into an INNER JOIN. At the outer query level, these would result in no matching record and thus NULL for your percentage calculations.
In your second left-join (ibf), you are doing a join to the tracks table, but not utilizing anything from it. Also, you are joining to the bundles table but not using anything from there either -- unless you are getting multiple rows in the bundles and tracks tables which would result in a Cartesian result and possibly overstate your "CountFeedbacks" value. You also do not need the contacts table as you are not doing anything else with it, and the feedback table has the contact ID basis your are querying for. Since that is only grouped by the contact_id, your "bf.bundle_feedback_supporting" is otherwise wasted. If you want counts of feedback, just count from that table per contact ID and remove the rest. (also, the joins should have the "ON" clauses instead of within the WHERE clause for consistency)
Also, for your supporting feedback, the data type and value are unclear, so I implied as a Yes or No and have a SUM() based on how many are supporting. So, a given contact may have 100 records but only 37 are supporting. This gives you 1 record for the contact having BOTH values 100 and 37 respectively and not lost in a group by based on the first entry found for the contact.
I would try to summarize your query to below:
SELECT
c.contact_id,
c.contact_first_name,
c.contact_last_name,
c.contact_email,
c.contact_date_added,
c.contact_company,
c.contact_twitter,
concat(c.contact_first_name," ", c.contact_last_name) AS fullname,
c.contact_subscribed,
ifnull(icc.sendCampaignsCount, 0) AS sendCampaignsCount,
ifnull(round((ibf.countfeedbacks / icc.sendCampaignsCount * 100),2), 0) AS percentFeedback,
ifnull(ibf.SupportCount, 0) AS feedbackSupporting
FROM
contacts AS c
LEFT JOIN
( SELECT
c.contact_id,
count(*) AS sendCampaignsCount
FROM
contacts AS c
JOIN contact_campaigns AS cc
ON c.contact_id = cc.contact_id
AND cc.contact_campaign_sent = '1'
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
GROUP BY
c.contact_id) AS icc
ON c.contact_id = icc.contact_id
LEFT JOIN
( SELECT
bf.contact_id,
count(*) AS countfeedbacks,
SUM( case when bf.bundle_feedback_supporting = 'Y'
then 1 else 0 end ) as SupportCount
FROM
contacts AS c
JOIN bundle_feedback bf
ON c.contact_id = bf.contact_id
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
GROUP BY
bf.contact_id) AS ibf
ON c.contact_id = ibf.contact_id
WHERE
c.company_id = '876'
AND c.contact_is_active = '1'
ORDER BY
percentFeedback DESC LIMIT 0, 25;

MySQL query too slow about 25 seconds

I have a MySQL query and it takes about 25 sec. There are not many rows (just about 200) but I don't understand why it takes long time.
Query:
SELECT *
, c.id c_id
FROM campaign c
JOIN campaign_category cc
ON c.campaign_type = cc.id
WHERE c.is_deleted = 0
AND c.status = 1
AND c.id NOT IN (SELECT campaign_id FROM user_reviews WHERE user_id = 4)
AND c.amt_req > (SELECT COUNT(id)
FROM reserved_reviews
WHERE camping_id = c.id
AND user_id != 4)
+ (SELECT COUNT(id)
FROM user_reviews
WHERE campaign_id = c.id)
Edit:
I tried with JOIN like this but i got no result:
SELECT
*, `c`.`id` as `c_id`,COUNT(`ur`.`id`) as `total_reviewed`, COUNT(`rr`.`id`) as `total_reserved`
FROM
`campaign` `c`
JOIN `campaign_category` `cc` ON `c`.`campaign_type`=`cc`.`id`
JOIN `user_reviews` `ur` ON `ur`.`campaign_id`=`c`.`id`
JOIN `reserved_reviews` `rr` ON `rr`.`camping_id`=`c`.`id`
WHERE
`c`.`is_deleted` =0
AND
`c`.`status` = 1
AND
`ur`.`user_id` != 4
GROUP BY `c`.`id`
HAVING `c`.`amt_req` > COUNT(`ur`.`id`) + COUNT(`rr`.`id`)
Edit: Table structures: First Image - user_reviews Table, Second image campagin Table, Third image: reserved_reviews Table.
http://imgur.com/GI4817B,SdnSxuz,truxHM6#0
You can improve this query with indexes;
SELECT *, c.id c_id
FROM campaign c JOIN
campaign_category cc
ON c.campaign_type = cc.id
WHERE c.is_deleted = 0 AND
c.status = 1 AND
c.id NOT IN (SELECT campaign_id FROM user_reviews WHERE user_id = 4)
c.amt_req > (SELECT COUNT(*)
FROM reserved_reviews
WHERE campaign_id = c.id AND user_id <> 4)
) +
(SELECT COUNT(id)
FROM user_reviews
WHERE campaign_id = c.id
) ;
For the outer query and joins: campaign(status, is_deleted, id, amt_req) and campaign_category(id) (you should have the latter if it is defined as a primary key.
Then: user_reviews(user_id, campaign_id), reserved_reviews(campaign_id, user_id), and user_reviews(campaign_id).

How to group by columns, and pick arbitrary non null other columns to display

I have a staging table with new address records and need to copy new cities into a mart table. I want to only have one entry in the mart for each city, state, zip combination, and I want to include latitude and longitude for the city. The address table has lat & long, but they could be anywhere within the city, or could be null.
The query I've got so far gets me the right data, but it's pulling one pair of lat & long arbitrarily. I'd prefer to pull from the ones that are not null.
SELECT a.city
,a.STATE
,a.country
,a.latitude
,a.longitude
FROM (
SELECT city
,STATE
,country
FROM staging2.address_daily s
WHERE NOT EXISTS (
SELECT *
FROM mart.city m
WHERE m.city_name = s.city
AND m.state_code = s.STATE
AND m.country_code = s.country
)
GROUP BY city
,STATE
,country
) sq --This subquery groups by city state and country
JOIN staging2.address_daily a
ON a.ID = (
SELECT ID
FROM staging2.address_daily i
WHERE i.city = sq.city
AND i.STATE = sq.STATE
AND i.country = sq.country LIMIT 1
) --This subquery takes the group, and picks one ID.
--The overall query is still flawed, as we're picking at random, and we should ideally pick a non-null latitude and longitude if they exist.
I'm using MySQL but would prefer to avoid things that are unique to MySQL.
From a SQL perspective this logic would work. For MySQL you might need to do some tweaks to the syntax -
SELECT sq.city
,sq.STATE
,sq.country
,a.latitude
,a.longitude
FROM (
SELECT city
,STATE
,country
FROM staging2.address_daily s
WHERE NOT EXISTS (
SELECT *
FROM mart.city m
WHERE m.city_name = s.city
AND m.state_code = s.STATE
AND m.country_code = s.country
)
GROUP BY city
,STATE
,country
) sq --This subquery groups by city state and country
LEFT JOIN staging2.address_daily a
ON a.ID = (
SELECT ID
FROM staging2.address_daily i
WHERE i.city = sq.city
AND i.STATE = sq.STATE
AND i.country = sq.country
AND NOT latitude IS NULL LIMIT 1
)
can you try this to see if it works for you
SELECT *
FROM (SELECT city,
STATE,
country,
A.latitude,
A.longitude,
ROW_NUMBER ()
OVER (PARTITION BY city, state, country
ORDER BY longitude, latitude DESC)
AS ROWNUM
FROM staging2.address_daily s
WHERE NOT EXISTS
(SELECT *
FROM mart.city M
WHERE M.city_name = s.city
AND M.state_code = s.STATE
AND M.country_code = s.country))
WHERE ROWNUM = 1

MySQL MAX associated with a second column

I am trying to get the highest and lowest value associated with an account for a 1 year timeframe for a country. This data is pulled from one table.
I will have the highest account return for account one and lowest account return for account two for a country. So 1 result per country.
I've got the following but it doesn't work properly, it actually provides me the highest and lowest values from an incorrect account as it should only work with accounts that have 1 year timeframe as well.
Also forgot to add perhaps ordering the overall result by dpromo_one only do these countries eg country in ('united states','united kingdom','south africa','india','australia') for these selected countries only. Its just got quite complex that it went way over my head.
SELECT DISTINCT acc2.account_name AS account_one, acc5.account_name AS account_two,
MAX( acc2.dpromo_rate ) AS dpromo_one, MIN( acc5.dpromo_rate ) AS dpromo_two,
acc2.deposit_term, acc2.country
FROM accounts acc2
INNER JOIN accounts acc5 ON acc2.country = acc5.country
WHERE acc2.type =2
AND acc5.type =2
AND acc2.deposit_term = '1 Year'
GROUP BY country
overall output example could be the following
for line 1:
Country Bank Highest Bank Lowest
USA BOFA 1yr 1% Wells Fargo 1yr 0.5%
UK HSBC 1yr 0.5% Halifax 1yr 0.25%
Australia CBA 1yr 0.4% NAB 1yr 0.1%
eg the accounts table has the following fields for example that are relevant
account_name
country
dpromo_rate
deposit_term
note that we are having both accounts and rates side by side. my code does this but incorrectly though and thats why it also explains why i have aliases for duplicate field names.
To give you the basics in your output example:-
SELECT DISTINCT z.county, b.account_name, b.dpromo_rate, d.account_name, d.dpromo_rate
FROM accounts z
INNER JOIN (SELECT country, type, MAX(dpromo_rate) AS MaxRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) a
ON z.country = a.country AND z.type = a.type
INNER JOIN accounts b
ON a.country = b.country and a.MaxRate = b.dpromo_rate AND a.type = b.type
INNER JOIN (SELECT country, type, MIN(dpromo_rate) AS MinRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) c
ON z.country = c.country AND z.type = c.type
INNER JOIN accounts d
ON c.country = d.country and c.MinRate = d.dpromo_rate AND c.type = d.type
This is just getting the country, account name with the max rate, the actual max rate, account name with the min rate and the actual min rate.
Not sure where deposit term and provider are wanted in the output, but they would be easy to get from the b or d alias tables.
Note that that this will make a mess should you have multiple accounts in a country which all share the max or min rates.
To limit it to a few countries and order by the max rate:-
SELECT DISTINCT z.county, b.account_name, b.dpromo_rate, d.account_name, d.dpromo_rate
FROM accounts z
INNER JOIN (SELECT country, type, MAX(dpromo_rate) AS MaxRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) a
ON z.country = a.country AND z.type = a.type
INNER JOIN accounts b
ON a.country = b.country and a.MaxRate = b.dpromo_rate AND a.type = b.type
INNER JOIN (SELECT country, type, MIN(dpromo_rate) AS MinRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) c
ON z.country = c.country AND z.type = c.type
INNER JOIN accounts d
ON c.country = d.country and c.MinRate = d.dpromo_rate AND c.type = d.type
WHERE z.country IN ('united states', 'united kingdom', 'south africa', 'india', 'australia')
ORDER BY b.dpromo_rate
To limit it to one per country then you can do this:-
SELECT z.county, b.account_name, b.dpromo_rate, d.account_name, d.dpromo_rate
FROM accounts z
INNER JOIN (SELECT country, type, MAX(dpromo_rate) AS MaxRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) a
ON z.country = a.country AND z.type = a.type
INNER JOIN accounts b
ON a.country = b.country and a.MaxRate = b.dpromo_rate AND a.type = b.type
INNER JOIN (SELECT country, type, MIN(dpromo_rate) AS MinRate FROM accounts WHERE type = 2 AND deposit_term = '1 Year' GROUP BY country, type) c
ON z.country = c.country AND z.type = c.type
INNER JOIN accounts d
ON c.country = d.country and c.MinRate = d.dpromo_rate AND c.type = d.type
WHERE z.country IN ('united states', 'united kingdom', 'south africa', 'india', 'australia')
GROUP BY z.county
ORDER BY b.dpromo_rate
Note that if 2 accounts in a country have the same rate which is the highest rate for that country then only one will be returned. Which one that is returned is not certain.
It is doing exactly what MySQL says it will do. Columns not included in the group by clause have arbitrary values. In most other databases, the query would simply fail with a syntax error.
Here is a trick to get the names of the accounts:
SELECT substring_index(group_concat(acc2.account_name order by acc2.dpromo_rate desc), ',', 1) AS account_one,
substring_index(group_concat(acc5.account_name order by acc5.dpromo_rate asc), ',', 1) AS account_two,
MAX( acc2.dpromo_rate ) AS dpromo_one, MIN( acc5.dpromo_rate ) AS dpromo_two,
acc2.deposit_term, acc2.country
FROM accounts acc2
INNER JOIN accounts acc5 ON acc2.country = acc5.country
WHERE acc2.type =2 AND acc5.type =2 AND acc2.deposit_term = '1 Year'
GROUP BY country
I think you can simplify the query to a simple aggregation. I don't see why you are doing a join:
SELECT substring_index(group_concat(acc.account_name order by acc.dpromo_rate desc), ',', 1) AS account_one,
substring_index(group_concat(acc.account_name order by acc.dpromo_rate asc), ',', 1) AS account_two,
MAX(acc.dpromo_rate) AS dpromo_one, MIN(acc.dpromo_rate) AS dpromo_two,
acc.deposit_term, acc.country
FROM accounts acc
WHERE acc.type = 2 and acc.deposit_term = '1 Year'
GROUP BY country;
If you intend for the deposit term to only apply to the max, then replace the max with:
max(case when acc.deposit_term = '1 Year' then acc.dpromo_rate end) as dpromo_one
I thing this will give result
select max(dpromo_rate), min(dpromo_rate), country, account_name, provider from accounts
where deposit_term = '1 Year' group by country, provider, account_name