I have a problem constructing a mysql query:
I have this table "tSubscribers" were I store the subscribers for my newsletter mailing list.
The table looks like this (simplified):
--
-- Table structure for tSubscriber
--
CREATE TABLE tSubscriber (
fId INT UNSIGNED NOT NULL AUTO_INCREMENT,
fSubscriberGroupId INT UNSIGNED NOT NULL,
fEmail VARCHAR(255) NOT NULL DEFAULT '',
fDateConfirmed DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
fDateUnsubscribed TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (fId),
INDEX (fSubscriberGroupId),
) ENGINE=MyISAM;
Now what I want to accomplish is to have a diagram showing the subscriptions and unsubscriptions per month per subscriber group.
So I need to extract the year and months from the fDateConfirmed, fDateUnsubscribed dates, count them and show the count sorted by month and year for a subscriber group.
I think this sql query gets quite complex and I just can't get my head around it. Is this even possible with one query.
You will need two separate queries, one for subscriptions and other for unsubscriptions.
SELECT COUNT(*), YEAR(fDateConfirmed), MONTH(fDateConfirmed) FROM tSubscriber GROUP BY YEAR(fDateConfirmed), MONTH(fDateConfirmed)
SELECT COUNT(*), YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribe ) FROM tSubscriber GROUP BY YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribed)
Related
I have a table for storing stats. Currently this is populated with about 10 million rows at the end of the day then copied to daily stats table and deleted. For this reason I can't have an auto-incrementing primary key.
This is the table structure:
CREATE TABLE `stats` (
`shop_id` int(11) NOT NULL,
`title` varchar(255) CHARACTER SET latin1 NOT NULL,
`created` datetime NOT NULL,
`mobile` tinyint(1) NOT NULL DEFAULT '0',
`click` tinyint(1) NOT NULL DEFAULT '0',
`conversion` tinyint(1) NOT NULL DEFAULT '0',
`ip` varchar(20) CHARACTER SET latin1 NOT NULL,
KEY `shop_id` (`shop_id`,`created`,`ip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I have a key on shop_id, created, ip but I'm not sure what columns I should use to create the optimal index to increase lookup speeds any further?
The query below takes about 12 seconds with no key and about 1.5 seconds using the index above:
SELECT DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane')) AS `date`, COUNT(*) AS `views`
FROM `stats`
WHERE `created` <= '2017-07-18 09:59:59'
AND `shop_id` = '17515021'
AND `click` != 1
AND `conversion` != 1
GROUP BY DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane'))
ORDER BY DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane'));
If there is no column (or combination of columns) that is guaranteed unique, then do have an AUTO_INCREMENT id. Don't worry about truncating/deleting. (However, if the id does not reset, you probably need to use BIGINT, not INT UNSIGNED to avoid overflow.)
Don't use id as the primary key, instead, PRIMARY KEY(shop_id, created, id), INDEX(id).
That unconventional PK will help with performance in 2 ways, while being unique (due to the addition of id). The INDEX(id) is to keep AUTO_INCREMENT happy. (Whether you DELETE hourly or daily is a separate issue.)
Build a Summary table based on each hour (or minute). It will contain the count for such -- 400K/hour or 7K/minute. Augment it each hour (or minute) so that you don't have to do all the work at the end of the day.
The summary table can also filter on click and/or conversion. Or it could keep both, if you need them.
If click/conversion have only two states (0 & 1), don't say != 1, say = 0; the optimizer is much better at = than at !=.
If they 2-state and you changed to =, then this becomes viable and much better: INDEX(shop_id, click, conversion, created) -- created must be last.
Don't bother with TZ when summarizing into the Summary table; apply the conversion later.
Better yet, don't use DATETIME, use TIMESTAMP so that you won't need to convert (assuming you have TZ set correctly).
After all that, if you still have issues, start over on the Question; there may be further tweaks.
In your where clause, Use the column first which will return the small set of results and so on and create the index in the same order.
You have
WHERE created <= '2017-07-18 09:59:59'
AND shop_id = '17515021'
AND click != 1
AND conversion != 1
If created will return the small number of set as compare to other 3 columns then you are good otherwise you that column at first position in your where clause then select the second column as per the same explanation and create the index as per you where clause.
If you think order is fine then create an index
KEY created_shopid_click_conversion (created,shop_id, click, conversion);.
I have a sales table in MySQL (InnoDB). It's +- 1 million records big. I would like to show some nice charts. Fetching the right data is not a problem. Fetching it fast is...
So I like to count the amount of sales in table A grouped per day (later on also month, and year) for PERIOD A till Z. Concrete; for the last 30 days I like to know for each day how many sales records we have in the DB.
So MySQL would have to return something like this:
I like to achieve that MySQL returns the data like this:
date, count
2017-04-01, 2482
2017-04-02, 1934
2017-04-03, 2701
...
The structure of the Sales basically like this:
CREATE TABLE `sales` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`deleted_at` timestamp NULL DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `contacts_created_at_index` (`created_at`),
KEY `contacts_deleted_at_index` (`deleted_at`),
KEY `ind_created_at_deleted_at` (`created_at`,`deleted_at`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Some days (datapoints) might not have any results, but I don't like to have gaps in the data. So I also have some 'calendar' table.
CREATE TABLE `time_dimension` (
`id` int(11) NOT NULL,
`db_date` date NOT NULL,
`year` int(11) NOT NULL,
`month` int(11) NOT NULL,
`day` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `td_ymd_idx` (`year`,`month`,`day`),
UNIQUE KEY `td_dbdate_idx` (`db_date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Fetching 30 rows (30 days) with a count per day takes 30 secs...
This is the first query I tried:
SELECT
`db_date` AS `date`,
(SELECT
COUNT(1)
FROM
sales
WHERE
DATE(created_at) = db_date) AS count
FROM
`time_dimension`
WHERE
`db_date` >= '2017-04-11'
AND `db_date` <= '2017-04-25'
ORDER BY `db_date` ASC
But like I said it's really slow (11.9 secs). I tried al kinds of other approaches, but without luck. For example:
SELECT time_dimension.db_date AS DATE,
COUNT(1) AS count
FROM sales RIGHT JOIN time_dimension ON (DATE(sales.created_at) =
time_dimension.db_date)
WHERE
(time_dimension.db_date BETWEEN '2017-03-11' AND '2017-04-11')
GROUP BY
DATE
A query for just 1 datapoint takes only 5.4ms:
SELECT COUNT(1) FROM sales WHERE created_at BETWEEN '2017-04-11 00:00:00' AND '2017-04-25 23:59:59'
I haven't checked innodb_buffer_poolsize on my local machine. I will check that as well. Any ideas on how to make queries like this fast? In the future I would even need to where clauses and joins, to filter the set of sales records..
Thanks.
Nick
You could try to count sale data first, then join count result with your calendar table.
SELECT time_dimension.db_date AS date,
by_date.sale_count
FROM time_dimension
LEFT JOIN (SELECT DATE(sales.created_at) sale_date,
COUNT(1) AS sale_count
FROM sales
WHERE created_at BETWEEN '2017-03-11 00:00:00' AND
'2017-04-11 23:59:59'
GROUP BY DATE(sales.created_at)) by_date
ON time_dimension.db_date = by_date.sale_date
WHERE time_dimension.db_date BETWEEN '2017-03-11' AND '2017-04-11'
The problematic part of your query is the data type conversion DATE(created_at), which effectively prevents Mysql from using the index at created_at.
Your 1 datapoint query avoids that, and that is why it is working fast.
To fix this you should check if created_at is within a range of specific day, like that:
created_at BETWEEN db_date AND DATE_ADD(db_date,INTERVAL 1 DAY)
This way Mysql will be able to make use of index on it (do a range lookup), as appropriate.
WHERE DATE(created_at) = db_date)
-->
WHERE created_at >= db_date
AND created_at < db_date + INTERVAL 1 DAY
This avoids including midnight of second day (as BETWEEN does)
Work for all flavors: DATE, DATETIME, DATETIME(6)
Does not hid the created_at inside a function where the index cannot see it.
For time_dimension, get rid of PRIMARY KEY (id) and change UNIQUE(db_date) to the PK.
After making these changes, your original subquery may be competitive with the LEFT JOIN ( SELECT ... ). (It depends on which version of MySQL.)
I'm a newbie to MySQL querying and need some assistance with the subqueries.
I am using ASP .NET charting control that retrieves data from MySQL.I want to display a drill down chart and need some help on MySQL subquery.
Below is my table:
CREATE TABLE IF NOT EXISTS `data` (
`runtime` smallint(6) NOT NULL,
`app` varchar(60) NOT NULL,
`process` varchar(40) NOT NULL,
`username` varchar(51) NOT NULL,
`time` time NOT NULL,
`date` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Step 1 :
Showing a pie chart of Top 10 users with highest time between 2 dates.
I get the top 10 users used between 2 dates using the below query:
SELECT username ,SUM(runtime) as Runtime,
process,ROUND(SUM(runtime/201600),2) as 'Total Time',
role ,
date
FROM data
WHERE `date` BETWEEN 'date1' AND 'date2'
Group BY process LIMIT 10.
Step 2:
When user clicks on the individual user in chartArea, I wan to display the top 10 apps/process between specific dates.
I have 1 main table and two tables that hold multiple dinamyc information about the first table.
The first table called 'items' holds main information. Then there are two tables (ratings and indexes) that holds information about some values for dinamyc count of auditories and time period.
What i want:
When I query for those items, I want result to have an additional column names from ratings and indexes tables.
I have the code like this
SELECT items.*, ratings.val AS rating, indexes.val AS idx
FROM items,ratings,indexes
WHERE items.date>=1349902800000 AND items.date <=1349989199000
AND ratings.period_start <= items.date
AND ratings.period_end > items.date
AND ratings.auditory = 'kids'
AND indexes.period_start <= items.date
AND indexes.period_end > items.date
AND indexes.auditory = 'kids'
ORDER BY indexes.added, ratings.added DESC
The tables look something like this
items:
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(200) DEFAULT NULL,
`date` bigint(40) DEFAULT NULL
PRIMARY KEY (`id`)
ratings:
`id` bigint(50) NOT NULL AUTO_INCREMENT,
`period_start` bigint(50) DEFAULT NULL,
`period_end` bigint(50) DEFAULT NULL,
`val` float DEFAULT NULL,
`auditory` varchar(200) DEFAULT NULL,
`added` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
All dates except 'added' fields which are simple TIMESTAMPS are BIGINT format - miliseconds from whatever date it is in AS3 when you do Date.getTime();
So - what is the correct way to get this acomplished?
The only thing I'm not seeing is the unique correlation of any individual ITEM to its ratings... I would think the ratings table would need an "ItemID" to link back to items. As it stands now, if you have 100 items within a given time period say 3 months... and just add all the ratings / reviews, but don't associate those ratings to the actual Item, you are stuck. Put the ItemID in and add that to your WHERE condition and you should be good to go.
First.. here are the two tables I've created (sans irrelevant columns)..
CREATE TABLE users_history1 (
circuit tinyint(1) unsigned NOT NULL default '0',
userh_season smallint(4) unsigned NOT NULL default '0',
userh_userid int(11) unsigned NOT NULL default '0',
userh_rank varchar(2) NOT NULL default 'D',
userh_wins int(11) NOT NULL default '0',
userh_losses int(11) NOT NULL default '0',
userh_points int(11) NOT NULL default '1000',
KEY (circuit, userh_userid),
KEY (userh_season)
) ENGINE=MyISAM;
CREATE TABLE users_ladders1 (
circuit tinyint(1) unsigned NOT NULL default '0',
userl_userid int(11) unsigned NOT NULL default '0',
userl_rank char(2) NOT NULL default 'D',
userl_wins smallint(3) NOT NULL default '0',
userl_losses smallint(3) NOT NULL default '0',
userl_points smallint(4) unsigned NOT NULL default '1000',
PRIMARY KEY (circuit, userl_userid),
KEY (userl_userid)
) ENGINE=MyISAM;
Some background.. these tables hold data for a competitive ladder where players are compared against each other on an ordered standings by points. users_history1 is a table that contains records stored from previous seasons. users_ladders1 contains records from the current season. I'm trying to create a page on my site where players are ranked on the average points of their previous records and current record. Here is the main standings for a 1v1 ladder:
http://vilegaming.com/league.x/standings1/3
I want to select from the database from the two tables an ordered list players depending on their average points from their users_ladders1 and users_history1 records. I really have no idea how to select from two tables in one query, but I'll try, as generic as possible, to illustrate it..
Using hyphens throughout the examples since SO renders it weird.
SELECT userh-points
FROM users-history1
GROUP BY userh-userid
ORDER BY (total userh-points for the user)
Needs the GROUP BY since some players may have played in multiple previous seasons.
SELECT userl-points
FROM users-ladders1
ORDER BY userl-points
I want to be able to combine both tables in a query so I can get the data in form of rows ordered by total points, and if possible also divide the total points by the number of unique records for the player so I can get the average.
You'll want to use a UNION SELECT:
SELECT p.id, COUNT(p.id), SUM(p.points)
FROM (SELECT userh_userid AS id, userh_points AS points
FROM users_history1
UNION SELECT userl_userid, userl_points
FROM users_ladders1) AS p
GROUP BY p.id
The sub query is the important part. It will give you a single table with the results of both the current and history tables combined. You can then select from that table and do COUNT and SUM to get your averages.
My MySQL syntax is quite rusty, so please excuse it. I haven't had a chance to run this, so I'm not even sure if it executes, but it should be enough to get you started.
If you want to merge to table and you want to select particular column from one table and in another table want to select all.
e.g.
Table name = test1 , test2
query:
SELECT test1.column1,test1.column2, test2.* FROM test1 ,test2
if you want to merge with particular column
query:
SELECT test1.column1,test1.column2, test2.* FROM test1 ,test2 where test2.column3='(what ever condition u want to pass)'
Select col1 from test1 where id = '1'
union
select * from table2
this one can also used for the joining to tables.