I have 1 main table and two tables that hold multiple dinamyc information about the first table.
The first table called 'items' holds main information. Then there are two tables (ratings and indexes) that holds information about some values for dinamyc count of auditories and time period.
What i want:
When I query for those items, I want result to have an additional column names from ratings and indexes tables.
I have the code like this
SELECT items.*, ratings.val AS rating, indexes.val AS idx
FROM items,ratings,indexes
WHERE items.date>=1349902800000 AND items.date <=1349989199000
AND ratings.period_start <= items.date
AND ratings.period_end > items.date
AND ratings.auditory = 'kids'
AND indexes.period_start <= items.date
AND indexes.period_end > items.date
AND indexes.auditory = 'kids'
ORDER BY indexes.added, ratings.added DESC
The tables look something like this
items:
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(200) DEFAULT NULL,
`date` bigint(40) DEFAULT NULL
PRIMARY KEY (`id`)
ratings:
`id` bigint(50) NOT NULL AUTO_INCREMENT,
`period_start` bigint(50) DEFAULT NULL,
`period_end` bigint(50) DEFAULT NULL,
`val` float DEFAULT NULL,
`auditory` varchar(200) DEFAULT NULL,
`added` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
All dates except 'added' fields which are simple TIMESTAMPS are BIGINT format - miliseconds from whatever date it is in AS3 when you do Date.getTime();
So - what is the correct way to get this acomplished?
The only thing I'm not seeing is the unique correlation of any individual ITEM to its ratings... I would think the ratings table would need an "ItemID" to link back to items. As it stands now, if you have 100 items within a given time period say 3 months... and just add all the ratings / reviews, but don't associate those ratings to the actual Item, you are stuck. Put the ItemID in and add that to your WHERE condition and you should be good to go.
Related
I have a table that has over 2.5 million rows and I would like to run the following SQL Statment to get the
select count(*)
from workflow
where action_name= 'Workflow'
and release_date >= '2019-12-01 13:24:22'
and release_date <= '2019-12-31 13:24:22'
AND project_name= 'Web'
group
by page_id
, headline
, release_full_name
, release_date
The problem is that it takes over 2.7 seconds to return 0 rows as expected. Is there a way to speed it up more? I have 6 more SQL Statements that are similiar so that will take almost (2.7 seconds * 6) = 17 seconds at least.
Here is my table schema
CREATE TABLE workflow (
id int(11) NOT NULL AUTO_INCREMENT,
action_name varchar(100) NOT NULL,
project_name varchar(30) NOT NULL,
page_id int(11) NOT NULL,
headline varchar(200) NOT NULL,
create_full_name varchar(200) NOT NULL,
create_date datetime NOT NULL,
change_full_name varchar(200) NOT NULL,
change_date datetime NOT NULL,
release_full_name varchar(200) NOT NULL,
release_date datetime NOT NULL,
reject_full_name varchar(200) NOT NULL,
reject_date datetime NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=2948271 DEFAULT CHARSET=latin1
What I'm looking for in this query is to get the count of the pages that were released last month. that have project_name = "web" and action_name = "Workflow"
This is bit bigger for comments
Using Group by with Count function doesn't make any sense. Usually you need to count actual rows in DB not after aggregation. Not sure if this is your actual requirement reason being GROUP BY causes slowness of the query.
Use composite Index on (Web, start_date) as column project seems highest selective.
For other information, Please share the explain plan.
Assuming that you need counts for groups (you had listed), better to include the group fields in select (essentially) like
select page_id, headline, release_full_name, release_date, count(*)
from ...
Adding an index with (page_id, headline) would optimize well.
I have a table for storing stats. Currently this is populated with about 10 million rows at the end of the day then copied to daily stats table and deleted. For this reason I can't have an auto-incrementing primary key.
This is the table structure:
CREATE TABLE `stats` (
`shop_id` int(11) NOT NULL,
`title` varchar(255) CHARACTER SET latin1 NOT NULL,
`created` datetime NOT NULL,
`mobile` tinyint(1) NOT NULL DEFAULT '0',
`click` tinyint(1) NOT NULL DEFAULT '0',
`conversion` tinyint(1) NOT NULL DEFAULT '0',
`ip` varchar(20) CHARACTER SET latin1 NOT NULL,
KEY `shop_id` (`shop_id`,`created`,`ip`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
I have a key on shop_id, created, ip but I'm not sure what columns I should use to create the optimal index to increase lookup speeds any further?
The query below takes about 12 seconds with no key and about 1.5 seconds using the index above:
SELECT DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane')) AS `date`, COUNT(*) AS `views`
FROM `stats`
WHERE `created` <= '2017-07-18 09:59:59'
AND `shop_id` = '17515021'
AND `click` != 1
AND `conversion` != 1
GROUP BY DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane'))
ORDER BY DATE(CONVERT_TZ(`created`, 'UTC', 'Australia/Brisbane'));
If there is no column (or combination of columns) that is guaranteed unique, then do have an AUTO_INCREMENT id. Don't worry about truncating/deleting. (However, if the id does not reset, you probably need to use BIGINT, not INT UNSIGNED to avoid overflow.)
Don't use id as the primary key, instead, PRIMARY KEY(shop_id, created, id), INDEX(id).
That unconventional PK will help with performance in 2 ways, while being unique (due to the addition of id). The INDEX(id) is to keep AUTO_INCREMENT happy. (Whether you DELETE hourly or daily is a separate issue.)
Build a Summary table based on each hour (or minute). It will contain the count for such -- 400K/hour or 7K/minute. Augment it each hour (or minute) so that you don't have to do all the work at the end of the day.
The summary table can also filter on click and/or conversion. Or it could keep both, if you need them.
If click/conversion have only two states (0 & 1), don't say != 1, say = 0; the optimizer is much better at = than at !=.
If they 2-state and you changed to =, then this becomes viable and much better: INDEX(shop_id, click, conversion, created) -- created must be last.
Don't bother with TZ when summarizing into the Summary table; apply the conversion later.
Better yet, don't use DATETIME, use TIMESTAMP so that you won't need to convert (assuming you have TZ set correctly).
After all that, if you still have issues, start over on the Question; there may be further tweaks.
In your where clause, Use the column first which will return the small set of results and so on and create the index in the same order.
You have
WHERE created <= '2017-07-18 09:59:59'
AND shop_id = '17515021'
AND click != 1
AND conversion != 1
If created will return the small number of set as compare to other 3 columns then you are good otherwise you that column at first position in your where clause then select the second column as per the same explanation and create the index as per you where clause.
If you think order is fine then create an index
KEY created_shopid_click_conversion (created,shop_id, click, conversion);.
I have the following inside my stored procedure that retrieves unique records from player names that have a faction of Neutral:
SELECT
COUNT(DISTINCT Name) into #neutcount
FROM
dim5players
WHERE
Faction ='Neutral';
UPDATE dim5stats
SET Value = #neutcount
WHERE
Function = 'Neutral';
This works find and dandy. The problem is that I have a field called Date as well.
I want to select the lastest date of the records to be listed in the count instead of a random record from the unique "Name" record.
This is a history table, and it records daily changes of the records, where Name can appear several times. I need to count only the latest records that have a faction of neutral with their latest records only. Some people change factions from time to time. I only care about their latest faction.
This is the structure:
CREATE TABLE `dim5players` (
`id` CHAR(64) NOT NULL,
`name` VARCHAR(45) NOT NULL,
`rank_name` VARCHAR(20) NULL DEFAULT NULL,
`level` INT(11) NOT NULL,
`defender_rank_id` INT(11) NULL DEFAULT NULL,
`Faction` VARCHAR(15) NOT NULL,
`Organization` VARCHAR(100) NULL DEFAULT NULL,
`Date` DATE NOT NULL,
`Updated` BIT(1) NOT NULL,
UNIQUE INDEX `id_UNIQUE` (`id`) USING HASH,
INDEX `name_index` (`name`) USING HASH,
INDEX `date_index` (`Date`) USING HASH,
INDEX `updated_index` (`Updated`) USING HASH,
INDEX `faction_index` (`Faction`) USING HASH
)
COLLATE='utf8_general_ci'
ENGINE=InnoDB;
After a discussion with Michael i think i figured out what he needs:
"I want the Last Updated record of each name"
SELECT
name ,
MAX(Date) as last_date
FROM
dim5players
WHERE
Faction ='Neutral'
GROUP BY
name
"I just want to count the latest date on each NAME that still holds the faction of Neutral"
SELECT
COUNT(last_date)
FROM (
SELECT
name ,
MAX(Date) as last_date
FROM
dim5players
WHERE
Faction ='Neutral'
GROUP BY
name
) as tmp
#Michael : Let me know if i understood you requirements correctly
I have a problem constructing a mysql query:
I have this table "tSubscribers" were I store the subscribers for my newsletter mailing list.
The table looks like this (simplified):
--
-- Table structure for tSubscriber
--
CREATE TABLE tSubscriber (
fId INT UNSIGNED NOT NULL AUTO_INCREMENT,
fSubscriberGroupId INT UNSIGNED NOT NULL,
fEmail VARCHAR(255) NOT NULL DEFAULT '',
fDateConfirmed DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
fDateUnsubscribed TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (fId),
INDEX (fSubscriberGroupId),
) ENGINE=MyISAM;
Now what I want to accomplish is to have a diagram showing the subscriptions and unsubscriptions per month per subscriber group.
So I need to extract the year and months from the fDateConfirmed, fDateUnsubscribed dates, count them and show the count sorted by month and year for a subscriber group.
I think this sql query gets quite complex and I just can't get my head around it. Is this even possible with one query.
You will need two separate queries, one for subscriptions and other for unsubscriptions.
SELECT COUNT(*), YEAR(fDateConfirmed), MONTH(fDateConfirmed) FROM tSubscriber GROUP BY YEAR(fDateConfirmed), MONTH(fDateConfirmed)
SELECT COUNT(*), YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribe ) FROM tSubscriber GROUP BY YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribed)
First.. here are the two tables I've created (sans irrelevant columns)..
CREATE TABLE users_history1 (
circuit tinyint(1) unsigned NOT NULL default '0',
userh_season smallint(4) unsigned NOT NULL default '0',
userh_userid int(11) unsigned NOT NULL default '0',
userh_rank varchar(2) NOT NULL default 'D',
userh_wins int(11) NOT NULL default '0',
userh_losses int(11) NOT NULL default '0',
userh_points int(11) NOT NULL default '1000',
KEY (circuit, userh_userid),
KEY (userh_season)
) ENGINE=MyISAM;
CREATE TABLE users_ladders1 (
circuit tinyint(1) unsigned NOT NULL default '0',
userl_userid int(11) unsigned NOT NULL default '0',
userl_rank char(2) NOT NULL default 'D',
userl_wins smallint(3) NOT NULL default '0',
userl_losses smallint(3) NOT NULL default '0',
userl_points smallint(4) unsigned NOT NULL default '1000',
PRIMARY KEY (circuit, userl_userid),
KEY (userl_userid)
) ENGINE=MyISAM;
Some background.. these tables hold data for a competitive ladder where players are compared against each other on an ordered standings by points. users_history1 is a table that contains records stored from previous seasons. users_ladders1 contains records from the current season. I'm trying to create a page on my site where players are ranked on the average points of their previous records and current record. Here is the main standings for a 1v1 ladder:
http://vilegaming.com/league.x/standings1/3
I want to select from the database from the two tables an ordered list players depending on their average points from their users_ladders1 and users_history1 records. I really have no idea how to select from two tables in one query, but I'll try, as generic as possible, to illustrate it..
Using hyphens throughout the examples since SO renders it weird.
SELECT userh-points
FROM users-history1
GROUP BY userh-userid
ORDER BY (total userh-points for the user)
Needs the GROUP BY since some players may have played in multiple previous seasons.
SELECT userl-points
FROM users-ladders1
ORDER BY userl-points
I want to be able to combine both tables in a query so I can get the data in form of rows ordered by total points, and if possible also divide the total points by the number of unique records for the player so I can get the average.
You'll want to use a UNION SELECT:
SELECT p.id, COUNT(p.id), SUM(p.points)
FROM (SELECT userh_userid AS id, userh_points AS points
FROM users_history1
UNION SELECT userl_userid, userl_points
FROM users_ladders1) AS p
GROUP BY p.id
The sub query is the important part. It will give you a single table with the results of both the current and history tables combined. You can then select from that table and do COUNT and SUM to get your averages.
My MySQL syntax is quite rusty, so please excuse it. I haven't had a chance to run this, so I'm not even sure if it executes, but it should be enough to get you started.
If you want to merge to table and you want to select particular column from one table and in another table want to select all.
e.g.
Table name = test1 , test2
query:
SELECT test1.column1,test1.column2, test2.* FROM test1 ,test2
if you want to merge with particular column
query:
SELECT test1.column1,test1.column2, test2.* FROM test1 ,test2 where test2.column3='(what ever condition u want to pass)'
Select col1 from test1 where id = '1'
union
select * from table2
this one can also used for the joining to tables.