Mysql Subquery between specific dates - mysql

I'm a newbie to MySQL querying and need some assistance with the subqueries.
I am using ASP .NET charting control that retrieves data from MySQL.I want to display a drill down chart and need some help on MySQL subquery.
Below is my table:
CREATE TABLE IF NOT EXISTS `data` (
`runtime` smallint(6) NOT NULL,
`app` varchar(60) NOT NULL,
`process` varchar(40) NOT NULL,
`username` varchar(51) NOT NULL,
`time` time NOT NULL,
`date` date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Step 1 :
Showing a pie chart of Top 10 users with highest time between 2 dates.
I get the top 10 users used between 2 dates using the below query:
SELECT username ,SUM(runtime) as Runtime,
process,ROUND(SUM(runtime/201600),2) as 'Total Time',
role ,
date
FROM data
WHERE `date` BETWEEN 'date1' AND 'date2'
Group BY process LIMIT 10.
Step 2:
When user clicks on the individual user in chartArea, I wan to display the top 10 apps/process between specific dates.

Related

How to get average sum for every weekday? (MySQL)

I'm struggling with one task - I need to get average number of users for every weekday. Unfortunately, I'm stuck at this point.
SELECT dayname(DAY) as week, SUM(VISITORS_NUMBER) as vis
FROM mytable
GROUP BY week
The result of code above looks like this Sum results
From this moment I want to get same weekday column but with average values.
What can I do? I've tried subqueries, but I'm still a beginner and can't use it properly
Edit 1:
AVG() is not working. I'm getting results like this: AVG() RESULTS
I checked in excel, average for friday should be 572, not 53.
That's how my dataset looks like: Data set
edit 2:
CREATE TABLE `mytable` (
`DAY` date NOT NULL,
`BROWSER` varchar(22) COLLATE utf8_unicode_ci DEFAULT NULL,
`PLATFORM` varchar(13) COLLATE utf8_unicode_ci DEFAULT NULL,
`VISITORS_NUMBER` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO mytable(DAY, BROWSER, PLATFORM,VISITORS_NUMBER)
VALUES('2020-02-01','Android Webkit Browser','Android','9'),
('2020-02-01','Safari','iOs','5'),
('2020-02-01','Android Webkit Browser','Android','15');
Test
SELECT DAYNAME(`day`) as `week`, SUM(visitors_number) / COUNT(DISTINCT `day`) as avg_vis
FROM mytable
GROUP BY `week`
You need 2 levels of aggregation:
SELECT DAYNAME(t.day) AS day, AVG(t.visitors) AS vis
FROM (
SELECT day, SUM(VISITORS_NUMBER) AS visitors
FROM mytable
GROUP BY day
) t
GROUP BY DAYNAME(t.day)
See a simplified demo.

Speed up mysql SQL query but with a huge dataset

I have a table that has over 2.5 million rows and I would like to run the following SQL Statment to get the
select count(*)
from workflow
where action_name= 'Workflow'
and release_date >= '2019-12-01 13:24:22'
and release_date <= '2019-12-31 13:24:22'
AND project_name= 'Web'
group
by page_id
, headline
, release_full_name
, release_date
The problem is that it takes over 2.7 seconds to return 0 rows as expected. Is there a way to speed it up more? I have 6 more SQL Statements that are similiar so that will take almost (2.7 seconds * 6) = 17 seconds at least.
Here is my table schema
CREATE TABLE workflow (
id int(11) NOT NULL AUTO_INCREMENT,
action_name varchar(100) NOT NULL,
project_name varchar(30) NOT NULL,
page_id int(11) NOT NULL,
headline varchar(200) NOT NULL,
create_full_name varchar(200) NOT NULL,
create_date datetime NOT NULL,
change_full_name varchar(200) NOT NULL,
change_date datetime NOT NULL,
release_full_name varchar(200) NOT NULL,
release_date datetime NOT NULL,
reject_full_name varchar(200) NOT NULL,
reject_date datetime NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=2948271 DEFAULT CHARSET=latin1
What I'm looking for in this query is to get the count of the pages that were released last month. that have project_name = "web" and action_name = "Workflow"
This is bit bigger for comments
Using Group by with Count function doesn't make any sense. Usually you need to count actual rows in DB not after aggregation. Not sure if this is your actual requirement reason being GROUP BY causes slowness of the query.
Use composite Index on (Web, start_date) as column project seems highest selective.
For other information, Please share the explain plan.
Assuming that you need counts for groups (you had listed), better to include the group fields in select (essentially) like
select page_id, headline, release_full_name, release_date, count(*)
from ...
Adding an index with (page_id, headline) would optimize well.

Improve query speed suggestions

For self education I am developing an invoicing system for an electricity company. I have multiple time series tables, with different intervals. One table represents consumption, two others represent prices. A third price table should be still incorporated. Now I am running calculation queries, but the queries are slow. I would like to improve the query speed, especially since this is only the beginning calculations and the queries will only become more complicated. Also please note that this is my first database i created and exercises I have done. A simplified explanation is preferred. Thanks for any help provided.
I have indexed: DATE, PERIOD_FROM, PERIOD_UNTIL in each table. This speed up the process from 60 seconds to 5 seconds.
The structure of the tables is the following:
CREATE TABLE `apxprice` (
`APX_id` int(11) NOT NULL AUTO_INCREMENT,
`DATE` date DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`PRICE` decimal(10,2) DEFAULT NULL,
PRIMARY KEY (`APX_id`)
) ENGINE=MyISAM AUTO_INCREMENT=28728 DEFAULT CHARSET=latin1
CREATE TABLE `imbalanceprice` (
`imbalanceprice_id` int(11) NOT NULL AUTO_INCREMENT,
`DATE` date DEFAULT NULL,
`PTU` tinyint(3) DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`UPWARD_INCIDENT_RESERVE` tinyint(1) DEFAULT NULL,
`DOWNWARD_INCIDENT_RESERVE` tinyint(1) DEFAULT NULL,
`UPWARD_DISPATCH` decimal(10,2) DEFAULT NULL,
`DOWNWARD_DISPATCH` decimal(10,2) DEFAULT NULL,
`INCENTIVE_COMPONENT` decimal(10,2) DEFAULT NULL,
`TAKE_FROM_SYSTEM` decimal(10,2) DEFAULT NULL,
`FEED_INTO_SYSTEM` decimal(10,2) DEFAULT NULL,
`REGULATION_STATE` tinyint(1) DEFAULT NULL,
`HOUR` int(2) DEFAULT NULL,
PRIMARY KEY (`imbalanceprice_id`),
KEY `DATE` (`DATE`,`PERIOD_FROM`,`PERIOD_UNTIL`)
) ENGINE=MyISAM AUTO_INCREMENT=117427 DEFAULT CHARSET=latin
CREATE TABLE `powerload` (
`powerload_id` int(11) NOT NULL AUTO_INCREMENT,
`EAN` varchar(18) DEFAULT NULL,
`DATE` date DEFAULT NULL,
`PERIOD_FROM` time DEFAULT NULL,
`PERIOD_UNTIL` time DEFAULT NULL,
`POWERLOAD` int(11) DEFAULT NULL,
PRIMARY KEY (`powerload_id`)
) ENGINE=MyISAM AUTO_INCREMENT=61039 DEFAULT CHARSET=latin
Now when running this query:
SELECT i.DATE, i.PERIOD_FROM, i.TAKE_FROM_SYSTEM, i.FEED_INTO_SYSTEM,
a.PRICE, p.POWERLOAD, sum(a.PRICE * p.POWERLOAD)
FROM imbalanceprice i, apxprice a, powerload p
WHERE i.DATE = a.DATE
and i.DATE = p.DATE
AND i.PERIOD_FROM >= a.PERIOD_FROM
and i.PERIOD_FROM = p.PERIOD_FROM
AND i.PERIOD_FROM < a.PERIOD_UNTIL
AND i.DATE >= '2018-01-01'
AND i.DATE <= '2018-01-31'
group by i.DATE
I have run the query with explain and get the following result: Select_type, all simple partitions all null possible keys a,p = null i = DATE Key a,p = null i = DATE key_len a,p = null i = 8 ref a,p = null i = timeseries.a.DATE,timeseries.p.PERIOD_FROM rows a = 28727 p = 61038 i = 1 filtered a = 100 p = 10 i = 100 a extra: using where using temporary using filesort b extra: using where using join buffer (block nested loop) c extra: null
Preferably I run a more complicated query for a whole year and group by month for example with all price tables incorporated. However, this would be too slow. I have indexed: DATE, PERIOD_FROM, PERIOD_UNTIL in each table. The calculation result may not be changed, in this case quarter hourly consumption of two meters multiplied by hourly prices.
"Categorically speaking," the first thing you should look at is indexes.
Your clauses such as WHERE i.DATE = a.DATE ... are categorically known as INNER JOINs, and the SQL engine needs to have the ability to locate the matching rows "instantly." (That is to say, without looking through the entire table!)
FYI: Just like any index in real-life – here I would be talking about "library card catalogs" if we still had such a thing – indexes will assist both "equal to" and "less/greater than" queries. The index takes the computer directly to a particular point in the data, whether that's a "hit" or a "near miss."
Finally, the EXPLAIN verb is very useful: put that word in front of your query, and the SQL engine should "explain to you" exactly how it intends to carry out your query. (The SQL engine looks at the structure of the database to make that decision.) Although the EXPLAIN output is ... (heh) ... "not exactly standardized," it will help you to see if the computer thinks that it needs to do something very time-wasting in order to deliver your answer.

How to design Cassandra Scheme for User Actions Log?

I have a table like this in MYSQL to log user actions :
CREATE TABLE `actions` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`module` VARCHAR(32) NOT NULL,
`controller` VARCHAR(64) NOT NULL,
`action` VARCHAR(64) NOT NULL,
`date` Timestamp NOT NULL,
`userid` BIGINT(20) NOT NULL,
`ip` VARCHAR(32) NOT NULL,
`duration` DOUBLE NOT NULL,
PRIMARY KEY (`id`),
)
COLLATE='utf8mb4_general_ci'
ENGINE=MyISAM
AUTO_INCREMENT=1
I have a MYSQL Query Like this to find out count of specific actions per day :
SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND
YEAR(date)=2017 AND MONTH(date)=06 GROUP BY YEAR(date), MONTH(date),
DAY(date)
this takes 50 - 60 second to me to have a list of days with count of "join" action with only 5 million rows and index in date and action.
So, I want to log actions using Cassandra, so How can I design Cassandra scheme and How to query to get such request less than 1 second.
CREATE TABLE actions (
id timeuuid,
module varchar,
controller varchar,
action varchar,
date_time timestamp,
userid bigint,
ip varchar,
duration double,
year int,
month int,
dt date,
PRIMARY KEY ((action,year,month),dt,id)
);
Explanation:
With abobe table Defination
SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND yaer=2017 AND month=06 GROUP BY action,year,month,dt
will hit single partition.
In dt column only date will be there... may be you can change it to only day number with int as datatype and since id is timeuuid.. it will be unique.
Note: GROUP BY is supported by cassandra 3.10 and above

Counting year and month entries from datetime fields

I have a problem constructing a mysql query:
I have this table "tSubscribers" were I store the subscribers for my newsletter mailing list.
The table looks like this (simplified):
--
-- Table structure for tSubscriber
--
CREATE TABLE tSubscriber (
fId INT UNSIGNED NOT NULL AUTO_INCREMENT,
fSubscriberGroupId INT UNSIGNED NOT NULL,
fEmail VARCHAR(255) NOT NULL DEFAULT '',
fDateConfirmed DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
fDateUnsubscribed TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
PRIMARY KEY (fId),
INDEX (fSubscriberGroupId),
) ENGINE=MyISAM;
Now what I want to accomplish is to have a diagram showing the subscriptions and unsubscriptions per month per subscriber group.
So I need to extract the year and months from the fDateConfirmed, fDateUnsubscribed dates, count them and show the count sorted by month and year for a subscriber group.
I think this sql query gets quite complex and I just can't get my head around it. Is this even possible with one query.
You will need two separate queries, one for subscriptions and other for unsubscriptions.
SELECT COUNT(*), YEAR(fDateConfirmed), MONTH(fDateConfirmed) FROM tSubscriber GROUP BY YEAR(fDateConfirmed), MONTH(fDateConfirmed)
SELECT COUNT(*), YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribe ) FROM tSubscriber GROUP BY YEAR(fDateUnsubscribed), MONTH(fDateUnsubscribed)