I'm sending syslog-ng to Percona. I have different logging sources filtered into different MySQL tables. I'm trying to determine the number of logs per second, minute and hour.
This is how the table was created:
CREATE TABLE syslog.switchlogs (
host varchar(40) DEFAULT NULL,
facility varchar(10) DEFAULT NULL,
level varchar(10) DEFAULT NULL,
tag varchar(10) DEFAULT NULL,
program varchar(50) DEFAULT NULL,
msg text,
seq bigint(20) unsigned NOT NULL AUTO_INCREMENT,
timestamp timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (seq),
KEY host (host),
KEY timestamp (timestamp),
KEY host_timestamp (host,timestamp)
) ENGINE=InnoDB;
Most of the tables get the results I'm expecting:
Last hour:
SELECT count(seq) as thecount FROM syslog.switchlogs WHERE timestamp>=DATE_SUB(NOW(),INTERVAL 1 HOUR);
Last minute:
SELECT count(seq) as thecount FROM syslog.switchlogs WHERE timestamp>=DATE_SUB(NOW(),INTERVAL 1 MINUTE);
Last second:
SELECT count(seq) as thecount FROM syslog.switchlogs WHERE timestamp>=DATE_SUB(NOW(),INTERVAL 1 SECOND);
I get results like this:
Hour: 804
Minute: 16
Second: 1
One of my tables has VMware logs and I get odd results from counts...
Hour: 30,180
Minute: 24,278
Second: 24,160
That's obviously wrong. If I look at the last 50 logs in the table there's only 10 of them in the last second and 39 in the last minute. Why is the SQL above not working as expected?
Related
I have a table like this in MYSQL to log user actions :
CREATE TABLE `actions` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`module` VARCHAR(32) NOT NULL,
`controller` VARCHAR(64) NOT NULL,
`action` VARCHAR(64) NOT NULL,
`date` Timestamp NOT NULL,
`userid` BIGINT(20) NOT NULL,
`ip` VARCHAR(32) NOT NULL,
`duration` DOUBLE NOT NULL,
PRIMARY KEY (`id`),
)
COLLATE='utf8mb4_general_ci'
ENGINE=MyISAM
AUTO_INCREMENT=1
I have a MYSQL Query Like this to find out count of specific actions per day :
SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND
YEAR(date)=2017 AND MONTH(date)=06 GROUP BY YEAR(date), MONTH(date),
DAY(date)
this takes 50 - 60 second to me to have a list of days with count of "join" action with only 5 million rows and index in date and action.
So, I want to log actions using Cassandra, so How can I design Cassandra scheme and How to query to get such request less than 1 second.
CREATE TABLE actions (
id timeuuid,
module varchar,
controller varchar,
action varchar,
date_time timestamp,
userid bigint,
ip varchar,
duration double,
year int,
month int,
dt date,
PRIMARY KEY ((action,year,month),dt,id)
);
Explanation:
With abobe table Defination
SELECT COUNT(*) FROM actions WHERE actions.action = "join" AND yaer=2017 AND month=06 GROUP BY action,year,month,dt
will hit single partition.
In dt column only date will be there... may be you can change it to only day number with int as datatype and since id is timeuuid.. it will be unique.
Note: GROUP BY is supported by cassandra 3.10 and above
table with login data of the website users with two datetime columns. One is logged in time and other one is logged out time.
table has following columns
user_id (user id of the
datetime (start login time DATETIME )
datetime_end (end logout time DATETIME)
From this data I need to generate a report or calculate the following
Daily AVG Login Duration for Users (i.e 01 March 2015 its 21 mins , 03 November 2016 its 25 mins ..etc )
second report is
Current Day (Last 24 hours) based on Hour AVG Login Duration (i.e same as above but only per hour AVG)
is there a way i can achieve this via MySQL query ? (or SQL query)
table is create is
CREATE TABLE IF NOT EXISTS `users_login` (
`login_id` int(11) unsigned NOT NULL,
`user_id` int(11) unsigned NOT NULL DEFAULT '0',
`last_ip` varchar(45) NOT NULL DEFAULT '0.0.0.0',
`server_ip` varchar(225) DEFAULT '0.0.0.0',
`country` varchar(2) NOT NULL DEFAULT '00',
`continent` tinytext,
`datetime` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`datetime_end` datetime NOT NULL DEFAULT '0000-00-00 00:00:00'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Something like this for the first one
SELECT DATE(datetime),AVG(TIMESTAMPDIFF(MINUTE,datetime,datetime_end)) as DailyAvg
FROM users_login
GROUP BY DATE(datetime)
And for hourly :
SELECT DATE(datetime),extract(hour from datetime) as HourCol,
AVG(TIMESTAMPDIFF(MINUTE,datetime,datetime_end)) as HourlyAvg
FROM users_login
WHERE DATE(datetime) = date(now())
GROUP BY DATE(datetime),extract(hour from datetime)
Of course this queries have a few exceptions.. You didn't explain you entire logic so I assumed you want to group by the hour of the login and the date of the login..
I have a table caled "Ongoing_Portfolio" Below is its structure.
CREATE TABLE `ongoing_portfolio` (
`idOngoing_Portfolio` int(11) NOT NULL AUTO_INCREMENT,
`Updated_Date` date NOT NULL,
`Investment_Value` double NOT NULL,
`Cash_Value` double NOT NULL,
`idPortfolio` int(11) NOT NULL,
PRIMARY KEY (`idOngoing_Portfolio`),
KEY `fk_Ongoing_Portfolio_Portfolio1_idx` (`idPortfolio`),
CONSTRAINT `fk_Ongoing_Portfolio_Portfolio1` FOREIGN KEY (`idPortfolio`) REFERENCES `portfolio` (`idPortfolio`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8
I need to get the first day and last day of Updated_Date year. Below is my attempt.
//Get the first date
SELECT EXTRACT (YEAR FROM `Updated_Date`)
FROM Ongoing_Portfolio WHERE `idPortfolio` = 1
//Get the last date
SELECT EXTRACT (YEAR FROM `Updated_Date`)
FROM Ongoing_Portfolio WHERE `idPortfolio` = 1
I know my attempt is not completed, but that is also incorrect. I am getting errors!
What I expected is, if the Updated_Date is 2014-05-06 the first query should return 2014-01-01 and the second query should return 2014-12-31
How can I do this in MySQL please?
You can use only one query for getting first and last day
SELECT MAKEDATE(YEAR(`Updated_Date`),1) as first_date ,MAKEDATE(YEAR(`Updated_Date`),365) as last_date FROM Ongoing_Portfolio WHERE `idPortfolio` = 1
You can easily access first_date and last_date from db results
I have a MYSQL DB with table definition like this:
CREATE TABLE `minute_data` (
`date` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`open` decimal(10,2) DEFAULT NULL,
`high` decimal(10,2) DEFAULT NULL,
`low` decimal(10,2) DEFAULT NULL,
`close` decimal(10,2) DEFAULT NULL,
`volume` decimal(10,2) DEFAULT NULL,
`adj_close` varchar(45) DEFAULT NULL,
`symbol` varchar(10) NOT NULL DEFAULT '',
PRIMARY KEY (`symbol`,`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
It stores 1 minute data points from the stock market. The primary key is a combination of the symbol and date columns. This way I always have only 1 data point for each symbol at any time.
I am wondering why the following query takes so long that I can't even wait for it to finish:
select distinct date from test.minute_data where date >= "2013-01-01"
order by date asc limit 100;
However I can select count(*) from minute_data; and that finishes very quickly.
I know that it must have something to do with the fact that there are over 374 million rows of data in the table, and my desktop computer is pretty far from a super computer.
Does anyone know something I can try to speed up with query? Do I need to abandon all hope of using a MySQL table this big??
Thanks a lot!
When you have a composite index on 2 columns, like your (symbol, date) primary key, searching and grouping by a prefix of they key will be fast. But searching for something that doesn't include the first column in the index requires scanning all rows or using some other index.
You can either change your primary key to (date, symbol) if you don't usually need to search for symbol without date. Or you can add an additional index on date:
alter table minute_data add index (date)
since I have launched a podcast recently I wanted to analyse our Downloaddata. But some clients seem to send multiple requests. So I wanted to only count one request per IP and User-Agent every 15 Minutes. Best thing I could come up with is the following query, that counts one request per IP and User-Agent every hour. Any ideas how to solve that Problem in MySQL?
SELECT episode, podcast, DATE_FORMAT(date, '%d.%m.%Y %k') as blurry_date, useragent, ip FROM downloaddata GROUP BY ip, useragent
This is the table I've got
CREATE TABLE `downloaddata` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`date` datetime NOT NULL,
`podcast` varchar(255) DEFAULT NULL,
`episode` int(4) DEFAULT NULL,
`source` varchar(255) DEFAULT NULL,
`useragent` varchar(255) DEFAULT NULL,
`referer` varchar(255) DEFAULT NULL,
`filetype` varchar(15) DEFAULT NULL,
`ip` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=216 DEFAULT CHARSET=utf8;
Personally I'd recomend collecting every request, and then only taking one every 15 mins with a distict query, or perhaps counting the number every 15 mins.
If you are determined to throw data away so it can never be analysed though.
Quick and simple is to just the date and have an int column which is the 15 minute period,
Hour part of current time * 4 + Minute part / 4
DatePart functions are what you want to look up. Things is each time you want to record, you'll have to check if they have in the 15 minute period. Extra work, extra complexity and less / lower quality data...
MINUTE(date)/15 will give you the quarter hour (0-3). Ensure that along with the date is unique (or ensure UNIX_TIMESTAMP(date)/(15*60) is unique).