Clearly, I am missing the forest for the trees...I am missing something obvious here!
Scenario:
I've a typical table asset_locator with multiple fields:
id, int(11) PRIMARY
logref, int(11)
unitno, int(11)
tunits, int(11)
operator, varchar(24)
lineid, varchar(24)
uniqueid, varchar(64)
timestamp, timestamp
My current challenge is to SELECT records from this table based on a date range. More specifically, a date range using the MAX(timestamp) field.
So...when selecting I need to start with the latest timestamp value and go back 3 days.
EX: I select all records WHERE the lineid = 'xyz' and going back 3 days from the latest timestamp. Below is an actual example (of the dozens) I've been trying to run.
MySQL returns a single row with all NULL values for the following:
SELECT id, logref, unitno, tunits, operator, lineid,
uniqueid, timestamp, MAX( timestamp ) AS maxdate
FROM asset_locator
WHERE 'maxdate' < DATE_ADD('maxdate',INTERVAL -3 DAY)
ORDER BY uniqueid DESC
There MUST be something obvious I am missing. If anyone has any ideas, please share.
Many thanks!
MAX() is an aggregated function, which means your SELECT will always return one row containing the maximum value. Unless you use GROUP BY, but it looks that's not what you need.
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html#function_max
If you need all the entries between MAX(timestamp) and 3 days before, then you need to do a subselect to obtain the max date, and after that use it in the search condition. Like this:
SELECT id, logref, unitno, tunits, operator, lineid, uniqueid, timestamp
FROM asset_locator
WHERE timestamp >= DATE_ADD( (SELECT MAX(timestamp) FROM asset_locator), INTERVAL -3 DAY)
It will still run efficiently as long as you have an index defined on timestamp column.
Note: In your example
WHERE 'maxdate' < DATE_ADD('maxdate',INTERVAL -3 DAY)
Here you were are actually using the string "maxdate" because of the quotes causing the condition to return false. That's why you were seeing NULL for all fields.
Edit: Oops, forgot the "FROM asset_locator" in query. It got lost at some point when writing the answer :)
Related
I have a MySQL database named mydb in which I store daily share prices for
423 companies in a table named data. Table data has the following columns:
`epic`, `date`, `open`, `high`, `low`, `close`, `volume`
epic and date being primary key pairs.
I update the data table each day using a csv file which would normally have 423 rows
of data all having the same date. However, on some days prices may not available
for all 423 companies and data for a particular epic and date pair will
not be updated. In order to determine the missing pair I have resorted
to comparing a full list of epics against the incomplete list of epics using
two simple SELECT queries with different dates and then using a file comparator, thus
revealing the missing epic(s). This is not a very satisfactory solution and so far
I have not been able to construct a query that would identify any epics that
have not been updated for any particular day.
SELECT `epic`, `date` FROM `data`
WHERE `date` IN ('2019-05-07', '2019-05-08')
ORDER BY `epic`, `date`;
Produces pairs of values:
`epic` `date`
"3IN" "2019-05-07"
"3IN" "2019-05-08"
"888" "2019-05-07"
"888" "2019-05-08"
"AA." "2019-05-07"
"AAL" "2019-05-07"
"AAL" "2019-05-08"
Where in this case AA. has not been updated on 2019-05-08. The problem with this is that it is not easy to spot a value that is not a pair.
Any help with this problem would be greatly appreciated.
You could do a COUNT on epic, with a GROUP BY epic for items in that date range and see if you get any with a COUNT less than 2, then select from this result where UpdateCount is less than 2, forgive me if the syntax on the column names is not correct, I work in SQL Server, but the logic for the query should still work for you.
SELECT x.epic
FROM
(
SELECT COUNT(*) AS UpdateCount, epic
FROM data
WHERE date IN ('2019-05-07', '2019-05-08')
GROUP BY epic
) AS x
WHERE x.UpdateCount < 2
Assuming you only want to check the last date uploaded, the following will return every item not updated on 2019-05-08:
SELECT last_updated.epic, last_updated.date
FROM (
SELECT epic , max(`date`) AS date FROM `data`
GROUP BY 'epic'
) AS last_updated
WHERE 'date' <> '2019-05-08'
ORDER BY 'epic'
;
or for any upload date, the following will compare against the entire database, so you don't rely on '2019-08-07' having every epic row. I.e. if the epic has been in the database before then it will show if not updated:
SELECT d.epic, max(d.date)
FROM data as d
WHERE d.epic NOT IN (
SELECT d2.epic
FROM data as d2
WHERE d2.date = '2019-05-08'
)
GROUP BY d.epic
ORDER BY d.epic
I have a table containing thousands of records representing the temperature of a room in a certain moment. Up to now I have been rendering a client side graph of the temperature with JQuery. However, as the amount of records increases, I think it makes no sense to provide so much data to the view, if it is not going to be able to represent them all in a single graph.
I would like to know if there exists a single MySQL query that returns one out of every n records in the table. If so, I think I could get a representative sample of the temperatures measured during a certain lapse of time.
Any ideas? Thanks in advance.
Edit: add table structure.
CREATE TABLE IF NOT EXISTS `temperature` (
`nid` int(10) unsigned NOT NULL COMMENT 'Node identifier',
`temperature` float unsigned NOT NULL COMMENT 'Temperature in Celsius degrees',
`timestamp` int(10) unsigned NOT NULL COMMENT 'Unix timestamp of the temperature record',
PRIMARY KEY (`nid`,`timestamp`)
)
You could do this, where the subquery is your query, and you add a row number to it:
SET #rows=0;
SELECT * from(
SELECT #rows:=#rows+1 AS rowNumber,nid,temperature,`timestamp`
FROM temperature
) yourQuery
WHERE MOD(rowNumber, 5)=0
The mod would choose every 5th row: The 5 here is your n. so 5th row, then 10th, 15th etc.
Not really sure what your asking but you have multiple options
You can limit your results to n (n representing the amount of temperatures you want to display)
just a simple query with the limit in the end:
select * from tablename limit 1000
You could use a time/date restraint so you display only the results of the last n days.
Here is an example that uses date functions. The following query selects all rows with a date_col value from within the last 30 days:
mysql> SELECT something FROM tbl_name
-> WHERE DATE_SUB(CURDATE(),INTERVAL 30 DAY) <= date_col;
You could select an average temperature of a certain period, the shorter the period the more results you'll get. You can group by date, yearweek, month etc. to "create the periods"
Goal: Write the correct SQL to solve the problems below.
Part 1:
Having trouble figuring out the SQL statement on how to get the timestamp that includes the date and the hour where you have the maximum "in_bytes" for each day. See "video_hourly" table DDL code below. If there are two maximum values that have the same value in a given day just pick the first one. This data is being graphed in highcharts so there can only be one data point for each given day. You can fill the table with some sample data.
Part 2:
Another part of this problem is once you have all of the unique maximum "in_bytes" for each day then you need to sum the "in_bytes" and "out_bytes" to get one record.
To convert the UTC time from the database to local time we using this in the queries:
SELECT time_stamp,CONVERT_TZ(time_stamp, '+00:00', '-07:00' ) as localtime
Here is the DDL SQL for the table:
CREATE TABLE video_hourly (
id bigint(20) NOT NULL AUTO_INCREMENT,
time_stamp datetime NOT NULL,
in_bytes bigint(20) UNSIGNED NOT NULL DEFAULT 0,
out_bytes bigint(20) UNSIGNED NOT NULL DEFAULT 0,
opt_pct decimal(11, 2) NOT NULL DEFAULT 0.00,
PRIMARY KEY (id)
)
ENGINE = INNODB;
Any help or advice on this would greatly be appreciated. Thank you!
See this list of datetime functions that you can use. Specifically, you can use HOUR() to get the hour value.
You can also use DATE() to get the date part of a datetime column. Once you have those, you can group them together. I will try and break it down for you.
This will return the date, hour, and the in_bytes for that hour, by grouping by day and hour.
SELECT DATE(time_stamp) AS date, HOUR(time_stamp) AS hour, SUM(in_bytes) AS totalInBytes
FROM video_hourly
GROUP BY date, hour
ORDER BY date, hour, totalInBytes DESC;
This will also but the max totalInBytes at the top of each group because it orders by that in descending order.
Also, please see this question for how to get the max value in a group, which in this case is you want to get the max inBytes for each date.
Then, you can change your query to this:
SELECT CONCAT(v.date, ' ', v.hour) AS dateAndHour, v.totalInBytes
FROM(SELECT time_stamp AS fullDate, DATE(time_stamp) AS date, HOUR(time_stamp) AS hour, SUM(in_bytes) AS totalInBytes
FROM video_hourly
GROUP BY date, hour
ORDER BY date, hour, totalInBytes DESC
) v
WHERE(
SELECT COUNT(*)
FROM(SELECT DATE(time_stamp) AS date, HOUR(time_stamp) AS hour, SUM(in_bytes) AS totalInBytes
FROM video_hourly
GROUP BY date, hour
ORDER BY date, hour, totalInBytes DESC
) vh
WHERE vh.date = v.date AND vh.totalInBytes >= v.totalInBytes
) <= 1;
I can't try it without any sample data, but here is an SQL Fiddle link, if you want to try it out. I used this to make sure it would not produce any errors.
I have an SQL query that I need help with...
Basically I have two tables I need to work with. One contains customer accounts and the other contains a log of customer service reps interactions with customers. I want this query give me the id of any account that has not had a log entry (interaction) in the last 14 days. I also want to filter out a few rep accounts that are irrelevant (using the assignedto field as you will see). Also, the date format in the log table is funky non-standard and I cannot change it, as software I have not written also utilizes this database.
The two tables are cm.dbs (customer accounts) and cm.log (interaction log).
This is the query I came up with but it takes FOREVER to run. The subquery works perfectly and takes a fraction of a second, but when the main query runs with the subquery it is just impossibly slow. I'm guessing this is because the subquery is being run for every row in the main query (and it doesn't need to be) but I am kind of clueless as to how to fix this, as I am not an expert in SQL, I know enough to create basic to intermediate queries and this is not something I have done before.
Here is the query I created so far:
SELECT id FROM cm.dbs WHERE id NOT IN (SELECT filenumber FROM cm.log
WHERE STR_TO_DATE(logdate, '%m/%d/%Y')
BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY)
AND NOW()
GROUP BY filenumber)
AND assignedto != 'OLD_ACCTS'
AND assignedto != 'HOUSE_ACCOUNTS'
AND assignedto != 'PAID_ACCOUNTS';
The subquery finds all the accounts that have entries in the log table within the last two weeks. It does this job perfectly. The trick is then to get the main query to find all the accounts that do not have entries.
Note also, that the filenumber field in cm.log corresponds to id in the cm.dbs table.
I may have approached this in a completely silly way and I am not above admitting that. Any input on making this work correctly and efficiently is appreciated. I'd also love the fixes/changes anyone recommends explained. I am not simply wanting a query built for me, I want to learn what I did wrong and how to do it better so next time I can figure this out for myself. I rarely ever ask questions like this, I usually figure things out on my own but this has me stumped.
EDIT: Here is a partial schema for the relevant fields in the tables:
cm.dbs:
id int(10) UN PK AI
title varchar(45)
firstname varchar(200)
middlename varchar(200)
lastname varchar(200)
fullname varchar(200)
address varchar(200)
address2 varchar(200)
city varchar(200)
state varchar(200)
zip varchar(50)
assignedto varchar(200)
...
cm.log:
id int(10) UN PK AI
filenumber varchar(200)
agentname varchar(200)
logtime varchar(200)
logdateandtime varchar(200)
logdate varchar(200)
logmessage mediumtext
Your query looks correct to me except the change below ( since you have multiple assignedto values to be checked for, use a IN operator instead making them in separate OR exclusively.)
SELECT id FROM cm.dbs WHERE id NOT IN (SELECT filenumber FROM cm.log
WHERE STR_TO_DATE(logdate, '%m/%d/%Y')
BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY)
AND NOW()
GROUP BY filenumber)
AND assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS');
This is the best I can do without a database schema, but should hopefully be pretty close to what you were looking for (or at least point you in the right direction):
SELECT DISTINCT dbs.id
FROM cm.dbs, cm.log
WHERE dbs.id = log.filenumber
AND STR_TO_DATE(log.logdate, '%m/%d/%Y') NOT BETWEEN DATE_SUB(NOW(), INTERVAL 14 DAY) AND NOW()
AND dbs.assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS');
If you get a chance run EXPLAIN on your query and add the output to your question, so we can profile it better (and include the database schema).
I think you are attacking this in the wrong way. Lets break down what you're looking for.
First thing is the filenumber and max logdate:
SELECT filenumber, MAX(logdate)
FROM cm.log
GROUP BY filenumber
So now we just need to join it to the other table:
SELECT filenumber, MAX(logdate), assignedto
FROM cm.log as log
INNER JOIN cm.dbs as dbs ON log.filenumber = dbs.id
GROUP BY filenumber
Now we want to apply some conditions on what we just selected (older than 2 weeks, not in those 3 groups) :
SELECT * FROM (
SELECT log.filenumber, MAX(logdate) as logdate, assignedto
FROM cm.log as log
INNER JOIN cm.dbs as dbs ON log.filenumber = dbs.id
GROUP BY filenumber) t
WHERE logdate < DATE_SUB(NOW(), INTERVAL 14 DAY)
AND assignedto NOT IN ('OLD_ACCTS','HOUSE_ACCOUNTS','PAID_ACCOUNTS')
I need to return two different results in a single query. When I run them independently, the first returns no rows (that's fine) and the second returns some rows (also fine). When I UNION ALL them, I get 1048 - Column "Date" cannot be null.
I need resulting rows of Date, PW, errors which I will feed a graph to show me what's going on in the system at the points in time specified by Date. In both tables, Date is of the format DateTime and must never be NULL.
SELECT `Date`, COUNT(`ID`) AS `PW`, 0 AS `errors`
FROM `systemlogins`
WHERE `Result` = 'PasswordFailure' AND `Date` >= DATE_SUB(NOW(), INTERVAL 1 DAY)
UNION ALL
SELECT `Date`, 0 AS `PW`, COUNT(`ID`) AS `errors`
FROM `systemerrors`
WHERE `Date` >= DATE_SUB(NOW(), INTERVAL 1 DAY)
GROUP BY ( 4 * HOUR( `Date` ) + FLOOR( MINUTE( `Date` )/15)) --i.e. full 1/4s of hour
ORDER BY ( 4 * HOUR( `Date` ) + FLOOR( MINUTE( `Date` )/15))
I have read that MySQL might ignore tables' NOT NULL conditions in UNIONs, causing that error. I have indeed removed the "NOT NULL" restriction on the tables and, tada, it works. Now, those restrictions have been put there for a reason and I would like to keep them while running the aforementioned query - is there any way?
Edit:
Order is the villain - removing it returns a correct result, albeit with one empty row where Date is NULL. For my purposes, I need to order the results by Date somehow.
Why are you selecting the Date column? Since you are using a aggregate function COUNT, but there is no GROUP BY clause in any of the selects, seems to me that you do not care about the Date column.
Try adding a GROUP BY clause, or removing the Date column from the select.