MySQLI SELECT with multiple clauses for epoch timestamps selection - mysql

I have epoch timestamps into "PART_EPOCH" column, table name is "crud_mysqli"
I would like to select associated "PART_ID" value for the next FUTURE timestamps. (avoid a research into past timestamps)
The following MySQLI query should select the MIN (next) value within the future : > now.
But it does not return anything.
It does return expected return if i state clauses seperately,
combining clauses as below returns no result.
Would you please tell me what is wrong here :
// Find next event PART_ID name :
// SELECT lowest (next) PART_ID value in the future (do not select winthin past PART_EPOCH values)
$query = "SELECT
PART_ID
FROM crud_mysqli
WHERE (PART_EPOCH = (SELECT MIN(PART_EPOCH) FROM crud_mysqli))
AND (PART_EPOCH > UNIX_TIMESTAMP(NOW()))
";

WHERE (PART_EPOCH = (SELECT MIN(PART_EPOCH) FROM crud_mysqli))
Here you say to only take the entry with the lowest timestamp, which is probably somthing in the past.
AND (PART_EPOCH > UNIX_TIMESTAMP(NOW()))
And here you say, that it should be in the future. The two conditions are excluding each other, if you have any entry with the timestamp in the future.
So you need to put the second condition into the subquery:
SELECT
PART_ID
FROM crud_mysqli
WHERE PART_EPOCH = (
SELECT MIN(PART_EPOCH)
FROM crud_mysqli
WHERE PART_EPOCH > UNIX_TIMESTAMP(NOW())
)
That means: "take the entry with the lowest timestamp in the past"
However.. you can as good do the following:
SELECT PART_ID
FROM crud_mysqli
WHERE PART_EPOCH > UNIX_TIMESTAMP(NOW())
ORDER BY PART_EPOCH ASC
LIMIT 1
The result would only differ if you have two entries with the same timestamp. In that case the first query would return both of them - the second query only one.

Related

MySQL Matching date-based First Instance of value

I have a table containing stock market data (open, hi, lo, close prices) but in a random order of date:
Date Open Hi Lo Close
12/10/2019 313.82 314.54 312.81 313.58
11/22/2019 311.09 311.24 309.85 310.96
11/25/2019 311.98 313.37 311.98 313.37
11/26/2019 313.41 314.28 313.06 314.08
11/27/2019 314.61 315.48 314.37 315.48
11/29/2019 314.86 315.13 314.06 314.31
12/2/2019 314.59 314.66 311.17 311.64
12/3/2019 308.65 309.64 307.13 309.55
I have another value in a PHP variable (say $BaseValue),and a start date and end date ($startdt and $enddt).
1) My requirement is to pick-up the value from the HI column, if it exceeds the $BaseValue on the very FIRST date in a chronological order between the given start and end dates.
For example, if the $BaseValue=314, startdt=11/22, enddt=12/2, then I want to retrieve the Date (11/26/19) as it is the earliest date on which the Hi value (314.28) exceeded the $Basevalue within the given date range. The select statement should return both the Hi value (314.28) and the Date (11/26/19).
2) Additionally, I also need to retrieve the HIGHEST value and date from the HI column during the given date duration. In the above scenario, it should return 315.48 and corresponding date 11/27.
The table is NOT in a chronological order - its randomly filled.
I am unable to get the first query at all with the use of MAX function and its various combinations. Makes me wonder if that is possible at all in SQL or not.
While the second is straightforward, I was wondering if it is more efficient and less complex to club the two queries and get the four values in one single shot.
Any ideas on how can I approach the need to fulfill this requirement please?
Thanks
You could use two subqueries for filtering, one per criteria, like:
select t.*
from mytable t
where
t.date = (
select min(t1.date)
from mytable t1
where t1.date between :datedt and :enddt and t1.hi >= :basevalue
)
or t.hi = (
select max(t1.hi)
from mytable t1
where t1.date between datedt and :enddt and t1.hi >= :basevalue
)
Another option is to union two queries with orer by and limit:
(
select t.*
from mytable
where t.date between :datedt and :enddt and t1.hi >= :basevalue
order by t.date
limit 1
)
union
(
select t.*
from mytable t
where t.date between :datedt and :enddt and t1.hi >= :basevalue
order by t.hi desc, t.date
limit 1
)
Please note that both queries do not do exactly the same thing. If there are ties for the highest hi in the period, the first query will return all ties, while the second will pick the earliest one. It's up to you to decide which solution better fits your use case.

When selecting records between 2 dates, what impact will encounter when I use "OR" over an "AND"?

An example
SELECT * FROM my_table WHERE date>=01012018 OR date<=31012018
Using "AND"
SELECT * FROM my_table WHERE date>=01012018 AND date<=31012018
How will my records be affected?
It's a big impact.
When you use "condition AND condition" the query will return only the matching results (in your case, only results where the date is >= than 01012018 AND date is <= 31012018, like 10012018, 20012018, 15012018, etc).
If you use "condition OR condition", it will return the results where the date is >= 01012018 (like 02012018, 20012018, 15022018, 30122018, etc) OR the date is <= 31012018 (like 01012000, 15022015, 17052017, etc) -> basically, all the results.

Generating complex sql tables

I currently have an employee logging sql table that has 3 columns
fromState: String,
toState: String,
timestamp: DateTime
fromState is either In or Out. In means employee came in and Out means employee went out. Each row can only transition from In to Out or Out to In.
I'd like to generate a temporary table in sql to keep track during a given hour (hour by hour), how many employees are there in the company. Aka, resulting table has columns HourBucket, NumEmployees.
In non-SQL code I can do this by initializing the numEmployees as 0 and go through the table row by row (sorted by timestamp) and add (employee came in) or subtract (went out) to numEmployees (bucketed by timestamp hour).
I'm clueless as how to do this in SQL. Any clues?
Use a COUNT ... GROUP BY query. Can't see what you're using toState from your description though! Also, assuming you have an employeeID field.
E.g.
SELECT fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable
INNER JOIN (SELECT employeeID AS 'empID', MAX(timestamp) AS 'latest' FROM StaffinBuildingTable GROUP BY employeeID) AS LastEntry ON StaffinBuildingTable.employeeID = LastEntry.empID
GROUP BY fromState
The LastEntry subquery will produce a list of employeeIDs limited to the last timestamp for each employee.
The INNER JOIN will limit the main table to just the employeeIDs that match both sides.
The outer GROUP BY produces the count.
SELECT HOUR(SBT.timestamp) AS 'Hour', SBT.fromState AS 'Status', COUNT(*) AS 'Number'
FROM StaffinBuildingTable AS SBT
INNER JOIN (
SELECT SBIJ.employeeID AS 'empID', MAX(timestamp) AS 'latest'
FROM StaffinBuildingTable AS SBIJ
WHERE DATE(SBIJ.timestamp) = CURDATE()
GROUP BY SBIJ.employeeID) AS LastEntry ON SBT.employeeID = LastEntry.empID
GROUP BY SBT.fromState, HOUR(SBT.timestamp)
Replace CURDATE() with whatever date you are interested in.
Note this is non-optimal as it calculates the HOUR twice - once for the data and once for the group.
Again you are using the INNER JOIN to limit the number of returned row, this time to the last timestamp on a given day.
To me your description of the FromState and ToState seem the wrong way round, I'd expect to doing this based on the ToState. But assuming I'm wrong on that the following should point you in the right direction:
First, I create a "Numbers" table containing 24 rows one for each hour of the day:
create table tblHours
(Number int);
insert into tblHours values
(0),(1),(2),(3),(4),(5),(6),(7),
(8),(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),(23);
Then for each date in your employee logging table, I create a row in another new table to contain your counts:
create table tblDailyHours
(
HourBucket datetime,
NumEmployees int
);
insert into tblDailyHours (HourBucket, NumEmployees)
select distinct
date_add(date(t.timeStamp), interval h.Number HOUR) as HourBucket,
0 as NumEmployees
from
tblEmployeeLogging t
CROSS JOIN tblHours h;
Then I update this table to contain all the relevant counts:
update tblDailyHours h
join
(select
h2.HourBucket,
sum(case when el.fromState = 'In' then 1 else -1 end) as cnt
from
tblDailyHours h2
join tblEmployeeLogging el on
h2.HourBucket >= el.timeStamp
group by h2.HourBucket
) cnt ON
h.HourBucket = cnt.HourBucket
set NumEmployees = cnt.cnt;
You can now retrieve the counts with
select *
from tblDailyHours
order by HourBucket;
The counts give the number on site at each of the times displayed, if you want during the hour in question, we'd need to tweak this a little.
There is a working version of this code (using not very realistic data in the logging table) here: rextester.com/DYOR23344
Original Answer (Based on a single over all count)
If you're happy to search over all rows, and want the current "head count" you can use this:
select
sum(case when t.FromState = 'In' then 1 else -1) as Heads
from
MyTable t
But if you know that there will always be no-one there at midnight, you can add a where clause to prevent it looking at more rows than it needs to:
where
date(t.timestamp) = curdate()
Again, on the assumption that the head count reaches zero at midnight, you can generalise that method to get a headcount at any time as follows:
where
date(t.timestamp) = "CENSUS DATE" AND
t.timestamp <= "CENSUS DATETIME"
Obviously you'd need to replace my quoted strings with code which returned the date and datetime of interest. If the headcount doesn't return to zero at midnight, you can achieve the same by removing the first line of the where clause.

SQL comparison within max function

I'm trying to get a list of 20 events grouped by their Ids and sorted by whether they are in progress, pending, or already finished. The problem is that there are events with the same id that include finished, pending, and in progress events and I want to have 20 distinct Ids in the end. What I want to do is group these events together but if one of them is in progress then sort that group by that event. So basically I want to sort by the latest end time that is also before now().
What I have so far is something like this where end and start are end/start times. I'm not sure if what is inside max() is behaving how I should expect.
select * from event_schedule as t1
JOIN (
SELECT DISTINCT(event_id) as e
from event_schedule
GROUP BY event_id
order by MAX(end < unix_timestamp(now())) asc,
MIN(start >= unix_timestamp(now())) asc,
MAX(start) desc
limit 0, 20
)
as t2 on (t1.event_id = t2.e)
This results in some running / pending events to be mixed around in order when I want them to be in the order running -> pending -> Ended.
I would suggest to first create a view in order to not get an overcomplicated SELECT statement:
CREATE VIEW v_event_schedule AS
SELECT *,
CASE
WHEN end < unix_timestamp(now())
THEN 1
WHEN start > unix_timestamp(now())
THEN 2
ELSE 3
END AS category
FROM event_schedule;
This view v_event_schedule returns an extra column, in addition to the columns of event_schedule, which represents the priority of the category (running, pending, past):
running (in progress)
pending (future)
past
Then the following will do what you want:
SELECT a.*
FROM v_event_schedule a
INNER JOIN (
SELECT id,
MIN(category) category
FROM v_event_schedule b
GROUP BY id
) b
ON a.id = b.id
AND a.category = b.category
ORDER BY category,
start DESC
LIMIT 20;
The ORDER BY can be further adapted to your needs as to how you want to sort within the same category. I added start DESC as that seemed what you were doing in your attempt.
About the original ORDER BY
You had this:
order by MAX(end < unix_timestamp(now())) asc,
MIN(start >= unix_timestamp(now())) asc,
The expressions you have there evaluate to boolean values, and both elements in the ORDER BY each divide the groups into two sections, one for false and one for true, so in total 4 groups.
The first of the two will order IDs first that have no record with an end value in the past, because only then the boolean expression is always false which is the only way to make the MAX of them false as well.
Now let's say for the same ID you have both records that have an end date in the future as well as records with an end date in the past. In that case the MAX aggregates to true, and so the id will be sorted secondary. This is not intended, as this ID might have a "running" record.
I did not look into making your query work based on such aggregates on boolean expressions. It requires some time to understand what they are doing. A CASE WHEN to determine the category with a number really makes the SQL a lot easier to understand, at least to me.

MYSQL - find and show all duplicates within date difference critria

This query below selects all rows that have a row with the same father registering 335 days or less since earlier registration. Is there a way to edit this query so that it does not eliminate the duplicate row in the output? I need to see all instances of the registration for that father within 335 days of each other.
SELECT * FROM ymca_reg a later
WHERE NOT EXISTS (
SELECT 1 FROM ymca_reg a earlier
WHERE
earlier.Father_First_Name = later.Father_First_Name
AND earlier.Father_Last_Name = later.Father_Last_Name
AND (later.Date - earlier.Date < 335) AND (later.Date > earlier.Date)
My current query is:
SELECT ymca_reg.* FROM ymca_reg WHERE (((ymca_reg.Year) In (SELECT Year FROM ymca_reg As Tmp
GROUP BY Year, Father_Last_Name, Father_First_Name
HAVING Count(*)>1
And Father_Last_Name = ymca_reg.Father_Last_Name
And Father_First_Name = ymca_reg.Father_First_Name)))
ORDER BY ymca_reg.Year, ymca_reg.Father_Last_Name, ymca_reg.Father_First_Name
This query does return all the duplicates for review correctly, but it's terribly slow because it doesn't use a join and as soon as I add the date criteria it only returns the later row. Thanks.
I think you want something like this:
SELECT *
FROM ymca_reg later
WHERE EXISTS (SELECT 1
FROM ymca_reg earlier
WHERE earlier.Father_First_Name = later.Father_First_Name AND
earlier.Father_Last_Name = later.Father_Last_Name AND
abs(later.Date - earlier.Date) < 335 and
later.Date <> earlier.Date
);
This should return all records that have such duplicates. Note that "later" and "earlier" are no longer really apt descriptions, but I left the names so you can see the similarity to your query.