Get date when two things appear at the same time (mysql query) - mysql

Is there a sql query that can generate the date when 2 things appear together?
I mean, let's say I have a table consists of bus schedule. Then, I have bus A and B. Bus A will operate on 22 May, 24 May, and 25 May while B operates on 22 May, 24 May and 26 May. I want to get the most recent date that 2 buses appear together which is 24 May.

To see those that both buses share:
SELECT t.date
FROM YOUR_TABLE t
WHERE t.bus IN ('A', 'B')
GROUP BY t.date
HAVING COUNT(DISTINCT t.bus) = 2
To see the most recent date that both buses share:
SELECT t.date
FROM YOUR_TABLE t
WHERE t.bus IN ('A', 'B')
GROUP BY t.date
HAVING COUNT(DISTINCT t.bus) = 2
ORDER BY t.date DESC
LIMIT 1

Assuming you have a table named bus_schedule that contains a bus_name and bus_date field, something like this should work:
select bus_schedule_a.bus_date
from bus_schedule bus_schedule_a
inner join bus_schedule bus_schedule_b
on bus_schedule_a.bus_date = bus_schedule_b.bus_date
and bus_schedule_a.bus_name <> bus_schedule_b.bus_name
order by bus_schedule_a.bus_date desc
limit 1

Related

HOW to Find the most frequent string in mySQL

I would like to get the most frequent car type (varchar) within the 25 years older owners. I wrote a query, but it counts all the types of names.
How can I complement this mySQL query to count only the same name type of cars?
SELECT type, COUNT(type)
FROM `car`
INNER JOIN owner
ON car.owner= tulajdonos.id
WHERE 2018 - owner.birth_date >= 25
You have to sort descending your query and take only the topmost row with LIMIT 1.
SELECT type, COUNT(type) AS counter
FROM `car`
INNER JOIN owner
ON car.owner= tulajdonos.id
WHERE 2018 - owner.birth_date >= 25
GROUP BY type
ORDER BY COUNT(type) DESC
LIMIT 1
If you want the one most frequent type you could use
SELECT top 1 type, COUNT(type)
FROM `car`
INNER JOIN owner
ON car.owner= tulajdonos.id
WHERE 2018 - owner.birth_date >= 25
order by count(type) desc, type
if you are sure there are no ties.
Also, your way of determining age isn't perfect, you could use something like:
WHERE
case
when DATEPART(DY, owner.birth_date) > DATEPART(DY, GETDATE())
then DATEDIFF(YYYY, owner.birth_date, GETDATE()) - 1
else DATEDIFF(YYYY, owner.birth_date, GETDATE())
end >= 25
You could start by doing the date arithmetic correctly. After that, you probably just want limit:
SELECT c.type, COUNT(*) AS counter
FROM car c INNER JOIN
owner o
ON c.owner= o.id
WHERE o.birth_date < curdate() - interval 25 year;
ORDER BY counter DESC
LIMIT 1

sql optimization: count all rows through subquery or own query / other improvements

I'm trying to improve my mysql query. At first I'm trying to optimize that simple one:
SELECT * ,
(
SELECT COUNT(id)
FROM animal
WHERE type = :type AND timestampadopt > 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
) AS countanimals
FROM animal
WHERE type = :type AND timestampadopt > 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
ORDER BY timestamp DESC
LIMIT 1, 20;
COLUMNS:
id | timestampadd | timestampadopt | dateborn | animaltype | gender | chipped | smalldescger | smalldesceng | imagepath
On that affected site I loop all animals, with pagination. So you can see 20 animals and for the next 20 you have to use the next button.
I need to know for the pagination how many sites have to be displayed, so I have to count how many animals in total are, that is what the subquery does.
I measured with profiling the times and get following results:
0.0047s for the total query,
0.0023s for the subquery
In the database are only 5 rows!
On that site I offer some filters, like age +/- 1 year and is the animal already adopted, because of that I need the WHERE clause on both, which probably takes up the most performance, followed by the order by clause which is necessary to display the new ones first.
P.S. I need all columns from the table, I did some testings and SELECT * had same runtimes then selecting all 10 columns manually like some people recommend.
EDIT:
Would it be worth to exclude the smalltext (varchar 250), imagpath (varchar 50) columns in a own table and inner join them, the other columns I could probably need for later filter. But type, gender, chipped are tinyints.
Any improvement tips for me?
Should I do the subquery in a own query outside of the main one?
Edit: 31.07
SELECT a.* , c.cnt AS countanimals
FROM animal a
JOIN (
Select a1.date AS date1, a1.tmstmpadopt AS tmstmpadopt1, a1.type AS type1, COUNT(a1.id) as cnt
FROM animal a1
GROUP BY date1, tmstmpadopt1, type1
) c on (a.date = c.date1 AND a.tmstmpadopt = c.tmstmpadopt1 AND a.type = c.type1)
WHERE a.type = 1 AND tmstmpadopt = 0 AND (date BETWEEN DATE_FORMAT(CURDATE() , '%Y-%m-%d') - INTERVAL 100 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d')- INTERVAL 1 YEAR)
ORDER BY a.timestamp DESC
LIMIT 1, 20;
Inline view may help you. So try this
SELECT a.*,c.cnt AS countanimals
FROM animal a
join (Select a1.dateborn, a1.timestampadopt, count(a1.id) as cnt
from animals a1
Where a1.timestampadopt > 0
and a1.type = :type
group by a1.dateborn, a1.timestampadopt) c on (a.dateborn = c.dateborn and a.timestampadopt = c.timestampadopt)
WHERE a.type = :type
AND a.timestampadopt > 0
AND a.dateborn BETWEEN DATE_FORMAT(CURDATE(),'%Y-%m-%d')-INTERVAL 1 YEAR AND DATE_FORMAT(CURDATE(),'%Y-%m-%d'))
ORDER BY a.timestamp DESC
LIMIT 1, 20;
Why don't you do the count on the script, as you process the rows, you can count them.

Comparing Dates in a MYSQL Subquery

I have two tables
class
-------------
id name
-------------
1 Knives
2 Pastries
class_date
-------------
get_id start_date
-------------
1 2017-10-09
1 2017-11-15
1 2017-12-03
2 2017-10-30
The class 'Knives' is a series with multiple dates. The class 'Pastries' is only offered on one date.
I want my result to be based on Oct 10, 2017 (or current date). In my search I only want results based on the first date - in this case the date of Oct 9, 2017 for 'Knives' should disqualify it from showing up in the results. 'Pastries' should show up.
I am not sure if I should do a LEFT OUTER JOIN or a Subquery. I've tried both but neither works - but I'm probably not doing it correctly.
This is what I tried:
SELECT *
FROM class, class_date WHERE
class_date.get_id = class.id &&
(SELECT DATE(start_date)
FROM class, class_date WHERE
class_date.get_id = classes.id
ORDER BY class_date.start_date ASC
LIMIT 1
) > CURDATE()
ORDER BY class_date.start_date ASC
and
SELECT *
FROM class
LEFT OUTER JOIN
class_date ON
class_date.get_id = classes.id
WHERE
class_date.start_date > CURDATE()
GROUP BY classes.class_id
ORDER BY class_dates.start_date ASC
I have a feeling that the subquery is the way to go but I get no results. If I use < instead of > I get too many results. Any help would be appreciated.
Here is one method to get the most recent record as of a particular date. This allows you to get all the rows (and you can join in class to get rows there):
select cd.*
from class_date cd
where cd.date = (select max(cd2.date)
from class_date cd2
where cd2.get_id = cd.get_id and
cd2.date <= '2017-10-09'
);
If you just want the maximum date for a given class:
select cd.get_id, max(cd.date)
from class_date cd
where cd.date <= '2017-10-09'
group by cd.get_id;

Using a sum of values as a condition (SQL query)

I have a table that looks roughly like this
Year Species Count
1979 A 0
1980 A 10
1981 A 4
1982 A 3
1979 B 0
1980 B 1
1981 B 2
1982 B 3
1979 C 9
1980 C 14
1981 C 2
1982 C 1
What i want is to return all Year, Species, Count for those species that have a total count (as in summed over all years) of 10 or more. so for a total count of 20 i would want it to just return
1979 C 9
1980 C 14
1981 C 2
1982 C 1
i played around with having but havent really gotten anything useful (total SQL beginner)
In MySQL, you can do this using aggregation and a join:
select t.*
from table t join
(select species, count(*) as cnt
from table
group by species
) s
on t.species = s.species
where s.cnt >= 10;
This is the easiesy. You already have the counts. Group on species and filter table on the results of the subquesy. You can get the same functionality with an exists or a join also.
SELECT
[YEAR]
,SPECIES
,[COUNT]
FROM TABLE
WHERE SPECIES IN (
SELECT SPECIES
FROM TABLE
GROUP BY SPECIES
HAVING SUM([COUNT]) > 20)
)
Adding some addtional explanation for BootstrapBill
Group by "makes multiple sets" for each unique value of the GROUP BY column. That allows the aggregate function SUM() act on only one set of the GROUP BY values at a time. HAVING is sort of like a WHERE clause for the GROUP BY statement that allows you to apply a predicate. The only fields allowed to be returned by a GROUP BY are the grouped column itself and the results of any aggregate function(s), you need to join back to or filter the original set to get the other columns your are targeting in the query.
And I apoligze, I did not see where the OP stated this was for MySql. The core concept is the same so I am leaving the answer. [] are MS SQL syntax for escaping the keywords COUNT and YEAR.
You'll want to use GROUP BY with the SUM() aggregate function and HAVING clause (similar to WHERE, but for groups instead of rows), combined with a self-join:
SELECT t1.`Year`, t1.`Species`, t1.`Count`
FROM mytable t1 INNER JOIN (
SELECT `Species`, SUM(`Count`)
FROM mytable
GROUP BY `Species`
HAVING SUM(`Count`) >= 20
) t2
ON t1.`Species` = t2.`Species`

mysql moving average of N rows

I have a simple MySQL table like below, used to compute MPG for a car.
+-------------+-------+---------+
| DATE | MILES | GALLONS |
+-------------+-------+---------+
| JAN 25 1993 | 20.0 | 3.00 |
| FEB 07 1993 | 55.2 | 7.22 |
| MAR 11 1993 | 44.1 | 6.28 |
+-------------+-------+---------+
I can easily compute the Miles Per Gallon (MPG) for the car using a select statement, but because the MPG varies widely from fillup to fillup (i.e. you don't fill the exact same amount of gas each time), I would like to computer a 'MOVING AVERAGE' as well. So for any row the MPG is MILES/GALLON for that row, and the MOVINGMPG is the SUM(MILES)/SUM(GALLONS) for the last N rows. If less than N rows exist by that point, just SUM(MILES)/SUM(GALLONS) up to that point.
Is there a single SELECT statement that will fetch the rows with MPG and MOVINGMPG by substituting N into the select statement?
Yes, it's possible to return the specified resultset with a single SQL statement.
Unfortunately, MySQL does not support analytic functions, which would make for a fairly simple statement. Even though MySQL does not have syntax to support them, it is possible to emulate some analytic functions using MySQL user variables.
One of the ways to achieve the specified result set (with a single SQL statement) is to use a JOIN operation, using a unique ascending integer value (rownum, derived by and assigned within the query) to each row.
For example:
SELECT q.rownum AS rownum
, q.date AS latest_date
, q.miles/q.gallons AS latest_mpg
, COUNT(1) AS cnt_rows
, MIN(r.date) AS earliest_date
, SUM(r.miles) AS rtot_miles
, SUM(r.gallons) AS rtot_gallons
, SUM(r.miles)/SUM(r.gallons) AS rtot_mpg
FROM ( SELECT #s_rownum := #s_rownum + 1 AS rownum
, s.date
, s.miles
, s.gallons
FROM mytable s
JOIN (SELECT #s_rownum := 0) c
ORDER BY s.date
) q
JOIN ( SELECT #t_rownum := #t_rownum + 1 AS rownum
, t.date
, t.miles
, t.gallons
FROM mytable t
JOIN (SELECT #t_rownum := 0) d
ORDER BY t.date
) r
ON r.rownum <= q.rownum
AND r.rownum > q.rownum - 2
GROUP BY q.rownum
Your desired value of "n" to specify how many rows to include in each rollup row is specified in the predicate just before the GROUP BY clause. In this example, up to "2" rows in each running total row.
If you specify a value of 1, you will get (basically) the original table returned.
To eliminate any "incomplete" running total rows (consisting of fewer than "n" rows), that value of "n" would need to be specified again, adding:
HAVING COUNT(1) >= 2
sqlfiddle demo: http://sqlfiddle.com/#!2/52420/2
Followup:
Q: I'm trying to understand your SQL statement. Does your solution do a select of twenty rows for each row in the db? In other words, if I have 1000 rows will your statement perform 20000 selects? (I'm worried about performance)...
A: You are right to be concerned with performance.
To answer your question, no, this does not perform 20,000 selects for 1,000 rows.
The performance hit comes from the two (essentially identical) inline views (aliased as q and r). What MySQL does with these (basically) is create temporary MyISAM tables (MySQL calls them "derived tables"), which are basically copies of mytable, with an extra column, each row assigned a unique integer value from 1 to the number of rows.
Once the two "derived" tables are created and populated, MySQL runs the outer query, using those two "derived" tables as a row source. Each row from q, is matched with up to n rows from r, to calculate the "running total" miles and gallons.
For better performance, you could use a column already in the table, rather than having the query assign unique integer values. For example, if the date column is unique, then you could calculate "running total" over a certain period of days.
SELECT q.date AS latest_date
, SUM(q.miles)/SUM(q.gallons) AS latest_mpg
, COUNT(1) AS cnt_rows
, MIN(r.date) AS earliest_date
, SUM(r.miles) AS rtot_miles
, SUM(r.gallons) AS rtot_gallons
, SUM(r.miles)/SUM(r.gallons) AS rtot_mpg
FROM mytable q
JOIN mytable r
ON r.date <= q.date
AND r.date > q.date + INTERVAL -30 DAY
GROUP BY q.date
(For performance, you would want an appropriate index defined with date as a leading column in the index.)
For the first query, any predicates included (in the inline view definition queries) to reduce the number of rows returned (for example, return only date values in the past year) would reduce the number of rows to be processed, and would also likely improve performance.
Again, to your question about running 20,000 selects for 1,000 rows... a nested loops operation is another way to get the same result set. For a large number of rows, this can exhibit slower performance. (On the other hand, this approach can be fairly efficient, when only a few rows are being returned:
SELECT q.date AS latest_date
, q.miles/q.gallons AS latest_mpg
, ( SELECT SUM(r.miles)/SUM(r.gallons)
FROM mytable r
WHERE r.date <= q.date
AND r.date >= q.date + INTERVAL -90 DAY
) AS rtot_mpg
FROM mytable q
ORDER BY q.date
Something like this should work:
SELECT Date, Miles, Gallons, Miles/Gallons as MilesPerGallon,
#Miles:=#Miles+Miles overallMiles,
#Gallons:=#Gallons+Gallons overallGallons,
#RunningTotal:=#Miles/#Gallons runningTotal
FROM YourTable
JOIN (SELECT #Miles:= 0) t
JOIN (SELECT #Gallons:= 0) s
SQL Fiddle Demo
Which produces the following:
DATE MILES GALLONS MILESPERGALLON RUNNINGTOTAL
January, 25 1993 20 3 6.666667 6.666666666667
February, 07 1993 55.2 7.22 7.645429 7.358121330724
March, 11 1993 44.1 6.28 7.022293 7.230303030303
--EDIT--
In response to the comment, you can add another Row Number to limit your results to the last N rows:
SELECT *
FROM (
SELECT Date, Miles, Gallons, Miles/Gallons as MilesPerGallon,
#Miles:=#Miles+Miles overallmiles,
#Gallons:=#Gallons+Gallons overallGallons,
#RunningTotal:=#Miles/#Gallons runningTotal,
#RowNumber:=#RowNumber+1 rowNumber
FROM (SELECT * FROM YourTable ORDER BY Date DESC) u
JOIN (SELECT #Miles:= 0) t
JOIN (SELECT #Gallons:= 0) s
JOIN (SELECT #RowNumber:= 0) r
) t
WHERE rowNumber <= 3
Just change your ORDER BY clause accordingly. And here is the updated fiddle.