SQL query for related rows of data from the same table - mysql

I am trying to understand how to write SQL statements which can return related rows of data from the same table.
By way of example, I have the following table called purchases:
------------------------------
|ID | Customer | Date | Cost |
------------------------------
1 | 1 | Mon | 20.0
2 | 1 | Tue | 10.0
3 | 1 | Sat | 23.0
4 | 2 | Thu | 211.0
5 | 2 | Mon | 24.0
6 | 2 | Sat | 50.0
7 | 3 | Mon | 34.0
8 | 3 | Sat | 200.0
9 | 3 | Fri | 90.0
I want the data for how much each customer spent on Saturday and the corresponding data for how much they spent on Monday, so that I would have the following data:
Saturday: [23.0,50.0,200.0]
Monday: [20.0,24,0,34.0]
How do I do this? I can do:
SELECT cost FROM purchases WHERE Date=='Sat' and SELECT cost FROM purchases WHERE Date=='Mon' but this doesn't seem very satisfactory because it depends on the order in which the database gives the result and a purchase may not have happened on both Saturday and Monday.
I investigated Joins and Unions for this purpose but they seem concerned with data from more than one table.
I'm sure there's a standard way to solve this problem.

You could join the table to itself; this would give you the data you want, but in the format
UserID, AmountSpentSat, AmountSpentMon
The SQL:
select p1.Customer, p1.cost as AmountMon, p2.cost as AmountSat
from purchases p1, purchases p2
where p1.customer = p2.customer
and p1.date = "Mon"
and p2.date = "Sat"
Here is a working sqlfiddle of it: http://sqlfiddle.com/#!2/367ef/1
Note I added a new customer (4) who only shopped on Sat and it works correctly.

Related

SQL - select x entries within a timespan

I'm creating a database (in MySQL) with a table of measurements. For each measurement I want to store the DateTime it came in. For showing plots within an app for different intervals (measurements of the day/week/month/year) I want sample the data points I have, so I can return e. g. 30 data points for the whole year as well as for the day/hour. This is the same as done with stock price graphs:
stock price plot for 1 day
vs
stock price plot for 1 month
As you can see, the amount of data points is the same in both pictures.
So how can I select x entries within a timespan in MySQL via SQL?
My data looks like this:
+====+====================+=============+==========+
| id | datetime | temperature | humidity |
+====+====================+=============+==========+
| 1 | 1-15-2016 00:30:00 | 20 | 40 |
+----+--------------------+-------------+----------+
| 2 | 1-15-2016 00:35:00 | 19 | 41 |
+----+--------------------+-------------+----------+
| 3 | 1-15-2016 00:40:00 | 20 | 40 |
+----+--------------------+-------------+----------+
| 4 | 1-15-2016 00:45:00 | 20 | 42 |
+----+--------------------+-------------+----------+
| 5 | 1-15-2016 00:50:00 | 21 | 42 |
+----+--------------------+-------------+----------+
| 6 | 1-15-2016 00:55:00 | 20 | 43 |
+----+--------------------+-------------+----------+
| 7 | 1-15-2016 01:00:00 | 21 | 43 |
+====+====================+=============+==========+
Let's say, I always want two data points (in reality a lot more). So for the last half hour I want the database to return data point 1 and 4, for the last ten minutes I want it to return 6 and 7.
Thanks for helping!
PS: I'm sorry for any errors in my English
OK, assuming a very simple systematic approach, you can get the first and last entry for any defined period:
select *
from table
where mydatetime =
(select
max(mydatetime)
from table
where mydatetime between '2017-03-01' and '2017-03-15'
)
OR mydatetime =
(select
min(mydatetime)
from table
where mydatetime between '2017-03-01' and '2017-03-15'
)
I believe your answer can be found at the following location:
https://stackoverflow.com/a/1891796/7176046
If you are looking to filter out any items not within your date/time your query would use:
Select * from table where Date/Time is (What you want to sort by)

Calculate the fullness of an apartment using SQL expression

I have a database which looks like this:
Reservations Table:
-------------------------------------------------
id | room_id | start | end |
1 | 1 | 2015-05-13 | 2015-05-16 |
2 | 1 | 2015-05-18 | 2015-05-20 |
3 | 1 | 2015-05-21 | 2015-05-24 |
-------------------------------------------------
Apartment Table:
---------------------------------------
id | room_id | name |
1 | 1 | test apartment |
---------------------------------------
Meaning that in the month 05 (May) there is 31 days in the database we have 3 events giving us 8 days of usage 31 - 8 = 23 / 31 = 0.741 * 100 = %74.1 is the percentage of the emptiness and %25.9 is the percentage of usage. how can i do all of that in SQL? (mySQL).
This is my proposal:
SELECT SUM(DAY(`end`)-DAY(`start`))/EXTRACT(DAY FROM LAST_DAY(`start`)) FROM `apt`;
LAST_DAY function gives as output the date of last day of the month.
Check this
http://sqlfiddle.com/#!9/7c53b/2/0
Not the most efficient query but will get the job done.
select
sum(a.days)*100/(SELECT DAY(LAST_DAY(min(start))) from test1)
as usePercent,
100-(sum(a.days)*100/(SELECT DAY(LAST_DAY(min(start))) from test1))
as emptyPercent
FROM
(select DATEDIFF(end,start) as days from test1) a
What I did is first get the date difference and count them. Then in a nested query use the day(last_day()) function to get the last day of month. Then calculated by using your logic.

How to transpose dynamic matrix dataset in EXCEL or SQL

I have the following issue that I can't find the best solution for.
I need to align hours per date per ID from a sheet with these parameters. I tried transposing in excel but I just came with a summier result that wouldn't give the rows per ID and date.
HOURS WORKSHEET
YEAR = 2015
ID | MON | TUES | WED | THU | FRI | SAT | SUN | WEEKNR
15 | 6 | 8 | 9 | - | - | - | - | 14
16 | - | - | 2 | - | 3 | - | - | 14
17 | - | 3 | 5 | - | - | 5 | - | 14
18 | 9 | - | - | 3 | - | - | - | 14
I'd like to have the ID transposed on the date with the values(hours) like this
ID | DATE | HOURS
15 | 30-3-2015 | 6
15 | 31-3-2015 | 8
15 | 1-4-2015 | 9
16 | 1-4-2015 | 2
16 | 3-4-2015 | 3
17 | 31-3-2015 | 3
17 | 1-4-2015 | 5
17 | 4-4-2015 | 5
18 | 30-3-2015 | 9
18 | 2-4-2015 | 3
Any suggestion/solution is much appreciated. SQL or Excel formula(VBA)
This would be easy in sql-server (using a table function). In MySQL however, simplest way i can think of is this:
select id, date (use date_ADD() to deduce this), mon as hours from table
union all
select id, date (use date_ADD() to deduce this), tues as hours from table
union all
select id, date (use date_ADD() to deduce this), wed as hours from table
union all
select id, date (use date_ADD() to deduce this), thu as hours from table
union all
select id, date (use date_ADD() to deduce this), fri as hours from table
union all
select id, date (use date_ADD() to deduce this), sat as hours from table
union all
select id, date (use date_ADD() to deduce this), sun as hours from table
And then sort as needed.
EDIT: if you need help with the date, tell us.

mysql split a string in a where clause

I have an event system and for my repeat events I am using a cron like system.
Repeat Event:
+----+----------+--------------+
| id | event_id | repeat_value |
+----+----------+--------------+
| 1 | 11 | *_*_* |
| 2 | 12 | *_*_2 |
| 3 | 13 | *_*_4/2 |
| 4 | 14 | 23_*_* |
| 5 | 15 | 30_05_* |
+----+----------+--------------+
NOTE: The cron value is day_month_day of week
Event:
+----+------------------------+---------------------+---------------------+
| id | name | start_date_time | end_date_time |
+----+------------------------+---------------------+---------------------+
| 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+------------------------+---------------------+---------------------+
Anyway I have a query to select the events:
SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
That produces:
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| id | event_id | repeat_value | id | name | start_date_time | end_date_time |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| 1 | 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 2 | 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 3 | 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 4 | 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 5 | 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
However, I want to select events within a month. I will only have certain conditions: daily, weekly, every two weeks, month and yearly.
I want to put in my where clause a way to divide the string of the repeat value and if it fits any of the following conditions to show it as a result (repeatEvent is row that is being interrogated, search is the date being looked for):
array(3) = string_divide(repeat_value, '_')
daily = array(0)
monthy = array(1)
dayOfWeek = array(2)
if(daily == '*' && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if(if(daily == '*' && month == '*' && dayOfWeek == search.dayOfWeek) //returns all the events on specific day
return repeatEvent
if(daily == search.date && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if (contains(dayOfWeek, '/'))
array(2) = string_divide(dayOfWeek,'/')
specificDayOfWeek = array(0);
if(specificDayOfWeek == repeatEvent.start_date.dayNumber)
if(timestampOf(search.timestamp)-timestampOf(repeatEvent.start_date)/604800 == (0 OR EVEN)
return repeatEvent
if(daily == search.date && month == search.month && dayOfWeek == '*') //returns a single yearly event (shouldn't often crop up)
return repeatEvent
//everything else is either an unknown format of repeat_value or not an event on this day
To summarise I want to run a query in which the repeat value is split in the where clause and I can interrogate the split items. I have looked at cursors but the internet seems to advise against them.
I could process the results of selecting all the repeat events in PHP, however, I imagine this being very slow.
Here is what I would like to see if looking at the month of April:
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of May
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of June
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
You could put a bandaid on this, but no one would be doing you any favors to tell you that that is the answer.
If your MySQL database can be changed I would strongly advise you to split your current column with underscores day_month_day of year to three separate columns, day, month, and day_of_year. I would also advise you to change your format to be INT rather than VARCHAR. This will make it faster and MUCH easier to search and parse, because it is designed in a way that doesn't need to be translated into computer language through complicated programs... It is most of the way there already.
Here's why:
Reason 1: Your Table is not Optimized
Your table is not optimized and will be slowed regardless of what you choose to do at this stage. SQL is not built to have multiple values in one column. The entire point of an SQL database is to split values into different columns and rows.
The advantage to normalizing this table is that it will be far quicker to search it, and you will be able to build queries in MySQL. Take a look at Normalization. It is a complicated concept, but once you get it you will avoid creating messy and complicated programs.
Reason 2: Your Table could be tweaked slightly to harness the power of computer date/time functions.
Computers follow time based on Unix Epoch Time. It counts seconds and is always running in your computer. In fact, computers have been counting this since, as the name implies, the first Unix computer was ever switched on. Further, each computer and computer based program/system, has built in, quick date and time functions. MySQL is no different.
I would also recommend also storing all of these as integers. repeat_doy (day of year) can easily be a smallint or at least a standard int, and instead of putting a month and day, you can put the actual 1-365 day of the year. You can use DAY_OF_YEAR(NOW()) to input this into MySQL. To pull it back out as a date you can use MAKEDATE(YEAR(NOW),repeat_doy). Instead of an asterisk to signify all, you can either use 0's or NULL.
With a cron like system you probably will not need to do that sort of calculation anyway.
Instead, it will probably be easier to just measure the day of year elsewhere (every computer and language can do this. In Unix it is just date "%j").
Solution
Split your one repeat_value into three separate values and turn them all into integers based on UNIX time values. Day is 1-7 (or 0-6 for Sunday to Saturday), Month is 1-12, and day of year is 1-365 (remember, we are not including 366 because we are basing our year on an arbitrary non-leap year).
If you want to pull information in your SELECT query in your original format, it is much easier to use concat to merge the three columns than it is to try to search and split on one column. You can also easily harness built in MySQL functions to quickly turn what you pull into real, current, days, without a bunch of effort on your part.
To implement it in your SQL database:
+----+----------+--------------+--------------+------------+
| id | event_id | repeat_day | repeat_month | repeat_doy |
+----+----------+--------------+--------------+------------+
| 1 | 11 | * | * | * |
| 2 | 12 | * | * | 2 |
| 3 | 13 | * | * | 4/2 |
| 4 | 14 | 23 | * | * |
| 5 | 15 | 30 | 5 | * |
+----+----------+--------------+--------------+------------+
Now you should be able to build one query to get all of this data together regardless of how complicated your query. By normalizing your table, you will be able to fully harness the power of relational databases, without the headaches and hacks.
Edit
Hugo Delsing made a great point in the comments below. In my initial example I provided a fix to leap years for day_of_year in which I chose to ignore Feb 29. A much better solution removes the need for a fix. Split day_of_year to month and day with a compound index. He also has a suggestion about weeks and number of weeks, but I will just recommend you read it for more details.
Try to write where condition using this:
substring_index(repeat_value,'_', 1)
instead of daily
substring_index(substring_index(repeat_value,'_', -2), '_', 1)
instead of monthly
and
substring_index(substring_index(repeat_value,'_', -1), '_', 1)
instead of dayOfWeek
I think you are overthinking the problem if you only want the events per month and not per day. Assuming that you always correctly fill the repeat_value, the query is very basic.
Basically all event occur every month where the repeat_value is either LIKE '%_*_%' or LIKE '%_{month}_%'.
Since you mentions PHP I'm assuming you are building the query in PHP and thus I used the same.
<?php
function buildQuery($searchDate) {
//you could/should do some more checking if the date is valid if the user provides the string
$searchDate = empty($searchDate) ? date("Y-m-d") : $searchDate;
$splitDate = explode('-', $searchDate);
$month = $splitDate[1];
//Select everything that started after the searchdate
//the \_ is because else the _ would match any char.
$query = 'SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
WHERE `Event`.`start_date_time` < \''.$searchDate.'\'
AND
(
`RepeatEvent`.`repeat_value` LIKE \'%\_'.$month.'\_%\'
OR `RepeatEvent`.`repeat_value` LIKE \'%\_*\_%\'
)
';
return $query;
}
//show querys for all months on current day/year
for ($month = 1; $month<=12; $month++) {
echo buildQuery(date('Y-'.$month.'-d')) . '<hr>';
}
?>
Now if the repeat_value could be wrong, you could add a simple regex check to make sure the value is always like *_*_* or *_*_*/*
You can use basic regular expressions in MySQL:
http://dev.mysql.com/doc/refman/5.0/en/pattern-matching.html
For a monthly event in May (first day) you can use a pattern like this (not tested):
[0-9\*]+\_[5\*]\_1
You can generate this pattern via PHP

Place a footer row as an header row of a group of rows

I have a table like the following :
rfa_yea | rfa_idx | rfa_dsp | rfa_tpr
---------+---------+----------------------------------------------------+---------
2013 | 1 | PIGATO VERM.NO/ROSS/ORMEASCO CL75 | A
2013 | 2 | ESTATE\134134134047 BICCHIERE SING.VERDE | A
2013 | 3 | Rif. Trn. N. 17 del 17/04/2013 Cassa N. 00001 | C
2013 | 4 | BIB.RED BULL LAT.CL25 ENER.DRI | A
2013 | 5 | BIB.RED BULL LAT.CL25 ENER.DRI | A
2013 | 6 | SHOPPER 30X60 MAXI X 1000 | A
2013 | 7 | SHOPPER HD 27X50 MEDIE X 1000 | A
2013 | 8 | PIGATO VERM.NO/ROSS/ORMEASCO CL75 | A
2013 | 9 | * SCONTO SUBTOTALE | A
2013 | 10 | Rif. Trn. N. 19 del 17/04/2013 Cassa N. 00001 | C
The record with the field rfa_tpr marked as 'C' is the header of the group of rows that came before it. I need to place the row as an header of the group of rows instead of footer(separator) as at the moment, so I want to retrieve a result set like the following :
rfa_yea | rfa_idx | rfa_dsp | rfa_tpr
---------+---------+----------------------------------------------------+---------
2013 | 3 | Rif. Trn. N. 17 del 17/04/2013 Cassa N. 00001 | C
2013 | 1 | PIGATO VERM.NO/ROSS/ORMEASCO CL75 | A
2013 | 2 | ESTATE\134134134047 BICCHIERE SING.VERDE | A
2013 | 10 | Rif. Trn. N. 19 del 17/04/2013 Cassa N. 00001 | C
2013 | 4 | BIB.RED BULL LAT.CL25 ENER.DRI | A
2013 | 5 | BIB.RED BULL LAT.CL25 ENER.DRI | A
2013 | 6 | SHOPPER 30X60 MAXI X 1000 | A
2013 | 7 | SHOPPER HD 27X50 MEDIE X 1000 | A
2013 | 8 | PIGATO VERM.NO/ROSS/ORMEASCO CL75 | A
2013 | 9 | * SCONTO SUBTOTALE | A
Is there a solution with only SQL ? The solution should work on each of these kind of database server : MSSQL, PostgreSQL and MySQL.
Note
I can have multiple separators(footers) rows, not only two as in the example ...
SELECT a.rfa_yea ,
a.rfa_idx ,
a.rfa_dsp ,
a.rfa_tpr
FROM table1 a
INNER JOIN table1 c ON a.rfa_idx <= c.rfa_idx AND c.rfa_tpr = 'C'
GROUP BY a.rfa_yea ,
a.rfa_idx ,
a.rfa_dsp ,
a.rfa_tpr
ORDER BY MIN(c.rfa_idx), a.rfa_tpr DESC, a.rfa_idx
SQL Server Demo
MySQL Demo
PostgreSQL Demo
I found a solution by myself, below the SQL query if someone in the future will need to do something similar :
SELECT * FROM righfatture r
LEFT JOIN (
SELECT r1.rfa_tpr,r1.rfa_idx, COALESCE(r2.rfa_idx, -1) AS IDXP FROM righfatture r1
LEFT JOIN righfatture r2 ON r2.rfa_idx < r1.rfa_idx AND r2.rfa_tpr = r1.rfa_tpr
WHERE r1.rfa_tpr = 'C'
) j ON j.rfa_tpr = r.rfa_tpr AND r.rfa_idx = j.rfa_idx
ORDER BY CASE WHEN j.rfa_tpr IS NOT NULL THEN j.IDXP ELSE r.rfa_idx END
Solution is easy, problem is the problem...
Solution under assumption that last year should come first and rfa_idx comes behind rfa_yea.
select * from table1 order by rfa_yea desc, find_in_set(rfa_tpr, "C,A"), rfa_idx;
Problem is that you should not rely too much on increasing ids and the already mentioned design questions.
Marco
I think this is a database design issue. The "groups of rows" you are talking about have nothing in common besides the order of the rows.
I would suggest, adding a field and inserting a common value for these groups of rows, if possible. While this is not a convenient solution, I think a database design should reflect the way you want data to be organized.