Getting a syntax error when trying to join two tables - mysql

I have created the following tables:
USER TABLE
user_id (primary key)
account_created (date)
email (varchar)
usage_count (number)
PRODUCT TABLE
product_id (primary key)
product (varchar) (values include “iPhone”, “Android”, “Windows”)
users_supported (number) (users supported notes: some phones can support group calls up to 1000 users, some can only support normal calls of 2 users)
USAGE TABLE
usage_id (primary key)
product_id (foreign key)
user_id (foreign key)
usage_date (date)
purchase_call (number) (can be a 0, 2, 4, 6, or 10 min call)
usage_winnings (number) (when users use their minutes, sometimes they will randomly earn cash back)
computer_usage (binary value) (users can link the phone to a computer, and make calls through their computer, similar to google voice)
I want to write a select statement with the following constraints:
Time frame between 2014 and 2016
% of calls made for 2 users
% of purchased minutes used for only 2 users
Only in the first 30 days after a user created their account
In each year between 2014 and 2016, what percentage of calls and purchased calls were for only 2 users in each user's first 30 days after they created their account.
I have been practicing joins and what I have is:
SELECT COUNT(p.users_supported = 2)/COUNT(p.users_supported), SUM(CASE WHEN users_supported = 2 THEN us.purchase_call ELSE 0 END)/SUM(CASE WHEN users_supported <> 2 THEN us.purchase_call ELSE 0 END)
FROM USERS u
JOIN USAGE us ON u.user_id = us.user_id
JOIN PRODUCT p ON p.product_id = us.product_id
WHERE u.account_created >= '2014-01-01'
AND u.account_created <= '2016-12-31'
AND u.account_created <= u.account_created + 30
I have several errors right now - the percentages are not coming out correct and the account created with 30 days constraint is causing an error that breaks the whole query. Any suggestions would be much appreciated!

You did not state what database you are using...
First thing I notice is that
AND u.account_created <= u.account_created + 30
is always true. I think you want the query based on the current date like
AND u.account_created > NOW() - 30
If you are using sql server then you could use the datediff function and check for a result < 30
Since it was pointed out that you tagged this question mysql then using TIMESTAMPDIFF will work. Here is an example showing the syntax and the use of the NOW() function.
mysql> select TIMESTAMPDIFF(day,NOW(),'20161206');
+-------------------------------------+
| TIMESTAMPDIFF(day,NOW(),'20161206') |
+-------------------------------------+
| -17 |
+-------------------------------------+
1 row in set (0.04 sec)

Related

MySQL - SQL select query with two tables using where, count and having

There are two tables: client and contract.
client table:
client_code INT pk
status VARCHAR
A client can have 1 or more contracts. The client has a status column which specifies if it has valid contracts - the values are 'active' or 'inactive'. The contract is specified for a client with active status.
contract table:
contract_code INT pk
client_code INT pk
end_date DATE
A contract has an end date. A contract end date before today is an expired contract.
REQUIREMENT: A report requires all active clients with contracts, but with all (not some) contracts having expired date. Some example data is shown below:
Client data:
client_code status
----------------------------------
1 active
2 inactive
3 active
4 active
Contract data:
contract_code client_code end_date
-------------------------------------------------------------
11 1 08-12-2018
12 1 09-12-2018
13 1 10-12-2018
31 3 11-31-2018
32 3 10-30-2018
41 4 01-31-2019
42 4 12-31-2018
Expected result:
client_code
-------------
1
RESULT: This client (client_code = 1) has all contracts with expired dates: 08-12-2018, 09-12-2018 and 10-12-2018.
I need some help to write a SQL query to get this result. I am not sure what constructs I have to use - one can point out what I can try. The database is MySQL 5.5.
One approach uses aggregation. We can join together the client and contract tables, then aggregate by client, checking that, for an active client, there exist no contract end dates which occur in the future.
SELECT
c.client_code
FROM client c
INNER JOIN contract co
ON c.client_code = co.client_code
WHERE
c.status = 'active'
GROUP BY
c.client_code
HAVING
SUM(CASE WHEN co.end_date > CURDATE() THEN 1 ELSE 0 END) = 0;
Demo
Note: I am assuming that your dates are appearing in M-D-Y format simply due to the particular formatting, and that end_date is actually a proper date column. If instead you are storing your dates as text, then we might have to make a call to STR_TO_DATE to convert them to dates first.
Is that what you're looking for?
select clients.client_code
from clients
join contracts
on contracts.client_code=clients.client_code
where status='active'
group by clients.client_code
having min(end_date)>curdate()

Get the detail of overlapping bookings mysql

I am developing a stable booking system where user can book and update their room bookings and chose stables from interactive map.
My stable registration db structure is like below
event_detail_stable_registrations
| id | accountId | eventId | stableId | checkInDate | checkOutDate |
5 233 55 66 26-06-2017 28-06-2017
6 234 55 66 28-06-2017 29-06-2017
When user updates the booking but do not change the checkInDate and checkOutDate then its an easy scenario which i have implemented already.
In above case if user 234 updates the booking and change checkInDate then the query should return 233 for stableId 66 but my query returns '234' as accountId
Another scenario is when user changes the checkInDate and or checkOutDate of the registration. User A wants to change the booking detail how can i check if any overlapping for updated checkInDate and or checkOutDate for user's booking and if those are already booked then which accountId has booked it.
Right now I am running following query which gives me correct information about overlapping dates but could not get the information of account who has booked it.
Query always returns the user's accountId for overlapping dates as well.
SET #checkInDate = '2017-04-27 14:00:00' , #checkOutDate = '2017-05-01 10:00:00' ;
SELECT a.*,
IF(b.`stableid` IS NULL,"Avalalable","Not Available") as `status`,
IF(NOT b.`stableid` IS NULL,b.`accountId`,"") as `overLapAccount`,
IF(NOT b.`stableid` IS NULL,b.`checkInDate`,"") as `start_overlap`,
IF(NOT b.`stableid` IS NULL,b.`checkOutDate`,"") as `end_overlap`
FROM `event_detail_stable_registrations` b
LEFT JOIN `stables` a
ON a.`id` = b.`stableid` AND
(((`checkInDate` BETWEEN #checkInDate AND #checkOutDate)
OR (`checkOutDate` BETWEEN #checkInDate AND #checkOutDate))
OR ((#checkInDate BETWEEN `checkInDate` AND `checkOutDate`)
OR (#checkOutDate BETWEEN `checkInDate` AND `checkOutDate`))
)
ORDER BY a.`name`
Here is the SQL fiddle where I haven't used the same DB structure but its similar.
The output I get for the same stable booked by multiple account for given period is fine but with the query I am using, I get null in the column name,stableId.
Part 2 of your scenario:
MariaDB-10.5 Application Time Periods - WITHOUT OVERLAPS can enforce the constraints of non-overlapping bookings:
ALTER TABLE event_detail_stable_registrations
ADD period FOR booking(checkInDate, checkOutDate),
ADD PRIMARY KEY (accountId, stableId, booking WITHOUT OVERLAPS)
A user cancelling a day is just:
DELETE
FROM event_detail_stable_registrations
FOR PORTION OF booking
FROM '2017-04-29 14:00:00' TO '2017-04-30 10:00:00'
WHERE accountId=5
AND stableId=866
Which splits the non-deleted dates booking entry over periods.
Any UPDATE of a booking enforces the constraints.
ref: dbfiddle

MySQL Order By and a Limit 1 subquery to return most recent record

I've noticed there are a few similar questions on StackOverflow, but nothing has worked for me so far. I'll try to keep this as short as possible.
I am building a query that needs to return a number of issues that may or may not have contracts, that may or may not be completed (completed_at would be set to a DateTime, not nil). Each row needs to include:
one row containing all the issue record's fields
the description from the budget_item
the completed_at date from the most recent contract that was completed (one budget_item could have 0 contracts, 1 contract, or 5+ contracts and any number of them could be open (completed_at :nil) or closed (completed_at: DateTime)
This is what I have so far (which returns the correct number of rows, but it is returning the most recently created contract, not the most recent
BaseItem.issues
.joins('LEFT JOIN budget_items
ON issues.id = budget_items.issue_id
LEFT JOIN contracts
ON budget_items.id = contracts.budget_item_id')
.select('issues.*, budget_items.description, contracts.completed_at AS resolved_at')
.group('issues.id')
.order('contracts.completed_at')
The code in the models is as follows:
class BaseItem < ActiveRecord::Base
has_many :issues
...
end
class Issue < ActiveRecord::Base
belongs_to :base_item
has_many :budget_items
...
end
class BudgetItem < ActiveRecord::Base
belongs_to :issue
has_many :contracts
...
end
class Contract < ActiveRecord::Base
belongs_to :budget_item
...
end
The end result needs to be something along the line of:
There will likely be multiple issues making up the different rows. Each issue has at least four budget_items which are used only for the budget_item.description which needs to appear in the final query and then are used to join each issue to its many contracts (each budget_item could have 2 or 3 contracts so the issue could end up having 8-12 contracts. From those contracts, the query needs to order them according to their completed_at attribute and return AS resolved_at only the most recent contract's completed_at date. If there were 4 contracts, two had completed_at: nil, the query should return the most recent of the two remaining completed_at dates as the resolved_at field of that particular issue.
Any help would be greatly appreciated and please let me know if I need to provide any additional information.
-Dave
The resulting query (from a comment):
SELECT issues.*, budget_items.description, contracts.completed_at AS resolved_at
FROM issues
LEFT JOIN budget_items ON issues.id = budget_items.issue_id
LEFT JOIN contracts ON budget_items.id = contracts.budget_item_id
WHERE issues.base_item_id = 6
GROUP BY issues.id
ORDER BY contracts.completed_at DESC
LIMIT 1
You don't need to show your actual data in the sample to make it useful...
This is what I mean by it. You could have put into your question the following sample data:
issues
id base_item_id
-- ------------
10 6
20 6
30 6
99 123
budget_items
id issue_id description
-- -------- -----------
1 10 'one contract, none completed'
2 20 'one contract, one completed'
3 30 'two contracts, one completed'
4 30 'three contracts, two completed'
contracts
id budget_item_id completed_at
-- -------------- ------------
1 1 NULL
2 2 2015-01-02
3 3 2015-01-03
4 3 NULL
5 4 2015-01-05
6 4 NULL
7 4 2015-01-07
expected result
issues.id contracts.completed_at budget_item.description
--------- ---------------------- -----------------------
10 NULL NULL
20 2015-01-02 one contract, one completed
30 2015-01-07 three contracts, two completed
Here is SQL Fiddle.
Is it what you want? Does my sample data cover all possible edge cases? If not, add more rows to it and show how it affects the result.
This is how the final query may look like. MySQL doesn't have things like CROSS APPLY or LATERAL JOINS, so it is less efficient than in other databases - the subquery will run twice.
I have no idea how to translate this SQL to Ruby - I never used Ruby.
SELECT
issues.*
,(
SELECT contracts.completed_at
FROM
budget_items
INNER JOIN contracts ON contracts.budget_item_id = budget_items.id
WHERE
budget_items.issue_id = issues.id
AND contracts.completed_at IS NOT NULL
ORDER BY contracts.completed_at DESC
LIMIT 1
) AS resolved_at
,(
SELECT budget_items.description
FROM
budget_items
INNER JOIN contracts ON contracts.budget_item_id = budget_items.id
WHERE
budget_items.issue_id = issues.id
AND contracts.completed_at IS NOT NULL
ORDER BY contracts.completed_at DESC
LIMIT 1
) AS description
FROM issues
WHERE issues.base_item_id = 6
The main idea is simple. We return one row for each issue. For each issue we find one latest contract using whatever conditions you need (like contracts.completed_at IS NOT NULL to look for completed contracts only).
If there is no completed contracts at all for an issue it returns NULL for description and resolved_at. You can add extra filter in the main SELECT to remove such rows if this is what you want (WHERE issues.base_item_id = 6 AND resolved_at IS NOT NULL).
.order('contracts.completed_at DESC LIMIT 1')
(or do I not understand what is missing from your code?)

How can I find days between different paired rows?

I've been racking my brain about how to do this in one query without PHP code.
In a nutshell, I have a table that records email activity. For the sake of this example, here is the data:
recipient_id activity date
1 delivered 2011-08-30
1 open 2011-08-31
2 delivered 2011-08-30
3 delivered 2011-08-24
3 open 2011-08-30
3 open 2011-08-31
The goal: I want to display to users a single number that tells how many recipients open their email within 24 hours.
E.G. "Users that open their email within 24 hours: 13 Readers"
In the case of the sample data, above, the value would be "1". (Recipient one was delivered an email and opened it the next day. Recipient 2 never opened it and recipient 3 waited 5 days.)
Can anyone think of a way to express the goal in a single query?
Reminder: In order to count, the person must have a 'delivered' tag and at least one 'open' tag. Each 'open' tag only counts once per recipient.
** EDIT ** Sorry, I'm using MySQL
Here is a version in mysql.
select count(distinct recipient_id)
from email e1
where e1.activity = 'delivered'
and exists
(select * from email e2
where e1.recipient_id = e2.recipient_id
and e2.activity = 'open'
and datediff(e2.action_date,e1.action_date) <= 1)
The basic principle is that you want to find a delivered row for a recipient that also has an open within 24 hours.
The datediff() is a good way to do the date arithmetic in mysql -- other dbs will vary on exact methods for this step. The rest of the sql will work anywhere.
SQLFiddle here: http://sqlfiddle.com/#!2/c9116/4
Untested, but should work ;) Don't know which SQL dialect you use, so I've used TSQL DATEDIFF function.
select distinct opened.recipient_id -- or count(distinct opened.recipient_id) if you want to know number
from actions as opened
inner join actions as delivered
on opened.recipient_id = delivered.recipient_id and delivered.activity = 'delivered'
where opened.activity = 'open' and DATEDIFF(day, delivered.date, opened.date) <= 1
Edit: I'd confused opened with delivered - now replaced.
Assumptions: MySql, table is called "TABLE"
Ok, I am not 100% on this, because I don't have a copy of the table to run it against, but I think that you could do something like this:
SELECT COUNT(DISTINCT t1.recipient_id) FROM TABLE t1
INNER JOIN TABLE t2 ON t1.recipient_id = t2.recipient_id AND t1.activity != t2.activity
WHERE t1.activity in ('delivered', 'open') AND t2.activity in ('delivered', 'open')
AND ABS(DATEDIFF(t1.date, t2.date)) = 1
Basically, you are joining a table onto itself, where the activities don't match, but recipient_ids do, and the status is either 'delivered' or 'open'. What you would end up getting, is a result that looks like this:
1 delivered 2011-08-30 1 open 2011-08-31
You are then doing a diff between the two dates (with an absolute value, because we don't know which order they will be in) and making sure that it is equal to 1 (or 24 hours).

mysql update with a self referencing query

I have a table of surveys which contains (amongst others) the following columns
survey_id - unique id
user_id - the id of the person the survey relates to
created - datetime
ip_address - of the submission
ip_count - the number of duplicates
Due to a large record set, its impractical to run this query on the fly, so trying to create an update statement which will periodically store a "cached" result in ip_count.
The purpose of the ip_count is to show the number of duplicate ip_address survey submissions have been recieved for the same user_id with a 12 month period (+/- 6months of created date).
Using the following dataset, this is the expected result.
survey_id user_id created ip_address ip_count #counted duplicates survey_id
1 1 01-Jan-12 123.132.123 1 # 2
2 1 01-Apr-12 123.132.123 2 # 1, 3
3 2 01-Jul-12 123.132.123 0 #
4 1 01-Aug-12 123.132.123 3 # 2, 6
6 1 01-Dec-12 123.132.123 1 # 4
This is the closest solution I have come up with so far but this query is failing to take into account the date restriction and struggling to come up with an alternative method.
UPDATE surveys
JOIN(
SELECT ip_address, created, user_id, COUNT(*) AS total
FROM surveys
WHERE surveys.state IN (1, 3) # survey is marked as completed and confirmed
GROUP BY ip_address, user_id
) AS ipCount
ON (
ipCount.ip_address = surveys.ip_address
AND ipCount.user_id = surveys.user_id
AND ipCount.created BETWEEN (surveys.created - INTERVAL 6 MONTH) AND (surveys.created + INTERVAL 6 MONTH)
)
SET surveys.ip_count = ipCount.total - 1 # minus 1 as this query will match on its own id.
WHERE surveys.ip_address IS NOT NULL # ignore surveys where we have no ip_address
Thank you for you help in advance :)
A few (very) minor tweaks to what is shown above. Thank you again!
UPDATE surveys AS s
INNER JOIN (
SELECT x, count(*) c
FROM (
SELECT s1.id AS x, s2.id AS y
FROM surveys AS s1, surveys AS s2
WHERE s1.state IN (1, 3) # completed and verified
AND s1.id != s2.id # dont self join
AND s1.ip_address != "" AND s1.ip_address IS NOT NULL # not interested in blank entries
AND s1.ip_address = s2.ip_address
AND (s2.created BETWEEN (s1.created - INTERVAL 6 MONTH) AND (s1.created + INTERVAL 6 MONTH))
AND s1.user_id = s2.user_id # where completed for the same user
) AS ipCount
GROUP BY x
) n on s.id = n.x
SET s.ip_count = n.c
I don't have your table with me, so its hard for me to form correct sql that definitely works, but I can take a shot at this, and hopefully be able to help you..
First I would need to take the cartesian product of surveys against itself and filter out the rows I don't want
select s1.survey_id x, s2.survey_id y from surveys s1, surveys s2 where s1.survey_id != s2.survey_id and s1.ip_address = s2.ip_address and (s1.created and s2.created fall 6 months within each other)
The output of this should contain every pair of surveys that match (according to your rules) TWICE (once for each id in the 1st position and once for it to be in the 2nd position)
Then we can do a GROUP BY on the output of this to get a table that basically gives me the correct ip_count for each survey_id
(select x, count(*) c from (select s1.survey_id x, s2.survey_id y from surveys s1, surveys s2 where s1.survey_id != s2.survey_id and s1.ip_address = s2.ip_address and (s1.created and s2.created fall 6 months within each other)) group by x)
So now we have a table mapping each survey_id to its correct ip_count. To update the original table, we need to join that against this and copy the values over
So that should look something like
UPDATE surveys SET s.ip_count = n.c from surveys s inner join (ABOVE QUERY) n on s.survey_id = n.x
There is some pseudo code in there, but I think the general idea should work
I have never had to update a table based on the output of another query myself before.. Tried to guess the right syntax for doing this from this question - How do I UPDATE from a SELECT in SQL Server?
Also if I needed to do something like this for my own work, I wouldn't attempt to do it in a single query.. This would be a pain to maintain and might have memory/performance issues. It would be best have a script traverse the table row by row, update on a single row in a transaction before moving on to the next row. Much slower, but simpler to understand and possibly lighter on your database.