I'm having trouble writing this Query. I have 2 tables, vote_table and click_table. in the vote_table I have two fields, id and date. the format of the date is "12/30/11 : 14:28:36". in the click_table i have two fields, id and date. the format of the date is "12.30.11".
The id's occur multiple times in both tables. What i want to do is produce a result that contains 3 fields: id, votes, clicks. the id column should have distinct id values, the votes column should have the total times that ID has the date 12/30/11% from the vote_table, and the clicks should have the total times that ID has the date 12.30.11 from the click table, so something like this:
ID | VOTES | CLICKS
001 | 24 | 50
002 | 30 | 45
Assuming that the types of the 'date' columns are actually either DATE or DATETIME (rather than, say, VARCHAR), then the required operation is fairly straight-forward:
SELECT v.id, v.votes, c.clicks
FROM (SELECT id, COUNT(*) AS votes
FROM vote_table AS v1
WHERE DATE(v1.`date`) = TIMESTAMP('2011-12-30')
GROUP BY v1.id) AS v
JOIN (SELECT id, COUNT(*) AS clicks
FROM click_table AS c1
WHERE DATE(c1.`date`) = TIMESTAMP('2011-12-30')
GROUP BY c1.id) AS c
ON v.id = c.id
ORDER BY v.id;
Note that this only shows ID values for which there is at least one vote and at least one click on the given day. If you need to see all the ID values which either voted or clicked or both, then you have to do more work.
If you have to normalize the dates because they are VARCHAR columns, the WHERE clauses become correspondingly more complex.
Related
I need to create a query from 2 tables, where my company stores e-shop information.
Example of data from the first table:
currentDate: 5.5.2022 | eshopId: 1 | eshopName: test | active: true |
Table 2:
currentDate: 5.5.2022 | eshopId: 1 | orderId: 123 | attribution: direct |
From the first table, I want get how many days in a given period the eshop was active. From the second table, I would like to count all the orders that were attributed directly to our company in the same time period as in the first table.
SELECT i.id, count(*)
from table1 as i
FULL JOIN table1 as e ON e.id= i.id
WHERE i.active = TRUE
GROUP BY i.id
I tried merging the same table twice, because after I used count to get amount of inactive dates, I could not use another variable as it was not aggregated. This still does not work. I cannot imagine how I would do this for 3 tables. Can someone please point me in the right direction on how to do this? Thanks.
If there is one row for each day per eshopId and you want to count number of active days along with number of order per eshopId:
SELECT i.eshopId, count(*)
from table1 as i
left join (select eshopId, count(distinct orderId) from table2 group by eshopId) j on i.eshopId=j.eshopId
WHERE i.active = TRUE
GROUP BY i.eshopId
I'm practicing MySQL and I'm trying to solve an exercise. I have data that contains reviews to a hotel. The data contains reviews by different users: one user can have many reviews if they have visited more than once. Each review has its own id and then review values from 1 to 5.
The reviews also have dates, and now I would like to count the average reviews of the first visits (earliest date). My problem is that the ways I have tried to retrieve the earliest date, don't actually work. By this I mean that I get the same results with and without the HAVING and WHERE methods. Is there someone that could help me with this? Thanks!
Here is my query (I have tried with the HAVING and WHERE methods)
SELECT AVG(overall_rating), AVG(rooms_rating), AVG(service_rating), AVG(location_rating), AVG(value_rating)
FROM reviews
HAVING MIN(review_date)
#WHERE review_date IN (SELECT MIN(review_date) FROM repatrons_reviews GROUP BY id)
Here is an example of the data
user_id | id | rooms_rating | service_rating | location_rating | value_rating | date
---------------------------------------------------------------------------------------
matt21 | 123 | 4 | 5 | 2 | 4 | 2007-08-20
This
SELECT ...
FROM reviews
HAVING MIN(review_date)
cannot work. Let's say the minimum date in the table is DATE '2020-01-01', then what is HAVING DATE '2020-01-01' supposed to mean?
This
SELECT ...
FROM reviews
WHERE review_date IN (SELECT MIN(review_date) FROM reviews GROUP BY id);
is close, but it's not the minimum date per ID, but the minimum date per user ID you want. And if you replace id by user_id, then there is still a problem, because what is the first date for one user can be the third date for another.
Here is this query corrected:
SELECT
AVG(overall_rating), AVG(rooms_rating),
AVG(service_rating), AVG(location_rating), AVG(value_rating)
FROM reviews
WHERE (user_id, review_date) IN
(SELECT user_id, MIN(review_date) FROM reviews GROUP BY user_id);
You can do it with NOT EXISTS:
SELECT AVG(r.overall_rating), AVG(r.rooms_rating), AVG(r.service_rating), AVG(r.location_rating), AVG(r.value_rating)
FROM reviews r
WHERE NOT EXISTS (SELECT 1 FROM reviews WHERE user_id = r.user_id AND date < r.date)
Let's say I have a schools table (cols = "ids (int)") and a users table (cols = "id (int), school_id (int), created_at (datetime)").
I have a list of school ids saved in <school_ids>. I want to group those schools by the yearweek(users.created_at) value for the user at that school with the earliest created_at value, and for each group list the value of yearweek(users.created_at) and the number of schools.
In other words, i want to find the earliest-created user for each school, and then group the schools by the yearweek() result for that created_at date, so i have the number of schools that signed up their first user in each week, effectively.
So, i want results like
| 201301 | 22 | #meaning there are 22 schools where the earliest created_at user
#has yearweek(created_at) = "201301"
| 201302 | 5 | #meaning there are 5 schools where the earliest created_at user
#has yearweek(created_at) = "201302"
etc
As a sanity check, the total of all rows in the second column should equal the size of <school_ids>, ie the number of ids in school_ids.
Does that make sense? I can't quite figure out how to get this without doing several queries and storing values in between. I'm sure there's a one-liner. Thanks! max
You could use a subquery that returns the minimum created_at field for every school_id, and then you can group by yearweek and do the count:
SELECT
yearweek(u.min_created_at) AS yearweek_first_user,
COUNT(*)
FROM
(
SELECT school_id, MIN(created_at) AS min_created_at
FROM users
GROUP BY school_id
) u
GROUP BY
yearweek(u.min_created_at)
In my application, each product group has many products, and each product has one manufacturer. These relations are stored by MySQL in InnoDB tables product_groups with an id field, and products with id, product_group and manufacturer fields.
Is there a way to find the most common manufacturer in each product group, without resorting to selecting subqueries?
This is how I'm doing it currently:
SELECT product_groups.id,
(
SELECT manufacturer FROM products
WHERE product_group = product_groups.id
GROUP BY manufacturer
ORDER BY count(*) DESC
LIMIT 1
) manufacturer_mode
FROM product_groups;
Try this solution:
SELECT
a.product_group,
SUBSTRING_INDEX(GROUP_CONCAT(a.manufacturer ORDER BY a.occurrences DESC SEPARATOR ':::'), ':::', 1) AS manufacturer_mode
FROM
(
SELECT
aa.product_group,
aa.manufacturer,
COUNT(*) AS occurrences
FROM
products aa
GROUP BY
aa.product_group,
aa.manufacturer
) a
GROUP BY
a.product_group
Explanation:
This still uses a form of subquery, but one which executes only once as opposed to one that executes on a row-by-row basis such as in your original example.
It works by first selecting the product_group id, the manufacturer, and the count of how many times the manufacturer appears for each particular group.
The FROM sub-select will look something like this after execution (just making up data here):
product_group | manufacturer | occurrences
---------------------------------------------------
1 | XYZ | 4
1 | Test | 2
1 | Singleton | 1
2 | Eloran | 2
2 | XYZ | 1
Now that we have the sub-select result, we need to pick out the row that has the maximum in the occurences field for each product group.
In the outer query, we group the subselect once again by the product_group field, but this time, only the product_group field. Now when we do our GROUP BY here, we can use a really compelling function in MySQL called GROUP_CONCAT which we can use to concatenate the manufacturers together and in any order we want.
...GROUP_CONCAT(a.manufacturer ORDER BY a.occurrences DESC SEPARATOR ':::'...
What we are doing here is concatenating the manufacturers together that are grouped together per product_group id, the ORDER BY a.occurrences DESC makes sure that the manufacturer with the most appearances appears first in the concatenated list. Finally we are separating each manufacturer with :::. The result of this for product_group 1 will look like:
XYZ:::Test:::Singleton
XYZ appears first since it has the highest value in the occurance field. We only want to select XYZ, so we encase the concatenation within SUBSTRING_INDEX, which will allow us to only pick the first element of the list based on the ::: delimiter.
The end result will be:
product_group | manufacturer_mode
---------------------------------------
1 | XYZ
2 | Eloran
I have two tables, simplified to the following:
users:
+-----+------+-----------+
| id | name | timestamp |
+-----+------+-----------+
vouchers:
+-----+------+
| id | code |
+-----+------+
I also have a third table, containing pairs of IDs:
recipients:
+-----+------+------+
| id | u_id | v_id |
+-----+------+------+
I need to periodically insert new pairs of IDs to the recipients table when a user's row is older than two weeks (the query will be scheduled to run once a day). IDs already present within the recipients table should not be retrieved.
I am currently unable to find an effective method of returning arbitrarily paired IDs from the two initial SELECT queries:
1. SELECT id FROM users WHERE date < NOW() - INTERVAL 2 WEEK AND id not in (select u_id from recipients)
2. SELECT id FROM vouchers WHERE id not in (select v_id from recipients) limit *by number of retrieved user IDs*
So far, all of my attempted JOINS have failed to achieve the desired result. I have established a solution using the two above queries, with a PHP for loop to pair the results before their insertion, but I am very aware that this is poor.
Thanks in advance,
You could create a Cartesian Product and remove the combinations already present in Recipients using a NOT EXISTS
Cartesian Product
INNER JOIN and , (comma) are semantically equivalent in the absence of
a join condition: both produce a Cartesian product between the
specified tables (that is, each and every row in the first table is
joined to each and every row in the second table).
SELECT *
FROM users u
, vouchers v
WHERE u.timestamp < NOW() -INTERVAL 2 WEEK
AND NOT EXISTS (
SELECT *
FROM Recipients r
WHERE r.u_id = u.id
AND r.v_id = v.id
)