I have a table full of duplicates. I'm trying to convert them so I can put a unique constraint across two fields (say, identifier1 and identifier2).
I would like to "collapse" those duplicates into single records, but of my records contain differing strings. I'd like to keep the last-touched in these circumstances (keeping the one from the highest ID and discarding the rest).
For example, I can aggregate the startDate below with MIN() -- but how do I only get the most recent location?
id | identifier1 | identifier2 | location | startDate
1 | alice | 0001 | ambridge | 2016-01-01
2 | bob | 1312 | brigadoon | 2017-01-01
3 | alice | 0001 | brigadoon | 2017-05-01
4 | bob | 9999 | brigadoon | 2015-01-01
5 | celeste | 1234 | cittegazze | 2011-01-01
id | identifier1 | identifier2 | location | startDate
6 | alice | 0001 | brigadoon | 2016-01-01
7 | bob | 1312 | brigadoon | 2017-01-01
8 | bob | 9999 | brigadoon | 2015-01-01
9 | celeste | 1234 | cittegazze | 2011-01-01
Try this:
select A.identifier1, A.identifier2, A.startDate, B.Location from (
select identifier1,
identifier2,
MIN(startDate) AS startDate
from TABLE_NAME
group by identifier1, identifier2
) AS A JOIN TABLE_NAME AS B
ON (A.identifier1 = B.identifier1 and A.identifier2 = B.identifier2 and A.startDate = B.startDate)
I think a more efficient query is simply:
select t.*
from t
where t.startDate = (select max(t2.startDate)
from t t2
where t2.identifier1 = t.identifier1 and t2.identifier2 = t.identifier2
);
The advantage of this approach is that it can take advantage of an index on (identifier1, identifier2, startDate).
Related
I have a script which is working but not as desired. My aim is to select the most recently inputted record on the plans database for each seller in the account_manager_sellers list.
The current issue with the script below is: It is returning the oldest record rather than the newest, for example: it is selecting a record in 2016 rather than one which has a timestamp in 2018. (eventually I need to change the WHERE clause to get all lastsale records before 2017-01-01.
Simple Database Samples.
plans AKA (sales list)
+----+------------------+-----------+
| id | plan_written | seller_id |
+----+------------------+-----------+
| 1 | 20/09/2016 09:12 | 123 |
| 2 | 22/12/2016 09:45 | 444 |
| 3 | 19/10/2016 09:07 | 555 |
| 4 | 02/10/2015 14:26 | 123 |
| 5 | 15/08/2016 11:06 | 444 |
| 6 | 16/08/2016 11:03 | 123 |
| 7 | 03/10/2016 10:15 | 555 |
| 8 | 28/09/2016 10:12 | 123 |
| 9 | 27/09/2016 15:12 | 444 |
+----+------------------+-----------+
account_manager_sellers (seller list)
+-----+----------+
| id | name |
+-----+----------+
| 123 | person 1 |
| 444 | person 2 |
| 555 | person 3 |
+-----+----------+
Current Code Used
SELECT p.plan_written, p.seller_id
FROM plans AS p NATURAL JOIN (
SELECT id, MAX(plan_written) AS lastsale
FROM plans
GROUP BY seller_id
) AS t
JOIN account_manager_sellers AS a ON a.id = p.seller_id
WHERE lastsale < "2018-05-08 00:00:00"
Summary
Using the code and example tables above, this code would return these 3 results, whilst we do expect 3 results, the MAX(plan_written) does not seem to have followed, my guess is that it is something to do with the GROUP clause, I am not sure if we can utilise an ORDER BY and LIMIT clause?
+--------------+------------------+
| seller_id | plan_written |
+--------------+------------------+
| 123 | 16/08/2016 11:03 |
| 444 | 15/08/2016 11:06 |
| 555 | 03/10/2016 10:15 |
+--------------+------------------+
The join condition in your query is off, and you should be restricting to the max date for each seller. Also, you don't need to join to the account_manager_sellers table to get your expected output:
SELECT p1.*
FROM plans p1
INNER JOIN
(
SELECT
seller_id, MAX(plan_written) AS max_plan_written
FROM plans
WHERE plan_written < '2018-05-08 00:00:00'
GROUP BY seller_id
) p2
ON p1.seller_id = p2.seller_id AND
p1.plan_written = p2.max_plan_written;
I have 2 tables
Transaction table
+----+----------+-----+---------+----
| TID | CampaignID | DATE |
+----+----------+-----+---------+---+
| 1 | 5 | 2016-01-01 |
| 2 | 5 | 2016-01-01 |
| 3 | 2 | 2016-01-01 |
| 4 | 5 | 2016-01-01 |
| 5 | 1 | 2016-01-01 |
| 6 | 1 | 2016-02-02 |
| 7 | 3 | 2016-02-02 |
| 8 | 3 | 2016-02-02 |
| 9 | 5 | 2016-02-02 |
| 10| 4 | 2016-02-02 |
+----+----------+-----+---------+---+
Campaign Table
+-------------+----------------+--------------------
| CampaignID | DailyMaxImpressions | CampaignActive
+-------------+----------------+--------------------
| 1 | 5 | Y |
| 2 | 5 | Y |
| 3 | 5 | Y |
| 4 | 5 | Y |
| 5 | 1 | Y |
+-------------+----------------+--------------------
What I am trying to do is get a single random campaign where the the count in transaction table is less than the daily max impressions in the campaign table. I might also be passing a date s part of the query for the transaction table
So for CampaignId 1 there must be 4 trans of less in the transaction table and the Campaignactive must be a "Y"
Any help would be appreciated if this can be done in a single statement. ( mysql )
Thanks in advance,
Jeff Godstein
This should get it for you. The basic query is select each campaign that is active. The INNER query will pre-aggregate per campaign for the given date in question. From that, a LEFT-JOIN allows any campaign to be returned even if it does NOT exist within the subquery OR it DOES exist, but the count is less than that allowed for the date in question. The order by RAND() is obvious.
SELECT
c.CampaignID
from
Campaign c
LEFT JOIN
( select
t1.CampaignID,
count(*) as CampCount
from
Transaction t1
where
t1.Date = YourDateParameterValue
group by
t1.CampaignID ) as T
ON c.CampaignID = T.CampaignID
where
c.CampaignActive = 'Y'
AND ( t.CampaignID IS NULL
OR t.CampCount < c.DailyMaxImpressions )
order by
RAND()
i'm build an exercises web app and i'm working with two tables like this:
Table 1: weekly_stats
| id | code | type | date | time |
|----|--------------|--------------------|------------|----------|
| 1 | CC | 1 | 2015-02-04 | 19:15:00 |
| 2 | CC | 2 | 2015-01-28 | 19:15:00 |
| 3 | CPC | 1 | 2015-01-26 | 19:15:00 |
| 4 | CPC | 1 | 2015-01-25 | 19:15:00 |
| 5 | CP | 1 | 2015-01-24 | 19:15:00 |
| 6 | CC | 1 | 2015-01-23 | 19:15:00 |
| .. | ... | ... | ... | ... |
Table 2: global_stats
| id | exercise_number |correct | wrong |
|----|-----------------|--------|-----------|
| 1 | 138 | 1 | 0 |
| 2 | 246 | 1 | 0 |
| 3 | 988 | 1 | 10 |
| 4 | 13 | 5 | 0 |
| 5 | 5 | 4 | 7 |
| 6 | 5 | 4 | 7 |
| .. | ... | ... | ... |
What i would like is to get MAX(correct-wrong) and MIN(correct-wrong) and now i'm working with this query:
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
UNION
SELECT
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats);
This query is working good, except for one thing: when "WHERE correct - wrong = (SELECT MIN(correct - wrong)[...]" selects more than one row, the row selected is the first but i would like to have returned the most recent (in other words: ordered by datetime(date, time)). Is it possible?
Thanks!
I think you can solve it like this:
SELECT * FROM (
SELECT
1 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MAX(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1 ) as a
UNION
SELECT * FROM (
SELECT
2 as sort_column,
exercise_number,
date,
time
FROM weekly_stats AS w JOIN global_stats AS g
ON w.id=g.id
WHERE correct - wrong = (SELECT MIN(correct - wrong) from global_stats)
ORDER BY date DESC, time DESC
LIMIT 1) as b
ORDER BY sort_column;
Here is the documentation about how UNION works.
Hei guy I'm working on a POS app with MySQL. Here is my situation:
Table "purchased_item"
| id | name | check_id | real_price |
| 1 | iPhone5 | 0001 | 399 |
| 2 | iPhone4 | 0001 | 199 |
| 3 | iPhone5s | 0002 | 599 |
| 4 | iPhone5c | 0003 | 399 |
| 5 | iMac 21" | 0003 | 999 |
| 6 | iPod Touch | 0003 | 99 |
| 7 | iPhone5 | 0004 | 399 |
| 8 | iPhone3G | 0004 | 99 |
| 9 | iPhone6 | 0005 | 899 |
| 10 | iPhone3Gs | 0005 | 101 |
And I want to know how many checks's total are larger than or qual(>=) 1000, so what I'm doing now is to do several times of query. In this example, I do 5 times and sum it manually by the host program.
Later the data grow, the queries become slow because there're tons of checks everyday. So I change to record it to another table.
Table "checks"
| id | total | sales |
| 0001 | 598 | A |
| 0002 | 599 | A |
| 0003 | 1497 | B |
| 0004 | 498 | B |
| 0005 | 1000 | A |
But another problem occur in the later time: When I need to adjust the real_price in "purchased_item" table, I also need to maintain the "total" column in "checks" table. It sounds doesn't a big matter but I'd like to find a better way to solve it.
Solved:
SELECT * FROM purchased_item
GROUP BY check_id
HAVING sum(real_price) >= 1000
And the result will be:
| id | name | check_id | real_price |
| 4 | iPhone5c | 0003 | 399 |
| 9 | iPhone6 | 0005 | 899 |
Further question: If I want to count the total price for checks, how can I do it?
I found it:
SELECT check_id,sum(real_price) FROM purchased_item
GROUP BY check_id
HAVING sum(real_price) >= 1000
Try it this way
SELECT i.id, i.name, i.check_id, i.real_price
FROM
(
SELECT MIN(id) id
FROM purchased_item
GROUP BY check_id
HAVING SUM(real_price) >= 1000
) q JOIN purchased_item i
ON q.id = i.id
ORDER BY q.id DESC
Sample output:
| ID | NAME | CHECK_ID | REAL_PRICE |
|----|----------|----------|------------|
| 9 | iPhone6 | 5 | 899 |
| 4 | iPhone5c | 3 | 399 |
...I want to count how many checks's total are over 1000
For that you can just do this
SELECT COUNT(*) total
FROM
(
SELECT check_id
FROM purchased_item
GROUP BY check_id
HAVING SUM(real_price) >= 1000
) q;
Sample output:
| TOTAL |
|-------|
| 2 |
Here is SQLFiddle demo
To update total in checks after adjusting real_price in purchased_item
UPDATE checks c JOIN
(
SELECT check_id, SUM(real_price) total
FROM purchased_item
WHERE check_id IN(5) -- whatever check(s)'s total you want to recalculate
GROUP BY check_id
) p
ON c.id = p.check_id
SET c.total = p.total;
Here is SQLFiddle demo
Hi all and thanks in advance.
I have a small problem which can not resolve, I have this table, and I want to sort by date and group it (but I only show 1 row per idCAT)
| id | idcat | name | date |
| 1 | 3 | xx | 2011-01-02 |
| 2 | 4 | xf | 2011-01-02 |
| 3 | 3 | cd | 2011-01-01 |
| 4 | 1 | cg | 2011-01-04 |
| 5 | 4 | ce | 2011-01-06 |
would like to stay that way, try in a way but I can not
| 2 | 4 | xf | 2011-01-02 |
| 3 | 3 | cd | 2011-01-01 |
| 4 | 1 | cg | 2011-01-04 |
Order by ID
Thank's a one friend the work.
SELECT id, idcat, name, date FROM (SELECT * FROM data ORDER BY idcat, date ) m GROUP BY idcat
I can't test conveniently atm, but try this:
SELECT FIRST(id), idcat, FIRST(name), FIRST(date) AS d FROM myTable GROUP BY idcat ORDER BY d;
Note the use of the FIRST calls to pick the first row in the table with any particular idcat.
If you are trying to get groupings by the idcat, then sorted by date:
select id, idcat, name, date from myTable group by idcat order by date;