I need help with a MySQL query. We have a database (~10K rows) which I have simplified down to this problem.
We have 7 truck drivers who visit 3 out of a possible 9 locations, daily. Each day they visit exactly 3 different locations and each day they can visit different locations than the previous day. Here are representative tables:
Table: Drivers
id name
10 Abe
11 Bob
12 Cal
13 Deb
14 Eve
15 Fab
16 Guy
Table: Locations
id day address driver.id
1 1 Oak 10
2 1 Elm 10
3 1 4th 10
4 1 Oak 16
5 1 4th 16
6 1 Toy 16
7 1 Toy 11
8 1 5th 11
9 1 Law 11
10 2 Oak 11
11 2 4th 11
12 2 Toy 11
.........
We have data for a full year and we need to find out how many times each "route" is visited over a year, sorted from most to least.
From my high school math, I believe there are 9!/(6!3!) route combinations, or 84 in total. I want do something like:
Get count of routes where route addresses = 'Oak' and 'Elm' and '4th'
then run again
where route addresses = 'Oak' and 'Elm' and '5th'
then again and again, etc.Then sort the route counts, descending. But I don't want to do it 84 times. Is there a way to do this?
I'd be looking at GROUP_CONCAT
SELECT t.day
, t.driver
, GROUP_CONCAT(t.address ORDER BY t.address)
FROM mytable t
GROUP
BY t.day
, t.driver
What's not clear here, if there's an order to the stops on the route. Does the sequence make a difference, and how to we tell what the sequence is? To ask that a different way, consider these two routes:
('Oak','Elm','4th') and ('Elm','4th','Oak')
Are these equivalent (because it's the same set of stops) or are they different (because they are in a different sequence)?
If sequence of stops on the route distinguishes it from other routes with the same stops (in a different order), then replace the ORDER BY t.address with ORDER BY t.id or whatever expression gives the sequence of the stops.
Some caveats with GROUP_CONCAT: the maximum length is limited by the setting of group_concat_max_len and max_allowed_packet variables. Also, the comma used as the separator... if we combine strings that contain commas, then in our result, we can't reliably distinguish between 'a,b'+'c' and 'a'+'b,c'
We can use that query as an inline view, and get a count of the the number of rows with identical routes:
SELECT c.route
, COUNT(*) AS cnt
FROM ( SELECT t.day
, t.driver
, GROUP_CONCAT(t.address ORDER BY t.address) AS route
FROM mytable t
GROUP
BY t.day
, t.driver
) c
GROUP
BY c.route
ORDER
BY cnt DESC
Related
I have been at this for a few days without much luck and I am looking for some guidance on how to get the lowest estimate from a particular group of sullpiers and then place it into another table.
I have 4 supplier estimate on every piece of work and all new estimates go into a single table, i am trying to find the lowest 'mid' price from the 4 newsest entries in the 'RECENT QUOTE TABLE' with a group id of '1' and then place that into the 'LOWEST QUOTE TABLE' as seen below.
RECENT QUOTE TABLE:
suppid group min mid high
1 1 200 400 600
2 1 300 500 700
3 1 100 300 500
[4] [1] 50 [150] 300
5 2 1000 3000 5000
6 2 3000 5000 8000
7 2 2000 4000 6000
8 2 1250 3125 5578
LOWEST QUOTE TABLE:
suppid group min mid high
4 1 50 150 300
Any help on how to structure this would be great as i have been loking for a few days and have not been able to find anything to get me moving again, im using MYSQL and the app is made in Python im open to all suggestions.
Thanks in advance.
If you really want to select only row with group 1, you can do something like
INSERT INTO lowest_quote_table
SELECT * FROM recent_quote_table
WHERE `group` = 1
ORDER BY `mid` ASC
LIMIT 1.
If you want a row with the lowest mid from every group, you can do something like
INSERT INTO lowest_quote_table
SELECT rq.* FROM recent_quote_table AS rq
JOIN (
SELECT `group`, MIN(`mid`) AS min_mid FROM recent_quote_table
GROUP BY `group`
) MQ ON rq.`group` = MQ.`group` AND rq.`mid` = MQ.min_mid
At first I would like greet all Users and apologize for my english :).
I'm new user on this forum.
I have a question about MySQL queries.
I have table Items with let say 2 columns for example itemsID and ItemsQty.
itemsID ItemsQty
11 2
12 3
13 3
15 5
16 1
I need select itemsID but duplicated as many times as indicated in column ItemsQty.
itemsID ItemsQty
11 2
11 2
12 3
12 3
12 3
13 3
13 3
13 3
15 5
15 5
15 5
15 5
15 5
16 1
I tried that query:
SELECT items.itemsID, items.itemsQty
FROM base.items
LEFT OUTER JOIN
(
SELECT items.itemsQty AS Qty FROM base.items
) AS Numbers ON items.itemsQty <=Numbers.Qty
ORDER BY items.itemsID;
but it doesn't work correctly.
Thanks in advance for help.
SQL answer - Option 1
You need another table called numbers with the numbers 1 up to the maximum for ItemsQuantity
Table: NUMBERS
1
2
3
4
5
......
max number for ItemsQuantity
Then the following SELECT statement will work
SELECT ItemsID, ItemsQty
FROM originaltable
JOIN numbers
ON originaltable.ItemsQty >= numbers.number
ORDER BY ItemsID, number
See this fiddle -> you should always set-up a fiddle like this when you can - it makes everyone's life easier!!!
code answer - option 2
MySQL probably won't do what you want 'cleanly' without a second table (although some clever person might know how)
What is wrong with doing it with script?
Just run a SELECT itemsID, ItemsQty FROM table
Then when looping through the result just do (pseudo code as no language specified)
newArray = array(); // new array
While Rows Returned from database{ //loop all rows returned
loop number of times in column 'ItemsQty'{
newArray -> add 'ItemsID'
}
}//end of while loop
This will give you a new array
0 => 11
1 => 11
2 => 12
3 => 12
4 => 12
5 => 13
etc.
Select DISTINCT items.itemsID, items.itemsQty From base.items left outer join (select items.itemsQty as Qty from base.items) As Numbers On items.itemsQty <=Numbers.Qty
order by items.itemsID;
Use DISTINCT to remove duplicates. Read more here - http://dev.mysql.com/doc/refman/5.0/en/select.html
It seems like I understood what you asked differently than everyone else so I hope I answer you question. What I would basically do is -
create a new table for those changes.
Create a mysql procedure which given a line in the original table add new lines to the new table - http://dev.mysql.com/doc/refman/5.6/en/loop.html
Run this procedure for each line in the original table.
try this to get distinct values from both columns
SELECT DISTINCT itemsID FROM items
UNION
SELECT DISTINCT itemsQty FROM items
I've a tree structure, and its subsequent assignment table for customer categories in an sql server database.
CustomerCategory (CategoryID, ParentId)
CustomerInCategory(CustomerID, CategoryID)
If a CustomerCategory has any customer assigned to it, we can't add another subcategory to it. So, Customer can only be added to the lowest level in every sub tree. In other sense, the result of this query
SELECT * FROM `CustomerCategory` WHERE `CategoryId` NOT IN
(SELECT DISTINCT `parentid` FROM `CustomerCategory` WHERE `parentid` IS NOT NULL)
would yield leaf nodes. The Other thing is that, this tree might have subtrees of different levels, and we also, don't want to bound the number of levels in anyway, however, our users won't need more than 10 levels. Consider this as an illustration
CategoryID------ParentID---------------Name
1 NULL All Customers
2 1 Domestic
3 1 International
4 2 Independent Retailers
5 2 Chain Retailers
6 2 Whole Sellers
7 5 A-Mart
8 5 B-Mart
9 4 Grocery Stores
10 4 Restaurants
11 4 Cafes
CustomerID---------CustomerName----------Category
1 Int.Customer#1 3
2 Int.Customer#2 3
3 A-Mart.Branch#1 7
4 A-Mart.Branch#2 7
5 B-Mart.Branch#1 8
6 B-Mart.Branch#2 8
7 Grocery#1 9
8 Grocery#2 9
9 Grocery#3 9
10 Restaurant#1 10
11 Restaurant#2 10
12 Cafe#1 11
13 Wholeseller#1 6
14 Wholeseller#2 6
My requirement is something like this, "Given a node in Categories, Return All the Customers attached to any node below it".
How can I do it with sql?
Obviously this can be done with a recursive call in the code, but how can we do it in t-sql (without calling a stored procedure several times or using text-based search)?
Can any body, Use a CTE to solve this problem?
I have a result set of something like this in mind
CustomerID--------Customer Name----------------CategoryId----------CAtegoryName
12 Cafe#1 11 Cafes
12 Cafe#1 4 IndependentRetailers
12 Cafe#1 2 Demoestic
12 Cafe#1 1 AllCustomers
.
.
.
4 A-Mart.Branch#2 7 A-Mart
4 A-Mart.Branch#2 5 Chain Retailers
4 A-Mart.Branch#2 2 Domestic
4 A-Mart.Branch#2 1 All Customers
.
.
.
14 Wholeseller#2 6 WholeSellers
14 Wholeseller#2 2 Domestic
14 Wholeseller#2 1 All Customers
This is not necessarily a good Idea to layout a result like this, This would consume too much space, something that might not be required, yet, a search in such result set would be very fast. If I want to find all the customers below say categoryId = 2 , I would simply query
SELECT * FROM resultset where category ID = 2
Any suggestions to improve the data model is super welcomed! If It helps solving this problem.
Once again, I'm not fixated on this result set. Any other Suggestion that solves the problem,
"Given a node in Categories, Return All the Customers attached to any node below it", is well accepted.
You can use a CTE to recursively build a table containing all the parent-child relationships and use the where clause to get only the subtree you need (in my example, everyting under CategoryId 5) :
WITH CategorySubTree AS (
SELECT cc.CategoryId as SubTreeRoot,
cc.CategoryId
FROM CustomerCategory cc
UNION ALL
SELECT cst.SubTreeRoot, cc.CategoryId
FROM CustomerCategory cc
INNER JOIN CategorySubTree cst ON cst.CategoryId = cc.parentId
)
SELECT cst.CategoryId
FROM CategorySubTree cst
WHERE cst.SubTreeRoot = 5
You can modify this query to add whatever you need, for example, to get customers linked to the category nodes in the subtree :
WITH CategorySubTree AS (
SELECT cc.CategoryId as SubTreeRoot,
cc.CategoryId
FROM CustomerCategory cc
UNION ALL
SELECT cst.SubTreeRoot, cc.CategoryId
FROM CustomerCategory cc
INNER JOIN CategorySubTree cst ON cst.CategoryId = cc.parentId
)
SELECT cst.CategoryId,cic.CustomerId
FROM CategorySubTree cst
INNER JOIN CustomerInCategory cic ON cic.CategoryId = cst.CategoryId
WHERE cst.SubTreeRoot = 5
And of course you can join further tables to get labels and other needed information.
I've the following database structure:
id idproperty idgbs
1 1 136
2 1 128
3 1 10
4 1 1
5 2 136
6 2 128
7 2 10
8 2 1
9 3 561
10 3 560
11 3 10
12 3 1
13 4 561
14 4 560
15 4 10
16 4 1
17 5 234
18 5 120
19 5 1
20 6 234
21 6 120
22 6 1
Here are the details:
The table refers idproperty with different geographic location. For example:
idgbs
1 refers to United States
10 refers to Alabama with parentid 1 (United States)
128 refers to Alabama Gulf Coast with parentid 10 (Alabama)
136 Dauphin Island with parentid 128 (Alabama Gulf Coast)
So, the structure is:
United States > Alabama > Alabama Gulf Coast > Dauphin Island
I want to delete all entries for idproperty EXCEPT the first with the set of idgbs 136, 128, 10, 1 i.e. leave atleast 1 property in all GBS and delete others.
Also, sometimes it is 4 level of geographic entries, sometimes it is 3 level.
Please share the logic & SQL query to delete all entries except one in every unique GBS.
GBS 1, 10, 128, 136 is one unique, so database should only contain 1 property id with these GBS.
After the query, the table would look like this:
id idproperty idgbs
1 1 136
2 1 128
3 1 10
4 1 1
9 3 561
10 3 560
11 3 10
12 3 1
17 5 234
18 5 120
19 5 1
Rephrasing the question:
I want to keep properties in every root level GBS i.e. there should be only ONE property in Dauphin Island.
Whew... I think I understand what you are after now. I couldn't let this one go ;-)
I had to realize that in the question, you wanted property 2 deleted, because it shared a hierarchy with property 1. Once I realized that, I got the following idea. Basically, we join to an aggregated version of self twice: the first one tells us what our "gbs hierarchy path" is, and the second one matches any previous properties with the same hierarchy. Rows which find that there are no "previous" properties that share their hierarchy are spared, the rest with that hierarchy are deleted. It's possible that this could be further tweaked, but I wanted to share this now. I have tested it with the data you showed, and I got the results you posted.
DELETE
each_row.*
FROM property_gbs AS each_row
JOIN ( SELECT
idproperty,
GROUP_CONCAT(idgbs ORDER BY idgbs DESC SEPARATOR "/") AS idgbs_path
FROM property_gbs
GROUP BY idproperty
) AS mypath
USING(idproperty)
LEFT JOIN ( SELECT
idproperty,
GROUP_CONCAT(idgbs ORDER BY idgbs DESC SEPARATOR "/") AS idgbs_path
FROM property_gbs
GROUP BY idproperty
) AS previous_property
ON mypath.idgbs_path = previous_property.idgbs_path
AND previous_property.idproperty < each_row.idproperty
WHERE previous_property.idproperty
Note that the last line is not a typo, we are just checking if there is a previous property with the same path. If there is, then delete the currently-evaluated row.
Cheers!
note for clarification
The thought here is to associate every row with it's hierarchy, even if it's a row which represents somewhere in the middle of the hierarchy (such as row: {2, 1, 128} in the question). With the first join to the aggregate, each row now "knows" what it's path is (so that row would get "136/128/10/1"). We can then use that value in the second join to find other properties with the same path, but only if they have a LOWER property id. This allows us to check for the existence of a lower-ID property with the same "path", and delete any row which represents a property which does have such a "lower-order path-sibling".
I'm really not sure. But try this.
DELETE a1 FROM table a1, table a2
WHERE a1.id > a2.id
AND a1.idgbs = a2.idgbs
AND a1.idgbs <> 1
if you want to keep the row with the lowest id.
This one was difficult #dang but I enjoyed the challenge.
;With [CTE] as (Select id ,idproperty ,idgbs ,Row_Number() Over(Partition By idgbs order by idproperty Asc) as RN From [TableGBS])
,[CTE2] as (Select * From [CTE] Where RN > 1)
,[CTE3] as (Select idproperty ,count(*) as [Count] From [CTE2] Group by idproperty)
Delete from [TableGBS] Where id in (Select a.id From [CTE] as a Left Join [CTE3] as b on a.idproperty = b.idproperty Where RN > 1 And [Count] > 2);
Since i dont think you can do a delete statement in sqlfiddle here are the rows it will delete showing in a select statement: http://sqlfiddle.com/#!3/08108/40
Edit: I use MySQL linked to Microsoft SQL Server Management Studio so this might not work for you
This technique joins the table against an aggregated version of itself, essentially matching each row in the table to the knowledge of which idproperty is the lowest for it's idgbs, and deletes the row if it does not share that idproperty (i.e. the row with the lowest idproperty, when joined to itself, will not be deleted, but the rest of the rows with that idgbs will).
DELETE
each_row.*
FROM table AS each_row
JOIN (select MIN(idproperty), idgbs FROM table GROUP BY idgbs) as lowest_id
USING(idgbs)
WHERE each_row.idproperty != lowest_id.idproperty;
I have table structure as follows
id productid ip hittime
---------------------------------------------------------------------------
1 5 1.1.1.1 2011-05-03 06:55:11
2 5 1.1.1.1 2011-05-03 06:57:11
3 6 2.2.2.2 2011-05-03 07:30:00
4 4 1.1.1.1 2011-05-03 07:32:54
5 5 2.2.2.2 2011-05-03 07:55:00
Now I need query such that, it output me total and unique hits for each product
productid totalhits uniquehits
------------------------------------------------------------------
4 1 1
5 3 2
6 1 1
Criteria for
Total Hits = all the records that belong to particular product
Unique Hits = 2 hits are identified as unique hits if (1) IP is different or (2) for same ip, there is difference of 5 mins in hittime
How can I achieve this?
rMX was extremely close with his solution, it's quite clever. He should really get the credit, I just tweaked it slightly to add in a couple missing pieces:
select productid, count(*) totalhits,
count(distinct
concat(ip,
date_format(hittime, '%Y%m%d%H'),
round(date_format(hittime, '%i') / 5) * 5)
) uniquehits
from table
group by productid
Changes I made to rMX's idea:
Changed ceil() to round() because
ceil/floor will cause edge cases to
be treated improperly
Multiply the results of the round()
by 5. I think rMX meant to do this
and just forgot to type it.
EDIT: The multiplying by 5 really isn't necessary. My brain was just muddled. Changing ceil() to round() still matters though.
UPD>
select productid, count(*) totalhits,
count(distinct
concat(ip,
date_format(hittime, '%Y%m%d%H'),
ceil(date_format(hittime, '%i') / 5))
) uniquehits
from table
group by productid
I think, this should work. Sorry, had no time to test it.