Data warehouse logical model - mysql

I have drawn a logical model for a hotel room booking OLTP system Logical model
I'm facing a problem querying it.
the question is: For each Country and quarter, produce the cumulative income of 3-star-rooms
Noted that RoomBand contain how many starts for each room
I have tried this
SELECT Quarter, Country
FROM DT_Date, Country, RoomBand
where RoomBand = 3
the output was this
CountyName Quarter
USA 2
Uk 1
USA 2
UK 1
as it has shown it repeated USA and UK
why is that happening
Thanks for the support all

Try the DISTINCT operator with your query. For example: SELECT DISTINCT column1, column2, .... You can read more about it here.

Related

SQL: how to select where one column does not match another column for ALL records within a given group

I have a table named sales in a MySQL database that looks like this:
company manufactured shipped
Mercedes Germany United States
Mercedes Germany Germany
Mercedes Germany United States
Toyota Japan Canada
Toyota Japan England
Audi Germany United States
Audi Germany France
Audi Germany Canada
Tesla United States Mexico
Tesla United States Canada
Tesla United States United States
Here is a Fiddle: http://www.sqlfiddle.com/#!17/145ff/3
I would like to return the list of companies that ship ALL of their products internationally (that is, where the value in the manufactured column differs from the value in the shipped column for ALL records of a particular company).
Using the example above, the desired result set would be:
company
Toyota
Audi
Here is my (hackish) attempt:
WITH temp_table AS (
SELECT
s.company AS company
, SUM(CASE
WHEN s.manufactured != s.shipped THEN 1
ELSE 0
END
) AS count_international
, COUNT(s.company) AS total_within_company
FROM
sales s
GROUP BY
s.company
)
SELECT
company
FROM
temp_table
WHERE count_international = total_within_company
Essentially, I count the instances where the columns do not match. Then I check whether the sum of those mismatched instances matches the number of records within a given group.
This approach works, but it's far from an elegant solution!
Can anyone offer advice as to a more idiomatic way to implement this query?
Thanks!
We can GROUP BY company and use a HAVING clause to say all countries in shipped must differ to the country in manufactured:
SELECT company
FROM sales
GROUP BY company
HAVING COUNT(CASE WHEN manufactured = shipped THEN 1 END) = 0;
Try out here: db<>fiddle
The fiddle linked in the question is a Postgres DB, but MySQL is taged as DBMS.
In a MySQL DB, the above query can be simplified to:
SELECT company
FROM sales
GROUP BY company
HAVING SUM(manufactured = shipped) = 0;
In a Postgres DB, this is not possible.
You have to think in sets... you want to display all without a match -- find the matches display the rest
SELECT DISTINCT company
FROM sales
WHERE company NOT IN (
SELECT company
FROM sales
WHERE manufactured = shipped
)

Fetch fixed set of duplicate records from table

I want to fetch records from a table that contains duplicate records. I want the output to be like only two duplicate records from each set of duplicate records in overall record output set.
example-
Name
Country
John
India
Mark
India
Chris
Russia
Feggy
England
Rain
Russia
Monesy
Russia
Bhumi
India
Peter
England
Bruice
England
Radhe
India
Output should have only two duplicate set of records from all duplicate of similar type as we can see in output below the country is repeating only two times and it took only first two counters of duplicate records in final record set -
Name
Country
John
India
Mark
India
Chris
Russia
Feggy
England
Rain
Russia
Peter
England
You can number the lines by the window and select only the first N.
Sorting should be chosen according to the business logic of the query.
For example:
;WITH numbered_name AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY t.Country ORDER BY t.Name) rn
FROM table t
)
SELECT Name
, Country
FROM numbered_name
WHERE rn <= 2

How to set up crosstab queries to count days for negative stock counts?

Hello Stack overflow (and anyone googling similar questions in the future)!
I have a dataset that regularly reports which products are absent on a warehouse stockcheck, which I am trying to use to analyse when stock is or isn’t available. I’m essentially trying to identify “Has a part been reported as missing? -> If so, count the number of days it is missing until another part in the same category is reported as missing, but the original part was not reported as missing on that date (as we can assume it’s back in stock)”.
I’ve managed to make this work in excel, but my spreadsheet began to die from the calculation of 5 locations worth of categories and parts, let alone across the 600+ I’m working on! As a result, I’m trying to set up a similar function in Access to analyse which, and for how long, parts were out of stock.
My dataset looks something like:
Location number
Location
Category
Date reported
Part Number
Part Description
Order number
1
London
Car
03/06/2021
2021
Wheel A
1
2
London
Bus
03/06/2021
1491
Seat C
2
3
Manchester
Car
01/06/2021
2021
Wheel A
3
My assumptions are that:-
• My data is fed by individual workers who each cover a location, and check all stock for a random selection of categories each visit (with the idea that they cover all of their location’s categories within a certain number of visits) and record which parts are missing. There is no particular visit plan – it can be a random number of days between each visit. This data gets fed into a central table, which I have access to.
• As my workers may not check all categories in a location on each visit, I must assume that a previously reported missing part is OOS until they check products in the same category, but do not report that part again.
I made this work on excel by setting up another column that concatenated my location, part number, and date reported, and then set up three tables (all of which are essentially locations, categories, and parts down my X axis, and dates across the Y axis):-
• Table1, to look if my concatenated code was reported for each day (and if so, output 1 – essentially working in days) – essentially, was each part reported as missing for each category and location?
• Table2, to look if any parts were reported for each category, for each location – essentially, how many parts were reported for each category for each location, and a value greater than 0 means we can assume that that category at that location has been checked by my workers for that date.
• Table3, that for each location+category+day asked as a formula – IF(category was checked as per table2 = yes , pull the value of 1 for that part/location/category in table 1 , re-use yesterday’s value for this part/location/day in this table). For the 1st day in my date range, I used the values for table1 for that day as a “starting up” point.
When I look at table 3, I can visually the run of days products were out of stock, and can from there crunch numbers related to that, which is what I want!
My initial Access plan was to set up three crosstab queries, to mirror my three excel tables. I can make Table1 and Table2 very easily, but for the life of me can’t make table3 work (currently have a calculated expression that mirrors the formula I had in table 3, but something has gone amiss…).
I’m looking for a steer/advice on setting up the expression in my crosstab query, or other ideas/approaches I could use to calculate how long each part is missing for. Any help would be greatly appreciated, as I’ve lost my mind going in circles today!
Edit:-
Simplified dataset I'm working with:-
Location
Category
Date Reported
Part number
Part Description
Order number
Concatenate code
Concatenate Code 2
1
London
Car
03/06/2021
2021
Wheel
1
1443502021
1
London
Bus
03/06/2021
1491
Seat
2
1443501491
2
Manchester
Car
05/06/2021
2021
Wheel
3
2443522021
1
London
Car
05/06/2021
2021
Wheel
4
1443522021
1
London
Car
07/06/2021
2021
Wheel
5
1443542021
1
London
Bus
05/06/2021
1860
Seatbelt
6
1443521860
1
London
Bus
05/06/2021
1860
Seatbelt
7
1443521860
2
manchester
Bus
01/06/2021
1860
Seatbelt
8
2443481860
2
Manchester
Bus
06/06/2021
1860
Seatbelt
9
2443531860
2
manchester
Bus
04/06/2021
1491
Seat
10
2443511491
2
Manchester
Bus
06/06/2021
1491
Seat
11
2443531491
I'm trying to output something like (which I've made work in Excel):-
Location
Category
Part code
01/06/2021
02/06/2021
03/06/2021
04/06/2021
05/06/2021
06/06/2021
07/06/2021
1
London
Car
2021
1
1
1
1
1
London
Car
2626
1
London
Bus
1491
1
1
1
London
Bus
1860
1
1
2
Manchester
Car
2021
1
1
2
Manchester
Car
2626
2
Manchester
Bus
1491
1
1
1
2
Manchester
Bus
1860
1
1
1
1
3
Liverpool
Car
2021
3
Liverpool
Car
2626
3
Liverpool
Bus
1491
Or to return the value for how many concurrent days out of stock a part has been, like per day of this version:-
Location
Category
Part code
01/06/2021
02/06/2021
03/06/2021
04/06/2021
05/06/2021
06/06/2021
07/06/2021
1
London
Car
2021
1
2
3
4
1
London
Car
2626
1
London
Bus
1491
1
2
1
London
Bus
1860
1
2
2
Manchester
Car
2021
1
2
2
Manchester
Car
2626
2
Manchester
Bus
1491
1
2
3
2
Manchester
Bus
1860
1
2
3
1
3
Liverpool
Car
2021
3
Liverpool
Car
2626
3
Liverpool
Bus
1491
My Access sql (that I then turned into a crosstab) to identify ordered parts per day:
SELECT DISTINCT T_stores.[Store Nos], T_stores.[Store Name], t_Stands.Brand, t_Productlookup.TPND, t_Productlookup.TITLE, t_gapdata.Quantity, t_gapdata.[Requested Date]
FROM ((T_stores
INNER JOIN t_Stands ON T_stores.[Store Nos] = t_Stands.[Store Nos])
INNER JOIN t_gapdata ON (t_Stands.[Brand] = t_gapdata.[Brand]) AND (t_Stands.[Store Nos] = t_gapdata.[Store No]))
INNER JOIN t_Productlookup ON t_gapdata.[Part Number] = t_Productlookup.[EAN];
And likewise, to identfy is parts were ordered for a location's category:-
SELECT DISTINCT T_stores.[Store Nos], T_stores.[Store Name], t_Stands.Brand, t_Productlookup.TPND, t_Productlookup.TITLE, t_gapdata.Quantity, t_gapdata.[Requested Date]
FROM ((T_stores
INNER JOIN t_Stands ON T_stores.[Store Nos] = t_Stands.[Store Nos])
INNER JOIN t_gapdata ON (t_Stands.[Brand] = t_gapdata.[Brand]) AND (t_Stands.[Store Nos] = t_gapdata.[Store No]))
INNER JOIN t_Productlookup ON t_gapdata.[Part Number] = t_Productlookup.[EAN];
These first two work fine, but I'm struggling to put them together with some sort of Iif calculated field for a third query:-
SELECT First(q_gaps_per_product.[Store Nos]) AS [FirstOfStore Nos], First(q_gaps_per_product.[Store Name]) AS [FirstOfStore Name], First(q_gaps_per_product.Brand) AS FirstOfBrand, First(q_gaps_per_brand_store.[Order Id]) AS [FirstOfOrder Id], First(q_gaps_per_product.TPND) AS FirstOfTPND, First(q_gaps_per_product.TITLE) AS FirstOfTITLE, First(q_gaps_per_product.[Requested Date]) AS [FirstOfRequested Date], First(IIf([q_gaps_per_brand_store]![Requested Date]>=[q_gaps_per_product]![Requested Date],[Quantity],"PREVIOUS DAY")) AS Expr1, [q_gaps_per_product]![Store Nos] & [q_gaps_per_product]![Quantity] & [q_gaps_per_product]![TPND] AS Expr2
FROM q_gaps_per_product LEFT JOIN q_gaps_per_brand_store ON q_gaps_per_product.[Brand] = q_gaps_per_brand_store.[Brand]
GROUP BY [q_gaps_per_product]![Store Nos] & [q_gaps_per_product]![Quantity] & [q_gaps_per_product]![TPND];
Expr1 is supposed to be how many days a product is out of stock, with the idea that "PREVIOUS DAY" would return the same criteria for the previous day, to show either running gaps or that a product was in fact available as a 0, but I haven't got that far yet.
Expr2 is basically something I tried to make up to group the results by, as I had an insane number of results due to my janky table relationships.
I sort of think this query is DOA, and I need to go back to the drawing board to reproduce something like my Excel tables / how many days out of stock products have been concurrently out of stock before.
Sorry for the sheer storm of words!

How to get results from Mysql database using WHERE if there is more than 1 criterion for identification?

id points year country
-----------------------------------
1 45 1998 Mexico
2 45 2000 Germany
3 47 2010 Russia
4 45 1970 China
5 49 2010 Austria
I wonder how can I take row results considering only 2 items from country column. For example only records where country is Germany and Mexico. When I try to get results where only 1 country is criterion the thing is easy:
SELECT * FROM List WHERE Country='Mexico';
the result is:
id points year country
-----------------------------------
1 45 1998 Mexico
but when I try to get results where 2 country items are criteria problems start. I tried:
SELECT * FROM List WHERE country='Mexico' AND Country='Germany';
SELECT * FROM List WHERE country='Mexico' AND 'Germany';
SELECT * FROM List WHERE country='Mexico','Germany';
SELECT * FROM List WHERE country='Mexico'AND WHERE country='Germany';
but no desired result:
id points year country
-----------------------------------
1 45 1998 Mexico
2 45 2000 Germany
I understand that maybe I committed logical error because there is no single record where country is Mexico and Germany at same time, and sql maybe understands claim exactly that way, but, how to write correctly in sql language: Give me results for records where countries are Mexico and Germany. Thanks.
You are looking for IN operator
SELECT * FROM List WHERE Country in ('Mexico','Germany');
Just use OR.
So instead of
SELECT * FROM List WHERE country='Mexico' AND Country='Germany';
it would be
SELECT * FROM List WHERE country='Mexico' OR country='Germany';
IN is also a good function to use, especially if you've got multiple values that you want to check against but that's been covered in the other answers.
You need to use or or in, you have been using and and asking mysql to find a row where country is both Mexico and Germany which is not true.
SELECT * FROM List WHERE Country in ('Mexico','Germany');
try this:
SELECT * FROM List WHERE country='Mexico' OR Country='Germany';
SQL is using logic. Natural language is not.
When you say that you want the results for a list of countries you need to specify so. This request corresponds to an logical or. Since the name can be one or the other, both are correct.
SELECT * FROM List WHERE Country = 'Mexico' OR Country = 'Germany'
To prevent further mistakes like these, I recommend that you look up logical operations in the docs (they are very good). MySQL or the PostGres, both should be fine.

Searching for data that can be in two different column (query/design)

Sorry if the title is not clear. I am a bit confused about how to plan my database schema as given my database design skill level the requirement falls under kind of advanced :) I could really use some help here. Anyway, here it goes ...
I need to track match details for teams. For the sake of simplicity, lets say I need to track the match date, result and the teams that played the match. Now, how do I design my tables so I can make sure all relevant data is returned without having to keep multiple records of the same match. I am not sure if I am explaining clearly, so here's an example below.
match_id team1 team2 result
________ ________ ________ ________
1 Arsenal Chelsea 5-3
2 Manchester Utd Arsenal 1-0
3 Liverpool Newcastle 2-0
4 Arsenal Everton 1-0
From this data, if I search for match_ids for matches played by Arsenal, I should get the below results,
1,2,4.
Now, in the basic designs which I know of, I would normally search for matched in team name for the team name supplied and return the result. But here the team name can be in two different columns and both can be relevant. So, is it something I need to decide on the design level or something that can be done with some sort of query.
(Note: Storing teams as home/away is not an option for my requirement).
You can just query both columns, it's not a problem:
select match_id
from matches
where team1 = 'Arsenal' or team2 = 'Arsenal';
(You could also normalize this schema by placing teams in their separate table and leaving only their ids in the matches table, but that doesn't change much, you still have to query both columns. Read about database normalization, any SQL book covers this).
If there are always two teams per match, then I think you did a good job here, and when querying for a particular team, you'll want to search for one column OR the other (SELECT match_id FROM matches WHERE team1 = "?" OR team2 = "?").
One note though: I would definitely split up the score into two columns:
match_id team1 team2 score1 score2
________ ______________ _________ ______ ______
1 Arsenal Chelsea 5 3
2 Manchester Utd Arsenal 1 0
3 Liverpool Newcastle 2 0
4 Arsenal Everton 1 0
This way you'll be able to query on scores later on, if you need it. (e.g. Big wins = SELECT match_id FROM matches WHERE ABS(score1 - score2) > 3;)
The other option you have should be more scalable if there exists a possibility of having more than two teams per match. If this is the case, then you'd likely want to remove the uniqueness constraint on match_id and cut out the team/score columns from 2 to 1:
match_id team score
________ ________ ____
1 Arsenal 5
1 Chelsea 3
2 Manchester Utd 1
2 Arsenal 0
3 Liverpool 2
3 Newcastle 0
4 Arsenal 1
4 Everton 1
And of course, you're definitely going to want to take Sergio's advice in putting all this stuff into separate tables. "Teams" are likely going to have different attributes (hometown, coach name, etc.), and you're not going to want to duplicate that data.
This will give you the results you want but there may be a better design too.
Select *
from table
where (team1 = 'ARSENAL' or Team2 = 'ARSENAL')
You may want to Separate out scores such as Team1Score Team2Score otherwise you can't easily do math with them.
for star I would not store the time name, I think its better if you store the times in other table and linq then thru an id.
And then you could create a table with columns id, match id and team id, and just search for the team id in that table!
you can used this query and its no problem for your program:
select YOUR_ID_FIELDS
from YOUR_TABLE_NAME
where YOUR_FIELD_NAME(team1) = 'Arsenal' or YOUR_FIELD_NAME(team2) = 'Arsenal';
and for exmale (Chelsea)
select YOUR_ID_FIELDS
from YOUR_TABLE_NAME
where YOUR_FIELD_NAME(team1) = 'Chelsea' or YOUR_FIELD_NAME(team2) = 'Chelsea';