Teradata Max per Partition with non-adjacent duplicates - duplicates

Okay, this one might be a little tricky, so let me start with a visual.
Here's what the data looks like:
Original Data From Source
I'm trying to simplify it so that it looks like this:
End Result that I'm working towards
The problem is that I have an employee who changed back to a previous manager, so when I try to partition and group the data, those two instances get combined together and I end up with data that looks like this:
Actual Results
In the image above we can see that Manager Tom has a Start and End Date that is within the Start and End date of Manager Bob which is an error. Any suggestions on how to isolate the grouping of a item that gets reintroduced at a later time? This would be determined by Start Date and Rank over Partition I believe, but I can't seem to get this to work.
Here's the query to build the sample data:
CREATE VOLATILE TABLE VT_AGENT
(
EmpID INT
,Manager VARCHAR(16)
,Director VARCHAR(16)
,Record_Start DATE
,Record_End DATE
) ON COMMIT PRESERVE ROWS;
INSERT INTO vt_agent VALUES(12345678, 'Jill M.', 'Mike B.', '2019-08-21', '2019-09-07');
INSERT INTO vt_agent VALUES(12345678, 'Jill M.', 'Mike B.', '2019-09-07', '2019-09-16');
INSERT INTO vt_agent VALUES(12345678, 'Bob S.', 'Mike B.', '2019-09-16', '2019-10-15');
INSERT INTO vt_agent VALUES(12345678, 'Bob S.', 'Mike B.', '2019-10-15', '2019-11-23');
INSERT INTO vt_agent VALUES(12345678, 'Tom A.', 'Mike B.', '2019-11-23', '2019-12-07');
INSERT INTO vt_agent VALUES(12345678, 'Tom A.', 'Mike B.', '2019-12-07', '2019-12-12');
INSERT INTO vt_agent VALUES(12345678, 'Bob S.', 'Mike B.', '2019-12-12', '2020-01-15');
INSERT INTO vt_agent VALUES(12345678, 'Bob S.', 'Mike B.', '2020-01-15', '9999-12-31');
Select * FROM VT_AGENT

Assuming your last insert had the typos mentioned in the comments, you can make use of Teradata's Period data type (and functions) to make this super simple:
SELECT NORMALIZE
empid,
manager,
directory,
PERIOD(record_start, record_end) as valid_period
FROM VT_AGENT;
What this is doing is constructing a PERIOD column type from the record_start and record_end dates. Then we use the NORMALIZE keyword to compress periods where all other non-period columns are equal across more than one record. The result is a single record with the expanded period. This works only when the periods in those matching records meet (the end of one stops at the start of the next) or overlap (the end of one is after the start of the next).
With the assumed typo corrected, this outputs:
+----------+---------+----------+--------------------------+
| EmpID | Manager | Director | valid_period |
+----------+---------+----------+--------------------------+
| 12345678 | Bob S. | Mike B. | (2019-09-16, 2019-11-23) |
| 12345678 | Bob S. | Mike B. | (2019-12-12, 9999-12-31) |
| 12345678 | Jill M. | Mike B. | (2019-08-21, 2019-09-16) |
| 12345678 | Tom A. | Mike B. | (2019-11-23, 2019-12-12) |
+----------+---------+----------+--------------------------+

Related

How to merge rows without null fields

I'm still learning MySQL and I have this table...
staff_id
Name
Monday
Tuesday
1
Mark
8:00am-5:00pm
9:00am-6:00pm
2
Steve
9:00am-6:00pm
7:00am-4:00pm
I have managed to split and insert new rows into the same table using scripts like below.
INSERT INTO table (staff_id,Name,Monday)
(SELECT staff_id, Name, "8:00am-1:00pm"
FROM table
WHERE Monday= "8:00am-5:00pm");
INSERT INTO table (staff_id,Name,Monday)
(SELECT staff_id, Name, "2:00pm-5:00pm"
FROM table
WHERE Monday= "8:00am-5:00pm");
...etc...
It works but I end up with too many rows(I'm working with thousands of rows).
Is there a way I can get a table like this below using MySQL scripts only?
staff_id
Name
Monday
Tuesday
1
Mark
8:00am-1:00pm
9:00am-2:00pm
1
Mark
2:00pm-5:00pm
3:00pm-6:00pm
2
Steve
9:00am-2:00pm
7:00am-12:00pm
2
Steve
3:00pm-6:00pm
1:00pm-4:00pm
other solutions suggest using aggregate functions(MAX) which unfortunately won't work in my case and I can't figure out how to properly use "JOINs" for this purpose.
Any help would be really appreciated.
You have to use UPDATE and not INSERT INTO
CREATE TABLE mytable
(`staff_id` int, `Name` varchar(5), `Monday` varchar(13), `Tuesday` varchar(13))
;
INSERT INTO mytable
(`staff_id`, `Name`, `Monday`, `Tuesday`)
VALUES
(1, 'Mark', '8:00am-5:00pm', '9:00am-6:00pm'),
(2, 'Steve', '9:00am-6:00pm', '7:00am-4:00pm')
;
UPDATe mytable SET Monday = "8:00am-1:00pm"
WHERE Monday= "8:00am-5:00pm";
UPDATe mytable SET Monday = "8:00am-1:00pm"
WHERE Monday= "9:00am-6:00pm";
SELECT * FROM mytable
staff_id | Name | Monday | Tuesday
-------: | :---- | :------------ | :------------
1 | Mark | 8:00am-1:00pm | 9:00am-6:00pm
2 | Steve | 8:00am-1:00pm | 7:00am-4:00pm
db<>fiddle here

MySQL - Insert only when certain value does not exist

Let's say I have this table
ID | Name | Hobby
---------------------------
1 | Alex | fishing
2 | Alex | soccer
3 | Nick | bike
4 | George | hike
ID - is unique. Hobby - is NOT unique (need to keep it as non-unique)
Inserting a record:
INSERT INTO my_table (ID, Name, Hobby) VALUES ('5', 'Christina', 'bike')
How to modify the query, if I need to insert the record if bike value does not exist at all in Hobby column?
Anotherwords:
VALUES ('5', 'Christina', 'bike') - would NOT be inserted as 3 | Nick | bike exists
VALUES ('5', 'Christina', 'cooking') would be inserted as cooking is not present in Hobby column at all.
Having existing database with thousands of records, there is a risk that there are duplicates already in Hobby...
But from now on.. when adding new records, I want to avoid adding if already exists..
Thank you.
The easiest solution could be changing hobby column to unique. This way you will force your database to only insert unique hobbies. Another solution could be using triggers fore before insert / update.
Based on MySQL: Insert record if not exists in table
But with some corrections (to fix the duplicate errors)
The following query works for me.
INSERT INTO my_table (ID, Name, Hobby)
SELECT * FROM (SELECT '5' AS ID, 'Christina' AS Name, 'cooking' AS Hobby) AS tmp
WHERE NOT EXISTS (
SELECT name FROM table_listnames WHERE Hobby= 'cooking'
) LIMIT 1;

Am I able to fetch several strings from MySQL DB as a single 'glued' string?

Let's say I have a table containing user names and cities:
John | New York
Aaron | New York
George | Dallas
Low | Dallas
John | Dallas
Young | Dallas
And I want to have a array as the following:
['New York'] => 'John, Aaron',
['Dallas'] => 'George, Low, John, Young'
So I think I need to concatenate strings within GROUP operator.
Is there any solution?
You can use group_concat
select
city_name,
group_concat(user_names) as user_names
from table_name
group by city_name
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE Table1
(`name` varchar(6), `city` varchar(8))
;
INSERT INTO Table1
(`name`, `city`)
VALUES
('John', 'New York'),
('Aaron', 'New York'),
('George', 'Dallas'),
('Low', 'Dallas'),
('John', 'Dallas'),
('Young', 'Dallas')
;
Query 1:
select
concat('[''',city,'''] => ''',group_concat(name),'''') as array
from table1
group by city
Results:
| ARRAY |
|---------------------------------------|
| ['Dallas'] => 'George,Low,John,Young' |
| ['New York'] => 'John,Aaron' |

MYSQL: Querying from tables based upon Len Silverston's "The Data Model Resource Book"

I am currently in the process of developing a Database software for a company that I am working with. I based the tables off of Len Silverston's book, as I found it to be an excellent source for information based on data modeling.
Now, you do not need to be acquainted with his book to know the solution to my problem, but I could not think of any other way to word my title.
Suppose I have two tables, Persons and Person_Names:
CREATE TABLE Persons
(
party_id INT PRIMARY KEY,
birth_date DATE,
social VARCHAR(20)
);
CREATE TABLE Person_Names
(
party_id INT,
name_id INT,
person_name VARCHAR(20),
CONSTRAINT person_names_cpk
PRIMARY KEY (party_id, name_id)
);
The two tables can be joined by party_id. Also, under Person_Names, name_id = 1 correlates to the person's first name (which is stored in the field person_name) and name_id = 2 is the person's last name.
* EDIT *
Someone asked for some data, so I will add some data below:
INSERT INTO Persons VALUES
(1, '01-01-1981', '111-11-1111'),
(2, '02-02-1982', '222-22-2222'),
(3, '03-03-1983', '333-33-3333');
INSERT INTO Person_Names VALUES
(1, 1, 'Kobe'),
(1, 2, 'Bryant'),
(2, 1, 'LeBron'),
(2, 2, 'James'),
(3, 1, 'Kevin'),
(3, 2, 'Durant');
Now that I added those data, how would I query the following?
-----------------------------------------------------------------------
| Party Id | First Name | Last Name | Birthdate | Social No. |
-----------------------------------------------------------------------
| 1 | Kobe | Bryant | 01-01-1981 | 111-11-1111 |
| 2 | LeBron | James | 02-02-1982 | 222-22-2222 |
| 3 | Kevin | Durant | 03-03-1983 | 333-33-3333 |
-----------------------------------------------------------------------
Thanks for taking your time to read my question!
Quite easily. I don't know the book, but presumably it contains some material describing table joins and their application in queries such as this one:
SELECT Persons.party_id AS "Party Id",
firstname.person_name AS "First Name",
lastname.person_name AS "Last Name",
Persons.birth_date AS "Birthdate",
Persons.social AS "Social No."
FROM Persons
INNER JOIN Person_Names firstname
ON Persons.party_id = firstname.party_id
AND firstname.name_id = 1
INNER JOIN Person_Names lastname
ON Persons.party_id = lastname.party_id
AND lastname.name_id = 2
Be advised that this will return results only for those people who have both a first and a last name defined in your Person_Names table; if one or the other isn't present, the INNER JOINs' ON clause conditions will exclude those rows entirely.

mysql possible to select count(distinct(id) where col = 'value')?

I'm using car makes as an example, which fits my situation nicely.
Example query I have now, that gives a simple count per state:
SELECT
state as State,
count(distinct(idnumber)) as Total
FROM
database.table
WHERE
make IN('honda', 'toyota', 'subaru')
GROUP BY
state
ORDER BY
state
Note that this would give me the count for each of the car makes, excluding things like Ford, Chevy, etc. The list of makes would be every make.
Is there a way I can break that down to give me the count of each make by state without resorting to a sub-query? In my head it would be like having a where statement in the count(distinct(idnumber)) select, but I'm not sure that's possible.
Here's what is in my head:
SELECT
state as State,
count(distinct(idnumber)) as Total_State,
(count(distinct(idnumber)) WHERE make = 'honda') as Total_Honda
WHERE
make IN('honda', 'toyota', 'subaru')
GROUP BY
state
ORDER BY
state
You could add multiple columns to your group by:
GROUP BY
state, make
I may misunderstand your question, but you can group along two columns, so you will get the number of fords made in CA and hondas made in CA etc
To be explicit, your query would be this:
SELECT
state as State,
count(distinct(idnumber)) as Total,
make as Make
FROM
database.table
WHERE
make IN('honda', 'toyota', 'subaru')
GROUP BY
state, make
ORDER BY
state
Just as a fun test, I did this:
create table `cars` (
`id` int(11),
`make` varchar(255),
`state` varchar(255)
);
insert into cars(id, make, state) values
(1, 'honda', 'ca'), (2, 'honda', 'ca'), (3, 'toyota', 'ca'),
(4, 'toyota', 'az'), (5, 'toyota', 'az'), (6, 'honda', 'az');
SELECT state as State, count(id) as Total, make as Make
FROM cars
WHERE make IN('honda', 'toyota', 'subaru')
GROUP BY state, make
ORDER BY state
And got back:
+-------+-------+--------+
| State | Total | Make |
+-------+-------+--------+
| az | 1 | honda |
| az | 2 | toyota |
| ca | 2 | honda |
| ca | 1 | toyota |
+-------+-------+--------+
Which is what I expected. Is that what you were thinking?