I have a table x with the below fields,
n_i int(10) NOT NULL,
n_m longtext,
n_t varchar(255) DEFAULT NULL,
n_s varchar(50) DEFAULT NULL,
n_c int(10) DEFAULT NULL,
n_d datetime DEFAULT NULL
Data:
1, ABC has kept Hold rating on John Inc for target of $ 125 over a 12 month horizon from the current market price of $ 101., Hold John Inc for target of $ 125: ABC, ABC PCG, 1, 1/27/2006 22:55
2, RB Research has kept Buy rating on Johnson (New York) Inc for target of $ 80 over a 12 month horizon from the current market price of $ 64., Buy Johnson (New York) Inc for target of $ 80: RB Research, RB Research, 1, 1/27/2006 23:03
3, XYZ Research has kept Buy rating on John & John Manufacturing (USA) Inc, a subsidiary of John & John Inc for target of $ 340 from the current market price of $ 270., Buy John & John Manufacturing (USA) Inc for target of Rs.340: XYZ Research, XYZ Research, 1, 1/27/2006 23:06
4, ABCDE Research has upgraded Johnson (New York) Inc to Buy with a target of $ 1000 from the current market price of $ 750., Buy Johnson (New York) Inc for target of $ 1000: ABCDE Research, ABCDE Research, 1, 1/27/2006 23:10
5, JKL Private Client Research has kept Buy rating on John2 Inc, parent company of John & John Manufacturing (USA) Inc with a target price of $ 295 from the current market price of $ 276., Buy John2 Inc for target of $ 295: JKL Private Client Research, JKL Private CLient Research, 1, 1/27/2006 23:12
I would like to create the below table y with data from above table plus a new field n_sy. The search criteria is a string say, BSSN which will be taken from table z and searched in fields n_t and n_m of table x and if found field n_sy in table y would be updated with corresponding field data "BSIS" from table z. Also, the field "n_sy" should be able to hold more than one value. :-(
n_i int(10) NOT NULL,
n_m longtext,
n_t varchar(255) DEFAULT NULL,
n_s varchar(50) DEFAULT NULL,
n_c int(10) DEFAULT NULL,
n_d datetime DEFAULT NULL,
n_sy text NOT NULL
Data:
1, ABC has kept Hold rating on John Inc for target of $ 125 over a 12 month horizon from the current market price of $ 101., Hold John Inc for target of $ 125: ABC, ABC PCG, 1, 1/27/2006 22:55, ABCD12345J
2, RB Research has kept Buy rating on Johnson (New York) Inc for target of $ 80 over a 12 month horizon from the current market price of $ 64., Buy Johnson (New York) Inc for target of $ 80: RB Research, RB Research, 1, 1/27/2006 23:03, ABCD34567L
3, XYZ Research has kept Buy rating on John & John Manufacturing (USA) Inc, a subsidiary of John & John Inc for target of $ 340 from the current market price of $ 270., Buy John & John Manufacturing (USA) Inc for target of Rs.340: XYZ Research, XYZ Research, 1, 1/27/2006 23:06, "ABCD56789A, ABCD45678M"
4, ABCDE Research has upgraded Johnson (New York) Inc to Buy with a target of $ 1000 from the current market price of $ 750., Buy Johnson (New York) Inc for target of $ 1000: ABCDE Research, ABCDE Research, 1, 1/27/2006 23:10, ABCD34567L
5, JKL Private Client Research has kept Buy rating on John2 Inc, parent company of Johnson (New York) Inc with a target price of $ 295 from the current market price of $ 276., Buy John2 Inc for target of $ 295: JKL Private Client Research, JKL Private CLient Research, 1, 1/27/2006 23:12, "ABCD23456K, ABCD34567L"
Table z
`BSCe` double DEFAULT NULL,
`BSSI` text,
`BSSN` text,
`BSSt` text,
`BSGr` text,
`BSFV` int(11) DEFAULT NULL,
`BSIS` text,
`BSIn` text,
`BSInst` text,
`NSSy` text,
`NSSc` text,
`NSSer` text,
`NSDOL` text,
`NSPUV` int(4) DEFAULT NULL,
`NSMktL` int(3) DEFAULT NULL,
`NSIS` text,
`NSEFV` int(11) DEFAULT NULL
12345, ABCD, John Inc, A, B, 10, ABCD12345J, Mine, E, J12345, John, T, 10-Oct-2019, 10, 1, ABCD12345J, 10
12346, XYZ, John2 Inc, A, B, 10, ABCD23456K, Iron, E, J12346, John2, T, 11-Jan-2020, 10, 1, ABCD23456K, 10
12347, JKL, Johnson (New York) Inc, A, B, 10, ABCD34567L, Electricty, E, J12347, John3, T, 7-Dec-2019, 10, 1, ABCD34567L, 10
12348, IJK, John & John Inc, A, B, 10, ABCD45678M, Mine, E, J12348, John & John, T, 19-Apr-2019, 10, 1, ABCD45678M, 10
12349, EFGH, John & John Manufacturing (USA) Inc, A, B, 10, ABCD56789A, IT, E, J12349, John & John Manufacturing Inc, T, 29-May-2019, 10, 1, ABCD56789A, 10
Looking for your help, as the MYSQL table is fairly large.
Use below query,
Creates the table with structure of x
create table y as select * from table x;
Adds column column n_sy
alter table y add n_sy text NOT NULL;
Updates BSSN from z to n_sy in y
update y
inner join z
on (z.BSSN = y.n_t)
set y.n_sy = z.BSIS;
Updates BSSN from z to n_sy in y to hold more than one value
update y
inner join z
on (z.BSSN = y.n_m)
set y.n_sy = y.n_sy || ',' || z.BSIS;
Related
I am collecting sports data for a database I will be creating soon, and one of the columns I have for one of my tables is player birthdays. I am using R to scrape and collect the data, and here is what it currently looks like:
head(my_df)
Name BirthDate Age Place Height Weight
1 Mike NA NA NA
2 Austin August 17, 1994 22.295 Roseville, California 73 210
3 Koby January 19, 1997 20.140 Wolfforth, Texas 70 255
4 Ty 1991 26.000 Amarillo, Texas 70 165
5 Cole NA NA NA
6 Jeff July 24, 1995 21.320 Boulder, Colorado 72 200
dput(head(my_df))
structure(list(Name = c("Mike", "Austin", "Koby", "Ty", "Cole",
"Jeff"), BirthDate = c("", "August 17, 1994", "January 19, 1997",
"1991", "", "July 24, 1995"), Age = c(NA, 22.295, 20.14, 26,
NA, 21.32), Place = c("", "Roseville, California", "Wolfforth, Texas",
"Amarillo, Texas", "", "Boulder, Colorado"), Height = c(NA, 73,
70, 70, NA, 72), Weight = c(NA, 210L, 255L, 165L, NA, 200L)), .Names = c("Name",
"BirthDate", "Age", "Place", "Height", "Weight"), row.names = 25:30, class = "data.frame")
The BirthDate column has two inconsistencies that make appropriate formatting difficult:
there are entirely missing values
for some columns, I have the birthyear but am missing Day and Month.
I'm relatively new with databases and am trying to plan ahead with this. Does anybody have any thoughts on the best way to handle this? To be more specific, how can I write the R code to format the column the right way?
Not a complete answer but a direction to try...
Genealogy databases, which have similar problems, will often have an extra column (a sort of meta-date column) that expresses confidence in, or quality of, the birth-date column. This can then be used at query-time to mediate the results returned. Could be "estimated", "not given", "year only" to suit your app.
If we intend to do calculations on date as date, or on year and month columns, for example: the age of the player, how many of them were born on January, etc. then we can can separate it into 3 numeric columns.
do.call(rbind,
lapply(my_df$BirthDate, function(i){
x <- rev(unlist(strsplit(i, " ")))
res <- data.frame(
BirthDate = i,
YYYY = as.numeric(x[1]),
DD = as.numeric(sub(",", "", x[2], fixed = TRUE)),
MM = match(x[3], month.name))
res$DOB <- as.Date(paste(res$DD, res$MM, res$YYYY, sep = "/"),
format = "%d/%m/%Y")
res
}))
# BirthDate YYYY DD MM DOB
# 1 NA NA NA <NA>
# 2 August 17, 1994 1994 17 8 1994-08-17
# 3 January 19, 1997 1997 19 1 1997-01-19
# 4 1991 1991 NA NA <NA>
# 5 NA NA NA <NA>
# 6 July 24, 1995 1995 24 7 1995-07-24
Need customized JSON output--
(I have two files - text file and schema file)
abc.txt -
100002030,Tom,peter,eng,block 3, lane 5,california,10021
100003031,Tom,john,doc,block 2, lane 2,california,10021
100004032,Tom,jim,eng,block 1, lane 1,california,10021
100005033,Tom,trek,doc,block 2, lane 2,california,10021
100006034,Tom,peter,eng,block 6, lane 6,california,10021
abc_schema.txt (field name and position)
rollno 1
firstname 2
lastname 3
qualification 4
address1 5
address2 6
city 7
Zipcode 8
Rules-
First 6 characters of rollno
Need to club address1 | address2 | city
Prefix Address to above
Expected Output-
{"rollno":"100002","firstname":"Tom","lastname:"peter","qualification":"eng","Address":"block 3 lane 5 california","zipcode":"10021"}
{"rollno":"100002","firstname":"Tom","lastname:"john","qualification":"doc","Address":"block 2 lane 2 california","zipcode":"10021"}
{"rollno":"100004","firstname":"Tom","lastname:"jim","qualification":"eng","Address":"block 1 lane 1 california","zipcode":"10021"}
{"rollno":"100005","firstname":"Tom","lastname:"trek","qualification":"doc","Address":"block 2 lane 2 california","zipcode":"10021"}
{"rollno":"100006","firstname":"Tom","lastname:"peter","qualification":"eng","Address":"block 6 lane 6 california","zipcode":"10021"}
I do not wish to hardcode the fields but read from the schema file, the idea is to have reusable code. Something like looping schema file and the text file
A = load 'abc.txt' using PigStorage(',') as (rollno, Fname,Lname,qua,add1,add2,city,Zipcode);
B = foreach A generate rollno, Fname,Lname,qua,concate (add1,add2,city) ,Zipcode;
C= STORE B
INTO 'first_table.json'
USING JsonStorage();
Hope this helps.
I have a list (mysql table) of People and their titles as shown in the table below. I also have a list of titles and their categories. How do I assign their categories to the person? The problem arises when there are multiple titles for a person. What is the pythonic way of mapping the title to the category and assigning it to the person?
People Table
Name Title
--------------------
John D CEO, COO, CTO
Mary J COO, MD
Tim C Dev Ops, Director
Title Category table
Title Executive IT Other
-----------------------------
CEO 1
COO 1
CTO 1 1
MD 1
Dev Ops 1
Director 1
Desired output :
Name Title Executive IT Other
---------------------------------------------
John D CEO, COO, CTO 1 1
Mary J COO, MD 1
Tim C Dev Ops, Director 1 1
name_title = (("John D",("CEO","COO","CTO")),
("Mary J",("COO","MD")),
("Tim C",("Dev Ops","Director")))
title_cat = {"CEO": set(["Executive"]),
"COO": set(["Executive"]),
"CTO": set(["Executive"]),
"MD": set(["Executive"]),
"Dev Ops": set(["IT"]),
"Director": set(["Other"])}
name_cat = [(name, reduce(lambda x,y:x|y, [title_cat[title]for title in titles])) for name,titles in name_title]
It would be nice if there was a union which behaved like sum on sets.
people=['john','Mary','Tim']
Title=[['CEO','COO','CTO'],['COO','MD'],['DevOps','Director']]
title_des={'CEO':'Executive','COO':'Executive','CTO':'Executive',
'MD':'Executive','DevOps':'IT','Director':'Others'
}
people_des={}
for i,x in enumerate(people):
people_des[x]={}
for y in Title[i]:
if title_des[y] not in people_des[x]:
people_des[x][title_des[y]]=[y]
else:
people_des[x][title_des[y]].append(y)
print(people_des)
output:
{'Tim': {'IT': ['DevOps'], 'Others': ['Director']}, 'john': {'Executive': ['CEO', 'COO', 'CTO']}, 'Mary': {'Executive': ['COO', 'MD']}}
Start by arranging your input data in a dictionary-of-lists form:
>>> name_to_titles = {
'John D': ['CEO', 'COO', 'CTO'],
'Mary J': ['COO', 'MD'],
'Tim C': ['Dev Ops', 'Director']
}
Then loop over the input dictionary to create the reverse mapping:
>>> title_to_names = {}
>>> for name, titles in name_to_titles.items():
for title in titles:
title_to_names.setdefault(title, []).append(name)
>>> import pprint
>>> pprint.pprint(title_to_names)
{'CEO': ['John D'],
'COO': ['John D', 'Mary J'],
'CTO': ['John D'],
'Dev Ops': ['Tim C'],
'Director': ['Tim C'],
'MD': ['Mary J']}
I propose this if you mean you have the string:
s = '''Name Title
--------------------
John D CEO, COO, CTO
Mary J COO, MD
Tim C Dev Ops, Director
Title Executive IT Other
-----------------------------
CEO 1
COO 1
CTO 1
MD 1
Dev Ops 1
Director 1
'''
lines = s.split('\n')
it = iter(lines)
for line in it:
if line.startswith('Name'):
break
next(it) # '--------------------'
for line in it:
if not line:
break
split = line.split()
titles = split[2:]
name = split[:2]
print ' '.join(name), titles
# John D ['CEO,', 'COO,', 'CTO']
# Mary J ['COO,', 'MD']
# Tim C ['Dev', 'Ops,', 'Director']
Calling all mySQL gurus!
I am in need of a complex query for mySQL but I can't get my head around it. There are 2 tables in question:
locations
(columns: location_id, parent, location)
Data is split in a hierarchal fashion into Country, Region, County and Town thus:
1, 0, England (country)
2, 1, South West (region)
3, 1, South East (region)
4, 2, Dorset (county)
5, 4, Bournemouth (town)
6, 4, Poole (town)
7, 4, Wimborne (town)
etc up to 400+ rows of location data
profiles
(columns: profile_id, title, location_id)
Each row has one location ID which is ALWAYS a town (ie the last child of). Eg:
1, 'This profile has location set as Bournemouth', 5
2, 'This profile has location set as Poole', 6
etc
What I need to achieve is to return all IDs from the Locations table where itself or it's children have entries associated with it. So in the example above I would need the following location IDs returned: 1, 2, 4, 5, 6
Reasons:
1 - YES, England is parent of South West, Dorset and Bournemouth which has an entry
2 - YES, South West is parent of Dorset and Bournemouth which has an entry
3 - NO, South East has no entries under it or any of it's children
4 - YES, Dorset is parent of Bournemouth which has an entry
5 - YES, Bournemouth has an entry
6 - YES, Poole has an entry
7 - NO, Wimborne has no entries
So, is this actually possible? I attempted to do it in PHP with nested SQL queries but the script timed out so there must be a way to do this just in a SQL query?
Thanking you in advance! :)
===========
UPDATE
After reading through and playing with all these solutions I realised that I was going about the problem completely the wrong way. Instead of going through all the locations and returning those that have entries it makes more sense and is far more efficient to get all the entries and return the corresponding locations and then go up the hierarchy to get each location parent until the root is hit.
Thank you very much for your help, it at least made me realise that what I was attempting was unnecessary.
The way I have dealt with this is doing only one SQL load, and then putting references inside of the parent objects.
$locations = array();
$obj_query = "SELECT * from locations";
$result_resource = mysql_query($obj_query);
while ($row = mysql_fetch_assoc($result_resource) {
$locations[$row['location_id'] = (object) $row;
}
foreach ($locations as $location) {
if (isset($location->parent) {
$locations[$location->parent]->children[] = $location;
}
}
Your object would then need a method such as this to find out whether a location is a descendant:
function IsAnscestorOF ($location) {
if (empty($children)) { return false; }
if (in_array($location, keys($this->children) {
return true;
} else {
foreach ($children as $child) {
if ($child->isAnscestor) {
return true;
}
}
}
return false;
}
THe fact that your script timed out would indicate an infinite loop somewhere.
Considering you're making a reference to the locations table based on the child area, plus another reference to the parent area, you probalby have to use a combination of PHP & Mysql to scroll through all this - a simple JOIN statement would not work in this case, I don't think.
Also you need to alter the table so that if it's a top-level page, it has a parent_id of NULL, not 0. After you've done that..
$sql = "SELECT * FROM locations WHERE parent =''";
$result = mysql_query($sql);
while($country = mysql_fetch_array($result)) {
$subsql = "SELECT * FROM locations WHERE parent='".$country['id']."'";
$subresult = mysql_query($subsql);
while($subregion = mysql_fetch_array($subresult)) {
$profilesql = "SELECT * FROM profiles WHERE location_id='".$subregion['id']."'";
$profileresult = mysql_query($profilesql);
echo mysql_num_rows($profileresult).' rows under '.$subregion['location'].'.<br />';
}
}
The base code is there... does anybody have a clever idea of making it work with various sub-levels? But honestly, if this were my project, I would have made separate tables for Country, and then Regions, and then City/Town. 3 tables would make the data navigation much easier.
If your php code was good you might have a nested loop in the [location -> parent] fd. I would start there first, and just use PHP. I don't think SQL has a recursive function.
If you NEED a nested parent loop, you should write an mutation of merge|union algorithm to solve this this.
To find the nested loop in PHP
$ids = array();
function nestedLoopFinder($parent)
{
global $ids;
$result = mysql_query("SELECT location_id FROM locations WHERE parent=$parent");
while($row = mysql_fetch_object($result))
{
if(in_array($row->location_id, $ids)) {
die("duplicate found: $row->location_id");
}
$ids[] = $row->location_id;
//recurse
nestedLoopFinder($row->location_id);
}
}
Not sure if I fully understand your requirements but the following stored procedure example might be a good starting point for you:
Example calls (note the included column)
mysql> call location_hier(1);
+-------------+---------------------+--------------------+---------------------+-------+----------+
| location_id | location | parent_location_id | parent_location | depth | included |
+-------------+---------------------+--------------------+---------------------+-------+----------+
| 1 | England (country) | NULL | NULL | 0 | 1 |
| 2 | South West (region) | 1 | England (country) | 1 | 1 |
| 3 | South East (region) | 1 | England (country) | 1 | 0 |
| 4 | Dorset (county) | 2 | South West (region) | 2 | 1 |
| 5 | Bournemouth (town) | 4 | Dorset (county) | 3 | 1 |
| 6 | Poole (town) | 4 | Dorset (county) | 3 | 1 |
| 7 | Wimborne (town) | 4 | Dorset (county) | 3 | 0 |
+-------------+---------------------+--------------------+---------------------+-------+----------+
7 rows in set (0.00 sec)
You'd call the stored procedure from php as follows:
$startLocationID = 1;
$result = $conn->query(sprintf("call location_hier(%d)", $startLocationID));
Full script:
http://pastie.org/1785995
drop table if exists profiles;
create table profiles
(
profile_id smallint unsigned not null auto_increment primary key,
location_id smallint unsigned null,
key (location_id)
)
engine = innodb;
insert into profiles (location_id) values (5),(6);
drop table if exists locations;
create table locations
(
location_id smallint unsigned not null auto_increment primary key,
location varchar(255) not null,
parent_location_id smallint unsigned null,
key (parent_location_id)
)
engine = innodb;
insert into locations (location, parent_location_id) values
('England (country)',null),
('South West (region)',1),
('South East (region)',1),
('Dorset (county)',2),
('Bournemouth (town)',4),
('Poole (town)',4),
('Wimborne (town)',4);
drop procedure if exists location_hier;
delimiter #
create procedure location_hier
(
in p_location_id smallint unsigned
)
begin
declare v_done tinyint unsigned default 0;
declare v_depth smallint unsigned default 0;
create temporary table hier(
parent_location_id smallint unsigned,
location_id smallint unsigned,
depth smallint unsigned default 0,
included tinyint unsigned default 0,
primary key (location_id),
key (parent_location_id)
)engine = memory;
insert into hier select parent_location_id, location_id, v_depth, 0 from locations where location_id = p_location_id;
create temporary table tmp engine=memory select * from hier;
/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
while not v_done do
if exists( select 1 from locations c
inner join tmp on c.parent_location_id = tmp.location_id and tmp.depth = v_depth) then
insert into hier select c.parent_location_id, c.location_id, v_depth + 1, 0 from locations c
inner join tmp on c.parent_location_id = tmp.location_id and tmp.depth = v_depth;
update hier inner join tmp on hier.location_id = tmp.parent_location_id
set hier.included = 1;
set v_depth = v_depth + 1;
truncate table tmp;
insert into tmp select * from hier where depth = v_depth;
else
set v_done = 1;
end if;
end while;
update hier inner join tmp on hier.location_id = tmp.parent_location_id
set hier.included = 1;
-- include any locations that have profiles ???
update hier inner join profiles on hier.location_id = profiles.location_id
set hier.included = 1;
-- output the results
select
c.location_id,
c.location as location,
p.location_id as parent_location_id,
p.location as parent_location,
hier.depth,
hier.included
from
hier
inner join locations c on hier.location_id = c.location_id
left outer join locations p on hier.parent_location_id = p.location_id
-- where included = 1 -- filter in your php or here up to you !
order by
hier.depth;
-- clean up
drop temporary table if exists hier;
drop temporary table if exists tmp;
end #
delimiter ;
call location_hier(1);
Hope this helps :)
I would like to know if there is any query to calculate the count of used fields(columns) in a table for every row(record).
I want to update my table new field called percentage usage by calculating
(total number of used columns) / (total number columns) * 100
for all records.
Any suggestion is appreciated. Thanks
For example:
I have a table named leads:
Name Age Designation Address
Jack 25 programmer chennai
Ram 30 ----------- ----------
Rob 35 Analyst ----------
I have added a new column called usagepercent and I want to update the new field as
Name Age Designation Address usagepercent
Jack 25 programmer chennai 100
Ram 30 ----------- ---------- 50
Rob 35 Analyst ---------- 75
------- indicates empty
Something like this should work (if the default/empty/unused value of the fields is Null):
SET #percValue=25;
UPDATE
leads
SET
usagePercent =
IF(Name IS NOT NULL, #percValue, 0) +
IF(Age IS NOT NULL, #percValue, 0) +
IF(Designation IS NOT NULL, #percValue, 0) +
IF(Address IS NOT NULL, #percValue, 0);
You'll have to change percValue according to the number of columns you have.
Edit: Adapted solution of RSGanesh:
UPDATE
leads
SET
usagePercent = (
IF(Name IS NOT NULL, 1, 0) +
IF(Age IS NOT NULL, 1, 0) +
IF(Designation IS NOT NULL, 1, 0) +
IF(Address IS NOT NULL, 1, 0)
) / 4 * 100;