Lahman's Baseball Database - Determine Primary Position - mysql

I'm using Lahman's Baseball Database and MySQL to determine each player's primary position. The goal is to write a query that will return the playerID and the position at which they played most games. I know that I would probably want a query that looks something like this:
select playerID, sum(G)
from fielding
where POS = 'C'
group by playerID
order by sum(G) desc;
The above query gathers all the games each player has played as a catcher. What I want to do is take each player and compare the sum of games played at each position and find the maximum value from that.
If you are not familiar with Lahman's Baseball Database here is the download link: http://www.seanlahman.com/baseball-archive/statistics/
Also here is the create table statement for the Fielding table:
CREATE TABLE `Fielding` (
`playerID` varchar(9) NOT NULL DEFAULT '',
`yearID` int(11) NOT NULL DEFAULT '0',
`stint` int(11) NOT NULL DEFAULT '0',
`teamID` varchar(3) DEFAULT NULL,
`lgID` varchar(2) DEFAULT NULL,
`POS` varchar(2) NOT NULL DEFAULT '',
`G` int(11) DEFAULT NULL,
`GS` int(11) DEFAULT NULL,
`InnOuts` int(11) DEFAULT NULL,
`PO` int(11) DEFAULT NULL,
`A` int(11) DEFAULT NULL,
`E` int(11) DEFAULT NULL,
`DP` int(11) DEFAULT NULL,
`PB` int(11) DEFAULT NULL,
`WP` int(11) DEFAULT NULL,
`SB` int(11) DEFAULT NULL,
`CS` int(11) DEFAULT NULL,
`ZR` double DEFAULT NULL,
PRIMARY KEY (`playerID`,`yearID`,`stint`,`POS`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The Fielding table is organized by year. POS is the position, G is for the number of games they played at the that position in the corresponding year. That means there will be multiple entires for some players in the same year. Also, ignore the case when POS = 'OF' as this takes the sum of all games played at LF, CF, and RF in the given year.
The final output should be a row for each distinct player, with columns playerID and primaryPosition.

plan
create table showing sums of players in all positions
get maximum sum of positions from this table
join back to sums to get the corresponding primary position
query
create table psums as
(
select playerID, POS, sum(G) as sm
from Fielding
where POS <> 'OF'
group by playerID, POS
)
;
select ps.playerID, ps.POS as primaryPosition
from
(
select playerID, max(sm) mx
from psums
group by playerID
) maxs
inner join
psums ps
on maxs.playerID = ps.playerID
and maxs.mx = ps.sm
order by ps.playerID
;
[ adding limit 10 ]
output
+-----------+-----------------+
| playerID | primaryPosition |
+-----------+-----------------+
| aardsda01 | P |
| aaronha01 | RF |
| aaronto01 | 1B |
| aasedo01 | P |
| abadan01 | 1B |
| abadfe01 | P |
| abadijo01 | 1B |
| abbated01 | 2B |
| abbeybe01 | P |
| abbeych01 | P |
+-----------+-----------------+

SELECT x.*
FROM fielding x
JOIN
(
SELECT playerid
, MAX(g) max_g
FROM fielding
GROUP
BY playerid
) y
ON y.playerid = x.playerid
AND y.max_g = x.g
LIMIT 10;
...or, more likely...
SELECT x.*
FROM
( SELECT playerid,pos,SUM(g) sum_g FROM fielding GROUP BY playerid,pos ) x
JOIN
(
SELECT playerid
, MAX(sum_g) max_sum_g
FROM
( SELECT playerid
, pos
, SUM(g) sum_g
FROM fielding
GROUP
BY playerid
, pos
) n
GROUP
BY playerid
) y
ON y.playerid = x.playerid
AND y.max_sum_g = x.sum_g
LIMIT 10;

Related

Distinct unique value and sum others in mysql

I am trying to get the order_payment_total of the unique od_grp_id once but while using sum it get added.
CREATE TABLE IF NOT EXISTS `subscription` (
`id` int(11) unsigned NOT NULL,
`od_grp_id` int(11) unsigned NULL,
`user_id` int(11) NOT NULL,
`order_discount` decimal(10, 2) null,
PRIMARY KEY (`id`)
) DEFAULT CHARSET = utf8;
INSERT INTO `subscription` (
`id`, `od_grp_id`, `user_id`, `order_discount`
)
VALUES
(123994, NULL, 115, null),
(124255, NULL, 115, null),
(124703, 1647692222, 115, null),
(125788, 1647692312, 115, '25.00'),
(125789, 1647692312, 115, '5.00');
CREATE TABLE IF NOT EXISTS `online_payment_against_subscription` (
`subscription_od_grp_id` int(11) unsigned NOT NULL,
`order_payment_total` decimal(10, 2) unsigned NOT NULL,
`user_id` int(11) NOT NULL
) DEFAULT CHARSET = utf8;
INSERT INTO `online_payment_against_subscription` (
`subscription_od_grp_id`, `order_payment_total`, `user_id`
)
VALUES
(1643695200, '45.00', 115),
(1647692312, '250.00', 115),
(1647692222, '30.00', 115);
SELECT
sum(y.order_payment_total),
sum(s.order_discount)
FROM
subscription s
LEFT JOIN(
SELECT
SUM(order_payment_total) as order_payment_total,
user_id,
subscription_od_grp_id
FROM
online_payment_against_subscription
GROUP BY
subscription_od_grp_id
) y ON y.subscription_od_grp_id = s.od_grp_id
WHERE
find_in_set(
s.id, '123994,124255,124703,125788,125789'
)
group by
s.user_id
Current Output:
| sum(y.order_payment_total) |sum(s.order_discount) |
|----------------------------|-----------------------|
| 530 | 30 |
Expected Ouput:
| sum(y.order_payment_total) |sum(s.order_discount) |
|----------------------------|-----------------------|
| 280 | 30 |
Sql Fiddle: http://sqlfiddle.com/#!9/5628f5/1
If I understand correctly, The problem is caused by some duplicate od_grp_id from subscription table, so you might remove the duplicate od_grp_id before JOIN, so we might do that in a subquery.
Query 1:
SELECT
SUM(order_payment_total),
SUM(order_discount)
FROM (
SELECT od_grp_id,SUM(order_discount) order_discount
FROM subscription
WHERE find_in_set(id, '123994,124255,124703,125788,125789')
GROUP BY od_grp_id
) s
LEFT JOIN online_payment_against_subscription y ON y.subscription_od_grp_id=s.od_grp_id
Results:
| SUM(order_payment_total) | SUM(order_discount) |
|--------------------------|---------------------|
| 280 | 30 |
I think you are getting this error because every subscription doesn't have an order payment that is you are getting NULL values.
You can try to remove them by using this -
SELECT y.order_payment_total
FROM subscription s
LEFT JOIN(SELECT SUM(order_payment_total) AS order_payment_total, user_id, subscription_od_grp_id
FROM online_payment_against_subscription
GROUP BY subscription_od_grp_id) y ON y.subscription_od_grp_id = s.od_grp_id
WHERE FIND_IN_SET(s.id, '11258,22547,18586')
AND y.order_payment_total IS NOT NULL;
Or you can make NULL values 0 if you required -
SELECT COALESCE(y.order_payment_total, 0) AS order_payment_total
FROM subscription s
LEFT JOIN(SELECT SUM(order_payment_total) AS order_payment_total, user_id, subscription_od_grp_id
FROM online_payment_against_subscription
GROUP BY subscription_od_grp_id) y ON y.subscription_od_grp_id = s.od_grp_id
WHERE FIND_IN_SET(s.id, '11258,22547,18586');

Mysql join acting weird

I have a very simple query but I am a beginner at this and I couldn't understand really what the problem is as it's not working properly in second case:
SELECT a.user_name, a.password, a.id, r.role_name
FROM accounts as a
JOIN roles as r ON a.role=r.id
SELECT accounts.user_name, accounts.password, accounts.id, roles.role_name
FROM accounts
JOIN roles ON accounts.role=roles.id
SELECT *
FROM accounts as a
JOIN roles as r ON a.role=r.id
accounts.role and roles.id linked with foreign key. I try to select everything using * in the third query but it didn't even get anything from second table only got everything from first table(as in the last photo). So what might be the problem ?
This behaviour has no sense, all fields must to appear when you use *
Let's do a test on SQL Fiddle
MySQL 5.6 Schema Setup:
create table t1 ( i int, a char);
insert into t1 values (1,'a');
create table t2 ( i int, b char);
insert into t2 values (1,'a');
Query 1:
select *
from t1 inner join t2 on t1.i = t2.i
Results:
| i | a | i | b |
|---|---|---|---|
| 1 | a | 1 | a |
Query 2:
select *
from t1 x inner join t2 y on x.i = y.i
Results:
| i | a | i | b |
|---|---|---|---|
| 1 | a | 1 | a |
You can see all times all fields appear. May be is a issue with the program you use to connect and make the queries. Check for twice you are executing all the sentence, not only the firsts 2 lines, also check if they are a scroll bar to see more data.
I've restarted the database with given codes:
$mysqli->query('
CREATE TABLE `crm`.`roles`
(
`id` TINYINT(1) NOT NULL AUTO_INCREMENT,
`role_name` VARCHAR(20) NOT NULL,
`edit` TINYINT(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`)
);') or die($mysqli->error);
$mysqli->query('
CREATE TABLE `crm`.`accounts`
(
`id` INT NOT NULL AUTO_INCREMENT,
`role` TINYINT(1) NOT NULL DEFAULT 1,
`user_name` VARCHAR(20) NOT NULL,
`password` VARCHAR(100) NOT NULL,
`email` VARCHAR(100) NOT NULL,
`first_name` VARCHAR(50) NOT NULL,
`last_name` VARCHAR(50) NOT NULL,
`hash` VARCHAR(32) NOT NULL,
`active` BOOL NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
FOREIGN KEY (`role`) REFERENCES roles(`id`)
);') or die($mysqli->error);
and every combination of SELECT is working fine now. I don't know what the problem was since I don't remember making any changes on the tables.

MySql Query With Pivot table or Case WHEN

i need write a query with two table, maybe i need pivot query:
First table:
CREATE TABLE `pm` (
`id` int(10) NOT NULL,
`dataoperazione` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`dataprimanota` date NOT NULL,
`idpuntovendita` int(4) NOT NULL,
`idoperatore` int(4) NOT NULL
)
Second table:
CREATE TABLE `pm_azzeramentofiscale` (
`id` int(10) NOT NULL,
`idprimanota` int(10) NOT NULL,
`cassa` varchar(20) NOT NULL,
`operatore` varchar(100) NOT NULL,
`azzeramento` decimal(8,2) NOT NULL
)
This is my query:
SELECT sum(azzeramento) as incasso, p.dataprimanota as data, p.idpuntovendita
FROM pm as p, pm_azzeramentofiscale as a
WHERE a.idprimanota = p.id
AND YEAR(p.dataprimanota) = 2016
GROUP BY p.dataprimanota,p.idpuntovendita
the result is this format:
| Incasso | Data | IdPuntovendita
1231,12 | 2015-12-12 | 3
6211,12 | 2015-12-12 | 4
but i would like this format
| Data | IncassoPuntovendita3 | IncassoPuntoVendita4
2015-12-12 | 1231,12 | 6211,12
How can write my query ? :D
THanks Regards
if you have a fixed number fo idPuntoVendita you can try this way
SELECT sum(case p.idpuntovendita when 3 azzeramento else 0 end) as incasso_punto_vendita3,
sum(case p.idpuntovendita when 4 azzeramento else 0 end) as incasso_punto_vendita34,
p.dataprimanota as data
FROM pm as p, pm_azzeramentofiscale as a
WHERE a.idprimanota = p.id
AND YEAR(p.dataprimanota) = 2016
GROUP BY p.dataprimanota,p.idpuntovendita;

Mysql Multiple Left Join And Sum not Working Correctly

I am new at joins.
I want to get sum of debts and incomes on mysql. But I am facing with a problem.
The problem is sum works more than normal.
Here is query
Select uyeler.*,
sum(uye_gider.tutar) as gider,
sum(gelir.tutar) as gelir
from uyeler
LEFT JOIN gelir on gelir.uye=uyeler.id
LEFT JOIN uye_gider on uye_gider.uye=uyeler.id
group by uyeler.id
First:
If I dont write group by it gives me only first row.
Does this work like this?
Main problem:
I have :
-2 row 'uye'(user)
-3 row 'gelir'(income)
-1 row 'uye_gider'(debt) value is 25
But when I execute this query the value of gider is 75.
I think its sum('uye_gider.tutar') is working 3 times because of 'gelir.tutar'
What am I doing wrong?
------Tables------
CREATE TABLE IF NOT EXISTS `gelir` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`tarih` date NOT NULL,
`uye` int(11) NOT NULL,
`tutar` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
CREATE TABLE IF NOT EXISTS `uyeler` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`ad` varchar(15) NOT NULL,
`soyad` varchar(15) NOT NULL,
`tc` varchar(11) NOT NULL,
`dogum` date NOT NULL,
`cep` int(11) NOT NULL,
`eposta` varchar(50) NOT NULL,
`is` int(11) NOT NULL,
`daire` int(11) NOT NULL,
`kan` varchar(5) NOT NULL,
`web` varchar(12) NOT NULL,
`webpw` varchar(100) NOT NULL,
`tur` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
CREATE TABLE IF NOT EXISTS `uye_gider` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uye` int(11) NOT NULL,
`tutar` float NOT NULL,
`gider` int(11) NOT NULL,
`aciklama` text COLLATE utf8_bin,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=1 ;
---End Tables---
first
Group by, groups sets of data according to the field you've picked. If you want a total sum of an entire table you should only want one row and it should not be grouped by anything. If you want to group your sums by a value, ie. the 'animals' table
| id | animal | food_eaten |
| 1 | dog | 10 |
| 2 | cat | 13 |
| 3 | dog | 10 |
select animal, sum(food_eaten) as total_food_eaten from animals group by animal;
will give you
| animal | total_food_eaten |
| dog | 20 |
| cat | 13 |
That is how group by works. It sections your queries by a field of non-unique values that you pick. so,
select sum(food_eaten) as total_food_eaten from animals;
will give you
|total_food_eaten|
| 33 |
second
A left join will return all your left tables values regardless of matches and will join to any right join tables with values that match. What I am sure of is the fact that you have three income rows associating with one user row. When you do a left join this generates three matching left join rows. When you then left join the debt row to these three rows it can associate to all three, since the ID matches. This is what is giving you a three-pete. I suggest, if you are only looking for the sum for both I suggest splitting up the queries seeing as the income and debt tables want to have no association with each other in these tables.
This is a likely answer to help you along the way.
Multiple select statements in Single query

Select count(*) from A update in B Super Slow

I have a query that takes 2minutes to count from table A and update Table B with the count result.
Everytime that a number in Table_B column Start matches the range in Table_A (readstart/readend) I should update read_count in Table_B.
id | readstart | readend | read_count
1 | 2999997 | 3000097 | 0
2 | 3000097 | 3000197 | 0
3 | 3000497 | 3000597 | 0
4 | 3001597 | 3001697 | 0
5 | 3001897 | 3001997 | 0
6 | 3005397 | 3005497 | 0
7 | 3005997 | 3006097 | 0
8 | 3006397 | 3006497 | 0
9 | 3006797 | 3006897 | 0
10| 3007497 | 3007597 | 0
Here is the table I should update with the count result :
CREATE TABLE `rdc_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`readstart` int(11) DEFAULT NULL,
`readend` int(11) DEFAULT NULL,
`read_count` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `readstart` (`readstart`),
KEY `readend` (`readend`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8;
Here is the table from where I wanna count matching rows :
CREATE TABLE `1ips_chr1` (
`strand` char(1) DEFAULT NULL,
`chr` varchar(10) DEFAULT NULL,
`start` int(11) DEFAULT NULL,
`end` int(11) DEFAULT NULL,
`name` varchar(255) DEFAULT NULL,
`name2` varchar(255) DEFAULT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
KEY `start` (`start`),
KEY `end` (`end`)
) ENGINE=MyISAM AUTO_INCREMENT=34994289 DEFAULT CHARSET=latin1;
I did a test on 10 rows, the result was horrible..2 minutes to select count(*) and update 10 rows. I have about 350,000 rows in Table_A to update and 35,000,000 in table_B. I know that in average each count should return 30~40 as a result.
Here is my super slow query :
UPDATE rdc_test
SET rdc_test.read_count =
(
SELECT COUNT(start) as read_count
FROM 1ips_chr1
WHERE 1ips_chr1.start >= rdc_test.readstart
AND 1ips_chr1.start <= rdc_test.readend
)
Query OK, 10 rows affected (2 min 22.20 sec)
Rows matched: 10 Changed: 10 Warnings: 0
Try this :
UPDATE rdc_test t1
INNER JOIN
(
SELECT r.id AS id,
COUNT(l.start) AS read_count
FROM rdc_test r
LEFT OUTER JOIN start1ips_chr1 l
ON l.start >= r.readstart
AND l.start <= r.readend
GROUP BY r.id
) t2
ON t1.id = t2.id
SET t1.read_count = t2.read_count
Edit :
Due to the amount of datas you need to update, the best way is to recreate the table instead of perform an update :
CREATE TABLE new_rdc_test AS
SELECT r.id AS id,
r.readstart AS readstart,
r.readend AS readend,
COUNT(l.start) AS read_count
FROM rdc_test r
LEFT OUTER JOIN start1ips_chr1 l
ON l.start >= r.readstart
AND l.start <= r.readend
GROUP BY r.id, r.readstart, r.readend
Does this query run fast enough ?
Try to bring the COUNT(*) to application level (ie. store it as a variable in PHP/Java) then do the UPDATE, with that value. MySQL will not have to calculate that count for every record you update.