Suppose you have the following SQL table:
A B C
2 1 4
3 4 5
3 1 1
1 4 0
5 0 1
And you want to add/show a column containing the mean (or any other aggregate function) of column A for each distinct value of column B. You want to keep all columns. So the result would look like this:
A B C avg(A)|B
2 1 4 2.5
3 4 5 2.0
3 1 1 2.5
1 4 0 2.0
5 0 1 5.0
The best way to do it in pandas, as far as I know, would be:
>>> df['avg(A)|B'] = df.groupby('B')['A'].transform('mean')
>>> df
A B C avg(A)|B
0 2 1 4 2.5
1 3 4 5 2.0
2 3 1 1 2.5
3 1 4 0 2.0
4 5 0 1 5.0
How would you do it in SQL? Can one avoid using a JOIN?
You can join to a derived table that contains the aggregate value for each grouping of b
select * from mytable t1
join (
select avg(a), b
from mytable
group by b
) t2 on t2.b = t1.b
or using a subquery
select *, (select avg(a) from mytable t2 where t2.b = t1.b)
from mytable t1
the question is tagged both mysql and psql, so I'm not sure which db you're using. But on postgres you can use window functions
select *, avg(a) over (partition by b)
from mytable
Related
I have table with the following:
ID TYPE ProjectID Date Rev
1 A 1 30-1-2010 500
2 B 1 28-02-2011 580
2 B 2 30-04-2011 540
2 B 3 03-04-2019 440
Results:
ID TYPE ProjectID Date Rev
1 A 1 30-1-2010 500
1 A 2 01-01-2000 0
1 A 3 01-01-2000 0
2 B 1 28-02-2011 580
2 B 2 30-04-2011 540
2 B 3 03-04-2019 440
I want to write an SQL query in which, whenever there is Type “A”, two rows should automatically be inserted with project id 2 and 3 and with default Date and Rev data.
Currently, I am using UNION function to add this data manually, but I want to do it automatically.
I am not sure how to do this in SQL.
If I understand correctly:
select it.id, it.type, p.projectid,
coalesce(t.date, '2000-01-01') as date, coalesce(t.rev, 0) as rev
from (select distinct id, type from t) it cross join
(select distinct projectid from t) p left join
t
on t.id = it.id and t.type = it.type and t.projectid = p.projectid;
This fills in the missing values, so all id/type combinations have all projects.
I have a table in my database which has 3 columns: (id, business_id, name). I need to write a query which selects 10 rows from table which have id greater than a specific value and the point is that not more than 5 rows must be selected for each business_id. how to include this criteria in the query?
so for example if we have these rows in table:
1 A JAD
2 A LPO
3 A LMN
4 A ABC
5 A QWE
6 A WER
7 B TYU
8 B POI
9 B AQZ
10 B UYT
11 C CDE
12 C XYZ
the desired result is (for id>0):
1 A JAD
2 A LPO
3 A LMN
4 A ABC
5 A QWE
7 B TYU
8 B POI
9 B AQZ
10 B UYT
11 C CDE
If you are using MySQL 8+, then ROW_NUMBER can be used here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY business_id ORDER BY id) rn
FROM yourTable
)
SELECT id, business_id, name
FROM cte
WHERE rn <= 5;
In older versions of MySQL, you can use:
select t.*
from t
where id > #id and
id < any (select t2.id
from t t2
where t2.business_id = t.business_id
order by id asc
limit 1 offset 4
)
limit 10;
The any is to handle the case where a business has fewer than four rows.
I have 2 views with different number of columns. 1 of the views has been joined with another view that is why it has additional columns.
The first view has 113 records (View 2), while the updated view (View 1) has 130 columns. I would like to find out the number of records that are extra in View 1
.
View 1 View 2
A|B|C|D|E A|B|C
1 2 3 4 5 1 2 3
1 2 3 7 8
3 2 1 4 5 3 2 1
3 2 1 7 8
expected result :
1 2 3 7 8
3 2 1 7 8
Thanks.
You can get that extra records by using 'not in' or 'not exists' conditions
select * from view1 m where not exists (
select 1 from view2 u where (m.a=u.a and m.b=u.b and m.c=u.c)
You can change those conditions as per your requirement
With left join also will get the required result
select m.* from view1 m left join view2 u
(m.a=u.a and m.b=u.b and m.c=u.c)
where u.a is null and u.b is null and u.c is null
You shoul probably refactor your DB schema and data logic.
But just to resolve your weird requirements you can:
http://sqlfiddle.com/#!9/cf2c50/2
SELECT t.a, t.b, t.c, t.d, t.e
FROM (
SELECT v1.*, IF(#idx = concat(v1.a,v1.b,v1.c),1,0) `filter`,#idx := concat(v1.a,v1.b,v1.c)
FROM v1
INNER JOIN v2
ON v1.a=v2.a AND v1.b=v2.b AND v1.c=v2.c
ORDER BY v1.a,v1.b,v1.c
) t
WHERE t.`filter`=1;
It is not best example of query performance, but it should return expected result.
My question is pretty similar to this one Auto number and reset count for each different column value
except that I can't make it work.
I have the table record:
ID(autoINC) plate_number
1 A
2 A
3 A
4 B
5 B
6 C
7 C
I want to display something like this adding additional field cc:
I have the table record:
ID(autoINC) plate_number count
1 A 1
2 A 2
3 A 3
4 B 1
5 B 2
6 C 1
7 C 2
You can have a correlated subquery which sequentially count the row which can be used as a rownumber.
SELECT A.ID,
A.plate_number,
(
SELECT COUNT(*)
FROM tableName c
WHERE c.plate_number = a.plate_number AND
c.ID <= a.ID) AS RowNumber
FROM TableName a
SQLFiddle Demo
i have a table of this general format. it was generated via pivoting, so the number of columns is not fixed.
id c1 c2... total
10 0 2 1 1 0 4
9 0 1 0 1 0 2
8 1 2 0 0 0 3
7 0 0 0 1 0 1
6 0 1 0 1 1 3
5 1 0 0 1 2 4
4 0 1 1 0 0 2
3 0 3 0 1 1 5
2 2 2 2 0 0 6
1 1 0 1 0 0 2
what i need, is to take the "total col" (last from left), and divide each one of the {c1, c2, c3....} columns by their respective total... for instance, if row 10, c2=2, then c2/total = 2/4 =0.5
just to emphasize, the number of cols. is not fixed. this is a sample table.
is it possible do to only via mysql, or is an external script needed?
many thanks
EDIT TO CLARIFY:
my inital data, pre-pivoting, looks like this:
2 2
8 1
2 2
1 5
3 1
9 1
5 3
4 1
1 2
10 5
6 4
4 5
5 2
10 3
5 4
3 1
6 1
6 3
3 4
3 1
5 4
7 3
2 5
10 1
9 3
where the first col is "id", second is "c". as shown, it needs to be transformed into a contingency table of sort. where each id has a count for each "c" {c1,c2,c3...}
is there an efficient way to code this data into a the format #bobwienholt mentioned below? (i'm new to mysql, in fact i taught it to myself today for the pivoting. apologies if this is trivial).
If I were you, I would structure my table as follows:
CREATE TABLE data ( row INT, col INT, value INT );
Then you can do this:
SELECT d.row, d.col, d.value/t.total
FROM (
SELECT row, SUM(value) as total
FROM data
GROUP BY row;
) t INNER JOIN data d
ON d.row = t.row
ORDER BY row, col;
It would work for any number of "rows" and "columns".
Ok, based on your edit... I would just import the data as is. So you would create a table like this:
CREATE TABLE data ( id INT, c INT);
Then you could import your data using LOAD DATA INFILE. You should consult the MySQL docs to learn how to use that.
Then, you would get all your c1, c2, etc counts like this:
SELECT id, c, COUNT(1) as num
FROM data
GROUP BY id, c;
That would yield results like (based on your sample data):
id c num
1 2 1
1 5 1
2 2 2
2 5 1
3 1 3
So, basically for id 3, c1 = 3... for id 2 c2=2, etc.
Your total column would be:
SELECT id, SUM(num) as total
FROM (
SELECT id, c, COUNT(1) as num
FROM data
GROUP BY id, c
) x
GROUP BY id;
In this scenario, you wouldn't have to pivot your data.