Create view from two tables with different column names - mysql

Is there a way to create a view from two tables, where one of the columns is different among the two tables? The problem I am currently running into is that MYSQL is telling me that there is an undefined index - which makes perfect sense since, in half of the cases, the column won't exist.
Table Layout:
(post_rank_activity)
ID, post_id, ... date
(reply_rank_activity)
ID, rank_id, ... date
What I want the resulting view to look like:
ID | Post_id | Reply_id | Date
x x NULL x
x NULL x x
And the SQL:
$rankView = "Create or replace view userRank as (
select PRA.id, PRA.post_id, PRA.user_id, PRA.vote_up, PRA.rank_date
From post_rank_activity PRA)
union All
(select RRA.id, RRA.reply_id, RRA.user_id, RRA.vote_up, RRA.rank_date
from reply_rank_activity RRA)";
And the result I'm getting, instead of returning null, it's returning the value of "reply_id" for the "post_id" field and then shifting all of the other values over - see below:
ID | Post_id | Reply_id | Date
x x date val x
x reply val date val x
Any ideas?

Unions must contain the same columns in the same order across all parts. You should explicitly select/declare the null columns in each part of the union:
SELECT PRA.id, PRA.post_id, NULL AS reply_id, PRA.user_id, PRA.vote_up, PRA.rank_date
FROM post_rank_activity PRA
UNION All
SELECT RRA.id, NULL AS post_id, RRA.reply_id, RRA.user_id, RRA.vote_up, RRA.rank_date
FROM reply_rank_activity RRA

Your query should look like
select PRA.id, PRA.post_id, null as Reply_id PRA.rank_date
From post_rank_activity PRA
union All
select RRA.id, null as post_id, RRA.reply_id, RRA.rank_date
from reply_rank_activity RRA

Related

Check for all null values in a GROUP BY [duplicate]

This question already has answers here:
MySQL AVG() return 0 if NULL
(3 answers)
Closed last month.
I have the following structre
id val
1 ...
.
.
2 ...
.
.
3 null
3 null
3 null
4 ...
.
.
Basically each id has multiple no. of values. And an id has either all values as integers or all values as null
What I want is to perform an aggregate (like AVG) on val group by id. If that id has null values, I want to put 5 there.
#1
SELECT id, (CASE SUM(val) WHEN null THEN 5 ELSE AVG(val) END) AS ac FROM tt GROUP BY id
> executes ELSE even for id = 3
In CASE, there should be an aggregate function that when done on null values give null.
I checked SUM and MAX like
SELECT SUM(val) FROM tt WHERE id = 3
> null
and it gives null here but doesn't work in main statement. I guess it is related to the type of equality and hence tried WHEN IS NULL but its a syntax error.
Also, is there some more standard way of indicating group of values as all null rather than using SUM or MAX.
You can use if condition :
select id, If(sum(val) is null, 5, AVG(val)) as average
FROM tt
group by id
check here : https://dbfiddle.uk/Uso9nNTM
The exact problem with your CASE expression is that to check for null in MySQL we have to use IS NULL rather than equality. So use this version:
CASE WHEN SUM(val) IS NULL THEN 5 ELSE AVG(val) END
But we might as well just use COALESCE() to assign an average of 5 for those id groups having all null values.
SELECT id, COALESCE(AVG(val), 5) AS avg_val
FROM tt
GROUP BY id;
Note that the AVG() function by default ignores nulls. Therefore, the expression AVG(val) would only be null if every record in an id group were having null for val.

count comma-separated values from a column - sql

I want count the length of a comma separated column
I have use these
(LENGTH(Col2) - LENGTH(REPLACE(Col2,",","")) + 1)
in my select query.
Demo:
id | mycolumn
1 2,5,8,60
2 4,5,1
3 5,Null,Null
query result for first two row is coming correctly.for 1 = 4 ,2 = 3 but for 3rd row it is calculating null value also.
Here is what I believe the actual state of your data is:
id | mycolumn
1 2,5,8,60
2 4,5,1
3 NULL
In other words, the entire value for mycolumn in your third record is NULL, likely from doing an operation involving a NULL value. If you actually had the text NULL your current query should still work.
The way to get around this would be to use COALESCE(val, "") when handling the NULL values in your strings.
Crude way of doing it is to replace the occurances of ',Null' with nothing first:-
SELECT a.id, (LENGTH(REPLACE(mycolumn, ',Null', '')) - LENGTH(REPLACE(REPLACE(mycolumn, ',Null', ''),",","")) + 1)
FROM some_table a
If the values refer to the id of rows in another table then you can join against that table using FIND_IN_SET and then count the matches (assuming that the string 'Null' is not an id on that other table)
SELECT a.id, COUNT(b.id)
FROM some_table a
INNER JOIN id_list_table b
ON FIND_IN_SET(b.id, a.mycolumn)
GROUP BY a.id

Why does this query returns no result?

I have two Tables, the table reseau_stream has different information about my a user post. A user can share the post of someone else. Table reseau_share makes that connexion (you have the detail of both table below). Now, if a user share someone else post, I have to order my query using the datetime of reseau_share.
I don't have alot of MySQL skills, but with some help, I finally ended up with the query below. It is working only if reseau_share has a row in it. If reseau_share is empty, the query return 0 result. I really don't understand why. Can anyone identify why ? Cheers.
Table reseau_stream
id user_id content datetime
1 100 Lorem Ipsum1 2013-03-04 19:35:02
2 100 Lorem Ipsum2 2013-03-04 12:35:02
Table reseau_share
id user_id target_id stream_id datetime
-------------------- EMPTY ------------------------
The query
SELECT reseau_stream.id,
reseau_stream.user_id,
reseau_stream.content,
IF(reseau_stream.user_id = 100, reseau_stream.datetime, reseau_share.datetime) as datetime
FROM reseau_stream, reseau_share
WHERE reseau_stream.id
IN (
SELECT id
FROM reseau_stream
WHERE user_id = 100
UNION
SELECT stream_id
FROM reseau_share
WHERE user_id = 100
) ORDER BY datetime DESC;
Basically it looks like you need a LEFT JOIN on reseau_share. Right now you have a FULL OUTER JOIN, which (a) is causing the zero rows as #diegoperini has pointed out and (b) probably isn't what you really want. It's unclear which column relates the two tables. I'll guess it's user_id:
SELECT
reseau_stream.id,
reseau_stream.user_id,
reseau_stream.content,
IF(reseau_stream.user_id = 100, reseau_stream.datetime, reseau_share.datetime) as datetime
FROM reseau_stream
LEFT JOIN reseau_share ON reseau_stream.user_id = reseau_share.user_id
WHERE reseau_stream.id
IN (
SELECT id
FROM reseau_stream
WHERE user_id = 100
UNION
SELECT stream_id -- or whatever
FROM reseau_share
WHERE user_id = 100
)
ORDER BY datetime DESC;
Cartesian product of a non empty set with an empty set is an empty set.
Multiple tables in a FROM statement uses above rule to join two tables which ends up with 0 results in your case.

Why does the total from my query results not add up?

I have three queries that get stats from the database, but the total does not add up correctly for my results. If I do the math myself this is what I get: // 440728 / 1128 = 390.72
However, the following is what is returned by my queries:
SELECT * FROM facebook_accts
WHERE user_id IN (SELECT id FROM `user_accts` WHERE owner_id = '121')
// returns 1128
SELECT sum(friend_count) FROM facebook_accts
WHERE user_id IN
(SELECT id FROM `user_accts` WHERE owner_id = '121')
// returns 440728
SELECT avg(friend_count) FROM facebook_accts
WHERE user_id IN
(SELECT id FROM `user_accts` WHERE owner_id = '121')
// returns 392.11 (number formatted to two decimal places by php)
this may be happening because of column friend_count having some NULL values because SUM and AVG sunctions ignore NULL values. see here.
I guess the 1128 rows contain NULL values (which AVG and SUM ignore).

How can I return the numerical boxplot data of all results using 1 mySQL query?

[tbl_votes]
- id <!-- unique id of the vote) -->
- item_id <!-- vote belongs to item <id> -->
- vote <!-- number 1-10 -->
Of course we can fix this by getting:
the smallest observation (so)
the lower quartile (lq)
the median (me)
the upper quartile (uq)
and the largest observation (lo)
..one-by-one using multiple queries but I am wondering if it can be done with a single query.
In Oracle I can use COUNT OVER and RATIO_TO_REPORT, but this is not supported in mySQL.
For those who don't know what a boxplot is: http://en.wikipedia.org/wiki/Box_plot
Any help would be appreciated.
I've found a solution in PostgreSQL using using PL/Python.
However, I leave the question open in case someone else comes up with a solution in mySQL.
CREATE TYPE boxplot_values AS (
min numeric,
q1 numeric,
median numeric,
q3 numeric,
max numeric
);
CREATE OR REPLACE FUNCTION _final_boxplot(strarr numeric[])
RETURNS boxplot_values AS
$$
x = strarr.replace("{","[").replace("}","]")
a = eval(str(x))
a.sort()
i = len(a)
return ( a[0], a[i/4], a[i/2], a[i*3/4], a[-1] )
$$
LANGUAGE 'plpythonu' IMMUTABLE;
CREATE AGGREGATE boxplot(numeric) (
SFUNC=array_append,
STYPE=numeric[],
FINALFUNC=_final_boxplot,
INITCOND='{}'
);
Example:
SELECT customer_id as cid, (boxplot(price)).*
FROM orders
GROUP BY customer_id;
cid | min | q1 | median | q3 | max
-------+---------+---------+---------+---------+---------
1001 | 7.40209 | 7.80031 | 7.9551 | 7.99059 | 7.99903
1002 | 3.44229 | 4.38172 | 4.72498 | 5.25214 | 5.98736
Source: http://www.christian-rossow.de/articles/PostgreSQL_boxplot_median_quartiles_aggregate_function.php
Here is an example of calculation of the quartiles for e256 value ranges within e32 groups, an index on (e32, e256) in this case is a must:
SELECT
#group:=IF(e32=#group, e32, GREATEST(#index:=-1, e32)) as e32_,
MIN(e256) as so,
MAX(IF(lq_i=(#index:=#index+1), e256, NULL)) as lq,
MAX(IF(me_i=#index, e256, NULL)) as me,
MAX(IF(uq_i=#index, e256, NULL)) as uq,
MAX(e256) as lo
FROM (SELECT #index:=NULL, #group:=NULL) as init, test t
JOIN (
SELECT e32,
COUNT(*) as cnt,
(COUNT(*) div 4) as lq_i, -- lq value index within the group
(COUNT(*) div 2) as me_i, -- me value index within the group
(COUNT(*) * 3 div 4) as uq_i -- uq value index within the group
FROM test
GROUP BY e32
) as cnts
USING (e32)
GROUP BY e32;
If there is no need in groupings, the query will be slightly simplier.
P.S. test is my playground table of random values where e32 is the result of Python's int(random.expovariate(1.0) * 32), etc.
Well I can do it in two queries.
Do the first query to get the positions of the quartiles and then use the limit function to
get the answers in the second query.
mysql> select (select floor(count(*)/4)) as first_q, (select floor(count(*)/2) from
customer_data) as mid_pos, (select floor(count(*)/4*3) from customer_data) as third_q from
customer_data order by measure limit 1;
mysql> select min(measure),(select measure from customer_data order by measure limit 0,1) as firstq, (select measure from customer_data order by measure limit 5,1) as median, (select measure from customer_data order by measure limit 8,1) as last_q, max(measure) from customer_data;