I need to join two datasets in Contour on a column that contains NULLs.
Contour drops NULLs when performing joins, but in this case, it's important to match the NULLs across this dataset. How can I do this?
Prior to the Join, use an Expression board to replace the NULL values in the desired column with some non-NULL string value, such as "null". Repeat this for your second dataset, possibly via creating a new path in your Contour analysis with your second dataset as an input. Then the Join between these two datasets will work as desired.
Related
I'm looking to create a variable to remove unnecessary null values from my query results. I will also need that variable to create a crosstab to count the acuity levels by day. Some of the query results return a line with a numeric result and line with a null result. Others will just return one or the other. The caveat is that I need to keep the null results if they are the only result for that patient. Please see sample data below and a sample cross tab. The cross tab is using the current acuity dimension which includes the additional null values I don't want. The other fields on the cross tab are Tracking Date and # of Patients (see below).
=FormatDate([Start Tracking Date & Time];"MM/dd/yyyy")
=Count([Financial Number])
I have a single record which joins to N other tables, and extracts a single column from each of them. I would like to put all N of those extracted columns in a single record.
After constructing the diagram below it seems like I can get to the second step easily, and then I should be able to use an aggregate function to filter out the NULL's. I have looked around for something like GROUP_COALESCE, but I couldn't find something which accomplishes this.
I have a fiddle here which unfortunately works, because MySQL will let you select columns which aren't in the GROUP BY without an aggregate at your own peril http://sqlfiddle.com/#!9/304992/1/0.
Is there a way I can make sure that it always selects the column from the record, if the record exists?
The end result should one record per group, and each column would contain the value which was inside the only row successfully joined for that group..
If I followed you correctly, you can just use aggregate functions on the columns coming from the joined tables. Aggregate functions ignore null values, so, since you have two null values and one non-null value for each column and each group, this will return the expected output (while conforming to the ONLY_FULL_GROUP_BY option).
SELECT
group_table_id,
MAX(t1.v) t1_v,
MAX(t2.v) t2_v,
MAX(t3.v) t3_v
FROM group_table
LEFT JOIN t1 ON t1.group_id = group_table_id
LEFT JOIN t2 ON t2.group_id = group_table_id
LEFT JOIN t3 ON t3.group_id = group_table_id
GROUP BY group_table_id
In my SSRS report, I already have a dataset A(by running a SQL script), and parameter P1 use all the records in A. Now I want to get a subset of A, and use another parameter P2 to refer to it.
Is it possible that get the whole and the subset of the dataset at the sametime and only run the script once?
I guess creating a shared dataset is a possible way, but the dataset A is just for locally use and shouldn't be shared.
Short answer
No, it is not possible.
Alternative
You can modify your query in order to it returns one column for populate the P1 parameter and other column to populate P2. Example:
select 'Foo' P1, 'Foo' P2
union all
select 'Bar', 'Bar'
union all
select 'Foobar', null
Returns:
P1 P2
Foo Foo
Bar Bar
Foobar null
Use P1 column for populate the P1 parameter and P2 for populate P2 parameter.
Note the subset column (P2 in my case) has less values than P1,
if your parameter is set to allow NULL values, it will show the NULL
option in select list otherwise it won't.
This solution could work for you but if you need the dataset runs only once I am unsure of that, I think SSRS will run the query for every parameter even if both parameters are being populated from one dataset.
Let me know if this helps.
One way I've achieved this is with grouping. If Dataset A already has all the stuff you want, you can group that dataset with criterion P2 = TRUE. This splits Dataset A into two groups -- one where P2 condition is true, and the other where P2 condition is false.
For instance consider a dataset with two columns, Label and Amount. I want to subset my data where Label = "LabelNameOne". I create a group around my dataset with expression =Fields!Label.Value = "LabelNameOne", which then automatically creates a subset for me. Assuming you wanted it to filter on a user-chosen parameter at run time, you just sub in that parameter in your grouping expression: =Fields!Label.Value = Parameters!P2.Value.
Hoping I’ve not over simplified things,
I have 2 tables: tblTestA and tblTestB
Both tables are linked through their common ID fields.
I’m looking to select all records from tblTestA that have a date greater than #2013/01/01# its Date field.
Then, from this record set, further filter by keeping only those records who have at least 1 non-Null value in Field1 or Field2 from tblTest2 (i.e. remove double Nulls)
Is there a way to modify the following unworkable/pseudo code so that the above is achieved?
SELECT tblTestA.ID, tblTestB.Field1, tblTestB.Field2
FROM tblTestA
WHERE tblTestA.Date > #2013/01/01#
Inner Join tblTestB
On tblTestA.ID= tblTestB.ID
Where (Not IsNull(tblTestB.Field1)) Or (Not IsNull(tblTestB.Field2));
In the real scenario (due to the way the tables are structured, their size, and additional factors) querying only on the non-Null requirement takes very long. Querying only on the date greater than #2013/01/01# requirement takes very little time. So I’m thinking that if we can return the smaller date requirement result set and then use the common ID field to do the second filter on the non-Null check, then, the whole query might complete faster.
Edit:
Modifying the above to...
SELECT tblTestA.ID, tblTestB.Field1, tblTestB.Field2
FROM tblTestA
Inner Join tblTestB
On tblTestA.ID= tblTestB.ID
WHERE tblTestA.Date > #2013/01/01#
AND
(Not tblTestB.Field1 Is Not Null) Or (tblTestB.Field2 Is Not Null);
...returns records that are within the required date range but seems to also link the non-Null filter to that same date range. Entries for Field1 and Field2 may have been entered before the date range requirement filter. I've probably over simplified things from the real scenario, but I'm looking to do 2 things: 1. return records within a date range, and 2. from this result set, filter out any records that do not have at least one non-Null value in Field1 or Field2 from any date range.
The following query returns many correct rows, but does not return a row for seed = '1985.00-Miller-13' (there are others missing too but this is just one example):
SELECT g.dam_alias "Seed"
FROM genetic g LEFT OUTER JOIN (genetic g1d)
ON (g.dam_alias = g1d.genetic_alias)
GROUP BY g1d.dam_alias , g1d.sire_alias;
However if I add a WHERE clause to the query specifying the row that I think is missing, it shows up. Here is the modified query:
SELECT g.dam_alias "Seed"
FROM genetic g LEFT OUTER JOIN (genetic g1d)
ON (g.dam_alias = g1d.genetic_alias)
WHERE g.dam_alias = '1985.00-Miller-13' -- this is the added line
GROUP BY g1d.dam_alias , g1d.sire_alias;
If my original query indeed should not have returned the row for the seed "1985.00-Miller-13", I would have expected the second query to return no rows.
At first I suspected that my keys/indexes were corrupt and so I did a db dump and rebuilt from the resulting sql script. I have replicated the problem using MYSQL v5.6 and MariasDB v 10.0.17
I have hand inspected the data and walked through the query on paper and find nothing that is inconsistent with my expected results.
Any suggestions would be greatly appreciated. I can provide any additional information/schema/data that anyone might need.
Thanks.
You're grouping on g1d.dam_alias, but selecting g.dam_alias.
Most other RDBMS products do not allow the selection of unaggregated columns from within a group, because it is ambiguous from which record within the group a value should be returned. MySQL does however permit this operation as a performance enhancement, although the documentation is clear that the results in such cases are indeterminate:
See MySQL Handling of GROUP BY (emphasis added):
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.
What's (presumably—we cannot say for certain without seeing the underlying data) happening is that g.dam_alias = '1985.00-Miller-13' exists within some groups, but different values of g.dam_alias from other records within those groups are selected instead. When you add the filter, there are no other values to select and consequently the value that is selected is guaranteed to be the one you expect.
It's difficult to make a recommendation for fixing this problem without understanding the semantics of your desired query.
You are using left outer join and the group by references the second table. These values could be NULL. Take the column from the first table:
SELECT g.dam_alias "Seed"
FROM genetic g LEFT OUTER JOIN
genetic g1d
ON g.dam_alias = g1d.genetic_alias
GROUP BY g.dam_alias, g1d.sire_alias;
---------^