How to check for duplicates with an added condition - duplicates

I'm working with a table of addresses in Power BI. The table also has a column marking some condition, it could be anything so I'll just label it "condition".
I'm trying to create a column (or measure) showing duplicate addresses. The problem I'm having is that both/all duplicates need to meet that other condition. Rows that don't should just be ignored from the start. I thought these nested IF statements would work:
Duplicate =
IF(
CALCULATE(COUNTROWS(Table),
FILTER(Table,Table[Condition]="Yes")),
IF(
CALCULATE(COUNTROWS(Table),
FILTER(Table,Table[Address]=EARLIER(Table[Address])))>1,
"Duplicate",BLANK()
)
)
But duplicate pairs where only one row meets the condition are still marked. What am I doing wrong?
I need all rows elsewhere so I can't filter the query. Also, I know I could add the condition to the concatenation, but that seems sloppy and I assume there's a more "correct" way to do it.

I don't understand how your outer IF function is supposed to work since the first argument is an integer rather than True/False.
Try this instead:
Duplicate =
IF (
COUNTROWS (
FILTER (
Table,
Table[Condition] = "Yes" &&
Table[Address] = EARLIER ( Table[Address] )
)
) > 1,
"Duplicate",
BLANK ()
)
Edit: As you pointed out, this didn't work exactly as intended. Try one of the following instead:
Duplicate =
IF (
COUNTROWS (
FILTER (
Table,
EARLIER ( Table[Condition] ) = "Yes" &&
Table[Condition] = "Yes" &&
Table[Address] = EARLIER ( Table[Address] )
)
) > 1,
"Duplicate",
BLANK ()
)
or
Duplicate =
IF (
Table[Condition] = "Yes" &&
COUNTROWS (
FILTER (
Table,
Table[Condition] = "Yes" &&
Table[Address] = EARLIER ( Table[Address] )
)
) > 1,
"Duplicate",
BLANK ()
)

Related

SQL: always return true from `IN` condition

How to make IN condition always return true just like WHERE 1, I tried null but didn't work:
WHERE X IN (NULL)
Is there a way to alway make IN returns true and accept all rows?
return true from something like the following:
where X in ("Any value here to alway return true")
The best you could do is to include the column being compared:
where x in (x)
However, this does not include NULL values. In fact, there is no way you can make this return true:
where NULL in ( . . . )
You could revise this to:
where coalesce(x, '') in (coalesce(x, '')
One way is to use LEFT JOIN :
select t.*
from table t left join
( . . . ) tt
on tt.x = t.x;

TYPO3 9.5 repository query to fetch elements with multiple sys_categories

I'm trying to figure out how to write the following query to fetch some elements which have multiple categories.
$query->matching(
$query->logicalAnd(
[
// the following 4 lines are the problem lines
$query->logicalAnd(
$query->in('categories.uid', $categories),
$query->in('categories.uid', $countryCategories)
),
// $query->in('categories.uid', $categories),
// $query->in('categories.uid', $countryCategories),
$query->logicalOr(
[
$query->equals('is_pinned', 0),
$query->lessThan('pinned_until', time())
]
),
]
)
);
The idea is to fetch the elements where categories.uid match at least one uid in $categories and at least one in $countryCategories. Both $categories and $countryCategories are arrays filled with category uids.
The query worked fine until the second line $query->in('categories.uid' [...] was inserted. As soon as the second line is inserted the query result is empty. It's probably an error in the query, but neither me nor my colleague could find a working solution.
While searching I found the sql UNION, which I've never been working with before but I guessed it would be the way to go if I had to write the statement instead of building the query.
What I would like to know is if it is possible to fetch the elements with the "query builder" or if it is really necessairy to write a statement? If there is a solution with the query builder could you point it out for me? If not how would I build the query with UNION to fetch the elements as required?
If something is unclear, please do not hesitate to ask, I will try to specify further. Thanks.
EDIT
We've debugged the query too and I executed it in phpmyadmin directly. It was working without "AND (sys_category.uid IN ( 41, 2 ))" but with it the result is empty. The follwoing was the debugged query:
SELECT `tx_gijakobnews_domain_model_news`.*
FROM `tx_gijakobnews_domain_model_news` `tx_gijakobnews_domain_model_news`
LEFT JOIN `sys_category_record_mm` `sys_category_record_mm` ON ( `tx_gijakobnews_domain_model_news`.`uid` = `sys_category_record_mm`.`uid_foreign`) AND (( `sys_category_record_mm`.`tablenames` = 'tx_gijakobnews_domain_model_news') AND ( `sys_category_record_mm`.`fieldname` = 'categories'))
LEFT JOIN `sys_category` `sys_category` ON `sys_category_record_mm`.`uid_local` = `sys_category`.`uid`
WHERE ((
(`sys_category`.`uid` IN ( 15, 17, 10, 11, 12, 16, 13, 14 ))
////// this following line is where the problem begins
AND (`sys_category`.`uid` IN ( 41, 2 ))
)
/////////// the following lines are additional restrictions
/////////// which have no influence on the problem
AND ((`tx_gijakobnews_domain_model_news`.`is_pinned` = 0) OR ( `tx_gijakobnews_domain_model_news`.`pinned_until` < 1560867383))
)
AND ( `tx_gijakobnews_domain_model_news`.`sys_language_uid` IN ( 0, -1) )
AND ( `tx_gijakobnews_domain_model_news`.`pid` = 31)
AND ( ( `tx_gijakobnews_domain_model_news`.`deleted` = 0)
AND ( `tx_gijakobnews_domain_model_news`.`t3ver_state` <= 0)
AND ( `tx_gijakobnews_domain_model_news`.`pid` <> -1)
AND ( `tx_gijakobnews_domain_model_news`.`hidden` = 0)
AND ( `tx_gijakobnews_domain_model_news`.`starttime` <= 1560867360)
AND ( ( `tx_gijakobnews_domain_model_news`.`endtime` = 0)
OR ( `tx_gijakobnews_domain_model_news`.`endtime` > 1560867360) ) )
AND ( ( ( `sys_category`.`deleted` = 0)
AND ( `sys_category`.`t3ver_state` <= 0)
AND ( `sys_category`.`pid` <> -1)
AND ( `sys_category`.`hidden` = 0)
AND ( `sys_category`.`starttime` <= 1560867360)
AND ( ( `sys_category`.`endtime` = 0)
OR ( `sys_category`.`endtime` > 1560867360) ) )
OR ( `sys_category`.`uid`
IS NULL) )
ORDER BY `tx_gijakobnews_domain_model_news`.`publish_date` DESC
If there's a missing bracket I problably removed it accidentally while formatting...
I believe the problem is that the where clause is applied on a "per row" basis.
Meaning If you have a query like the following (based off of your query):
SELECT *
FROM news
LEFT JOIN sys_category_record_mm mm
ON (news.uid = mm.uid_foreign) /* AND (...) */
LEFT JOIN sys_category
ON mm.uid_local = sys_category.uid
WHERE
sys_category.uid IN (1,2,3)
AND sys_category.uid IN (4,5,6)
You might have one news entry, that is in category 1 and in category 4. But the result set would be two distinct rows:
news.uid | sys_category.uid
1 | 1
1 | 4
and the WHERE clause filters both of them out, because the sys_category.uid is not both in (1, 2, 3) and in (4, 5, 6) for each row individually.
The way to do that on an SQL level, would probably be, to do two joins to sys_category. But I do not believe it's possible with the (rather simple) extbase query builder.
Edit:
As a solution, you could use the $query->statement() method, that allows to use custom sql queries.
$result = $query->statement('SELECT news.* FROM news');
https://docs.typo3.org/m/typo3/book-extbasefluid/master/en-us/6-Persistence/3-implement-individual-database-queries.html
You could build your own custom Query with the QueryBuilder. Something like this:
use TYPO3\CMS\Core\Database\ConnectionPool;
use TYPO3\CMS\Core\Utility\GeneralUtility;
use TYPO3\CMS\Extbase\Utility\DebuggerUtility;
$queryBuilder = GeneralUtility::makeInstance(ConnectionPool::class)
->getQueryBuilderForTable('table_to_select_from');
$result = $queryBuilder->select('*')
->from('table_to_select_from')
->where($queryBuilder->expr()->in('field', ['1','2','3'])
->execute()
->fetchAll();
DebuggerUtility::var_dump($result);
Here's the documentation:
https://docs.typo3.org/m/typo3/reference-coreapi/master/en-us/ApiOverview/Database/QueryBuilder/Index.html
I did it way simpler in the end.
Instead of adding both restrictions by the query, I looped through the results restricted by the first sys_category-condition and then removed those which didn't meet the second sys_category-restrictions.
Repository
$query->matching(
$query->logicalAnd([
$query->in('categories.uid', $categories),
$query->logicalOr(
[
$query->equals('is_pinned', 0),
$query->lessThan('pinned_until', time())
]
),
])
);
Controller
public function getRestrictedNews($news, $countryCategories) {
$newNews = array();
foreach ($news as $newsItem) {
$newsCategories = $newsItem->getCategories();
$shouldKeep = false;
foreach ($newsCategories as $categoryItem) {
if (in_array($categoryItem->getUid(), $countryCategories)) {
$shouldKeep = true;
}
}
if ($shouldKeep) {
array_push($newNews, $newsItem);
}
}
return $newNews;
}
It may not be the best solution, but it's one that works. :-)

How to convert calculated formula from salesforce to SQL

We have one Opportunity table in salesforce and this table has one calculated column called as "Is_XYZ".
Calculated formula for "Is_XYZ" column is -
calculatedFormula: IF(
AND(
OR(
AND(
UPPER(LeadSource__c) == 'XYZ',
DateValue(CreatedDate) > Date(2017,01,01)
),
AND( Is_PQR ,
IF(
Effective_Date__c <> null,
DateValue(CreatedDate) > Effective_Date__c ,
TRUE
),
IF(
Effective_Date__c <> null,
Effective_Date__c <= TODAY(),
TRUE
)
)
),
UPPER(MailingState) <> 'NY',
UPPER(Lead_Sub_Source__c) <> 'PQRS'
),
TRUE,
FALSE
)
We have created same Opportunity table in Hive SQL and we want to write select query to calculate "Is_XYZ" column value. We have converted formula from salesforce syntax to SQL syntax.
So, formula in SQL will be -
SELECT
IF(
(
(
( UPPER(LeadSource__c) == 'XYZ' AND
CreatedDate > '2017-01-01'
)
OR
( Is_PQR AND
IF( Effective_Date__c IS NOT NULL,
CreatedDate > Effective_Date__c,
TRUE
)
AND
IF( Effective_Date__c IS NOT NULL,
Effective_Date__c <= current_date,
TRUE
)
)
)
AND (UPPER(MailingState) <> 'NY')
AND (UPPER(Lead_Sub_Source__c) <> 'PQRS')
),
TRUE,
FALSE
) as Is_XYZ
FROM Opportunity;
Can you help me to confirm that both formulas(salesforce and SQL) are same? I mean, can you verify that both above formulas are doing same thing.
I tested it on both sides(salesforce and Hive SQL) and it is behaving differently. Values for that case are -
LeadSource__c = abcdef
Lead_Sub_Source__c = klmnop
CreatedDate = 2019-04-02T00:06:49.000Z
MailingState = HI
Is_PQR = true
Effective_Date__c = 2019-04-09
For above values, salesforce displays Is_XYZ = true and hive displays Is_XYZ = false. Please help me in identifying the issue.
I can tell that the date/time arithmetic is not correct, due to time components on the values. I don't know if this is the issue with your particular bad example.
For instance:
DateValue(CreatedDate) > Date(2017,01,01)
is not equivalent to:
CreatedDate > '2017-01-01'
The equivalence would be to:
CreatedDate >= '2017-01-02'
The issue is the DateValue() which removes the time component.
Similarly,
DateValue(CreatedDate) > Effective_Date__c ,
requires a modification.
Finally, we found that there is bug in Salesforce formula.
In Salesforce formula, we are checking "Effective_Date__c <> null" which is always returning false. In Salesforce, it is preferred to use IsBlank() function than IsNull().
So, we have changed CalculatedFormula in Salesforce to correct this issue. New formula will be - IsBlank(Effective_Date__c).
I am not a Salesforce developer, so not able to pick this bug earlier. After discussing this with Salesforce developer, we are able to find this bug which is present in system from last one year.
I found no way in Salesforce workbench to test/debug this Calculated field formula which is very frustrating.

MS Access SQL - how do I conditionally set criteria to all?

I'm setting up a report that will show me personnel on each shift. I have a form with a combo box where I can select a specific shift, or by leaving it blank show all shifts. Choosing a value from the combo box mirrors that text to a hidden text box which is then passed to the query. Now, getting the report to filter by shift is the easy part, what's kicking me in the teeth right now is how do I set it so that if my Shift Filter box is empty to show all records like it would if the WHERE clause were blank?
Here's my SQL code:
SELECT DISTINCTROW tblPersonnel.EmpID
,tblRank.Rank
,tblPersonnel.NameStr
,tblPersonnel.Shop
,qryShiftRosterSub.Narrative
,qryShiftRosterSubShift.CurrentShift
,qryShiftRosterSubShift.ShopName
,tblRank.ID
FROM (
(
tblPersonnel LEFT JOIN qryShiftRosterSubShift ON tblPersonnel.EmpID = qryShiftRosterSubShift.EmpID
) LEFT JOIN tblRank ON tblPersonnel.Rank = tblRank.ID
)
LEFT JOIN qryShiftRosterSub ON tblPersonnel.EmpID = qryShiftRosterSub.EmpID
WHERE (
((qryShiftRosterSubShift.CurrentShift) = IIf(Len([Forms] ! [frmNavMain] ! [NavigationSubform] ! [ShiftFilter]) = 0, 'Is Not Null', [Forms] ! [frmNavMain] ! [NavigationSubform] ! [ShiftFilter]))
AND ((tblPersonnel.DeleteFlag) = False)
);
I've got a few queries that are chained together and this is the last one before the completed dataset is sent to the report. Like I said, I can get it to show me just a specific shift easily, and by clearing the criteria from CurrentShift I can get it to show all records, but how do I get it to swap between the two based on what's in my filter box?
You can just add an OR clause to check if the combo box is empty. Note that you both need to account for "" empty strings, and Null values. I prefer to check using Nz(MyComboBox) = ""
Implementation:
SELECT DISTINCTROW tblPersonnel.EmpID
,tblRank.Rank
,tblPersonnel.NameStr
,tblPersonnel.Shop
,qryShiftRosterSub.Narrative
,qryShiftRosterSubShift.CurrentShift
,qryShiftRosterSubShift.ShopName
,tblRank.ID
FROM (
(
tblPersonnel LEFT JOIN qryShiftRosterSubShift ON tblPersonnel.EmpID = qryShiftRosterSubShift.EmpID
) LEFT JOIN tblRank ON tblPersonnel.Rank = tblRank.ID
)
LEFT JOIN qryShiftRosterSub ON tblPersonnel.EmpID = qryShiftRosterSub.EmpID
WHERE (
((qryShiftRosterSubShift.CurrentShift) = IIf(Len([Forms] ! [frmNavMain] ! [NavigationSubform] ! [ShiftFilter]) = 0, 'Is Not Null', [Forms] ! [frmNavMain] ! [NavigationSubform] ! [ShiftFilter])
OR Nz([Forms] ! [frmNavMain] ! [NavigationSubform] ! [ShiftFilter]) = "")
AND ((tblPersonnel.DeleteFlag) = False)
);
You can try to use condition like this:
qryShiftRosterSubShift.CurrentShift = [Forms]![frmNavMain]![NavigationSubform]![ShiftFilter]
OR Len(Nz([Forms]![frmNavMain]![NavigationSubform]![ShiftFilter],""))= 0
Why don't you append the 'WHERE' part of the SQL just if your combobox is not empty?

How can I speed up this linq to sql function?

I have a function (called "powersearch", the irony!) that searches for a set of strings across a bunch(~ 5) of fields.
The words come in as one string and are separated by spaces.
Some fields can have exact matches, others should have "contains".
(Snipped for brevety)
//Start with all colors
IQueryable<Color> q = db.Colors;
//Filter by powersearch
if (!string.IsNullOrEmpty(searchBag.PowerSearchKeys)){
foreach (string key in searchBag.SplitSearchKeys(searchBag.PowerSearchKeys)
.Where(k=> !string.IsNullOrEmpty(k))){
//Make a local copy of the var, otherwise it gets overwritten
string myKey = key;
int year;
if (int.TryParse(myKey, out year) && year > 999){
q = q.Where(c => c.Company.Name.Contains(myKey)
|| c.StockCode.Contains(myKey)
|| c.PaintCodes.Any(p => p.Code.Equals(myKey))
|| c.Names.Any(n => n.Label.Contains(myKey))
|| c.Company.CompanyModels.Any(m => m.Model.Name.Contains(myKey))
|| c.UseYears.Any(y => y.Year.Equals(year))
);
}
else{
q = q.Where(c => c.Company.Name.Contains(myKey)
|| c.StockCode.Contains(myKey)
|| c.PaintCodes.Any(p => p.Code.Contains(myKey))
|| c.Names.Any(n => n.Label.Contains(myKey))
|| c.Company.CompanyModels.Any(m => m.Model.Name.Equals(myKey))
);
}
}
}
Because the useYear count is rather large, I tried to check for it as little as possible by outruling all numbers that can never be a number that makes sence in this case. Similar checks are not possible on the other fields since they can pretty much contain any thinkable string.
Currently this query takes about 15 secs for a single, non-year string. That's too much.
Anything I can do to improve this?
--Edit--
Profiler shows me the following info for the part where the string is not a year:
exec sp_reset_connection
Audit login
exec sp_executesql N'
SELECT COUNT(*) AS [value]
FROM [dbo].[CLR] AS [t0]
INNER JOIN [dbo].[CO] AS [t1] ON [t1].[CO_ID] = [t0].[CO_ID]
WHERE
([t1].[LONG_NM] LIKE #p0)
OR ([t0].[EUR_STK_CD] LIKE #p1)
OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[PAINT_CD] AS [t2]
WHERE ([t2].[PAINT_CD] LIKE #p2)
AND ([t2].[CLR_ID] = [t0].[CLR_ID])
AND ([t2].[CUSTOM_ID] = [t0].[CUSTOM_ID])
)
)OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[CLR_NM] AS [t3]
WHERE ([t3].[CLR_NM] LIKE #p3)
AND ([t3].[CLR_ID] = [t0].[CLR_ID])
AND ([t3].[CUSTOM_ID] = [t0].[CUSTOM_ID])
)
) OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[CO_MODL] AS [t4]
INNER JOIN [dbo].[MODL] AS [t5] ON [t5].[MODL_ID] = [t4].[MODL_ID]
WHERE ([t5].[MODL_NM] = #p4)
AND ([t4].[CO_ID] = [t1].[CO_ID])
)
)
',N'#p0 varchar(10),#p1 varchar(10),#p2 varchar(10),#p3 varchar(10),#p4 varchar(8)',#p0='%mercedes%',#p1='%mercedes%',#p2='%mercedes%',#p3='%mercedes%',#p4='mercedes'
(took 3626 msecs)
Audit Logout (3673 msecs)
exec sp_reset_connection (0msecs)
Audit login
exec sp_executesql N'
SELECT TOP (30)
[t0].[CLR_ID] AS [Id],
[t0].[CUSTOM_ID] AS [CustomId],
[t0].[CO_ID] AS [CompanyId],
[t0].[EUR_STK_CD] AS [StockCode],
[t0].[SPCL_USE_CD] AS [UseCode],
[t0].[EFF_IND] AS [EffectIndicator]
FROM [dbo].[CLR] AS [t0]
INNER JOIN [dbo].[CO] AS [t1] ON [t1].[CO_ID] = [t0].[CO_ID]
WHERE
([t1].[LONG_NM] LIKE #p0)
OR ([t0].[EUR_STK_CD] LIKE #p1)
OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[PAINT_CD] AS [t2]
WHERE ([t2].[PAINT_CD] LIKE #p2)
AND ([t2].[CLR_ID] = [t0].[CLR_ID])
AND ([t2].[CUSTOM_ID] = [t0].[CUSTOM_ID])
)
)
OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[CLR_NM] AS [t3]
WHERE ([t3].[CLR_NM] LIKE #p3)
AND ([t3].[CLR_ID] = [t0].[CLR_ID])
AND ([t3].[CUSTOM_ID] = [t0].[CUSTOM_ID])
)
)
OR (EXISTS(
SELECT NULL AS [EMPTY]
FROM [dbo].[CO_MODL] AS [t4]
INNER JOIN [dbo].[MODL] AS [t5] ON [t5].[MODL_ID] = [t4].[MODL_ID]
WHERE ([t5].[MODL_NM] = #p4)
AND ([t4].[CO_ID] = [t1].[CO_ID])
)
)'
,N'#p0 varchar(10),#p1 varchar(10),#p2 varchar(10),#p3 varchar(10),#p4 varchar(8)',#p0='%mercedes%',#p1='%mercedes%',#p2='%mercedes%',#p3='%mercedes%',#p4='mercedes'
(took 3368 msecs)
The database structure, sadly, is not under my control. It comes from the US and has to stay in the exact same format for compatibility reasons. Although most of the important fields are indeed indexed, they are indexed in (unnecessary) clustered primary keys. There's verry little I can do about that.
Okay, let's break this down - the test case you're interested in first is a single non-year, so all we've got is this:
q = q.Where(c => c.Company.Name.Contains(myKey)
|| c.StockCode.Contains(myKey)
|| c.PaintCodes.Any(p => p.Code.Contains(myKey))
|| c.Names.Any(n => n.Label.Contains(myKey))
|| c.Company.CompanyModels.Any(m => m.Model.Name.Equals(myKey))
Am I right? If so, what does the SQL look like? How long does it take just to execute the SQL statement in SQL Profiler? What does the profiler say the execution plan looks like? Have you got indexes on all of the appropriate columns?
Use compiled queries.
If you don't, you will lose up to 5-10x times performance, as LINQ-to-SQL will have to generate SQL from query every time you call it.
Things become worse when you use non-constants in LINQ-to-SQL as getting their values is really slow.
This assumes that you already have indexes and sane DB schema.
BTW, I am not kidding about 5-10x part.