How can I recast a CNF expression to 3-CNF? - cnf

I have a CNF expression like this and I want recast it to 3-CNF:
(a+b+c+d)(~a)(~b+d)(a+b+~d)
Does anyone know how I can do that?

A 3-CNF is a Conjunctive Normal Form where all clauses have three or less literals. To obtain such a form for your expression, you could translate the expression into a nested Boolean expression where all operators (and, or) have two operands:
t1a = (a+b)
t1b = (c+d)
t1 = t1a + t1b
t2 = (~a)
t3 = (~b+d)
t4a = (a+b)
t4 = t4a + ~d
t5a = t1 t2
t5b = t3 t4
t5 = t5a t5b
You can directly translate this nested expression into a set of 3-CNF clauses.
Minimized solution:
(~a)(~b + d)(b + ~d)(c + d)
as suggested by WolframAlpha is a 3-CNF of your expression as it does not contain longer clauses.
For small cases, you could fill a truth table. Look at all rows with output 0 and find minterms covering all these rows. If you then invert all literals in the minterms, you have a CNF. In case clauses have more than 3 literals, they can be broken up into two or more shorter clauses by introducing intermediate variables. The widely used procedure for that is called Tseitin Transformation or Tseitin Encoding.

Related

Optimize derived table in select

I have sql query:
SELECT tsc.Id
FROM TEST.Services tsc,
(
select * from DICT.Change sp
) spc
where tsc.serviceId = spc.service_id
and tsc.PlanId = if(spc.plan_id = -1, tsc.PlanId, spc.plan_id)
and tsc.startDate > GREATEST(spc.StartTime, spc.startDate)
group by tsc.Id;
This query is very, very slow.
Explain:
Can this be optimized? How to rewrite this subquery for another?
What is the point of this query? Why the CROSS JOIN operation? Why do we need to return multiple copies of id column from Services table? And what are we doing with the millions of rows being returned?
Absent a specification, an actual set of requirements for the resultset, we're just guessing at it.
To answer your questions:
Yes, the query could be "optimized" by rewriting it to the resultset that is actually required, and do it much more efficiently than the monstrously hideous SQL in the question.
Some suggestions: ditch the old-school comma syntax for the join operation, and use the JOIN keyword instead.
With no join predicates, it's a "cross" join. Every row matched from one side matched to every row from the right side.) I recommend including the CROSS keyword as an indication to future readers that the absence of an ON clause (or, join predicates in the WHERE clause) is intentional, and not an oversight.
I'd also avoid an inline view, unless there is a specific reason for one.
UPDATE
The query in the question is updated to include some predicates. Based on the updated query, I would write it like this:
SELECT tsc.id
FROM TEST.Services tsc
JOIN DICT.Change spc
ON tsc.serviceid = spc.service_id
AND tsc.startdate > spc.starttime
AND tsc.startdate > spc.starttdate
AND ( tsc.planid = spc.plan_id
OR ( tsc.planid IS NOT NULL AND spc.plan_id = -1 )
)
Ensure that the query is making use of suitable index by looking at the output of EXPLAIN to see the execution plan, in particular, which indexes are being used.
Some notes:
If there are multiple rows from spc that "match" a row from tsc, the query will return duplicate values of tsc.id. (It's not clear why or if we need to return duplicate values. IF we need to count the number of copies of each tsc,id, we could do that in the query, returning distinct values of tsc.id along with a count. If we don't need duplicates, we could return just a distinct list.
GREATEST function will return NULL if any of the arguments are null. If the condition we need is "a > GREATEST(b,c)", we can specify "a > b AND a > c".
Also, this condition:
tsc.PlanId = if(spc.plan_id = -1, tsc.PlanId, spc.plan_id)
can be re-written to return an equivalent result (I'm suspicious about the actual specification, and whether this original condition actually satisfies that adequately. Without example data and sample of expected output, we have to rely on the SQL as the specification, so we honor that in the rewrite.)
If we don't need to return duplicate values of tsc.id, assuming id is unique in TEST.Services, we could also write
SELECT tsc.id
FROM TEST.Services tsc
WHERE EXISTS
( SELECT 1
FROM DICT.Change spc
ON spc.service_id = tsc.serviceid
AND spc.starttime < tsc.startdate
AND spc.starttdate < tsc.startdate
AND ( ( spc.plan_id = tsc.planid )
OR ( spc.plan_id = -1 AND tsc.planid IS NOT NULL )
)
)

Linq- A query body must end with a select clause or a group clause in C#

I am getting error when cross table operation performed as like this.
var rows = from a in db_data.AsEnumerable()
join b in cust_data.AsEnumerable()
on a["SERVICE_ZIP"].ToString().Trim().ToLower() equals b["Zip"].ToString().Trim().ToLower()
where
a["SSS"].ToString().Trim().ToLower() == b["SSS"].ToString().Trim().ToLower() &&
a["ttt"].ToString().Trim().ToLower() == b["ttt"].ToString().Trim().ToLower()
into g
where g.Count() > 0
select a;
DataTable merged;
if (rows.Any())
merged = rows.CopyToDataTable();
else
merged = cust_data.Clone();
Using the into clause allows LINQ comprehension expressions to be chained together, this is very powerful. But each must be a complete comprehension expression.
There is no select or group clause before the into in your code.

Inner join on a partial column in Access

I'm attempting to join two tables in MS Access 2010 where the join condition is part of(A.col1) = B.col2. I've not been able to figure out how to do this so far.
My two tables have these critical columns:
Table 1 has column ICD9 Code-Description* with values like:842.00 - Sprain/strain, wrist924.11 - Contusion, knee
Table 2 has column Dx with values like:842924.11
I have tried these two join criteria:
FROM Table1
INNER JOIN Table2 ON Table1.Replace(LTrim(Replace(Left(ICD9Code-Description],
(InStr(1,[ICD9Code-Description]," "))-1),"0"," "))," ","0")
= Table2.Dx
and
SELECT ICD9
FROM Table2 INNER JOIN
(SELECT Replace(LTrim(Replace(Left([ICD9 Code-Description],
(InStr(1,[ICD9 Code-Description]," "))-1),"0"," "))," ","0") AS ICD9
FROM Table1)
ON Diag.DX = ICD9
Neither of which Access likes.
I'd like to avoid pulling out the join criteria portion into its own column in Table1 if at all possible.
What would be the Access way of doing this?
*Don't hate me for the column name. I didn't create it, I just have to support it.
The Val() function "Returns the numbers contained in a string as a numeric value of appropriate type." (See the Val Function help topic in Access' built-in help system.)
The "neat thing" for your situation is it reads characters from the string until it encounters a character which can't be part of a valid number. But then it doesn't throw an error. It just keeps the numeric characters it's already collected and ignores the rest.
Here's two of your examples in the Immediate window ...
? Val("842.00 - Sprain/strain, wrist")
842
? Val("924.11 - Contusion, knee")
924.11
So Val() should make your JOIN much simpler, even though you would need to also apply it on the Table2.Dx strings ...
FROM
Table1 INNER JOIN Table2
ON Val(Table1.[ICD9Code-Description]) = Val(Table2.Dx)

Using the right MYSQL JOIN

I'm trying to get all the data from the match table, along with the currently signed up gamers of each type, experienced or not.
Gamers
(PK)Gamer_Id
Gamer_firstName,
Gamer_lastName,
Gamer experience(Y/N)
Gamer_matches
(PK)FK GamerId,
(PK)FK MatchId,
Gamer_score
Match
(PK)Match_Id,
ExperiencedGamers_needed,
InExperiencedGamers_needed
I've tried this query along with many others but it doesn't work, is it a bad join?
SELECT M.MatchId,M.ExperiencedGamers_needed,M.InExperiencedGamers_needed,
(SELECT COUNT(GM.GamerId)
FROM Gamers G, Gamers_matches GM
WHERE G.GamerId = GM.GamerId
AND G.experience = "Y"
AND GM.MatchId = M.MatchId
GROUP BY GM.MatchId)AS ExpertsSignedUp,
(SELECT COUNT(GM.GamerId)
FROM Gamers G, Gamers_matches GM
WHERE G.GamerId = GM.GamerId
AND G.experience = "N"
AND GM.MatchId = M.MatchId
GROUP BY GM.MatchId) AS NovicesSignedUp
FROM MATCHES M
What you've written is called a correlated subquery which forces SQL to re-execute the subquery for each row fetched from Matches. It can be made to work, but it's pretty inefficient. In some complex queries it may be necessary, but not in this case.
I would solve this query this way:
SELECT M.MatchId, M.ExperiencedGamers_needed,M.InExperiencedGamers_needed,
SUM(G.experience = 'Y') AS ExpertsSignedUp,
SUM(G.experience = 'N') AS NovicesSignedUp
FROM MATCHES M
LEFT OUTER JOIN (Gamer_matches GM
INNER JOIN Gamers G ON G.GamerId = GM.GamerId)
ON M.MatchId = GM.MatchId
GROUP BY M.MatchId;
Here it outputs only one row per Match because of the GROUP BY at the end.
There's no subquery to re-execute many times, it's just joining Matches to the respective rows in the other tables once. But I use an outer join in case a Match has zero players of eithe type signed up.
Then instead of using COUNT() I use a trick of MySQL and use SUM() with a boolean expression inside the SUM() function. Boolean expressions in MySQL always return 0 or 1. The SUM() of these is the same as the COUNT() where the expression returns true. This way I can get the "count" of both experts and novices only scanning the Gamers table once.
P.S. MySQL is working in a non-standard way to return 0 or 1 from a boolean expression. Standard ANSI SQL does not support this, nor do many other brands of RDBMS. Standardly, a boolean expression returns a boolean, not an integer.
But you can use a more verbose expression if you need to write standard SQL for portability:
SUM(CASE G.experience WHEN 'Y' THEN 1 WHEN 'N' THEN 0 END) AS ExpertsSignedUp

Dev Code - Understanding What I Am Seeing

This is where I start by saying I am not a developer and this is not my code. As the DBA though it has shown up on plate from a performance perspective. The execution plan shows me that there are CI scans for Table2 aliased as D and Table2 aliased as E. Focusing on Table 2 aliased as E. The scan is coming from the subquery in the where clause for E.SEQ_NBR =
I am also seeing far more executions than need be. I know it depends on the exact index structure on the table, but at a high level is it likely that what I am seeing is a CI scan resulting from the aggregate (min) for every match it finds. Basically it is walking the table for the min SEQ_NBR for each match on EMPLID and other fields?
If likely, is it more a result of the manner in which it is written (I would think incorporating a CTE with some ROW_NUMBER logic would help) or lack of indexing? I am trying to avoid throwing an index at it "just because". I am getting hung up on that sub query in the where clause.
SELECT
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
,SUM(D.TL_QUANTITY) 'YTD_TL_QUANTITY'
FROM
Table1 B
,Table2 D
,Table2 E
WHERE
D.TRC = B.TRC
AND B.TL_ERNCD IN ( #0, #1, #2, #3, #4, #5, #6 )
AND D.EMPLID = E.EMPLID
AND D.EMPL_RCD = E.EMPL_RCD
AND D.DUR < = E.DUR
AND D.DUR > = '1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR)
AND E.SEQ_NBR =
( SELECT
MIN(EX.SEQ_NBR)
FROM
Table2 EX
WHERE
E.EMPLID = EX.EMPLID
AND E.EMPL_RCD = EX.EMPL_RCD
AND E.DUR = EX.DUR
)
AND B.EFFDT = ( SELECT
MAX(B_ED.EFFDT)
FROM
Table1 B_ED
WHERE
B.TRC = B_ED.TRC
AND B_ED.EFFDT < = GETDATE()
)
GROUP BY
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
The MIN operation has nothing to do with the CL scan. A MIN or Max is calculated using a sort. The problem is most likely the number of times the subquery is being executed. It has to loop through the subquery for every record returned in the parent query. A CTE may be helpful here depending on the size of Table2, but I don't think you need to worry about finding a replacement for the MIN() ... at least not yet.
Correlated subqueries are performance killers. Remove them and replace them with CTEs and JOINs or derived tables.
Try something like this (not tested)
SELECT
D.EMPLID
,D.JOBCODE
,D.DEPTID
,E.DUR
,SUM(D.TL_QUANTITY) 'YTD_TL_QUANTITY'
FROM Table1 B
JOIN Table2 D
ON D.TRC = B.TRC AND D.EMPLID = E.EMPLID
JOIN Table2 E
ON D.EMPL_RCD = E.EMPL_RCD AND D.DUR < = E.DUR
JOIN (SELECT MIN(EX.SEQ_NBR)FROM Table2) EX
ON E.EMPLID = EX.EMPLID
AND E.EMPL_RCD = EX.EMPL_RCD
AND E.DUR = EX.DUR
JOIN (SELECT MAX(B_ED.EFFDT)
FROM Table1
WHERE B_ED.EFFDT < = GETDATE()) B_ED
ON B.TRC = B_ED.TRC
WHERE B.TL_ERNCD IN ( #0, #1, #2, #3, #4, #5, #6 )
AND D.DUR > = '1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR)
As far as the implicit join syntax, do not allow anyone to ever do this again. It is a poor programming technique. As a DBA you can say what you will and will not allow in the database. Code review what is coming in and do not pass it until they remove the implicit syntax.
Why is is bad? In the first place you get accidental cross joins. Further, from a maintenance perspective, you can't tell if the cross join was accidental (and thus the query incorrect) or on purpose. This means the query with a cross join in it is unmaintainable.
Next, if you have to change some of the joins later to outer joins and do not fix all the implict ones at the same time, you can get incorrect results (which may not be noticed by an inexperienced developer. In SQL Server 2008 you cannot use the implicit syntax for an outer join, but it shouldn't have been used even as far back as SQl Server 2000 because Books Online (for SQL Server 2000) states that there are cases where it is misinterpreted. In other words, the syntax in unreliable for outer joins. There is no excuse ever for using an implicit join, you gain nothing from them over using an explicit join and they can create more problems.
You need to educate your developers and tell them that this code (which has been obsolete since 1992!) is not longer acceptable.
This a quick one, but this, CAST('1/1/' + CAST(DATEPART(YEAR, E.DUR) AS CHAR) AS DATETIME), it likely causing a table scan on Table2 E because the function likely has to be evaluated against each row.