ab initio join component to determine the mismatch join key - ab-initio

I have two file of the same format and columns and i am comparing them by passing them in a joiner and setting join key to all the fields.
file a:
ID DESC CODE COMMENT VALUE
1 AFAF 34 GDG 34
2 DGF 45 DGDF 45
file b:
ID DESC CODE COMMENT VALUE
1 AFAF 34 XXX 34
2 XXX 45 DGDF 45
IN jOINER, I am setting join key as {ID},{DESC},{CODE},{COMMENT},{VALUE}
Using the example file both records will go into Unused port.
My question here is, is it possible to get to know due to which field the record is rejected.
As in is it possible to get the below output
1 AFAF 34 XXX 34 Comment mismatch
2 XXX 45 DGDF 45 DESC mismatch
Graph used:
Input file---->Reformat-------
|----Joiner----Output
Input file 2---->Reformat----- --Unused
strong text

Possible but you have to fix/decide your key. Seems like you want to match these files on id as the key or add any if you want to have more keys. By having that as your join key then write your transform function something like this.
out.comments::if((string_upcase(string_lrtrim(in0.DESC))) != (string_upcase(string_lrtrim(in1.DESC)))) "desc mismatch" else if(do the same on other columns);
By doing this you will be able to get records matching on the keys provided with comments on which key doesn't match.

if you want to identify the records or the field due to which there is a difference then you can use dynamic compare records as well.

Related

Update a value and a table with a foreign key

ticketId(**)
timeExpected
timeElapsed
187
5
5
225
4
8
856
8
15
782
10
8
**primary key
*foreign key
id(**)
(*)ticketId
beyondTime
1
187
0
2
225
1
3
856
1
4
782
0
I have to know which ticket his out of time and I have this in mind and in my database but I can't figure it out with SQL. I want to know when a ticket is out of time like the ticket number 225 is, I would like to update the other table with a binary 1 for "out of time" and 0 "good".
I don't know if I can update the "beyondTime" table when I do "timeExpected - timeElapsed" in the first table when a ticket exceed the time expected.
First, let's simplify the problem you're looking a solution for:
Comparing two data field values for the rows in a table
I believe that's basically what you're trying to do. And of course, you can make a decision based on the comparison and act accordingly like updating some values in another table if you want to.
But if all you're looking for is kind of a report, then all you need would be just a select statement equipped with the logic doing the comparison for you.
You can use SQL Case Statement for this. Here is an easy guide if you want to take a look.
The select statement you need would be like this:
select ticketId, (case when timeElapsed > timeExpected then 1 else 0 end) as beyondTime from tickets
The result would be like this:
Two things to remember:
Don't get disappointed by the negative points and keep asking questions :)
Identify the issue you need help with and be specific with your question. I'd suggest you to update this question.
Good luck!

Ordering MySQL 8 results by count existence in a crosswalk table

I have the following MySQL 8 tables:
[submissions]
===
id
submission_type
name
[reject_reasons]
===
id
name
[submission_reject_reasons] -- crosswalk joining the first 2 tables
===
id
submission_id
reject_reason_id
In my application, users can submit submissions, and other users can request changes to those submissions. When they request these rejections, 1+ entries get saved to the submission_reject_reasons table (which stores the ID of the submission for which rejections are requested, as well as the ID of the reason for why the rejection is being made). So a typical entry in the table might look like:
id submission_id reject_reason_id
==============================================
45 384 294
Where submission_id = 384 is the "Fizz Buzz" submission and reject_reason_id = 294 is the "Missing Required Field" reason.
I currently have a query that fetches all the reject_reasons out of the DB:
SELECT * FROM reject_reasons
I now want to modify this query to sort the results based on their usage frequency. Meaning the query might currently return:
294 | Missing Required Field
14 | Malformed Entry
1885 | Makes No Sense
etc. But lets say there are 5 entries in the submission_reject_reasons table where 294 (Missing Required Field) is the reject_reason_id, and say there are 15 enries where 1885 (Makes No Sense) is present, and 120 entries where 14 (Malformed Entry) are present. I need a query that returns all reject_reasons sorted by their count in the submission_reject_reasons (SRR) table, descending, so that the most frequently used appear earlier in the sort. Hence the result set would be:
14 | Malformed Entry --> because there are 120 instances of this in the SRR table
1885 | Makes No Sense --> because there are 15 instances in the SRR
294 | Missing Required Field --> because there are only 5 instances in the SRR
Furthermore, I need a ranking from most-used to least-used. If a reason doesn't exist in the SRR table it should have a default "count" of zero (0) but should still come back in the query. If 2+ reason counts are tied, then I don't care how they are sorted. Any ideas here? I need the final result set to only contain the rr.id and rr.name field/values.
My best attempt is not getting me anywhere:
SELECT rr.id, rr.name
FROM reject_reasons AS rr
LEFT JOIN submission_reject_reasons AS srr on rr.id = srr.reject_reason_id
GROUP BY rr.id
ORDER BY COUNT(*) DESC
Can anyone help me over the finish line here? Can anyone spot where I'm goin awry? Thanks in advance!
You should be grouping by the reject reason ID. COUNT(*) is what you want to count in each group.
SELECT rr.id, rr.name
FROM reject_reasons AS rr
JOIN submission_reject_reasons AS srr on rr.id = srr.reject_reason_id
GROUP BY rr.id
ORDER BY COUNT(*) DESC
There's no need for any EXISTS check, since the INNER JOIN won't return any reject reasons that don't exist in submission_reject_reasons.

FileMaker - Total SubSummary Values

I have a table with records each representing an appointment. I have the name of the contactthe appointment is with, and the date. In another table I have a field that contains how many appointments each contact is supposed to have during the day. There are 12 entries for each contact, because some are expected to have different numbers during different months.
I am able to call up the data for the appropriate contactfor the appropriate month. It looks great in the graph when I count up the number of entries for Contact A and put next to it the expected number of entries from the related table.
The problem I'm running into now is that I need to add up all of the expected appointments between all of the entities. So:
::ContactName:: ::appointments:: ::expected::
Contact A 12 10
Contact B 33 34
Contact C 18 27
Getting the roll up for the actual appointments is easy, a simple COUNT summary field in a subsubsummary section. But what of the expected? Because ContactA had 12 appointments that means that there will be 12 records for them, so putting a summary field for the expected column is would return 120 for all Contact A's. Instead, given the dataset above, I need the calculation to return 71. Does this issue make sense? Any help would be greatly appreciated.
If I am following this correctly, you need to divide the amount of expected appointments between the entries of the group, then total the result. So something like:
Sum ( Entities::Expected ) / GetSummary ( sCount ; EntityID )
(this would be easier if we knew the names of your tables and fields).
P.S. The term "entity" has a specific meaning in the context of a relational database. Consider using another term (e.g. "contacts").
Added:
Using your example data, you should see the following results in the above calculation field:
in the 1st group of 12 records: 10 / 12 = .8333333333333333
in the 2nd group of 33 records: 34 / 33 = 1.0303030303030303
in the 3rd group of 18 records: 27 / 18 = 1.5
When you sum all this up (using a summary field defined as Total of this calculation field), you should get 71 (or a number very near 71, due to rounding errors).
Note: in the above calculation, sCount is a summary field defined in the Appointments table as Count of [ any field that cannot be empty ], and EntityID is the field by which your records are sorted and grouped (and it must be a local field).

Iterate through a column and summarize findings

I have a table (t1) in mySQL that generates the following table:
type time full
0 11 yes
1 22 yes
0 11 no
3 13 no
I would like to create a second table (t2) from this that will summarize the information found in t1 like the following:
type time num_full total
0 11 1 2
1 22 1 1
3 13 0 1
I want to be able to iterate through the type column in order to be able to start this summary, something like a for-loop. The types can be up to a value of n, so I would rather not write n+1 WHERE statements, then have to update the code every time more types are added.
Notice how t2 skipped the type of value 2? This has also been escaping me when I try looping. I only want the the types found to have rows created in t2.
While a direct answer would be nice, it would be much more helpful to be pointed to some sources where I could figure this out, or both.
This may do what you want
create table t2 if not exists select type, time, sum(full) num_full, count(*) count
from t1
group by type,time
order by type,time;
depending on how you want to aggregate the time column.
This is a starting point for reference on the group by functions : https://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
here for create syntax
https://dev.mysql.com/doc/refman/5.6/en/create-table.html

a query that tells the number of submits for each location along with the location's name

My users submit information for locations. I'm trying to write a query that will tell me the number of submits they've done for each location along with that location's name.
This query seems to be a start in the right direction. It returns the id for each location that the user has submitted information.
SELECT reports.location_id
FROM reports
WHERE reports.user_id =104
ORDER BY reports.locations_id
An example of the return from this query is:
locations_id
99
99
99
112
115
115
For my final html output; however, I would like to show something more along the lines of this:
location_name number_of_submits
name_1 3
name_2 1
name_3 2
Is there a mysql query I could use to get this? Or would I need to use php to iterate through my query's results (i.e. recognize 99 was returned 3 times and fetch 99's name from the locations table, then recognize 112 was returned once and fetch its name, and so on).
Thank you...
Use GROUP BY / COUNT:
SELECT
location_name,
COUNT(*) AS number_of_submits
FROM reports
JOIN markers ON markers.id = reports.location_id
WHERE reports.user_id = 104
GROUP BY reports.location_id