I've been stuck on this problem for a couple of days now.
The structure of the data I'm working with is that each quote has a web stage and each client can have multiple quotes. I need to establish which quote(s) for each client has the highest web stage (web stage is a numerical field from 1-6) and remove from the data the quotes that aren't at the max stage(two or more quotes could be at the same web stage).
I need to do it this way because there is some information held at the quote level that I need to show at the client level and if I let all the data in then my number of clients gets inflated.
Universe or query level solutions would be greatly appreciated.
The data structure and results I'm hoping to get look like this:
Data Structure & Results
Many thanks in advance for any help.
Tom
Couple of ways to do it, either in the universe, via a subquery in the report, or report variables. Here's the report variable method:
Create a new report variable named [IsMax], with this definition:
=If [Web Stage] = Max([Web Stage]) In ([Client ID]) Then 1 Else 0
Add a filter to the report, where [IsMax] is 1.
Related
I am trying to accomplish something that is pretty easy to do in SQL, but seemingly very challenging to do in SSIS without using SQL. Basically, I need to consolidate and concatenate a field of a many-to-one relationship.
Given entities: [Contract Item] (many) to (one) [Account]
There is a field [ari_productsummary] that contains the product listed on the Contract Item entity. We want to write that value to the Account as [ari_activecontractitems]. However, an Account may have more than one Contract Item record associated to it, in which case, we want to concatenate those values. We also only want the distinct values to be concatenated (distinct rows already solved within my data flow).
This can be accomplished by writing to a temporary table, and then using a query or view to obtain the summarized results as followed. I created a SQL table called TESTTABLE that contains the [ari_productsummary] from the Contract Item entity along with the referring [accountid] to map it back to Account. I then wrote the following query as a view:
SELECT distinct accountid,
(SELECT TT2.ari_productsummary + '; '
FROM TESTTABLE TT2
WHERE TT2.accountid = TT.accountid
FOR XML PATH ('')
) AS 'ari_activecontractitems'
FROM TESTTABLE TT
Executing that Query provides me the results that I want, which I can then use for importing into the Account entity as shown below:
But how do I do this in a SSIS dataflow without writing to a SQL table as a temporary placeholder for the data?? I want to do the entire process inside one dataflow container, without using a temporary SQL table/view. The whole summarization process needs to be done on the fly:
Does anyone have a solution that doesn't require a temporary SQL table/view/query, but is contained entirely within the data flow?
I am using VS 2017 and the KingswaySoft Dynamic CRM 365 ETL toolset to develop my solution/package.
Spit balling here as I don't Dynamics nor do I have the custom components.
Data Flow 1 - Contract aggregation
The purpose of this data flow is to replicate your logic in the elegant query you provided and shove that into a Cache Connection Manager (see Notes for 2008+ at the end)
KingswaySoft Dynamics Source -> Script Task -> Cache Transform
If you want to keep the sort in there, do it before the script task. The implementation I'll take with the Script Task is that it's fully blocking - that is all the rows must arrive before it can send any on. Tasks like the Merge Join are only partially blocking because the requirement of sorted data means that once you no longer have a match for the current item, you can send it on down the pipeline.
The Script Task is going to be asynchronous transformation. You'll have two output columns, your key accountid and your new derived column of ari_activecontractitems. That column will might need to be big - you'll know your data best but if it's a blob type in Dynamics (> 4k unicode or > 8k ascii characters) then you'll have to define the data type as DT_TEXT/DT_NTEXT
As inputs, you'll select accountid and ari_productsummary from your source.
The code should be pretty easy. We're going to accumulate the inbound data into a Dictionary.
// member variable
Dictionary<string, List<string>> accumulator;
The PreProcess method, we'll tack this in there to initialize our variable
// initialize in PreProcess method
accumulator = new Dictionary<string, List<string>>();
In the OnBufferRowSent (name approx)
// simulate the inbound queue
// row_id would be something like Rows.row_id
if (!accumulator.ContainsKey(row_id))
{
// Create an empty dictionary for our list
accumulator.Add(row_id, new List<string>());
}
// add it if we don't have it
if (!accumulator[row_id].Contains(invoice))
{
accumulator[row_id].Add(invoice);
}
Once you get the signal sent of no more data available, that's when you start buffering output data. The auto generated code will have placeholders for all this.
// This is how we shove data out the pipe
foreach(var kvp in accumulator)
{
// approximately thus
OutputBuffer1.AddRow();
OutputBuffer1.row_id = kvp.Key;
OutputBuffer1.ari_productsummary = string.Join("; ", kvp.Value);
}
We have an upcoming release that comes with a component that does exactly what you are trying to achieve without the need of writing custom code. The feature is currently under preview, please reach out to us for private access to the feature. You can find our contact information on our website.
UPDATE - June 5, 2020, we have made the components available for public access at https://www.kingswaysoft.com/products/ssis-productivity-pack/ as a result of our 2020 Release Wave 1. We have two components available that serve this kind of purpose. The Composition component will take input values and transform into a composite value in a SSIS column. The Decomposition component does the opposite, it would take an input value and split it into multiple rows using either delimiter-based text splitting or XML/JSON array splitting.
forgive me for shoddy coding/description of the problem. I'm new to programming and this is my first question!
Anyways, I am building a simple inventory system with tkinter while using mySQL as a database. Currently, I am working on a feature that would allow a user to pick a department using an Optionmenu and then get all the items in that department. I have the items listed in one table and the departments listed in another with a FOREIGN KEY connecting the items table to the primary key (department_string) in the departments table.
My goal is to have mySQL deliver a list of departments and then to have the Optionmenu use that list for it's options. I then need to query the database with the department selected in the Optionmenu to find all the items in that department. My problem is that variable.get() from the Optionmenu returns parentheses and commas that is first received when I query the database the first time. This makes it so I cannot directly input the variable.get() into the string in the cursor. Here is the code:
department_cursor.execute("SELECT department_string FROM departments")
department_list = department_cursor.fetchall()
variable = StringVar(search_window)
variable.set(department_list[0])
user_entry = OptionMenu(search_window, variable, *department_list)
***
cursor_b.execute("SELECT item WHERE department_string = " + "'" + str(variable.get()) + "'")
I believe the problem is that variable.get() provides the special characters like the parentheses and comma that came from the original mySQL query. For example if the departments are HR, Warehouse, R&D then mySQL returns [('HR',), ('Warehouse',), ('R&D',)] this is then fed into the Option menu which then variable.get() spits out something like ('HR',), and so mySQL doesn't recognize this.
So far the only things I can think of is to use a for loop to delete all the special characters in what the Optionmenu returns or to hard code what the string for each department should be. Both seem suboptimal and although I'm pretty new to programming I think it's a little too much like rube goldberg machine.
Anyways, if you made it this far, thank you so much for reading this! Once again, I'm brand new to all of this so any help you can give me is greatly appreciated!
.get() is going to return whatever is in the optionmenu. If the values you put into the optionmenu have the undesirable characters, the value you get out will too.
The database call is going to return a list (rows) of lists (columns), and it doesn't look like you're taking that into account.
I have three seperate SPSS files with information about roughly 7500 hemicolectomy patients. One file contains the information about the hemicolectomies, the second one about other surgeries the patients have had during their lifetime and the last one contains information about their sick leaves during their lifetime.
I have merged (idnumber is the common variable) the files to a single SPSS document but i ran into a problem with filtering out the surgeries and sick leaves that have nothing to do with the hemicolectomy. I'm quite new to SPSS so the simplest way i could think of doing this is by somehow copying the hemicolectomy info to every case and then just using the date/time calculator to choose which sick leaves and surgeries to discard. Switching to wide format is unpractical due to the large number of unrelated surgeries and sick leaves: I'd have thousands of variables.
So basically I'd like to do the following:
IF idnumber = idnumber THEN variable1=variable1 AND variable2=variable2 etc
How would I go about doing this?
All help will be appreciated!
the IF command can only be used with one transformation:
IF [condition] [transformation].
Assuming both of your files are sorted by idnumber:
UPDATE file=[master_file_reference]
/file=[secondary_file_reference]
/BY idnumber.
EXECUTE.
The file reference can be made either by their dataset name, or by their full path.
More on the UPDATE command:
https://www.ibm.com/support/knowledgecenter/en/SSLVMB_24.0.0/spss/base/syn_update_examples.html
I cant comment yet, so Im sorry if I misunderstand the problem. I wouldve asked for clarification in the comments to the question... here goes...
So you have three sources of data which have dates (?) of hemicolectomies, one for each case; dates (?) of other surgeries, multiple for each case; and sickleaves even more for each case. Is that right?
I'd try solving the problem before matching all three file by matching the file that contains one observation per patient (presumably hemicolectomies) to the one with the second most observations (presumably other surgeries) per patient with the /table keyword:
MATCH FILES /FILE= 'surgeries.sav' /table = 'hemicolectomies.sav'
/by idnumber.
EXECUTE.
this will "fill up" the blank cells for each patient with the hemicolectomy data.
now use the datetime to check which surgeries "belong" to the hemicolectomies, thus reduce your data and match it to the sickleave data using the /table keyword again.
Seems like the easiest solution to me.
I am new to Talend os.
However, I received a task:
Create file delimited .csv metadata (one for Lead & Opportunity).
Move files to your repository on the AWS server (the etl_process1 login).
Create two tables sfdc_leads_reporting_raw and sfdc_opp_reporting_raw.
Load the data from the files into the tables. Ensure the data types are correctly used when creating metadata schemas & tables.
Till step 4 I am done.
Now the problem is:
How to Implement logging at the end of each job to report the number of leads (count of distinct id in leads table) and number of opportunities created (count of opportunity id) by stages (how many converted, qualified, closed won, and dead)?
Help would be appreciated.
You can get this data using global variables, in a subjob at the end of your job. Most components provide a global variable called tComponent_NB_LINE (or _NB_LINE_INSERTED for database components) that gives you the number of lines output by the component.
For instance tFileOutputDelimited_1_NB_LINE or tOracleOutput_1_NB_LINE_INSERTED.
Using these variables you can log into console or file.
Here is a simple example. If you have a tOracleOutput_1 in your job you can do:
tPostJob -- OnComponentOk -- tFixedFlowInput -- Main -- tLogRow
Inside tFixedFlowInput you retrieve the variable
(Integer)globalMap.get("tOracleOutput_1_NB_LINE_INSERTED")`.
If you need to log aggregated info, you can append a tAggregateRow to your output components, and use tSetGlobalVar to get count by certain criteria.
The 1,500 page Access 97 Bible (don't laugh!) that I've been given by my boss to solve his problem doesn't solve my problem of how to solve his problem, because it has nee VBA code.
Let me first make clear that I've made attempts to solve this without (much) coding, and that I've coded quite a bit in VBA already, so I'm basically familiar with most things including recordsets, queries, etc etc but have problems with MS Access limits on how to form a report with data coming from VBA variables. I'm also versatile in most programming languages, but this is not a language problem but rather a "how to/what's possible" problem.
My problem right now is that dragging the query fields into the Detail subform and putting them into cells in columns setting Left and Top with VBA code are moving them alright, but each cell is on a new page. Unfortunately, there is multiple data in each cell that won't conform to the Create Report Guide options available.
So my question is simply this: Can someone point me to working examples of code that create, place, and fill with VBA variable strings, text fields at any coordinate I please on a paper size of my choice?
Edit: The above is not an option, as I understand this will prohibit the client from getting an .mde database. What remains, then, is to merely ask for some sound advice on how to get several rows GROUPed BY weekday and machine (see below) into a recordset or similar for each cell. I guess the best way is to count the number of columns in the table (machines in the sql result) and create 5 rows of these with dummy data, then go through the result rows and place the data in the relevant controls. But if you have ideas for doing this work better and faster, write them as answers.
Sorry for this, I knew there was something I wasn't understanding. Basically, I thought Access supported creating reports dynamically via VBA, ie. "generating pages with data" rather than "preparing a flow of controls connected to datasources". But Access requires that you create an ample amount of dummy, unlinked controls manually, then either fill or hide them and that's how they become "dynamic".
This is for Access 2003 on a remote server accessing local and remote ODBC SQL database tables, if relevant. The goal is to make a week schedule of n columns (n=number of machines at a certain plant) x 5 rows (weekday Mon-Fri), and put 1 or more recordset rows (=scheduled activities for that day on that machine) in each of the "n by 5 table" cells.
If you detect venting frustration in this post I can only ask your forgiveness and hope for your understanding.
So, has many techniques for this:
Ex: 1) using dinamic sql for this:
'Create a function to make sql query
Function MakeMySQlReport(Parameters):
Dim strSql as string
Dim strMyVar as string
strsql = vbnullstring
strsql = "Select " & myVar1 & " as MyFieldVar1, * from myTable where Fieldx =" & Parameters
MyReport.recordSource = ssql
End Function
Ex: 2) create function that returns yours strings:
Function MyString1() as string
MyString1 = 'ABC'
end Function
An in your report, select the textbox will receive the value and type =MyString1()]
I hope this help to you, need more examples?
Solution:
Create many objects manually (grr!)
name them systematically
put them in a Control Array (get all Me.Controls, sift out the ones you're interested in, and put them in an indexed array)
go through the array and change their properties