I am new to MS Access and am trying to build a database that includes a text roll up (effectively summarising staff feedback collectively to the manager). I have set it up using ConcatRelated (http://allenbrowne.com/func-concat.html) and this works as expected. The only issue is that I cannot seem to ensure that only the first instance of a value is returned, rather than every instance. For example:
The function is bringing back the top view, but I only want it to bring back the Distinct values - please note, as this is a manager view, the same value can appear multiple times in the data retrieved. As such, I almost need the text to de-dupe if it is already there.
Full disclosure - my SQL is not great, so I am using the Expression Builder from the design view
Any help will be appreciated
EDIT for more detail:
The image shows a sample of the data and the output. In this, Mary is the team leader and so is responsible for the areas of John, Steven, Erin, and Harriet. The UID refers to the responsibility area.
As you can see, the dataset has the "minimum target" referenced against the area of responsibility and the output I am getting is duplicating the comments (which I presume is because the value is returned twice), but I am trying to ensure that commentary is summarised, not duplicated.
Note, in the summary output, I am not interested in the areas of responsibility, just the commentary against the staff names.
I hope that makes sense
Build a query that retrieves distinct Comments values for each person:
SELECT DISTINCT Person, Comments FROM table;
or
SELECT Person, Comments FROM table GROUP BY Person, Comments;
Now reference that query in ConcatRelated function as source for Comments values.
Be aware that calling a function like this in query along with aggregating can cause slow performance with large dataset.
Related
I have job in Talend that is designed to bring together some data from different databases: one is a MySQL database and the other a MSSQL database.
What I want to do is match a selection of loan numbers from the MySQL database (about 82,000 loan numbers) to the corresponding information we have housed in the MSSQL database.
However, the tables in MSSQL to which I am joining the data from MySQL are much larger (~ 2 million rows), are quite wide, and thus cost much more time to query. Ideally I could perform an inner join between the two tables based on the loan number, but since they are in different databases this is not possible. The inner join that is performed inside a tMap occurs after the Lookup input has already returned its data set, which is quite large (especially since this particular MSSQL query will execute a user-defined function for each loan number).
Is there any way to create a global variable out of the output from the MySQL query (namely, the loan numbers selected by the MySQL query) and use that global variable as an IN clause in the MSSQL query?
This should be possible. I'm not working in MySQL but I have something roughly equivalent here that I think you should be able to adapt to your needs.
I've never actually answered a Stackoverflow question and while I was typing this the page started telling me I need at least 10 reputation to post more than 2 pictures/links here and I think I need 4 pics, so I'm just going to write it out in words here and post the whole thing complete with illustrations on my blog in case you need more info (quite likely, I should think!)
As you can see, I've got some data coming out of the table and getting filtered by tFilterRow_1 to only show the rows I'm interested in.
The next step is to limit it to just the field I want to use in the variable. I've used tMap_3 rather than a tFilterColumns because the field I'm using is a string and I wanted to be able to concatenate single quotes around it but if you're using an integer you might not need to do that. And of course if you have a lot of repetition you might also want to get a tUniqueRows in there as well to save a lot of unnecessary repetition
The next step is the one that does the magic. I've got a list like this:
'A1'
'A2'
'B1'
'B2'
etc, and I want to turn it into 'A1','A2','B1','B2' so I can slot it into my where clause. For this, I've used tAggregateRow_1, selecting "list" as the aggregate function to use.
Next up, we want to take this list and put it into a context variable (I've already created the context variable in the metadata - you know how to do that, right?). Use another tMap component, feeding into a tContextLoad widget. tContextLoad always has two columns in its schema, so map the output of the tAggregateRows to the "value" column and enter the name of the variable in the "key". In this example, my context variable is called MyList
Now your list is loaded as a text string and stored in the context variable ready for retrieval. So open up a new input and embed the variable in the sql code like this
"SELECT distinct MY_COLUMN
from MY_SECOND_TABLE where the_selected_row in ("+
context.MyList+")"
It should be as easy as that, and when I whipped it up it worked first time, but let me know if you have any trouble and I'll see what I can do.
Creating a report that is a display of a persons online application, so there are text boxes and multiple tables per person. I need each person to print on the same page, so a user can select either a particular application, or a date range, like all applications for today and yesterday. I currently have everything in a LIST object with a page break set on it, however the tables in the LIST would throw the "detail member with inner members" error. I found a a way around that using the solution here: http://blogs.lessthandot.com/index.php/datamgmt/dbprogramming/reporting-services-error-the-tablix/ , which got rid of the error, but any multiple row tables return a new row per page, so a person with 3 aliases, will have a 3 page report. So, I am looking for a new tutorial on how to keep everything on one page, but allow my tables to return all results on the same page. Thanks.
I don't know if you are writing the query out or using a stored procedure. If you are using a stored procedure if would make it a lot easier to use SSRS. But to get the grouping correct find a common record between the user and the aliases. Then be sure to pull that common identifier into the stored procedure. Then use the table wizard and pull in all the information you want to see and pull the common identifier into the row group box and it will format for you and group that person on the common identifier and hopefully give you the results you are looking for.
I have a dataset that includes details of who is providing certain services. I've been asked to provide a report that splits out how many cases each provider is working on (and the assets involved). So far so good, I've got a report that is grouped by provider and sums the assets and counts the number of cases.
Now I've been asked to add in how many cases they are working as an advisor on. The cases where they are advisor will not be cases that they are working on, so I'm a little stumped on how to pull that information into the table.
All I need to do is add something like:
=sum(iif(Fields!Advisor.value = Fields!Provider.value,1,0),"DataSet1")
but the issue is that I need the Fields!Advisor to be looking at the whole dataset, and the Fields!Provider to use the current provider within the group.
I tried using ReportItems!Textbox.value to refer to the cell that contains the name of the provider, as follows:
=sum(iif(Fields!Advisor.value = ReportItems!Textbox10.value,1,0),"DataSet1")
but it gives an error that I can't use an aggregate function over a ReportItem. I don't want to use the aggregate function over the report item - I want to treat it as a constant!
Any ideas?
I am a little new to building this but have come a long way.
I have built a db using Access 2007. I have a table that shows the employees info:
Lname
Fname
Status
HireDate
TermDate
(Status: they are either inactive (potential Hires), Active or Terminated)
I can run a query that will show me all the employees by hire date or run one to show term dates.
We would like to have a query that will give us a count of how many drivers are still there within a given month.
Say Joe Smith was hired on 01/01/2008 and was terminated on 05/15/2011. If I ran a report in 2011 on May 31st how would I need to build the query to show this employee as being there in the month of May?
I have used >=Date() and others. I could use between #05/01/2011# and #06/01/2011# in the criteria under TermDate but if there is not a date there, nothing shows up. I have even dropped down a line and added "Null" and still nothing or I get all the employees that are still there and the ones that was terminated before the dates. I'm not sure what I am doing wrong.
I'm unsure about the logic for the filter criteria on this one. I think your goal is to identify all drivers who were on staff during any part of May 2011. My best guess is you need at least 2 conditions to identify them.
HireDate prior to June 1, 2011
TermDate either Null or >= May 1, 2011
If those conditions are sufficient, the SQL could be fairly easy.
SELECT e.Lname, e.Fname
FROM employees_info AS e
WHERE
e.HireDate < #2011-6-1#
AND
(
e.TermDate Is Null
OR
e.TermDate >= #2011-5-1#
);
It sounds like you're building the query in Design View ... which is a good and helpful feature. However, it's difficult to describe how to build that query in Design View. So I suggest you create a new query, switch to SQL View and paste in that SQL text. Replace employees_info with your actual table name, and fix any field names I misspelled.
If that query runs without error, you can flip back and forth between Design and SQL view, make a change in one, and examine how it is represented in the other view.
The SQL doesn't have to be formatted the way I wrote it. I chose that way in hopes it would make the WHERE logic clear. And if you make changes to the query from Design View, Access will reformat the SQL as it sees fit. However, the formatting change should not break the query.
I used yyyy-m-d format for the literal date values. That format avoids any possible confusion over which parts represent day and month, such as whether #05-01-2011# is intended to represent May 1st or Jan 5th. However, when you alter the query, Access may change them to mm-dd-yyyy format. (Sometimes its "helpful" impulses are annoying.)
I'm puzzled about one point. It seems you have one record per employee. If that is so, and an employee can leave for any reason and be re-hired later, it would be difficult to capture the different employment terms in a single record. If you're facing that situation, you may need to revise table designs.
If I misinterpreted your data, please show us a brief data sample, and the output you want from the query based on that sample. Good luck with this.
This is quite a strange problem, wasn't quite sure how to title it. The issue I have is some data rows in an SSIS task which need to be modified depending on other rows.
Name Location IsMultiple
Bob England
Jim Wales
John Scotland
Jane England
A simplifed dataset, with some names, their locations, and a column 'IsMultiple' which needs to be updated to show which rows share locations. (Bob and Jane's rows would be flagged 'True' in the example above).
In my situation there is much more complex logic involved, so solutions using sql would not be suitable.
My initial thoughts were to use an asyncronous script task, take in all the data rows, parse them, and then output them all after the very last row has been input. The only way I could think of doing this was to call row creation in the PostExecute Phase, which did not work.
Is there a better way to go about this?
A couple of options come to mind for SSIS solutions. With both options you would need the data sorted by location. If you can do this in your SQL source, that would be best. Otherwise, you have the Sort component.
With sorted data as your input you can use a Script component that compares the values of adjacent rows to see if multiple locations exist.
Another option would be to split your data path into two. Do this by adding a Multicast component. The first path would be your main path that you currently have. In the second task, add an Aggregate transformation after the Multicast component. Edit the Aggregate and select Location as a Group By operation. Select (*) as a Count all. The output will be rows with counts by location.
After the Aggregate, Add a Merge Join component and select your first and second data paths as inputs. Your join keys should be the Location column from each path. All the inputs from path 1 should be outputs and include the count from path 2 as an output.
In a derived column, modify the isMultiple column with an expression that expresses "If count is greater than 1 then true else false".
If possible, I might recommend doing it with pure SQL in a SQL task on your control flow prior to your data flow. A simple UPDATE query where you GROUP BY location and do a HAVING COUNT for everything greater than 1 should be able to do this. But if this is a simplified version this may not be feasible.
If the data isn't available until after the data flow is done you could place the SQL task after your data flow on your control flow.