Is it possible to print both the data and count of the data in SQL query? - mysql

I have a huge SQL query with multiple sub-tables (sub-table output is not retained) that is generating a final output in a specific format (format is fixed as it is expected by another service). I want to also generate some counts of the sub-tables in the query. A simple solution would be to re-write and run the query again, and get the counts.
However, I want to avoid double executions/computations of the same things and don't want to store the sub-tables either. I want to be able to either append these counts to the data output (will need to modify the service to ignore these count values accordingly) OR be able to write these counts to another location.
I'm using 'unload to S3' command (https://docs.aws.amazon.com/redshift/latest/dg/t_Unloading_tables.html), where all the results are stored in a S3 location.
Is it possible to achieve this? If so, how?

Related

How can I create a table that uses an equation to average data from another table?

I have a table that contains data from repeated experiments (for example, site A has one sample, and the lab processed the sample three times obtaining slightly different values). I need to average these results in a separate table, but what I have read on the Microsoft support site is that a query that pulls data into another table with a calculated field is not possible on Access.
Can I query multiple data points from one table into a single calculated field in another table? Thank you.
UPDATE
I ended up doing a lot of manual adjustments of the file format to create a calculated field in the existing table that averages each sites data, so my problem is, for my current purposes, solved. However I would still like to understand. Following up with you both, I think the problem was that I had repeated non-unique IDs between rows when I probably should have made data columns with unique variable names so that I could query each variable name for an average.
So, instead of putting each site separately on the y axis, I formatted it by putting the sample number for each site on the x-axis:
I was able to at least create a calculated field using this second format in order to create an average value for each site.
Would have there been a way to write a query using the first method? Luckily, my data set was not at all very hefty, so I could handle a reformat manually, but if the case were with thousands of data entries, I couldn't have done that.
Also, here is the link to the site I mentioned originally https://support.office.com/en-ie/article/add-a-calculated-field-to-a-table-14a60733-2580-48c2-b402-6de54fafbde3.
Thanks all.

what is the best approach to find duplicates in my Db table

In my app the user can select multiple filter options. I store this in a DB table.
For example
User 1 can select filters A^B
User 2 can select filters AORC^D
and so forth.
The way it is stored in Db is
user filter_selected
user1 A^B
user2 AORC^D
Now the criteria is no user can have the same filters selected. So if user 3 comes and select A^B or B^A it should throw a error.
I am trying to come up with a smart logic to validate this in javascript.
One approach is go through all the users in the DB (can be many) and sort alphabetically and check if its the same. So in our example A^B and B^A will be the same AB^. This way I can check. Any other better approach may be using mysql command itself ?
you can sort your filter rule based on character and then insert it to do
for example, B^A will convert to AB^ and when you want to check you can sort your filter and then search it
if you want to have an original filter you don't care about the size of your database and more you care about speed you can save original as another column too.if you are care about size of database you can just save the original filter and when you want to search select the rows that have the same length as your filter and then you need to sort alphabetically or you can save index of every filter chars for example when you change A^B to AB^ you can save this filter AB^|021 but this will need to some more space too like original column and I don't suggest this method. also if your filters are always in small length you can don't fetch all record and compare to all. you can just create all possible way of the filter(for example AB^ A^B B^A BA^ ^AB ^BA) but you must be careful because in this method you are creating n! string and this is not good at all, just for too small length string its ok and that's when you have too many records in your database this method can be good

MS-Access Form for multiple queries

Ok here is my dilemma and I am sure more experienced Access users have an easy solution. I have been manually running a set of queries to pull data for business users, that I want to create a form for so they can just run it themselves. This would be easy if it was one query, but the way I have to pull the correct data is first have to run one query that creates a table. I then use that 'new' table in a secondary query to get the data I need.
That first make table query needs to take inputs like supplier number and date range. I then use that output table in another query that sums up total dollar value of purchase orders. I can not include these two steps in one query as it creates an ambiguous outer join.
Any ideas on how I would go about creating a form for something like this?
Well after re-thinking this I think I was OVER thinking it. Instead of creating a table in the first step, I can just save it as a query and join that query to the second query. Then make a form off the two queries.

Creating a global variable in Talend to use as a filter in another component

I have job in Talend that is designed to bring together some data from different databases: one is a MySQL database and the other a MSSQL database.
What I want to do is match a selection of loan numbers from the MySQL database (about 82,000 loan numbers) to the corresponding information we have housed in the MSSQL database.
However, the tables in MSSQL to which I am joining the data from MySQL are much larger (~ 2 million rows), are quite wide, and thus cost much more time to query. Ideally I could perform an inner join between the two tables based on the loan number, but since they are in different databases this is not possible. The inner join that is performed inside a tMap occurs after the Lookup input has already returned its data set, which is quite large (especially since this particular MSSQL query will execute a user-defined function for each loan number).
Is there any way to create a global variable out of the output from the MySQL query (namely, the loan numbers selected by the MySQL query) and use that global variable as an IN clause in the MSSQL query?
This should be possible. I'm not working in MySQL but I have something roughly equivalent here that I think you should be able to adapt to your needs.
I've never actually answered a Stackoverflow question and while I was typing this the page started telling me I need at least 10 reputation to post more than 2 pictures/links here and I think I need 4 pics, so I'm just going to write it out in words here and post the whole thing complete with illustrations on my blog in case you need more info (quite likely, I should think!)
As you can see, I've got some data coming out of the table and getting filtered by tFilterRow_1 to only show the rows I'm interested in.
The next step is to limit it to just the field I want to use in the variable. I've used tMap_3 rather than a tFilterColumns because the field I'm using is a string and I wanted to be able to concatenate single quotes around it but if you're using an integer you might not need to do that. And of course if you have a lot of repetition you might also want to get a tUniqueRows in there as well to save a lot of unnecessary repetition
The next step is the one that does the magic. I've got a list like this:
'A1'
'A2'
'B1'
'B2'
etc, and I want to turn it into 'A1','A2','B1','B2' so I can slot it into my where clause. For this, I've used tAggregateRow_1, selecting "list" as the aggregate function to use.
Next up, we want to take this list and put it into a context variable (I've already created the context variable in the metadata - you know how to do that, right?). Use another tMap component, feeding into a tContextLoad widget. tContextLoad always has two columns in its schema, so map the output of the tAggregateRows to the "value" column and enter the name of the variable in the "key". In this example, my context variable is called MyList
Now your list is loaded as a text string and stored in the context variable ready for retrieval. So open up a new input and embed the variable in the sql code like this
"SELECT distinct MY_COLUMN
from MY_SECOND_TABLE where the_selected_row in ("+
context.MyList+")"
It should be as easy as that, and when I whipped it up it worked first time, but let me know if you have any trouble and I'll see what I can do.

Define same output table across multiple transformations

I have 6 different input datasets. I want to run ETL over all 6 datasets so they all get transformed to the same output table (same columns and types).
I am using Pentaho (Spoon) to do this.
Is there a way I can define an output table schema to be used by all these transformations in Pentaho? I am using MySQL as my output database.
Thanks in advance.
Sounds like you need the Select Values step. Put one of those on the last hop of each dataset's path and make the metadata for the paths all look EXACTLY the same. Then you can connect the output from each Select Values step into a Table Output. All the rows from each set will be mixed together in no particular order.
This can be more challenging than it looks. Spoon will throw errors if any of the fields aren't just exactly identical to the corresponding field from all other datasets. You'll have to find some way to get all the metadata from the datasets to be the same.