R MySQL For Loop Error - mysql

I Have a list of My SQL files with the following names. These are located in a folder whose path is reportconnection (reportconn)
TableName
A1_1
A1_2
A1_3
A1_4
A1_5
A1_6
A1_7
A1_8
Each of these tables consists of data regarding 1 e mail campaign blast.
The structure of each of these is as follows. There are 8 such tables, one for each e mail campaign
C1 C2 C3
Y X Z
Y2 X2 Z2
I want a list of unique counts of C2 for each A1, A2, A3 etc.
I have used the following code
C2count<-list()
For (I in(Tablenames){
sql2 <- paste("select count(DISTINCT BINARY C2) from ", TableName)## SQL
Query
C2count<-rbind(C2count,dbGetQuery(reportconn, sql2).}
I am getting just a single list of values. Please help me.

Your sql2 is pasting in "Tablenames" instead of I. I is looping through each name in your list of Tablenames. I is what is changing each time. Hope this helps.
` C2count<-list()
For (I in Tablenames){
sql2 <- paste("select count(DISTINCT BINARY C2) from ", I)## SQLQuery
C2count<-rbind(C2count,dbGetQuery(reportconn, sql2)
}`

Related

R data.frame to SQL - preserving ordered factors

I am just starting to use MySQL to handle data that is currently in R dataframe objects. I was hoping for a simple round-trip to and from SQL that would recreate an R dataframe exactly:
library("compare",pos=2)
library("RMySQL",pos=2)
conR <- dbConnect(MySQL(),
user = '...',
password = '...',
host = '...',
dbname='r2014')
a3 <- data.frame(x=5:1,y=letters[1:5],z=ordered(c("NEVER","ALWAYS","NEVER","SOMETIMES","NEVER"),levels=c("NEVER","SOMETIMES","ALWAYS")))
a3
dbWriteTable(conn = conR, name = 'a3', value = a3)
a4 <- dbReadTable(conn = conR, name = 'a3')
compare(a3,a4)$detailedResult
a3$z
a4$z
the result shows that factors end up as strings (columns y and z), and that the ordering information for ordered factors is lost (column z):
> a3
x y z
1 5 a NEVER
2 4 b ALWAYS
3 3 c NEVER
4 2 d SOMETIMES
5 1 e NEVER
> compare(a3,a4)$detailedResult
x y z
TRUE FALSE FALSE
> a3$z
[1] NEVER ALWAYS NEVER SOMETIMES NEVER
Levels: NEVER < SOMETIMES < ALWAYS
> a4$z
[1] "NEVER" "ALWAYS" "NEVER" "SOMETIMES" "NEVER"
> a3$y
[1] a b c d e
Levels: a b c d e
> a4$y
[1] "a" "b" "c" "d" "e"
Is there some way to specify the information in the ordered factors in the creation of the table a3 in the database?
I would change the code to:
dbWriteTable(conn = conR, name = 'a3', value = a3, row.names=TRUE)
a4 <- dbReadTable(conn = conR, name = 'a3', row.names=TRUE)
row.names of a data.frame are ordered by default. When they are stored in an SQL column they are also ordered. The SELECT query can use ORDER BY row_names to fetch the ordered set.
Value of row.names in dbReadTable() argument can be changed to NA in case the SQL table does not contain the row_names column.[2]
[1] REF: DBI::dbWriteTable
The interpretation of rownames depends on the ‘row.names’
argument, see ‘sqlRownamesToColumn()’ for details:
• If ‘FALSE’ or ‘NULL’, row names are ignored.
• If ‘TRUE’, row names are converted to a column named
"row_names", even if the input data frame only has natural
row names from 1 to ‘nrow(...)’.
• If ‘NA’, a column named "row_names" is created if the data
has custom row names, no extra column is created in the case
of natural row names.
• If a string, this specifies the name of the column in the
remote table that contains the row names, even if the input
data frame only has natural row names.
[2] REF: DBI::dbReadTable
The presence of rownames depends on the ‘row.names’ argument, see
‘sqlColumnToRownames()’ for details:
• If ‘FALSE’ or ‘NULL’, the returned data frame doesn't have
row names.
• If ‘TRUE’, a column named "row_names" is converted to row
names.
• If ‘NA’, a column named "row_names" is converted to row names
if it exists, otherwise no translation occurs.
• If a string, this specifies the name of the column in the
remote table that contains the row names.

concatenate DISTINCT string values in pentaho data integration

I am new to pentaho data integration. How can i concatenate distinct string values ?
bse_id values
100 A1
100 A1
100 A2
150 A1
150 B1
150 C1
150 C1
putput should be
bse_id values
100 A1,A2
150 A1,B1,C1
In Mysql, i can use
select bse_id,group_concat(distinct values) from table group by 1;
In SPOON, i have tried group_by step and memory group_by
both are resulting in duplicate values.
I'm getting output as
bse_id values
100 A1,A1,A2
150 A1,B1,C1,C1
Please help me in removing the duplicates.
You can do this easily with a Group by step. Be sure the input to the step is sorted on the bse_id field, then select values as the subject of an aggregate field and set the type to 'Concatenate strings separated by,'. That should give you exactly what you want.
You need to have 2 Group by Steps:
Try the following three steps after input:
Step: Sort by BOTH - 'bsi_id' and 'values'
Step: Group by BOTH - 'bsi_id' and 'values' (no aggregation here)
Step: Group by 'bsi_id'; aggregate 'values' with Type "Concatenate strings separated by ,"
Output is:
bse_id; values
100; A1, A2
150; A1, B1, C1
This should work fine.
Bye

Access query where one field is LIKE another

I'm trying to query on on field in one table where it is LIKE a field in another table but am having trouble getting valid results.
I want to find the Pager_ID in tbl_Emergin_Current_Device_Listing_20121126 where it is like the Pager_ID in tbl_AMCOM_PROD.
Some relevant information:
Pager_ID in tbl_Emergin_Current_Device_Listing_20121126 is at most 10 characters and are always numeric characters (example of 10 character Pager_ID: 3145551212).
However, Pager_ID in tbl_AMCOM_PROD can be alpha-numeric (3145551212#att.txt.com, which would be the same user.
All data is stored as text.
I want to be able to find "3145551212#att.txt.com" in tbl_Amcom_Prod.Pager_ID when "3145551212" is present in tbl_Emergin_Current_Device_Listing_20121126.Pager_ID. However, with the code below I'm only finding exact matches (EQUAL instead of LIKE).
current code:
SELECT DISTINCT tbl_emergin_current_device_listing_20121126.userrecno,
tbl_emergin_current_device_listing_20121126.username,
tbl_emergin_current_device_listing_20121126.department,
tbl_emergin_current_device_listing_20121126.carriername,
tbl_emergin_current_device_listing_20121126.protocol,
tbl_emergin_current_device_listing_20121126.pin,
tbl_emergin_current_device_listing_20121126.pager_id,
Iif([tbl_amcom_group_call_leads_and_id].[amcom listing msg id],
[tbl_amcom_group_call_leads_and_id].[amcom msg group id],
[tbl_amcom_prod].[messaging_id])
AS [Amcom Messaging or Message Group ID]
FROM ((tbl_emergin_current_device_listing_20121126
LEFT JOIN tbl_amcom_prod
ON tbl_emergin_current_device_listing_20121126.pager_id =
tbl_amcom_prod.pager_id)
LEFT JOIN tbl_amcom_group_call_leads_and_id
ON tbl_emergin_current_device_listing_20121126.pager_id =
tbl_amcom_group_call_leads_and_id.[ams group call lead])
LEFT JOIN tbl_deactivated_pager_list
ON tbl_emergin_current_device_listing_20121126.pager_id =
tbl_deactivated_pager_list.[pager number];
Sample Results:
UserRecNo UserName Department CarrierName Protocol PIN PAGER_ID Amcom Messaging or Message Group ID
43 Brown, Lewis BJH Verizon 0 3145550785 3145550785 3145550785
52 Wyman, Mel BJH Airtouch (Verizon) (SNPP) 3 3145558597 3145558597 3145558330
I'd also like to see this record but am not with current code:
57 Johnson, Mick BJH AT&T 3 3145551234 3145551234#att.txt.com 3145559876
What change should I be making?
Thanks in advance!
Something like:
SELECT Pager_ID
FROM tbl_Amcom_Prod a
LEFT JOIN [tbl_Emergin_Current_Device_Listing_20121126] b
On a.Pager_ID & "*" Like b.Pager_ID
This will only work in SQL view, not design view.
You could also use a mixture of Instr & Mid.
SELECT IIf(InStr([Pager_ID] & "",".")>0,
Mid([Pager_ID],1,InStr([Pager_ID],".")-1),[Pager_ID ]) AS PID
FROM [tbl_Amcom_Prod]
WHERE IIf(InStr([Pager_ID] & "",".")>0,
Mid([Pager_ID],1,InStr([Pager_ID],".")-1),[Pager_ID])
In (SELECT Pager_ID
FROM [tbl_Emergin_Current_Device_Listing_20121126])

Concatenating multiple rows into single line in MS Access [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Combine rows in Access 2007
Access 2007 - Concatenate fields from one column in one table into a single, comma delmited value in another table
Currently I have a table structure that is somewhat like this:
Name --- Cat --- Desc --- Thresh --- Perc --- Err --- BP
Bob -------C1-------Inf--------7Per--------0.05------0-----ADC2
Bob -------C1-------Inf--------7Per--------0.05------2-----BAC2
Bob -------C1-------Inf--------7Per--------0.05------0-----RBE2
Bob -------C1-------Inf--------7Per--------0.05------8-----VBE2
Bob -------C1-------Inf--------7Per--------0.05------6-----AEC2
Bob -------C1-------Inf--------7Per--------0.05------0-----PBC2
Bob -------C2-------Com------8Per--------0.45------1-----XBC4
Bob -------C2-------Com------8Per--------0.45------0-----AEC2
Bob -------C2-------Com------8Per--------0.45------0-----PBC2
Bob -------C2-------Com------8Per--------0.45------3-----ADC2
Bob -------C2-------Com------8Per--------0.45------0-----ADC2
Bob -------C2-------Com------8Per--------0.45------0-----BAC2
Joe--------C1-------Inf---------7Per--------0.05------0-----PBC2
Joe--------C1-------Inf---------7Per--------0.05------0-----ZTM2
Joe--------C1-------Inf---------7Per--------0.05------2-----QYC2
Joe--------C1-------Inf---------7Per--------0.05------0-----FLC2
Joe--------C1-------Inf---------7Per--------0.05------1-----KSC2
Joe--------C1-------Inf---------7Per--------0.05------0-----JYC2
What i'm looking to do is have 1 line per "Name" and per "Cat", that will sum up all the "Err" (per "Name" and "Cat") and concatenate only the "BP" fields into a single line. Such as:
Name --- Cat --- Desc --- Thresh --- Perc --- Err --- BP
Bob -------C1-------Inf--------7Per--------0.05-----16-----BAC2, VBE2, AEC2
Bob -------C2------Com------8Per--------0.45------4------XBC4, ADC2
Joe--------C1-------Inf--------7Per--------0.05------3------QYC2, KSC2
There have been similar questions asked but I cannot seem to apply it as my knowledge of VBA scripting is beginner. Is there any way to do all of this via SQL? If VBA scripting is the only option (ie. creating a function), any help would be greatly appreciated. Thank You in advance.
Question part 2:
I created the function as per Allen Browne's guide. The module is saved as modConcatRelated. Now, i've tried to run this query (im not sure if this is the correct SQL to get the result that i'm looking for):
SELECT
[Name],
[Cat],
[Desc],
[Thresh],
[Perc],
sum([Err]),
ConcatRelated("[BP]", "make_table_bp", "[Err] = " & [BP])
FROM make_table_bp
GROUP BY
[Name],
[Cat],
[Desc],
[Thresh],
[Perc],
[Err],
[BP];
It said "Error 3061. Too few parameters. Expected 1." Also it said "Undefined Function ConcatRelated." I'm looking for guidance on how to create the correct SQL statement so that I can call the ConcatRelated function correctly and yield the result as depicted above. Thanks again.
Next question:
What if the table had a unique date field tagged on as the last column in the table. Something like this:
Name --- Cat --- Desc --- Thresh --- Perc --- Err --- BP --- Date
Bob -------C1-------Inf--------7Per--------0.05------0-----ADC2--12/02/2011
Bob -------C1-------Inf--------7Per--------0.05------2-----BAC2--09/05/2011
Bob -------C1-------Inf--------7Per--------0.05------0-----RBE2--11/02/2011
Bob -------C1-------Inf--------7Per--------0.05------8-----VBE2--08/14/2012
Bob -------C1-------Inf--------7Per--------0.05------6-----AEC2--02/25/2009
Bob -------C1-------Inf--------7Per--------0.05------0-----PBC2--07/02/2011
Bob -------C2-------Com------8Per--------0.45------1-----XBC4--09/05/2011
Bob -------C2-------Com------8Per--------0.45------0-----AEC2--02/02/2010
Bob -------C2-------Com------8Per--------0.45------0-----PBC2--08/14/2012
Bob -------C2-------Com------8Per--------0.45------3-----ADC2--05/05/2001
Bob -------C2-------Com------8Per--------0.45------0-----ADC2--08/02/2010
Bob -------C2-------Com------8Per--------0.45------0-----BAC2--06/17/2010
Joe--------C1-------Inf---------7Per--------0.05------0-----PBC2--08/14/2012
Joe--------C1-------Inf---------7Per--------0.05------0-----ZTM2--09/05/2011
Joe--------C1-------Inf---------7Per--------0.05------2-----QYC2--05/17/2010
Joe--------C1-------Inf---------7Per--------0.05------0-----FLC2--3/19/2010
Joe--------C1-------Inf---------7Per--------0.05------1-----KSC2--09/05/2011
Joe--------C1-------Inf---------7Per--------0.05------0-----JYC2--08/14/2012
Let's say I wanted to build a query to say something like: show me all records still within this same format:
Name --- Cat --- Desc --- Thresh --- Perc --- Err --- BP
Bob -------C1-------Inf--------7Per--------0.05-----16-----BAC2, VBE2, AEC2
Bob -------C2------Com------8Per--------0.45------4------XBC4, ADC2
Joe--------C1-------Inf--------7Per--------0.05------3------QYC2, KSC2
But for a date range of 01/01/2009 to 09/31/2011
#HansUp could you help with this?
I used a subquery for the GROUP BY which computes the Sum of Err for each group. Then I added the ConcatRelated function (from Allen Browne) with the fields returned by the subquery. This is the query and the output (based on your sample data in make_table_bp) from the query:
SELECT
sub.[Name],
sub.Cat,
sub.[Desc],
sub.Thresh,
sub.Perc,
sub.SumOfErr,
ConcatRelated("BP",
"make_table_bp",
"[Err] > 0 AND [Name] = '" & sub.[Name]
& "' AND Cat = '"
& sub.Cat & "'",
"BP")
AS concat_BP
FROM
(SELECT
q.[Name],
q.Cat,
q.[Desc],
q.Thresh,
q.Perc,
Sum(q.[Err]) AS SumOfErr
FROM make_table_bp AS q
GROUP BY
q.[Name],
q.Cat,
q.[Desc],
q.Thresh,
q.Perc
) AS sub
ORDER BY
sub.Name,
sub.Cat;
The query outputs this result set:
Name Cat Desc Thresh Perc SumOfErr concat_BP
Bob C1 Inf 7Per 0.05 16 AEC2, BAC2, VBE2
Bob C2 Com 8Per 0.45 4 ADC2, XBC4
Joe C1 Inf 7Per 0.05 3 KSC2, QYC2
Notice I enclosed Name, Desc, and Err with square brackets every place they were referenced in the query. All are reserved words (see Problem names and reserved words in Access). Choose different names for those fields if possible. If not, use the square brackets to avoid confusing the db engine.
But this will not work unless/until your copy of the ConcatRelated function is recognized by your data base engine. I don't understand why it's not; I followed the same steps you listed for storing the function code, and this works fine on my system.
Edit: I tested that query with my version of the table, which has [Err] as a numeric data type. Sounds like yours is text instead. In that case, I'll suggest you change yours to numeric, too. I don't see the benefit of storing numerical values as text instead of actual numbers.
However if you're stuck with [Err] as text, you can adapt the query to deal with it. Change this ...
"[Err] > 0 AND [Name] = '" & sub.[Name]
to this ...
"Val([Err]) > 0 AND [Name] = '" & sub.[Name]
That change prevented the "Data type mismatch in criteria expression" error when I tested with [Err] as text data type. However, I also changed this ...
Sum(q.[Err]) AS SumOfErr
to this ...
Sum(Val(q.[Err])) AS SumOfErr
AFAICT that second change is not strictly necessary. The db engine seems willing to accept numbers as text when you ask it to Sum() them. However I prefer to explicitly transform them to numerical values rather than depend on the db engine to make the right guess on my behalf. The db engine has enough other stuff to deal with, so I try to tell it exactly what I want.
Edit2: If you want only unique values concatenated, you can modify the ConcatRelated() function. Find this section of the code ...
'Build SQL string, and get the records.
strSql = "SELECT " & strField & " FROM " & strTable
and change it to this ...
'Build SQL string, and get the records.
strSql = "SELECT DISTINCT " & strField & " FROM " & strTable

Simple Excel function that splits a numbered entered into one cell randomy and evenly over 3 others

I am looking for a simple function that will take a number entered into a single cell say 20 and divide it evenly and randomly over three other cells, none of the values can be 0.
ie. A1 = 20
then
B1=6
C1=8
D1=6
Thanks!!
I don't have Excel in front of me, but something like this
B1: =Round(Rand()*(a1-3)+1, 0)
C1: =Round(Rand()*(a1-b1-2)+1, 0)
D1: =A1-B1-C1
B1 is set to a number from 1 to A1-2
C1 is set to a number from 1 to A1-B1-1
D1 is set to what's left.
I think you would have to write a macro to expand the values into B1, C1, D1 automatically, but the way I would do it would be to put the following code into B1:
=RANDBETWEEN(1, (A1-2))
The following into C1:
=RANDBETWEEN(1, (A1-B1-1))
The following into D1:
=A1-B1-C1
If you don't have the RANDBETWEEN() Function available here is how to enable it:
From the 'Tools' menu, select 'Add-Ins'
Find the reference to 'Analysis ToolPak' and put a tick in the box
Select 'OK
Without a macro, I dont see any way to get around having some temporary values shown.
Look at my illustration here which does what you are trying to achieve:
http://www.gratisupload.dk/download/43627/
A2 holds the initial value to split
Temporary values:
C2,D2,E2 are just =RAND()
Your evenly, but radomly split values will apear in these cells:
C5 = A2 / (C2 + D2 + E2) * C2
D5 = A2 / (C2 + D2 + E2) * D2
E5 = A2 / (C2 + D2 + E2) * E2
Edit: Of course you could show the temporary values (C2, D2, E2) on a seperate sheet. Still, only to avoid the evil world of macros.