Scilab - for loop - finding matching data points - from matrices of different lengths - index error - data-analysis

Within Scilab I am trying to find data points (time) that match then index these points (a1 continous, a2 discreet events). I can take this index to select data points from other data sets so then I can analyse data based on the discreet events (a2).
The below code gives me an 'index error' on this line 'if a1(i) == a2(j);'
a1 = [1,2,3,4,5,6,7,8,9,10,11,12,13]
a2 = [3,4,6,8,10,12]
x = 0
for i = x:length(a1);
for j = 0:length(a2);
if a1(i) == a2(j);
disp(x)
end
end
end
If there are any proficient Scilab users out there to help, it would be much appreciated.

Please look at the intersect function. It does exactly what you want in an efficient way

I discovered the problem. The invalid index was due to the matrix not having a zero index.
a1 = [1,2,3,4,5,6,7,8,9,10,11,12,13]
a2 = [3,4,6,8,10,12]
x = 1;
for i = 1:13;
for j = x:6;
if a1(i) == a2(j);
disp(a2(j))
x = j
end
end
end

Related

searching an array in vb.net

I am creating an array by selecting data from a MySQL database, here is where i create the array:
Dim call_costs_data(10000, 5)
While reader.Read()
counter = counter + 1
call_costs_data(counter, 1) = reader.GetString(0)
call_costs_data(counter, 2) = reader.GetString(1)
call_costs_data(counter, 3) = reader.GetString(2)
call_costs_data(counter, 4) = reader.GetString(3)
call_costs_data(counter, 5) = reader.GetString(4)
End While
number = counter
i am doing this multiple times with different tables, so when i run my function the data is all in arrays rather than searching the database each time in my loop (inside the function)
i am currently searching the array using a For loop like this:
For p = 1 To number
If call_costs_data(p, 1) = "some_value" Then
'do something here
End If
Next
but because i do this for so many arrays inside my loop and there are so many records to loop through before the correct one is found, its taking a long time.
What is the best way to change this to search the arrays?

Conversion of Foxpro code to Set-Based MySQL Query

Trying to convert a Visual Foxpro code to set-based MySQL query. Following is the code segment from Foxpro
lnFound=0
IF LnFound = 0 .and. rcResult = "ALL" AND PcOpOrIp = "OP"
SELECT PFile
LcTag = ORDER()
SET ORDER TO TAG PtcntlNm
=SEEK(LcPatientNo)
SCAN WHILE PtcntlNm = LcPatientNo
IF GcMResult <= "0"
GcMResult = "1-7MAT-PTC"
ENDIF
IF MONTH(cSRa.Fromdate) = MONTH(pFile.Fromdate) ;
.AND. pFile.ThruDate >= cSRa.ThruDate
** Check From/Thru Date against pFile
IF (ABS(cSRa.totalchrg) = (pFile.BDeduct+pFile.Deduct+pFile.Coinsur)) .OR. cSRa.Tchrgs = (pFile.BDeduct+pFile.Deduct+pFile.Coinsur) .or. (ABS(cSRa.totalchrg) = pFile.Total .OR. cSRa.Tchrgs = pFile.Total)
IF lnFound = 0
gcRecid = recid
gcmResult=rcResult
ENDIF
lnFound = lnFound + 1
gcUNrECID = gcunRecid + IIF(EMPTY(gCUNreCID),Recid,[,]+recid)
ENDIF
ENDIF
ENDSCAN
SELECT PFile
SET ORDER TO &LcTag
ENDIF
I have a table named pfile which I'am trying to join with another table named csra. The main aim of this is to set the record_id (gcrecid) based on the condition of three nested if statements. After setting the gcrecid variable the lnfound variable is set to one hence the third if statement condition is false from the second iteration onwards.
Here is the MySQL stored procedure which I came up with and as you can see I'm not able to completely convert the code in an efficient manner.
UPDATE csra AS cs
JOIN p051331s AS p ON cs.patientno = p.ptcntlnm
SET cs.recid = p.recid
, cs.mcsult = "ALL"
, cs.lnfound = '"1"'
WHERE cs.provider = '051331'
AND cs.lnfound = "0"
AND cs.RECID IS NULL
AND month(cs.fromdate) = month(p.fromdate)
AND p.thrudate >= cs.ThruDate
AND ABS(cs.totalchrg) = (p.bdeduct+p.deduct+p.coinsur)
OR cs.tchrgs = (p.bdeduct+p.deduct+p.coinsur)
OR ABS(cs.totalchrg) = p.total OR cs.tchrgs = p.total;
Any lead in this regard will be much appreciated as I've been working on this procedure for a couple of day with no noticeable results.
According to this partial VFP code (which is not clear on variables it uses) there is no code to be converted to set based at all. Corresponding mySQL or MS SQL or any other SQL series backend code would simply be "nothing". ie: this would be equivalant:
-- Hello to mySQL or MS SQL
PS: On your trial to convert to an update code, inner joining with csra is wrong. It is not joined in VFP code, csra values are constant --unless there is a relation on fields set-- (pointing to the "current row" values in csra only). You would want to make them into parameters as with the rest of memory variables (which is not clear from the code which ones are memory variables).

More efficient active record grouping

Im trying to check 2.5 second intervals for records and add an object to an array based on the count. This way works but its far too slow. thanks
#tweets = Tweet.last(3000)
first_time = #tweets.first.created_at
last_time = #tweets.last.created_at
while first_time < last_time
group = #tweets.where(created_at: (first_time)..(first_time + 2.5.seconds)).count
if group == 0 || nil
puts "0 or nil"
first_id + 1
array << {tweets: 0}
else
first_id += group
array << {tweets: group}
end
first_time += 2.5.seconds
end
return array.to_json
end
What you really need is the group_by method on the records you've retrieved:
grouped = #tweets.group_by do |tweet|
# Convert from timestamp to 2.5s interval number
(tweet.created_at.to_f / 2.5).to_i
end
That returns a hash with the key being the time interval, and the values being an array of tweets.
What you're doing in your example probably has the effect of making thousands of queries. Always watch log/development.log to see what's going on in the background.

How can I tokenize a string in MySQL?

My project is importing a sizable collection +500K rows of data from flat Excel files, which are manually created by a team of people. Now the problem is that it all needs to be normalized, for client searching. For example, the company field will have multiple company spellings and include branches, such as "IBM" and then "IBM Inc." and "IBM Japan" etc. Additionally, I have product names that alphanumeric, such as "A46-Rhizonme Pentahol", which SOUNDEX alone cannot handle.
I can solve the issue in the long term by having all the data input be through a web form, with an AJAX auto-suggest. Until then however, I still need to deal with the massive collection of existing data. This brings me to what I believe is a good process, based on what I've read here:
http://msdn.microsoft.com/en-us/magazine/cc163731.aspx
Steps to create a custom Fuzzy Logic Lookup, and Fuzzy Logic Grouping
List item
tokenize strings into keywords
calculate keyword TF-IDF (total frequency - inverse document frequecy)
calculate levenshtein distance between keywords
calculate Soundex on available alpha strings
determine context of keywords
place keywords, based on context, into separate DB tables, such as "Companies", "Products", "Ingredients"
I've been Googling, searching StackOverflow, reading over MySQL.com discussions, etc. about this issue, to attempt to find a prebuilt solution. Any ideas?
So, I gave up and just made a string tokenizing function for mysql. Here's the code:
CREATE DEFINER = `root`#`localhost` FUNCTION `NewProc`(in_string VARCHAR(255), delims VARCHAR(255), str_replace VARCHAR(255))
RETURNS varchar(255)
DETERMINISTIC
BEGIN
DECLARE str_len, delim_len, a, b, is_delim INT;
DECLARE z, y VARBINARY(1);
DECLARE str_out VARBINARY(256);
SET str_len = CHAR_LENGTH(in_string), delim_len = CHAR_LENGTH(delims),a = 1, b = 1, is_delim = 0, str_out = '';
-- get each CHARACTER
WHILE a <= str_len DO
SET z = SUBSTRING(in_string, a, 1);
-- loop through the deliminators
WHILE b <= delim_len AND is_delim < 1 DO
SET y = SUBSTRING(delims, b, 1);
-- search for each deliminator
IF z = y THEN
SET is_delim = 1;
END IF;
SET b = b + 1;
END WHILE;
IF is_delim = 1 THEN
SET str_out = CONCAT(str_out, str_replace);
ELSE
SET str_out = CONCAT(str_out, z);
END IF;
SET b = 0;
SET is_delim = 0;
SET a = a + 1;
END WHILE;
RETURN str_out;
END;
It's called like this:
strtok("this.is.my.input.string",".,:;"," | ")
and will return
"this | is | my | input | string"
I hope someone else finds this useful. Cheers!
You should check out Google Refine.
Google Refine is a power tool for working with messy data, cleaning it
up, transforming it from one format into another, extending it with
web services, and linking it to databases like Freebase.

Linq to Sql - Use sum in the where clause

I'm trying to select orders that have either over or under 2000 products ordered in them, depending on other values. I need to select the information from the Orders table, but check this value in the OrdersProducts table, specifically the sum of OrdersProducts.ProductQty. I also need to do this using predicate builder, because of other requirements. So far, I have this, but it isn't returning the results correctly. Its using nested Lambda expressions, which I didn't know I could do but I tried it and it works, but its not returning correct results.
Dim getOrders = From d In db.Orders _
Where d.Status = OrderStatus.Approved _
Select d
' Then a for loop adding parameters via Predicatebuilder...
If over2000 = True Then
' over 2000
predicate1 = predicate1.And(Function(d) (d.OrderProducts.Sum(Function(c) c.ProductQty > 2000)))
Else
' under 2000
predicate1 = predicate1.And(Function(d) (d.OrderProducts.Sum(Function(c) c.ProductQty < 2000)))
End If
basePredicate = basePredicate.Or(predicate1)
' End For loop
getOrders = getOrders.Where(basePredicate)
I removed some code for brevity but I think that gets the point across. How can I do this?? Thanks!
Try changing this:
(d.OrderProducts.Sum(Function(c) c.ProductQty > 2000))
to this:
(d.OrderProducts.Sum(Function(c) c.ProductQty) > 2000)
I haven't built this to test it, but it appears that it was currently trying to sum the results of a boolean comparison instead of summing the quantities and then comparing.