I wrote the following code:
BEGIN{FS=OFS=","}
NR==FNR &&
$7{sum+=$7;
elementos++;
next}
!$7{$7=media}
{print}
ENDFILE{media=sum/elementos}
This awk script adds the average age to the empty cells on column 'age'.
Execution of the code is done as follows:
awk -f c_awk.awk train3.csv
Now I am trying to save the changes done in a new CSV file using awk. (new file: train4.csv)
I have been trying with
> ./c_awk.awk/train4.csv in the last line but it doesn't work.
awk: c_awk.awk:12: ENDFILE{media=sum/elementos}> /tmp/train4.csv
awk: c_awk.awk:12: ^ syntax error
awk: c_awk.awk:12: ENDFILE{media=sum/elementos}> /tmp/train4.csv
awk: c_awk.awk:12: ^ syntax error
The file from where I am trying to implement the changes looks like this:
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
The expected result is the following:
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
Thanks.
awk -f c_awk.awk train3.csv > /tmp/train4.csv
or just change {print} to {print > "/tmp/train4.csv"}
Your script contains several issues unrelated to writing to a new file that I suspect you aren't aware of - you should post a new question with concise, testable sample input and expected output so we can help you with that as it's not entirely clear what you want that script to do
(Using Sikuli IDE -288 20/04/19 on Windows 10)
I am currently having issues with a portion of code that runs correctly the first time around, but the second time where the function is looped instead of overwriting information created in the first iteration, it is somehow using the old information.
A function called selectRewards() is called, and it takes a few screenshots of the reward region over a few seconds to gather a useable still of an animation, the file name is numerically incremented. Then the function creates a Finder using the screenshots starting with screenshot 1. The Finder, and image I want to check against is passed into a search() function where it should be using the passed finder and image to find matches. It checks for all defined images in screenshot1, screenshot2 etc. until matches are found. And the matches are selected on the screen using coordinates from the screenshot image.
This all works well in the first iteration of the selectRewards(), it cycles through the screenshots and finds the images on a stable screenshot, but when the function is called again the same exact "found" results are returned, and clicks are exactly the same even when the images don't exist in the screenshots (I've even deleted the screenshots at the end of the first loop to try and clear any incorrect information being sent to the finder.
I've tried to pull the section out to share in a cleaner way, and it still provides the same issue. Any help and advice would be deeply appreciated.
(Though currently having even stranger issues with the code, since having the main script open in a tab on the IDE and the new script in another - neither running - if I run the snippet script it will use the coordinates/image finds from a previous run of the scripts). Can there be some sort of memory issue or caching in windows? ALT+SHIFT+R to restart the IDE normally helps clear the issue.
Settings.MoveMouseDelay = 0.5
#Define Regions
rewardRegion = Region(536,429,832,207)
#Define Images
searchCoupons = Pattern("coupons.png").similar(0.85)
searchAdvanced = Pattern("2011.png").similar(0.85)
searchAdvancedFrag = Pattern("2012.png").similar(0.85)
matchesFound = False
def search(image,rewardGlimpse, descr = ""):
print ("##### searching for: (%s) %s" % (image, descr))
rewardGlimpse.findAll(image) # find all matches (using passed finder variable & image variable)
matches = [] # an empty list to store the matches
while rewardGlimpse.hasNext(): # loop as long there is a first and more matches
matches.append(rewardGlimpse.next()) # access next match and add to matches
# now we have our matches saved in the list matches
print(" Does FindAll have next? (should be false):" + str(rewardGlimpse.hasNext()))
print(" Found matches:" + str(len(matches)))
if len(matches) > 0:
global matchesFound
matchesFound = True
obtainedReward = str(descr)
print(" Match found should be true " + str(matchesFound) + ". Found: "+obtainedReward)
# we want to use our matches
for m in matches:
#Find x & y location of rewards in screenshot
matchx = m.x
matchy = m.y
#Append them to the reward region to line it up.
newx = rewardRegion.getX()+matchx
newy = rewardRegion.getY()+matchy
rewardHover = Location(newx, newy)
#click the found reward location
click(rewardHover)
wait(1)
def selectRewards():
#---- Save Incremental Screenshots
wait(1)
capture(rewardRegion,"screenshot1.png")
wait(0.5)
capture(rewardRegion,"screenshot2.png")
wait(0.5)
capture(rewardRegion,"screenshot3.png")
wait(0.5)
capture(rewardRegion,"screenshot4.png")
wait(0.5)
capture(rewardRegion,"screenshot5.png")
wait(0.5)
capture(rewardRegion,"screenshot6.png")
wait(0.5)
#----- Test the screenshots
snum = 1 #screenshot file number
while True:
global matchesFound
if matchesFound == True:
print("Rewards Found - breaking search loop")
matchesFound = False
break
else:
pass
#Start with _screenshot1.png, increment snum.
screenshotURL = "_screenshot"+str(snum)+".png"
rewardGlimpse = Finder(screenshotURL) #Setup the Finder
print("Currently searching in: " + str(screenshotURL))
#Pass along the image to search, the screenshots Finder, and description.
search(searchCoupons,rewardGlimpse, "Coupons")
search(searchAdvanced,rewardGlimpse, "Advanced Recruit Proof")
search(searchAdvancedFrag,rewardGlimpse, "Advanced Recruit Fragments")
snum = snum + 1
if snum>6:
break
while True: #All rewards available this round are collected
if exists("1558962266403.png"):
click("1558962266403.png")
#confirm
break
else:
pass
print("No reward found at this point.")
print("Matches Found at No Reward Debug: " +str(matchesFound))
#Needed matches not found, selecting random reward.
hover("1558979979033.png")
click("1558980645042.png")
#matchesFound = False #Toggle back to False
#print("Matches found: Variable Value(Should be false)" + str(matchesFound))
def main():
i = 0
SSLoops = 2
while i < 2:
print("Loop #" + str(i+1) + "/"+ str(SSLoops))
print("--------------")
if i == 1: #remove this if statement for live
click("1559251066942.png") #switches spoofed html pages to show diff rewards
selectRewards()
i = i + 1
if __name__ == '__main__':
main()
Loop one of calling selectRewards() is correct, there were 3 images in the reward region that matched the things to search for. But the second loop is incorrect, only one of the matching images were there and wasn't in the same exact position. The script clicked in the 3 locations of the previous loop during the second time around.
Message log:
====
Loop #1/2
--------------
Currently searching in: _screenshot1.png
##### searching for: (P(coupons.png) S: 0.85) Coupons
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Coupons
[log] CLICK on L[603,556]#S(0) (586 msec)
##### searching for: (P(2011.png) S: 0.85) Advanced Recruit Proof
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Advanced Recruit Proof
[log] CLICK on L[653,556]#S(0) (867 msec)
##### searching for: (P(2012.png) S: 0.85) Advanced Recruit Fragments
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Advanced Recruit Fragments
[log] CLICK on L[703,556]#S(0) (539 msec)
Rewards Found - breaking search loop
[log] CLICK on L[90,163]#S(0) (541 msec)
Loop #2/2
--------------
[log] CLICK on L[311,17]#S(0) (593 msec)
Currently searching in: _screenshot1.png
##### searching for: (P(coupons.png) S: 0.85) Coupons
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Coupons
[log] CLICK on L[603,556]#S(0) (617 msec)
##### searching for: (P(2011.png) S: 0.85) Advanced Recruit Proof
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Advanced Recruit Proof
[log] CLICK on L[653,556]#S(0) (535 msec)
##### searching for: (P(2012.png) S: 0.85) Advanced Recruit Fragments
Does FindAll have next? (should be false):False
Found matches:1
Match found should be true True. Found: Advanced Recruit Fragments
[log] CLICK on L[703,556]#S(0) (539 msec)
Rewards Found - breaking search loop
[log] CLICK on L[304,289]#S(0) (687 msec)
====
RaiMan from SikuliX:
ok, the reason is the image caching internally based on filenames.
When a new Finder is created with an image filename already in cache, the cached version is used.
So you either might switch off caching globally:
Settings.setImageCache(0)
or add:
Image.reset()
at the beginning of selectRewards()
or add the loop count to the filename of the captured images (take care: memory is constantly increased!)
BTW: when selecting another tab in the IDE, the images from the previously selected tab are uncached automatically.
I have original dataset in json format. Let's load it in R.
library("rjson")
setwd("mydir")
getwd()
json_data <- fromJSON(paste(readLines("N1.json"), collapse=""))
uu <- unlist(json_data)
uutext <- uu[names(uu) == "text"]
And I have another dataset mydata2
mydata=read.csv(path to data/words)
I need to find the words in mydata2, only which are present in messages in json file. And then write this messages into the new document, "xyz.txt" How to do it?
chalk indirect pick reaction team skip pumpkin surprise bless ignorance
1 time patient road extent decade cemetery staircase monarch bubble abbey
2 service conglomerate banish pan friendly position tight highlight rice disappear
3 write swear break tire jam neutral momentum requirement relationship matrix
4 inspire dose jump promote trace latest absolute adjust joystick habit
5 wrong behave claim dedicate threat sell particle statement teach lamb
6 eye tissue prescription problem secretion revenge barrel beard mechanism platform
7 forest kick face wisecrack uncertainty ratio complain doubt reflection realism
8 total fee debate hall soft smart sip ritual pill category
9 contain headline lump absorption superintendent digital increase key banner second
i mean
chalk -1 number1 indirect -2 number2
template
Word1-1 number1-1; Word1-2 number 1-2; …; Word 1-10 number 1-10
Word2-1 number2-1; Word2-2 number 2-2; …; Word 2-10 number 2-10
Next time pls include real data. Simplified model:
library(data.table)
word = c("test","meh","blah")
jsonF = c("let's do test", "blah is right", "test blah", "test test")
outp <- list()
for (i in 1:length(word)) {
outp[[i]] = as.data.frame(grep(word[i],jsonF,v=T,fixed=T)) # possibly, ignore.case=T
}
qq = rbindlist(outp)
qq = unique(qq)
print(qq)
1: let's do test
2: test blah
3: test test
4: blah is right
Edit: quick and dirty paste/collapse:
library(data.table)
x = LETTERS[1:10]
y = LETTERS[11:20]
df = rbind(x,y)
L = list()
for (i in 1:nrow(df)) {
L[i] = paste0(df[i,],"-",seq(1,10)," ",i,"-",seq(1,10),collapse="; ")
}
Fin = cbind(L)
View(Fin)
Gives:
> Fin
L
[1,] "A-1 1-1; B-2 1-2; C-3 1-3; D-4 1-4; E-5 1-5; F-6 1-6; G-7 1-7; H-8 1-8; I-9 1-9; J-10 1-10"
[2,] "K-1 2-1; L-2 2-2; M-3 2-3; N-4 2-4; O-5 2-5; P-6 2-6; Q-7 2-7; R-8 2-8; S-9 2-9; T-10 2-10"
I am running a simple sentence disambiguation test. But the synset returned by nltk Lesk for the word 'cat' in the sentence "The cat likes milk" is 'kat.n.01', synsetid=3608870.
(n) kat, khat, qat, quat, cat, Arabian tea, African tea (the leaves of the shrub Catha edulis which are chewed like tobacco or used to make tea; has the effect of a euphoric stimulant) "in Yemen kat is used daily by 85% of adults"
This is a simple phrase and yet the disambiguation task fails.
And this is happening for many words in a set containing more than one sentence, for example in my test sentences, I would expect 'dog' to be disambiguated as 'domestic dog' but Lesk gives me 'pawl' (a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward)
Is it related to the size of the training set which is in my test only few sentences?
Here is my test code:
def test_lesk():
words = get_sample_words()
print(words)
tagger = PerceptronTagger()
tags = tagger.tag(words)
print (tags[:5])
for word, tag in tags:
pos = get_wordnet_pos(tag)
if pos is None:
continue
print("word=%s,tag=%s,pos=%s" %(word, tag, pos))
synset = lesk(words, word, pos)
if synset is None:
print('No synsetid for word=%s' %word)
else:
print('word=%s, synsetname=%s, synsetid=%d' %(word,synset.name(), synset.offset()))