directing the output to .txt file and automatically naming the file - output

I'm new to python and writing code in general. I want to direct the output of a users input into a .txt file (if possible). And if possible name it after the input in line 3. Thank you for any help or advice
userName = raw_input("login = ")
print "Welcome,", userName
number = raw_input("ID number = ")
weight = raw_input("Weight = ")

Write inside a file is quite easy in Python:
f = open(number + '.txt', 'w') #create a file using the given input
f.write(userName + " " + weight)
f.close()
For further references : http://docs.python.org/2/tutorial/inputoutput.html

Related

Creating an exe - file for data exchange with server in Tcl

I am completely lost and I do not know how to approach the following problem which my boss assigned to me.
I have to create an exe - file containing a code which works as follows when I run it: It sends a certain file, say file_A, to a server. When the server receives this file it sends back a json-file, say file_B, which contains an url. More precisely, the attribute of the json-file contains the url. The code should then open the url in a browser.
And here are the details:
The above code (one version in tcl) must accept three parameters and a fourth optional parameter (so, it is not necessary to pass a fourth parameter). The three parameters are: server, type and file.
server: this is the path to the server. For example, https://localhost:0008.
type: this is the type of the file (file_A) to be send to the server: xdt / png
file: this is the path to the file (file_A) to be send to the server.
The fourth optional parameter is:
wksName: if this paramater is given, then the url should be opened with it in the browser.
I got an example code for the above procedure written in python. It should serve as an orientation. I do not know anything about this language but to a large extend I understand the code. It looks as follows:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import platform
import sys
import webbrowser
import requests
args_dict = {}
for arg in sys.argv[1:]:
if '=' in arg:
sep = arg.find('=')
key, value = arg[:sep], arg[sep + 1:]
args_dict[key] = value
server = args_dict.get('server', 'http://localhost:0008')
request_url = server + '/NAME_List_number'
type = args_dict.get('type', 'xdt')
file = args_dict.get('file', 'xdtExamples/xdtExample.gdt')
wksName = args_dict.get('wksName', platform.node())
try:
with open(file, 'rb') as f:
try:
r = requests.post(request_url, data={'type': type}, files={'file': f})
request_url = r.json()['url'] + '?wksName=' + wksName
webbrowser.open(url = request_url, new = 2)
except Exception as e:
print('Error:', e)
print('Request url:', request_url)
except:
print('File \'' + file + '\' not found')
As you can see, the crucial part of the above code is this:
try:
with open(file, 'rb') as f:
try:
r = requests.post(request_url, data={'type': type}, files={'file': f})
request_url = r.json()['url'] + '?wksName=' + wksName
webbrowser.open(url = request_url, new = 2)
except Exception as e:
print('Error:', e)
print('Request url:', request_url)
except:
print('File \'' + file + '\' not found')
Everything else above it are just definitions. If possible, I would like to translate the above code into tcl. Could you please help me with this issue?
It doesn't have to be a 1-1 "tcl-translation" as long as it works as described above, and hopefully as simple as the above one.
I am not familiar with concepts such as sending/receiving data to/from servers, reading json-files etc.
Any help is welcome.

Python open file not identifying break line

I'm trying to fix an JSON file that was badly generated, so I need to replace all the '}{' by '},{' and stuff. The problem is that Python stop recognizing '\n' as a break line so when read the files by readlines() it print all the file, the same with ASCII characteres, they are all printed in the file and not recognized as ASCII characteres.
My code
def strip_accents(text):
try:
text = unicode(text, 'utf-8')
except (TypeError, NameError): # unicode is a default on python 3
print('ai carai')
pass
text = unicodedata.normalize('NFD', text)
text = text.encode('ascii', 'ignore')
text = text.decode("utf-8")
return text
file_name = 'json/get_tweets.json'
with open(file_name, 'r') as f:
file_s = ''
for i in f.readlines():
print(i)
i = i.replace('}{','},{')
i = strip_accents(i)
file_s += i
file_s = '[' + file_s + ']'
My file is around 4GB almost impossible to print here, so there is a print.
I already tried different encondings, but no result.
Can someone help to find a solution?
EDIT: The print wasnt uploaded. Sorry.

How to compare a character at a specific location in a string to identify the processing path

I'm trying to identify if the first character in a .txt/string is either a "{" or a "<". Depending on which will determine how the .txt is handled.
I'm working with two systems where one takes xml and the other takes json. So, as a file comes from one system it's converted and sent to the other. I've worked out the conversion for the files if they have the correct file extension but now I'm needing to be able to identify if a file is json or xml based off the content of a .txt file. I don't know why this would occur but was asked to include it.
Best way, as far as I can tell, is based off the first character within the file. If it's "<" than it is xml, if it's "{" than it's json. I'm not aware of a character that is only in json or only in xml that I can search through and identify that way.
The code below the # txt to xml and json is searching the whole file for the string which can give false positives which is why I'm trying to look at just the first character.
start_path = 'fileLocation'
for path,dirs,files in os.walk(start_path):
for fileName in files:
filePath = os.path.join(path,fileName)
# xml2json
if re.match('.*\.xml',fileName):
with open(filePath) as x:
xStr = x.read()
jStr = json.dumps(xmltodict.parse(xStr), indent=4)
with open("jsonOutput.json", 'w') as j:
j.write(jStr)
# json2xml
elif re.match('.*\.json',fileName):
with open(filePath) as j:
jStr = j.read()
xStr = xmltodict.unparse(json.loads(jStr), pretty=True)
with open('xmlOutput.xml', 'w') as x:
x.write(xStr)
# **Where I'm Having Trouble**
# txt to xml and json
elif re.match('.*\.txt',fileName):
with open(filePath) as t:
tStr = t.read()
if 'xml' in tStr:
with open('xmlOutput.xml', 'w') as x:
x.write(tStr)
elif '{' in tStr:
with open('jsonOutput.xml', 'w') as j:
j.write(tStr)
The ideal solution would replace the 'xml' and '{' full txt search with '<' and '{' checking the first character.
Any help is greatly appreciated and thank you.
If anyone is interested, I found a solution using readline(). This reads only the first line and if '{' is found it will process as a json, if there's an '<' it will process as xml. Thanks everyone for the help.
# unk to json & xml
else:
with open(filePath) as u:
fLine = u.readline() #This is only reading the first line.
uStr = u.read()
if '<' in fLine:
time = strftime('%Y%b%d %H%M', gmtime())
fName = fileName + ' ' + time + ".xml"
with open(fName, 'w') as x:
x.write(uStr)
elif '{' in fLine:
time = strftime('%Y%b%d %H%M', gmtime())
fName = fileName + ' ' + time + ".json"
with open(fName, 'w') as j:
j.write(uStr)

import chosen columns of json file in r

Can't find any solution how to load huge JSON. I try with well known Yelp dataset. It's 3.2 GB and I want to analyse 9 out of 10 columns. I need to skip import $text column, which will give me much lighter file to load. Probably about -70%. I don't want to manipulate the file.
I tried many libraries and stuck. I've found a solution for data.frame to apply pipe function:
df <- read.table(pipe("cut -f1,5,28 myFile.txt"))
from this thread: Ways to read only select columns from a file into R? (A happy medium between `read.table` and `scan`?)
How to do it for JSON? I'd like to do:
json <- read.table(pipe("cut -text yelp_academic_dataset_review.json"))
but this of course throws an error due to wrong format. Is there any possibilities without parsing whole file with regex?
EDIT
Structure of one row: (even can't count them all)
{"review_id":"vT2PALXWX794iUOoSnNXSA","user_id":"uKGWRd4fONB1cXXpU73urg","business_id":"D7FK-xpG4LFIxpMauvUStQ","stars":1,"date":"2016-10-31","text":"some long text here","useful":0,"funny":0,"cool":0,"type":"review"}
SECOND EDIT
Finally, I've created a loop to convert all data into csv file, omitted unwanted column. It's slow but I've got 150 mb (zipped) from 3.2 gb.
# files to process
filepath <- jfile1
fcsv <- "c:/Users/cp/Documents/R DATA/Yelp/tests/reviews.csv"
write.table(x = paste(colns, collapse=","), file = fcsv, quote = F, row.names = F, col.names = F)
con = file(filepath, "r")
while ( TRUE ) {
line = readLines(con, n = 1)
if ( length(line) == 0 ) {
break
}
# regex process
d <- NULL
for (i in rcols) {
pst <- paste(".*\"",colns[i],"\":*(.*?) *,\"",colns[i+1],"\":.*", sep="")
w <- sub(pst, "\\1", line)
d <- cbind(d, noquote(w))
}
# save on the fly
write.table(x = paste(d, collapse = ","), file = fcsv, append = T, quote = F, row.names = F, col.names = F)
}
close(con)
It can be save to json also. I wonder if it's the most efficient way, but other scripts I tested was slow and often had some encoding issues.
Try this:
library(jsonlite)
df <- as.data.frame(fromJSON('yelp_academic_dataset_review.json', flatten=TRUE))
Then once it is a dataframe delete the column(s) you don't need.
If you don't want to manipulate the file in advance of importing it, I'm not sure what options you have in R. Alternatively, you could make a copy of the file, then delete the text column with this script, then import the copy to R, then delete the copy.

Writing Encoded JSON data to a csv using tweepy

I've been stuck on this one for a while. Right now this function writes date,latitude,longitude,userid,text of a live tweet to a csv file.
The problem is that text of the tweet often contains letters from other alphabets e.g. arabic. These letters show up in this form (\u0641\u064a).
Is it possible to encode the text to a utf-8 string and append it to the rest of the data, so that the csv file would correclty display all characters?
def on_data(self, data):
try:
tweets = json.loads(data)
with open('Data.csv','a',encoding = 'utf-8') as f:
if(tweets['coordinates'] is not None):
coordinates_string = json.dumps(tweets["coordinates"]["coordinates"])
val_lg = coordinates_string.split(',')[0].strip("[")
val_lt = coordinates_string.split(',')[1].strip("]")
else:
val_lg = "None"
val_lt = "None"
text = json.dumps(tweets["text"])
user_id = json.dumps(tweets["user"]["id_str"])
time = json.dumps(tweets["created_at"])
data_string = time + "," + val_lt + "," + val_lg + "," + user_id + "," + text + "\n"
print(data_string)
f.write(data_string)
except:
pass
You've got some overuse of json. Once you loads the tweet, group the data fields in a list and use the csv module to write it nicely.
import json
import csv
# A guess on the data format of the tweet that was parsable by the OP's original code.
D = {'coordinates' : {'coordinates' : [45.6,122.3]}, 'text' : u'some text\u0641\u064a',
'user' : {'id_str' : 'some id'}, 'created_at': 'some date'}
data = json.dumps(D)
tweets = json.loads(data)
# 'utf-8-sig' makes sure the output csv will open in Excel if that is a goal.
# newline='' is a requirement for csv.writer in Python 3.
with open('Data.csv','a',encoding = 'utf-8-sig', newline='') as f:
# This forces quoting of strings like the OP got from json.dumps
w = csv.writer(f,quoting=csv.QUOTE_NONNUMERIC)
if tweets['coordinates'] is not None:
val_lg = tweets['coordinates']['coordinates'][1]
val_lt = tweets['coordinates']['coordinates'][0]
else:
val_lg = "None"
val_lt = "None"
text = tweets["text"]
user_id = tweets["user"]["id_str"]
time = tweets["created_at"]
# group the fields in a list for writerow
data = [time,val_lt,val_lg,user_id,text]
print(data)
w.writerow(data)
Output (UTF-8 terminal):
['some date', 45.6, 122.3, 'some id', 'some textفي']
Output (Data.csv):
"some date",45.6,122.3,"some id","some textفي"