convert json text entries to a dataframe in r - json
I have a text file with json like structure that contains values for certain variables as below.
[{"variable1":"111","variable2":"666","variable3":"11","variable4":"aaa","variable5":"0"}]
[{"variable1":"34","variable2":"12","variable3":"78","variable4":"qqq","variable5":"-9"}]
Every line is a new set of values for the same variables 1 through 5. There can be 1000s of lines in a text file but the variables would always remain the same. I want to extract variable 1 through 5 along with their values and convert into a dataframe. Currently I perform these operations in excel using string manipulation and transpose. Here is what it looks like in excel -
How to do this in R? Much appreciated. Thanks.
J
There is a package named jsonlite that you can use.
library("jsonlite")
df<- fromJSON("YourPathToTheFile")
You can find more info here.
Related
How can I write certain sections of text from different lines to multiple lines?
So I'm currently trying to use Python to transform large sums of data into a neat and tidy .csv file from a .txt file. The first stage is trying to get the 8-digit company numbers into one column called 'Company numbers'. I've created the header and just need to put each company number from each line into the column. What I want to know is, how do I tell my script to read the first eight characters of each line in the .txt file (which correspond to the company number) and then write them to the .csv file? This is probably very simple but I'm only new to Python! So far, I have something which looks like this: with open(r'C:/Users/test1.txt') as rf: with open(r'C:/Users/test2.csv','w',newline='') as wf: outputDictWriter = csv.DictWriter(wf,['Company number']) outputDictWriter.writeheader() rf = rf.read(8) for line in rf: wf.write(line)
My recommendation would be 1) read the file in, 2) make the relevant transformation, and then 3) write the results to file. I don't have sample data, so I can't verify whether my solution exactly addresses your case with open('input.txt','r') as file_handle: file_content = file_handle.read() list_of_IDs = [] for line in file_content.split('\n') print("line = ",line) print("first 8 =", line[0:8]) list_of_IDs.append(line[0:8]) with open("output.csv", "w") as file_handle: file_handle.write("Company\n") for line in list_of_IDs: file_handle.write(line+"\n") The value of separating these steps is to enable debugging.
Zapier Code Step Model Data into CSV
I'm looking for help with some JavaScript to insert inside of a code step in Zapier. I have two inputs that are named/look like the following: RIDS: 991,992,993 LineIDs: 1,2,3 Each of these should match in the quantity of items in the list. There can be 1, 2 or 100 of them. The order is significant. What I'm looking for is a code step to model the data into one CSV matching up the positions of each. So using the above data, my output would look like this: 991,1 992,2 993,3 Does anyone have code or easily know how to achieve this? I am not a JavaScript developer.
Zapier doesn't allow you to create files in a code step. You can, though, use the code step to generate text which can then be used in another step. I used Python for my example (I'm not as familiar with Javascript but the strategy is the same). Create CSV file in Zapier from Raw Data Code Step with LindeIDs and RIDs as inputs import csv import io # Convert inputs into lists lids = input_data['LineIDs'].split(',') rids = input_data['RIDs'].split(',') # Create file-like CSV object csvfile = io.StringIO() filewriter = csv.writer(csvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL) # Write CSV rows filewriter.writerow(['LineID', 'RID']) for x in range(len(lids)): filewriter.writerow([lids[x], rids[x]]) # Get CSV object value as text and set to output output = {'text': csvfile.getvalue()} Use a Google Drive step to Create File from Text File Content = Text from Step 1 Convert to Document = no This will create a *.txt document Use a CloudConvert step to Convert File from txt to csv.
xlsread in octave return zero values
I am trying to read a csv file in octave. The file contains a table with both numeric and text data. It also contains information of date and hour. In addition, the first line is in a different format then the rest of the lines since it contains titles. The csvread can only read numeric data (according to Octave help), so I tried using xlsread as follows: [NUMARR, TXTARR, RAWARR, LIMITS] = xlsread ('Line.csv') I get only a matrix of NUMARR with numeric values. However, all other returned variables are empty- their dimension is 0x0. How do I get all the text and all other information? TX!
To solve this issue, open your CSV file in Windows notepad and save it as ANSI format instead of UNICODE.
Selectively Import only Json data in txt file into R.
I have 3 questions I would like to ask as I am relatively new to both R and Json format. I read quite a bit of things but I don't quite understand still. 1:) Can R parse Json data when the txt file contains other irrelevant information as well? Assuming I can't, I uploaded the text file into R and did some cleaning up. So that it will be easier to read the file. require(plyr) require(rjson) small.f.2 <- subset(small.f.1, ! V1 %in% c("Level_Index:", "Feature_Type:", "Goals:", "Move_Count:")) small.f.3 <- small.f.2[,-1] This would give me a single column with all the json data in each line. I tried to write new .txt file . write.table(small.f.3, file="small clean.txt", row.names = FALSE) json_data <- fromJSON(file="small.clean") The problem was it only converted 'x' (first row) into a character and ignored everything else. I imagined it was the problem with "x" so I took that out from the .txt file and ran it again. json_data <- fromJSON(file="small clean copy.txt") small <- fromJSON(paste(readLines("small clean copy.txt"), collapse="")) Both time worked and I manage to create a list. But it only takes the data from the first row and ignore the rest. This leads to my second question. I tried this.. small <- fromJSON(paste(readLines("small clean copy.txt"), collapse=",")) Error in fromJSON(paste(readLines("small clean copy.txt"), collapse = ",")) : unexpected character ',' 2.) How can I extract the rest of the rows in the .txt file? 3.) Is it possible for R to read the Json data from one row, and extract only the nested data that I need, and subsequently go on to the next row, like a loop? For example, in each array, I am only interested in the Action vectors and the State Feature vectors, but I am not interested in the rest of the data. If I can somehow extract only the information I need before moving on to the next array, than I can save a lot of memory space. I validated the array online. But the .txt file is not json formatted. Only within each array. I hope this make sense. Each row is a nested array. The data looks something like this. I have about 65 rows (nested arrays) in total. {"NonlightningIndices":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],"LightningIndices":[],"SelectedAction":12,"State":{"Features":{"Data":[21.0,58.0,0.599999964237213,12.0,9.0,3.0,1.0,0.0,11.0,2.0,1.0,0.0,0.0,0.0,0.0]}},"Actions":[{"Features":{"Data":[4.0,4.0,1.0,1.0,0.0,3.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12213890532609,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.13055793241076,0.0,0.0,0.0,0.0,0.0,0.231325346416068,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.949158357257511,0.0,0.0,0.0,0.0,0.0,0.369666537828737,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0851765937900996,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.223409208023677,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.698640447815897,1.69496718435102,0.0,0.0,0.0,0.0,1.42312654023416,0.0,0.38394999584831,0.0,0.0,0.0,0.0,1.0,1.22164326251584,1.30980246401454,1.00411570750454,0.0,0.0,0.0,1.44306759429513,0.0,0.00568191150434618,0.0,0.0,0.0,0.0,0.0,0.0,0.157705869690127,0.0,0.0,0.0,0.0,0.102089274086033,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.37039305683305,2.64354332879095,0.0,0.456876463171171,0.0,0.0,0.208651305680117,0.0,0.0,0.0,0.0,0.0,2.0,0.0,3.46713142511126,2.26785558685153,0.284845692694476,0.29200364444299,0.0,0.562185300773834,1.79134869431988,0.423426746571872,0.0,0.0,0.0,0.0,5.06772310533214,0.0,1.95593334724537,2.08448537685298,1.22045520912269,0.251119892385839,0.0,4.86192274732091,0.0,0.186941346075472,0.0,0.0,0.0,0.0,4.37998688020614,0.0,3.04406665275463,1.0,0.49469909818283,0.0,0.0,1.57589195190525,0.0,0.0,0.0,0.0,0.0,0.0,3.55229001446173]}},...... {"NonlightningIndices":[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,24],"LightningIndices":[[15,16,17,18,19,20,21,22,23]],"SelectedAction":15,"State":{"Features":{"Data":[20.0,53.0,0.0,11.0,10.0,2.0,1.0,0.0,12.0,2.0,1.0,0.0,0.0,1.0,0.0]}},"Actions":[{"Features":{"Data":[4.0,4.0,1.0,1.0,0.0,3.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.110686363475575,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.13427913742728,0.0,0.0,0.0,0.0,0.0,0.218834141070836,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.939443046803111,0.0,0.0,0.0,0.0,0.0,0.357568892126985,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0889329732996782,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.22521492930721,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.700441220022084,1.6762090551226,0.0,0.0,0.0,0.0,1.44526456614638,0.0,0.383689214317325,0.0,0.0,0.0,0.0,1.0,1.22583659574753,1.31795156033445,0.99710368703165,0.0,0.0,0.0,1.44325394830013,0.0,0.00418600599483917,0.0,0.0,0.0,0.0,0.0,0.0,0.157518319482216,0.0,0.0,0.0,0.0,0.110244186273209,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.369899973785845,2.55505143302811,0.0,0.463342609296841,0.0,0.0,0.226088384842823,0.0,0.0,0.0,0.0,0.0,2.0,0.0,3.47842109127488,2.38476342332125,0.0698115810371108,0.276804206873942,0.0,1.53514282355593,1.77391161515718,0.421465101754304,0.0,0.0,0.0,0.0,4.45530484778828,0.0,1.43798302409155,3.46965807176681,0.468528940277049,0.259853183829217,0.0,4.86988325473155,0.0,0.190659677933533,0.0,0.0,0.963116148760181,0.0,4.29930830894124,0.0,2.56201697590845,0.593423384852181,0.46165947868584,0.0,0.0,1.59497392171253,0.0,0.0,0.0,0.0,0.0368838512398189,0.0,4.24538684327048]}},...... I would really appreciate any advice here.
Python 3 code to read CSV file, manipulate then create new file....works, but looking for improvements
This is my first ever post here. I am trying to learn a bit of Python. Using Python 3 and numpy. Did a few tutorials then decided to dive in and try a little project I might find useful at work as thats a good way to learn for me. I have written a program that reads in data from a CSV file which has a few rows of headers, I then want to extract certain columns from that file based on the header names, then output that back to a new csv file in a particular format. The program I have works fine and does what I want, but as I'm a newbie I would like some tips as to how I can improve my code. My main data file (csv) is about 57 columns long and about 36 rows deep so not big. It works fine, but looking for advice & improvements. import csv import numpy as np #make some arrays..at least I think thats what this does A=[] B=[] keep_headers=[] #open the main data csv file 'map.csv'...need to check what 'r' means input_file = open('map.csv','r') #read the contents of the file into 'data' data=csv.reader(input_file, delimiter=',') #skip the first 2 header rows as they are junk next(data) next(data) #read in the next line as the 'header' headers = next(data) #Now read in the numeric data (float) from the main csv file 'map.csv' A=np.genfromtxt('map.csv',delimiter=',',dtype='float',skiprows=5) #Get the length of a column in A Alen=len(A[:,0]) #now read the column header values I want to keep from 'keepheader.csv' keep_headers=np.genfromtxt('keepheader.csv',delimiter=',',dtype='unicode_') #Get the length of keep headers....i.e. how many headers I'm keeping. head_len=len(keep_headers) #Now loop round extracting all the columns with the keep header titles and #append them to array B i=0 while i < head_len: #use index to find the apprpriate column number. item_num=headers.index(keep_headers[i]) i=i+1 #append the selected column to array B B=np.append(B,A[:,item_num]) #now reshape the B array B=np.reshape(B,(head_len,36)) #now transpose it as thats the format I want. B=np.transpose(B) #save the array B back to a new csv file called 'cmap.csv' np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",") Thanks.
You can greatly simplify your code using more of numpy capabilities. A = np.loadtxt('stack.txt',skiprows=2,delimiter=',',dtype=str) keep_headers=np.loadtxt('keepheader.csv',delimiter=',',dtype=str) headers = A[0,:] cols_to_keep = np.in1d( headers, keep_headers ) B = np.float_(A[1:,cols_to_keep]) np.savetxt('cmap.csv',B,fmt='%.3f',delimiter=",")