Altering external data like a python list - json

I have a problem that seems like it should be simple in my head but i'm struggling to figure out the simplest way of executing it.
Basically, I have 2 lists of names:
list1 = ['name1', 'name2', 'name3', 'name4', 'name5']
list2 = ['name6', 'name7', 'name8', 'name9', 'name10']
My ultimate goal is, when the python script is run, i will return 1 random name from each list but ideally the script would not return the same name from either list for the next 4 times it was run. Essentially, i want each run's pair/choice to be random but i want it to restart the rotation every 5 times the script is run.
I think I'll need to store the data externally. Conceptually, i think i need the script to move a name from one list to another list every time it's run and when the first list is empty, it reverses and moves the names the other way and so on.
Should i use the csv.DictReader to move names between two CSV? or store them in a JSON?
Forgive me if this doesn't make sense. I'm struggling to put my problem into words.

Why not just store it in a text file?
import random
list1 = ['name1', 'name2', 'name3', 'name4', 'name5']
list2 = ['name6', 'name7', 'name8', 'name9', 'name10']
used = open("old.txt").readlines()
l1_copy = list1[:]
l2_copy = list2[:]
for line in used:
if line := line.strip():
a, b = line.split(",")
l1_copy.remove(a)
l2_copy.remove(b)
if not l1_copy or not l2_copy:
# clear file
with open("old.txt", "w"): pass
else:
list1 = l1_copy
list2 = l2_copy
choice1, choice2 = random.choice(list1), random.choice(list2)
print("Choice 1:", choice1)
print("Choice 2:", choice2)
with open("old.txt", "a") as f:
f.write(choice1 + "," + choice2 + "\n")
Make a file named old.txt in your current directory. It will store the previous runs, comma-separated.
Example run:
$ python3.8 choose_from_list.py
Choice 1: name2
Choice 2: name9
$ python3.8 choose_from_list.py
Choice 1: name5
Choice 2: name10
$ python3.8 choose_from_list.py
Choice 1: name4
Choice 2: name6
$ python3.8 choose_from_list.py
Choice 1: name1
Choice 2: name7
$ python3.8 choose_from_list.py
Choice 1: name3
Choice 2: name8

Related

How can I parse multiple entire lines of text into octave 'matrix'

I want to import a lot of data from multiple files from multiple sub files. Luckily the data is consistent in its output:
Subpro1/data apples 1
Subpro1/data oranges 1
Subpro1/data banana 1
then
Subpro2/data apples 1
Subpro2/data oranges 1
Subpro2/data banana 1
I want to have a a datafilename array that holds the file names for each set of data I need to read. Then I can extract and store the data in a more local file, process it and eventually compare 'sub1_apples' to 'sub2_apples'
I have tried
fid = fopen ("DataFileNames.txt");
DataFileNames = fgets (fid)
fclose (fid);
But this only gives me the first line of 7.
DataFileNames = dlmread('DataFileNames.txt') gives me a 7x3 array but only 0 0 1 in each line as it reads the name breaks as delimiters and I cant change the file names.
DataFileNames = textread("DataFileNames.txt", '%s')
has all the correct information but still the delimiters split it across multiple lines
data
apples
1
data
oranges
1
...
Is there a %? that I am missing, if so what is it?
I want the output to be:
data apples 1
data oranges 1
data banana 1
With spaces, underscores and everything included so that I can then use this to access the data file.
You can read all lines of the file to a cell array like this:
str = fileread("DataFileNames.txt");
DataFileNames = regexp(str, '\r\n|\r|\n', 'split');
Output:
DataFileNames =
{
[1,1] = data apples 1
[1,2] = data oranges 1
[1,3] = data banana 1
}
In the first option you tried, using fgets you are reading just one line. Also, its better to use fgetl to remove the line end. To read line by line (which is longer) you need to do:
DataFileNames = {};
fid = fopen ("DataFileNames.txt");
line = fgetl(fid);
while ischar(line)
if ~isempty(line)
DataFileNames = [DataFileNames line];
endif
line = fgetl(fid);
endwhile
fclose (fid);
The second option you tried, using dlmread is not good because it is intended for reading numeric data to a matrix.
The third option you tried with textread, is not so good because it treats all white spaces (spaces, line-ends, ...) equally

Pattern matching and list comprehension with binary

Using list comprehension, elixir allow do pattern matching like that:
iex()> for {a,2,c} = ch <- [{1,2,3},{4,5,6},3,4,5], do: c
[3]
But when I'm trying to do something like that with binary, I fail:
iex()> for << b1::size(2), b2::size(3), b3::size(3) >> = <<ch>> <- 'hello', do: b1
[]
Nevertheless, it matches well when standalone:
<< b1::size(2), b2::size(3), b3::size(3) >> = <<100>>
"d"
iex(282)> b2
4
iex(283)> b1
1
iex(284)> b3
4
It also works well when I pass mattern matching clause as secornd parameter to for:
iex(286)> for ch <- 'hello', << b1::size(2), b2::size(3), b3::size(3) >> = <<ch>>, do: b1
[1, 1, 1, 1, 1]
I'm interested if it possible to do something like first example with binary.
This is what fails:
<<ch>> <- 'hello'
In your very first example you do var <- list and later you try <<var>> <- list which is not the same by all means.
'hello' is the list of integers in the first place. Check this:
[104,101,108,108,111]
#⇒ 'hello'
Kernel.SpecialForms.for/1 is iterating through the list, one by one. One cannot match binary to an integer as is:
<<_::size(2), _::size(3), _::size(3)>> = 101
#⇒ ** (MatchError) no match of right hand side value: 101
And also:
<<ch>> = 101
#⇒ ** (MatchError) no match of right hand side value: 101
The latter code example from your question works because you match to an integer and then explicitly tell to Elixir/Erlang it’s to be treated as binary by wrapping it with << >>:
<<b1::size(2), _::size(3), _::size(3)>> = <<101>>
#⇒ "e"

From list of json files to data.table: partial variable list

I have a list of more than 100,000 json files from which I want to get a data.table with only a few variables. Unfortunately the files are complex. The content of each json file looks like:
Sample 1
$id
[1] "10.1"
$title
$title$value
[1] "Why this item"
$itemsource
$itemsource$id
[1] "AA"
$date
[1] "1992-01-01"
$itemType
[1] "art"
$creators
list()
Sample 2
$id
[1] "10.2"
$title
$title$value
[1] "We need this item"
$itemsource
$itemsource$id
[1] "AY"
$date
[1] "1999-01-01"
$itemType
[1] "art"
$creators
type name firstname surname affiliationIds
1 Person Frank W. Cornell. Frank W. Cornell. a1
2 Person David A. Chen. David A. Chen. a1
$affiliations
id name
1 a1 Foreign Affairs Desk, New York Times
What I need from this set of files is a table with creator names, item ids and dates. For the two sample files above:
id date name firstname lastname creatortype
"10.1" "1992-01-01" NA NA NA NA
"10.2" "1999-01-01" Frank W. Cornell. Frank W. Cornell. Person
"10.2" "1999-01-01" David A. Chen. David A. Chen. Person
What I have done so far:
library(parallel)
library(data.table)
library(jsonlite)
library(dplyr)
filelist = list.files(pattern="*.json",recursive=TRUE,include.dirs =TRUE)
parsed = mclapply(filelist, function(x) fromJSON(x),mc.cores=24)
data = rbindlist(mclapply(1:length(parsed), function(x) {
a = data.table(item = parsed[[x]]$id, date = list(list(parsed[[x]]$date)), name = list(list(parsed[[x]]$name)), creatortype = list(list(parsed[[x]]$creatortype))) #ignoring the firstname/lastname fields here for convenience
b = data.table(id = a$item, date = unlist(a$date), name=unlist(a$name), creatortype=unlist(a$creatortype))
return(b)
},mc.cores=24))
However, on the last step, I get this error:
"Error in rbindlist(mclapply(1:length(parsed), function(x){:
Item 1 of list is not a data.frame, data.table or list"
Thanks in advance for your suggestions.
Related questions include:
Extract data from list of lists [R]
R convert json to list to data.table
I want to convert JSON file into data.table in r
How can read files from directory using R?
Convert R data table column from JSON to data table
from the error message, i suppose this basically means that one of the results from mclapply() is empty, by empty I mean either NULL or data.table with 0 row, or simply encounters an error within the parallel processing.
what you could do is:
add more checks inside the mclapply() like try-error or check the class of b and nrow of b, whether b is empty or not
when you use rbindlist, add argument fill = T
hope this solves ur problem.

Parse txt file with shell

I have a txt file containing the output from several commands executed on a networking equipment. I wanted to parse this txt file so i can sort and print on an HTML page.
What is the best/easiest way to do this? Export every command to an array and then print array with sort on the HTML code?
Commands are between lines and they're tabular data. example:
*********************************************************************
# command 1
*********************************************************************
Object column1 column2 Total
-------------------------------------------------------------------
object 1 526 9484 10010
object 2 2 10008 10010
Object 3 0 20000 20000
*********************************************************************
# command 2
*********************************************************************
(... tabular data ...)
Can someone suggest any code or file where see how to make this work?
Thanks!
This can be easily done in Python with this example code:
f = open('input.txt')
rulers = 0
table = []
for line in f.readlines():
if '****' in line:
rulers += 1
if rulers == 2:
table = []
elif rulers > 2:
print(table)
rulers = 0
continue
if line == '\n' or '----' in line or line.startswith('#'):
continue
table.append(line.split())
print(table)
It just prints list of lists of the tabular values. But it can be formatted to whatever HTML or another format you need.
Import into your spreadsheet software. Export to HTML from there, and modify as needed.

Is it possible to write a table to a file in JSON format in R?

I'm making word frequency tables with R and the preferred output format would be a JSON file. sth like
{
"word" : "dog",
"frequency" : 12
}
Is there any way to save the table directly into this format? I've been using the write.csv() function and convert the output into JSON but this is very complicated and time consuming.
set.seed(1)
( tbl <- table(round(runif(100, 1, 5))) )
## 1 2 3 4 5
## 9 24 30 23 14
library(rjson)
sink("json.txt")
cat(toJSON(tbl))
sink()
file.show("json.txt")
## {"1":9,"2":24,"3":30,"4":23,"5":14}
or even better:
set.seed(1)
( tab <- table(letters[round(runif(100, 1, 26))]) )
a b c d e f g h i j k l m n o p q r s t u v w x y z
1 2 4 3 2 5 4 3 5 3 9 4 7 2 2 2 5 5 5 6 5 3 7 3 2 1
sink("lets.txt")
cat(toJSON(tab))
sink()
file.show("lets.txt")
## {"a":1,"b":2,"c":4,"d":3,"e":2,"f":5,"g":4,"h":3,"i":5,"j":3,"k":9,"l":4,"m":7,"n":2,"o":2,"p":2,"q":5,"r":5,"s":5,"t":6,"u":5,"v":3,"w":7,"x":3,"y":2,"z":1}
Then validate it with http://www.jsonlint.com/ to get pretty formatting. If you have multidimensional table, you'll have to work it out a bit...
EDIT:
Oh, now I see, you want the dataset characteristics sink-ed to a JSON file. No problem, just give us a sample data, and I'll work on a code a bit. Practically, you need to carry out the data into desirable format, hence convert it to JSON. list should suffice. Give me a sec, I'll update my answer.
EDIT #2:
Well, time is relative... it's a common knowledge... Here you go:
( dtf <- structure(list(word = structure(1:3, .Label = c("cat", "dog",
"mouse"), class = "factor"), frequency = c(12, 32, 18)), .Names = c("word",
"frequency"), row.names = c(NA, -3L), class = "data.frame") )
## word frequency
## 1 cat 12
## 2 dog 32
## 3 mouse 18
If dtf is a simple data frame, yes, data.frame, if it's not, coerce it! Long story short, you can do:
toJSON(as.data.frame(t(dtf)))
## [1] "{\"V1\":{\"word\":\"cat\",\"frequency\":\"12\"},\"V2\":{\"word\":\"dog\",\"frequency\":\"32\"},\"V3\":{\"word\":\"mouse\",\"frequency\":\"18\"}}"
I though I'll need some melt with this one, but simple t did the trick. Now, you only need to deal with column names after transposing the data.frame. t coerces data.frames to matrix, so you need to convert it back to data.frame. I used as.data.frame, but you can also use toJSON(data.frame(t(dtf))) - you'll get X instead of V as a variable name. Alternatively, you can use regexp to clean the JSON file (if needed), but it's a lousy practice, try to work it out by preparing the data.frame.
I hope this helped a bit...
These days I would typically use the jsonlite package.
library("jsonlite")
toJSON(mydatatable, pretty = TRUE)
This turns the data table into a JSON array of key/value pair objects directly.
RJSONIO is a package "that allows conversion to and from data in Javascript object notation (JSON) format". You can use it to export your object as a JSON file.
library(RJSONIO)
writeLines(toJSON(anobject), "afile.JSON")