Reading tab delimited file section by section - csv

My tab delimited txt file has this format:
Question 1 [tab] Answer 1
Question 2 [tab] Answer 2
Question 3 [tab] Answer 3
Etc.
I need a bash script that can read the file step by step like this:
Question 1
[Press any key to continue...]
Answer 1
[Press any key to continue...]
Question 2
[Press any key to continue...]
Answer 2
Etc.
For studying purposes.
What I got...
#!/bin/bash
filename='test.txt'
while IFS=$'\t' read -r question answer; do
printf "%b\n" "Q: ${question}"
printf "%b\n" "A: ${answer}"
read -p 'Press any key to continue...'
done < "$filename"
Not working :( Help please!

Related

"Argument list too long" while slurping JSON files [duplicate]

This question already has answers here:
Argument list too long error for rm, cp, mv commands
(31 answers)
Closed 1 year ago.
I have thousands of JSON files, and I want to merge them into a single one. I'm using the command below to do this.
jq -s . -- *.json > result.json
But I am getting argument list too long error, probably because of the number of files I'm trying to merge. Is there any workaround for this issue?
Built-in commands are immune to that limitation, and printf is one of them. In conjunction with xargs, it would help a lot to achieve this.
printf '%s\0' *.json | xargs -0 cat -- | jq -s .

Prettify a one-line JSON file [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I downloaded a 203775480 bytes (~200 MiB, exact size is important for a later error) JSON file which has all entries all on one line. Needless to say, my text editor (ViM) cannot efficiently navigate in it and I'm not able to understand anything from it. I'd like to prettify it. I tried to use cat file.json | jq '.', jq '.' file.json, cat file.json | python -m json.tool but none worked. The former two commands print nothing on stdout while the latter says Expecting object: line 1 column 203775480 (char 203775479).
I guess it's broken somewhere near the end, but of course I cannot understand where as I cannot even navigate it.
Have you got some other idea for prettifying it? (I've also tried gg=G in ViM: it did not work).
I found that the file was indeed broken: I accidentally noticed a ']' at the beginning of the file so I struggled to go to the end of the file and added a ']' at the end (it took me maybe 5 minutes).
Then I've rerun cat file.json | python -m json.tool and it worked like a charm.

Get difference between two csv files based on column using bash

I have two csv files a.csv and b.csv, both of them come with no headers and each value in a row is seperated by \t.
1 apple
2 banana
3 orange
4 pear
apple 0.89
banana 0.57
cherry 0.34
I want to subtract these two files and get difference between the second column in a.csv and the first column in b.csv, something like a.csv[1] - b.csv[0] that would give me another file c.csv looks like
orange
pear
Instead of using python and other programming languages, I want to use bash command to complete this task and found out that awk would be helpful but not so sure how to write the correct command. Here is another similar question but the second answer uses awk '{print $2,$6-$13}' to get the difference between values instead of occurence.
Thanks and appreciate for any help.
You can easily do this with the Steve's answer from the link you are referring to with a bit of tweak. Not sure the other answer with paste will get you solving this problem.
Create a hash-map from the second file b.csv and compare it again with the 2nd column in a.csv
awk -v FS="\t" 'BEGIN { OFS = FS } FNR == NR { unique[$1]; next } !($2 in unique) { print $2 }' b.csv a.csv
To redirect the output to a new file, append > c.csv at the end of the previous command.
Set the field separators (input and output) to \t as you were reading a tab-delimited file.
The FNR == NR { action; } { action } f1 f2 is a general construct you find in many awk commands that works if you had to do action on more than one file. The block right after the FNR == NR gets executed on the first file argument provided and the next block within {..} runs on the second file argument.
The part unique[$1]; next creates a hash-map unique with key as the value in the first column on the file b.csv. The part within {..} runs for all the columns in the file.
After this file is completely processed, on the next file a.csv, we do !($2 in unique) which means, mark those lines whose $2 in the second file is not part of the key in the unique hash-map generated from the first file.
On these lines print only the second column names { print $2 }
Assuming your real data is sorted on the columns you care about like your sample data is:
$ comm -23 <(cut -f2 a.tsv) <(cut -f1 b.tsv)
orange
pear
This uses comm to print out the entries in the first file that aren't in the second one, after using cut to get just the columns you care about.
If not already sorted:
comm -23 <(cut -f2 a.tsv | sort) <(cut -f1 b.tsv | sort)
If you want to use Miller (https://github.com/johnkerl/miller), a clean and easy tool, the command could be
mlr --nidx --fs "\t" join --ul --np -j join -l 2 -r 1 -f 01.txt then cut -f 2 02.txt
It gives you
orange
pear
It's a join in which it does not emit paired records and emits unpaired records from the left file.

How to use Bash to create arrays with values from the same line of many files?

I have a number of files (in the same folder) all with the same number of lines:
a.txt
20
3
10
15
15
b.txt
19
4
5
8
8
c.txt
2
4
9
21
5
Using Bash, I'd like to create an array of arrays that contain the value of each line in every file. So, line 1 from a.txt, b.txt, and c.txt. The same for lines 2 to 5, so that in the end it looks like:
[
[20, 19, 2],
[3, 4, 4],
...
[15, 8, 5]
]
Note: I messed up the formatting and wording. I've changed this now.
I'm actually using jq to get these lists in the first place, as they're originally specific values within a JSON file I download every X minutes. I used jq to get the values I needed into different files as I thought that would get me further, but now I'm not sure that was the way to go. If it helps, here is the original JSON file I download and start with.
I've looked at various questions that somewhat deal with this:
Creating an array from a text file in Bash
Bash Script to create a JSON file
JQ create json array using bash
Among others. But none of these deal with taking the value of the same line from various files. I don't know Bash well enough to do this and any help is greatly appreciated.
Here’s one approach:
$ jq -c -n '[$a,$b,$c] | transpose' --slurpfile a a.txt --slurpfile b b.txt --slurpfile c c.txt
Generalization to an arbitrary number of files
In the following, we'll assume that the files to be processed can be specified by *.txt in the current directory:
jq -n -c '
[reduce inputs as $i ({}; .[input_filename] += [$i]) | .[]]
| transpose' *.txt
Use paste to join the files, then read the input as raw text, splitting on the tabs inserted by paste:
$ paste a.txt b.txt c.txt | jq -Rc 'split("\t") | map(tonumber)'
[20,19,2]
[3,4,4]
[10,5,9]
[15,8,21]
[15,8,5]
If you want to gather the entire result into a single array, pipe it into another instance of jq in slurp mode. (There's probably a way to do it with a single invocation of jq, but this seems simpler.)
$ paste a.txt b.txt c.txt | jq -R 'split("\t") | map(tonumber)' | jq -sc
[[20,19,2],[3,4,4],[10,5,9],[15,8,21],[15,8,5]]
I could not come up with a simple way, but here's one I got to do this.
1. Join files and create CSV-like file
If your machine have join, you can create joined records from two files (like join command in SQL).
To do this, make sure your file is sorted.
The easiest way I think is just numbering each lines. This works as Primary ID in SQL.
$ cat a.txt | nl > a.txt.nl
$ cat b.txt | nl > b.txt.nl
$ cat c.txt | nl > c.txt.nl
Now you can join sorted files into one. Note that join can join only two files at once. This is why I piped output to next join.
$ join a.txt.nl b.txt.nl | join - c.txt.nl > conc.txt
now conc.txt is:
1 20 19 2
2 3 4 4
3 10 5 9
4 15 8 21
5 15 8 5
2. Create json from the CSV-like file
It seems little complicated.
jq -Rsn '
[inputs
| . / "\n"
| (.[] | select((. | length) > 0) | . / " ") as $input
| [$input[1], $input[2], $input[3] ] ]
' <conc.txt
Actually I do not know detailed syntex or usage of jq, it seems like doing:
split input file by \n
split a given line by space, then select valid data
put splitted records in appropriate location by their index
I used this question as a reference:
https://stackoverflow.com/a/44781106/10675437

Search .cvs file of email addresses for public pgp-keys

I have a list of mail addresses of friends (.csv) and I want to see if they have public keys stored on pgp keyservers. I want to get this going for Mac.
The pgp part is not the problem, however I can't get my head around the for loop to go through each element in the file...
for add in {cat contacts.csv | grep #}; do gpg --search-keys $add; done
Don't write loops just for running a single command for each line of a file, use xargs instead. cat is also not required here.
This small snippet is doing what you're trying to achieve:
grep # contacts.csv | xargs -n 1 gpg --search-keys
If you insist in the loop, use the right parenthesis $( ... ) runs the command in a subshell):
for add in $( grep # contacts.csv ); do gpg --search-keys $add; done
I answered a similar, but not equal question on the security stack exchange and Stack Overflow, you might also get some inspiration there.