How do I convert the output from ls to columns or a table? (Preferably "filelessly") - multiple-columns

Say I have a folder with a range input/output files and I want to put the input-files in one column and the output-files in another column next to the first to compare them:
$ ls
input1.inp input10.inp input2.inp input3.inp input4.inp input5.inp
input6.inp input7.inp input8.inp input9.inp output1.dat output10.dat
output2.dat output3.dat output4.dat output5.dat output6.dat output7.dat
output8.dat output9.dat
And I would like the output as:
input1.inp output1.dat
input10.inp output10.dat
input2.inp output2.dat
input3.inp output3.dat
input4.inp output4.dat
input5.inp output5.dat
input6.inp output6.dat
input7.inp output7.dat
input8.inp output8.dat
input9.inp output9.dat
I assume this can be done using a combination of ls and columnsor some other programs/tools. I just can't figure out a good combination of commands and/or pipes/redirects. I figured that I could achieve this with vimdiff
vimdiff <(ls -1 *.inp) <(ls -1 *.dat)
Though I'd prefer getting the output printed directly on screen (stdout(?)).
Well, slightly embarrassed and a few "doh" moments later, I am closer but not quite there. So, using paste I am able to get pretty close to what I want:
paste <(ls -1 *.inp) <(ls -1 *.dat)
This works as long as input and output filenames correlate as in the example and are produced "in order".
input1.inp output1.dat
input10.inp output10.dat
... ...
... ...
input7.inp output7.dat
input8.inp
input9.inp
However, if there are a mismatch in the order, one column gets shifted and "matches" don't line up.
input1.inp output1.dat
input10.inp output10.dat
... ...
... ...
inputX.inp output7.dat
input7.inp
input8.inp
input9.inp
The same is obviously observed with:
diff -y <(ls -1 *.inp) <(ls -1 *.dat)
Obviously, aligning the outputs is a bigger issue than I imagined as the filenames are different with either different prefix or suffix. Hence, every "line" is different when using diff.
I am going to close this topic as I have something that works right now, just not exactly how I'd imagined it. Though I'll leave it open for a little while longer as just to see of anyone has any ideas.

Either paste <(ls -1 *.inp) <(ls -1 *.dat) or diff -y <(ls -1 *.inp) <(ls -1 *.dat) did what I was asking for. Then I ended up wanting even more...
Still, the original problem was solved using either solution and I am happy with it for now.

Related

Formatting wide output via 'column' (or similar) command(s)

This question actually asks the 'inverse' solution as the one here, namely I would like to wrap the long column (column 4) on multiple lines. In effect, the output should look like:
cat test.csv | column -s"," -t -c5
col1 col2 col3 col4 col5
1 2 3 longLineOfText 5
ThatIWantTo
InspectAndWould
LikeToWrap
(excuse the u.u.o.c. duplicated over here :) )
The solution would ideally :
make use of standard *nix text processing utilities (e.g. column, paste, pr which usually are present on any modern Linux machine nowadays, usually coming from the core-utils package);
avoid jq as it is not necessarily present on every (production) system;
don't overheat the brain: yes... am looking mainly at you awk & co. gurus :). "Normal" awk / perl / sed is fine.
as a special bonus , a solution using vim would be even more welcome (again, no brain smoke please), since that would allow for syntax-coloring as well.
The background: I want to be able to make sense of the output of docker history, so as a last resort even some Go Template-magic would suit, as would using jq.
In extreme cases (if the benefits of ease-of-remembering-and-use outweigh the inconvenience of downloading a new utilty (preferably self-contained / static linked) utility on the server - is ok, or using json processing commands (in which case using pythons json module would be preferred)
Thanks !
LE:
Please keep in mind, that dockers output has the columns separated with several spaces, which unfortunately confuses most commands :(

How to get sift to use a pager?

I recently stumbled upon sift ("A fast and powerful alternative to grep."). I quite like its feature-set, but there's one thing I'm missing: paged output (with colors).
I mean, it can be done:
$ sift 'search_string' --color | less --RAW-CONTROL-CHARS
The --color I can put into .sift.conf, but the | less -R I have to repeat for every invocation.
With ack, I can stick this into the .ackrc and forget about it (Using ack, how do you display one page at a time? / http://beyondgrep.com/documentation/).
How do I do this with sift?

Extracting CREATE TABLE definitions from MySQL dump?

I have a MySQL dump file over 1 terabyte big. I need to extract the CREATE TABLE statements from it so I can provide the table definitions.
I purchased Hex Editor Neo but I'm kind of disappointed I did. I created a regex CREATE\s+TABLE(.|\s)*?(?=ENGINE=InnoDB) to extract the CREATE TABLE clause, and that seems to be working well testing in NotePad++.
However, the ETA of extracting all instances is over 3 hours, and I cannot even be sure that it is doing it correctly. I don't even know if those lines can be exported when done.
Is there a quick way I can do this on my Ubuntu box using grep or something?
UPDATE
Ran this overnight and output file came blank. I created a smaller subset of data and the procedure is still not working. It works in regex testers however, but grep is not liking it and yielding an empty output. Here is the command I'm running. I'd provide the sample but I don't want to breach confidentiality for my client. It's just a standard MySQL dump.
grep -oP "CREATE\s+TABLE(.|\s)+?(?=ENGINE=InnoDB)" test.txt > plates_schema.txt
UPDATE
It seems to not match on new lines right after the CREATE\s+TABLE part.
You can use Perl for this task... this should be really fast.
Perl's .. (range) operator is stateful - it remembers state between evaluations.
What it means is: if your definition of table starts with CREATE TABLE and ends with something like ENGINE=InnoDB DEFAULT CHARSET=utf8; then below will do what you want.
perl -ne 'print if /CREATE TABLE/../ENGINE=InnoDB/' INPUT_FILE.sql > OUTPUT_FILE.sql
EDIT:
Since you are working with a really large file and would probably like to know the progress, pv can give you this also:
pv INPUT_FILE.sql | perl -ne 'print if /CREATE TABLE/../ENGINE=InnoDB/' > OUTPUT_FILE.sql
This will show you progress bar, speed and ETA.
You can use the following:
grep -ioP "^CREATE\s+TABLE[\s\S]*?(?=ENGINE=InnoDB)" file.txt > output.txt
If you can run mysqldump again, simply add --no-data.
Got it! grep does not support matching across multiple lines. I found this question helpul and I ended up using pcregrep instead.
pcregrep -M "CREATE\s+TABLE(.|\n|\s)+?(?=ENGINE=InnoDB)" test.txt > plates.schema.txt

How to annote a large file with mercurial?

i'm trying to find out who did the last change on a certain line in a large xml file ( ~100.00 lines ).
I tried to use the log file viewer of thg but it does not give me any results.
Using hg annote file.xml -l 95000 .. took forever and eventually died with an error message.
Is there a way to annote a single line in a large file that does not take forever ?
You can use hg grep to dig into a file if you have interest in very specific text:
hg grep "text-to-search" file.txt
You will likely need to add the --all switch to get every change that matches and then limit your results to a specific changeset range with -r firstchage:lastchange syntax.
I don't have a file on hand of the size you are working with, so this may also have trouble past a certain point, particularly if the search string matches many many lines in the file. But if you can get as specific as possible with your search string, you should be able to track it down.

DIFF utility works for 2 files. How to compare more than 2 files at a time?

So the utility Diff works just like I want for 2 files, but I have a project that requires comparisons with more than 2 files at a time, maybe up to 10 at a time. This requires having all those files side by side to each other as well. My research has not really turned up anything, vimdiff seems to be the best so far with the ability to compare 4 at a time.
My question: Is there any utility to compare more than 2 files at a time, or a way to hack diff/vimdiff so it can do multiple comparisons? The files I will be comparing are relatively short so it should not be too slow.
Displaying 10 files side-by-side and highlighting differences can be easily done with Diffuse. Simply specify all files on the command line like this:
diffuse 1.txt 2.txt 3.txt 4.txt 5.txt 6.txt 7.txt 8.txt 9.txt 10.txt
Vim can already do this:
vim -d file1 file2 file3
But you're normally limited to 4 files. You can change that by modifying a single line in Vim's source, however. The constant DB_COUNT defines the maximum number of diffed files, and it's defined towards the top of diff.c in versions 6.x and earlier, or about two thirds of the way down structs.h in versions 7.0 and up.
diff has built-in option --from-file and --to-file, which compares one operand to all others.
--from-file=FILE1
Compare FILE1 to all operands. FILE1 can be a directory.
--to-file=FILE2
Compare all operands to FILE2. FILE2 can be a directory.
Note: argument name --to-file is optional.
e.g.
# this will compare foo with bar, then foo with baz .html files
$ diff --from-file foo.html bar.html baz.html
# this will compare src/base-main.js with all .js files in git repo,
# that has 'main' in their filename or path
$ git ls-files :/*main*.js | xargs diff -u --from-file src/base-main.js
Checkout "Beyond Compare": http://www.scootersoftware.com/
It lets you compare entire directories of files, and it looks like it runs on Linux too.
if your running multiple diff's based off one file you could probably try writing a script that has a for loop to run through each directory and run the diff. Although it wouldn't be side by side you could at least compare them quickly. hope that helped.
Not answering the main question, but here's something similar to what Benjamin Neil has suggested but diffing all files:
Store the filenames in an array, then loop over the combinations of size two and diff (or do whatever you want).
files=($(ls -d /path/of/files/some-prefix.*)) # Array of files to compare
max=${#files[#]} # Take the length of that array
for ((idxA=0; idxA<max; idxA++)); do # iterate idxA from 0 to length
for ((idxB=idxA + 1; idxB<max; idxB++)); do # iterate idxB + 1 from idxA to length
echo "A: ${files[$idxA]}; B: ${files[$idxB]}" # Do whatever you're here for.
done
done
Derived from #charles-duffy's answer: https://stackoverflow.com/a/46719215/1160428
There is a simple an good way to do this = GREP.
Depending on the size of the text you can copy and paste it, or you can redirect the input of the file to the grep command. If you make a grep -vir /path to make a reverse search or a grep -ir /path. This is my way for certification exams.