Check duplicates from two files, leave not-duplicates at one only (Notepad++) - duplicates

How do i check for duplicates on two files (file1.txt & file2.txt)?
So there are same lines on file1.txt that are on file2.txt also, i want the duplicates same as on file2.txt to be removed from file1.txt, so on the file1.txt i only would have left not-duplciates?
Can i use notepad++ (even plugins) or any other software to do this?
I hope i was clear, and i guess there could be an easy solution for this, its just that my brain ain't working right now.

You could create a 3rd file and use the TextFX plugin to sort line unique
(credit to the TextFX usage is to https://www.cathrinewilhelmsen.net/2012/05/16/notepad-remove-duplicates-remove-blank-lines-sort-data/)
To combine the 2 text files into one you can run from cmd:
type file1.txt > file3.txt
type file2.txt >> file3.txt (note the double >>)
Then open file3.txt in notepad++ and follow use TextFX

Related

How to add html attributes and values for all lines quickly with vim and plugins?

My os:debian8.
uname -a
Linux debian 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1+deb8u2 (2017-03-07) x86_64 GNU/Linux
Here is my base file.
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
I want to create a html file as bellow.
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
Every line was added href and id attributes,whose values are line content pasted .html and line content itself correspondingly.
How to add html attributes and values for all lines quickly with vim and plugins?
sed,awk,sublime text 3 are all welcomed to solve the problem.
$ sed 's:.*:&:' file
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
if you want to do this in vi itself, no plug-in neccessary
Open the file, type : and insert this line as the command
%s:.*:&
it will make all the substitutions in the file.
sed is the best solution (simple and pretty fast here) if your are sure of the content, if not it need a bit of complexity that is better treated by awk:
awk '
{
# change special char for HTML constraint
Org = URL = HTML = $0
# sample of modification
gsub( / /, "%20", URL)
gsub( /</, "%3C", HTML)
printf( "%s\n", URL, Org, HTML)
}
' YourFile
To complete this easily in Sublime Text, without any plugins added:
Open the base file in Sublime Text
Type Ctrl+Shift+P and in the fuzzy search input type syn html to set the file syntax to HTML.
In the View menu, make sure Word Wrap is toggled off.
Ctrl+A to select all.
Ctrl+Shift+L to break selection into multi-line edit.
Ctrl+C to copy selection into clipboard as multiple lines.
Alt+Shift+W to wrap each line with a tag-- then tap a to convert the default <p> tag into an <a> tag (hit esc to quit out of any context menus that might pop up)
Type a space then href=" -- you should see this being added to every line as they all have cursors. Also you should note that Sublime has automatically closed your quotes for you, so you have href="" with the cursor between the quotes.
ctrl+v -- this is where the magic happens-- your clipboard contains every lines worth of contents, so it will paste each appropriate value into the quotes where the cursor is lying. Then you simply type .html to add the extension.
Use the right arrow to move the cursors outside of the quotes for the href attribute and follow the two previous steps to similarly add an id attribute with the intended ids pasted in.
Voila! You're done.
Multi-line editing is very powerful as you learn how to combine it with other keyboard shortcuts. It has been a huge improvement in my workflow. If you have any questions please feel free to comment and I'll adjust as needed.
With bash one-liner:
while read v; do printf '%s\n' "$v" "$v" "$v"; done < file
(OR)
while read v; do echo "$v"; done < file
Try this -
awk '{print a$1b$1c$1d}' a='' d='' file
home
help
variables
compatibility
modelines
searching
selection
markers
indenting
reformatting
folding
tags
makefiles
mapping
registers
spelling
plugins
etc
Here I have created 4 variable a,b,c & d which you can edit as per your choice.
OR
while read -r i;do echo ""$i";done < f
home
help
variables
compatibility
To execute it directly in vim:
!sed 's:.*:&:' %
In awk, no regex, no nothing, just print strings around $1s, escaping "s:
$ awk '{print "" $1 ""}' file
home
help
If you happen to have empty lines in there just add /./ before the {:
/./{print ...
list=$(cat basefile.txt)
for val in $list
do
echo ""$val"" >> newfile.html
done
Using bash, you can always make a script or type this into the command line.
This vim replacement pattern handles your base file:
s#^\s*\(.\{-}\)\s*$#\1#
^\s* matches any leading spaces, then
.\{-} captures everything after that, non-greedily — allowing
\s$ to match any trailing spaces.
This avoids giving you stuff like home .
You can also process several base files with vim at once:
vim -c 'bufdo %s#^\s*\(.\{-}\)\s*$#\1# | saveas! %:p:r.html' some.txt more.txt`
bufdo %s#^\s*\(.\{-}\)\s*$#\1# runs the replacement on each buffer loaded into vim,
saveas! %:p:r.html saves each buffer with an html extension, overwriting if necessary,
vim will open and show you the saved more.html, which you can correct as needed, and
you can use :n and :prev to visit some.html.
Something like sed’s probably best for big jobs, but this lets you tweak the conversions in vim right after it’s made them, use :u to undo, etc. Enjoy!

how do I open files with conflicts during git/mercurial merge in textmate/sublime

how do I open from terminal window only files with conflicts during git/mercurial merge in textmate/sublime text2 editors
You can use the following to open all files with git merge conflicts in sublime text:
git diff --name-only | uniq | xargs subl
I wanted to add another answer. git diff --name-only will give you all files that have diffs. This is why sometimes it will yield duplicate entries because it marks the file as "modified" as well as in a merge conflict state. Piping it into uniq is a good solution for this but git diff --name-only will also include files you might have purposely changed so it doesn't actually filter only files with merge conflicts. When you are in the middle of rebasing, this is probably not going to happen often though I would say in most cases #StephanRodemeier's answer works.
However, what you can do though is leverage the --diff-filter option which assigns a states to files. See more in the docs
--diff-filter=[(A|C|D|M|R|T|U|X|B)…​[*]]
Select only files that are Added (A), Copied (C), Deleted (D), Modified (M), Renamed (R), have their type (i.e. regular file, symlink, submodule, …​) changed (T), are Unmerged (U), are Unknown (X), or have had their pairing Broken (B). Any combination of the filter characters (including none) can be used. When * (All-or-none) is added to the combination, all paths are selected if there is any file that matches other criteria in the comparison; if there is no file that matches other criteria, nothing is selected.
It seems when files are in the both modified state, the diff status gets set to U (Unmerged) and M (Modified) so you can filter for only Unmerged files.
git diff --diff-filter=U --name-only | xargs subl
Should work without needing to pipe into uniq
Another thing you can consider is simply setting your editor as the difftool i.e. for VSCode documentation specifies how to do this by adding this to your .gitconfig
[diff]
tool = default-difftool
[difftool "default-difftool"]
cmd = code --wait --diff $LOCAL $REMOTE

match data in two notepads

I have two notepads and each notepad contains some data. Let's say Notepad 1 and Notepad 2
Notepad 1 contains: A, B, C
Notepad 2 contains: C, D, E
I want to ask that how can i find data in Notepad 2 that contains notepad 1 data. Here answer is C. But i have lots of data in notepad 1 and notepad2. It is not possible to take individual data from notepad 1 and to press Ctrl+F in notepad 2 to find data. Is there any suitable method for this? Will it be possible by converting these notepads into html pages?
This can be done with the comm(1) tool:
$ cat F1
A
B
C
$ cat F2
C
D
E
$ comm -12 F1 F2
C
$
The -1 suppresses all lines unique to the first file. -2 suppresses all lines unique to the second file. All that is left is lines common to both.
Probably you'd like to have a look on diff/merge tools. WinMerge is a free one. Another good option is Araxis Merge, it is commercial. Also you can just use Notepad++ editor with its Compare plugin. These tools are GUI based and can help you if you want to see and edit difference.
If you need to extract and somehow automatically process difference, you more likely will have to use some console tools and scripting. *nix diff command can be used to extract difference and there are a lot of scripting languages suitable for text processing: sed, AWK, Perl, Python for instance.

Mercurial ignore all files except specific file names

I have a large file system in which almost every folder has a file called content.txt
I want to track every file named content.txt and automatically ignore everything else. I want the repo to automatically track new files named content.txt so I don't want to ignore everything in the .hgignore and then manually add.
Anyone know how to do this?
It has to be regexp mode, not glob
You must debug path-part of regexp, but "all except content.txt" draft is re:.*\.(?!content.txt) as hope
Alternative solution can be
* ignore all
* add content.txt files pattern to commit command (-I option), see hg help commit and hg help patterns
hg commit -I '**content.txt'
Edit
re:.*/(?!content.txt)
Try this:
syntax: regexp
\.(?!txt$)[^.]+$ # "*." is followed by "txt" and we're at the end
(?<!\.)txt$ # "txt" follows a "."
(?<!/)content\. # "content." follows path separator
(?<!content)\. # "." follows "content"
I left in the comments I made while experimenting, to make sense of it all. (That's glob syntax in the first one.)

DIFF utility works for 2 files. How to compare more than 2 files at a time?

So the utility Diff works just like I want for 2 files, but I have a project that requires comparisons with more than 2 files at a time, maybe up to 10 at a time. This requires having all those files side by side to each other as well. My research has not really turned up anything, vimdiff seems to be the best so far with the ability to compare 4 at a time.
My question: Is there any utility to compare more than 2 files at a time, or a way to hack diff/vimdiff so it can do multiple comparisons? The files I will be comparing are relatively short so it should not be too slow.
Displaying 10 files side-by-side and highlighting differences can be easily done with Diffuse. Simply specify all files on the command line like this:
diffuse 1.txt 2.txt 3.txt 4.txt 5.txt 6.txt 7.txt 8.txt 9.txt 10.txt
Vim can already do this:
vim -d file1 file2 file3
But you're normally limited to 4 files. You can change that by modifying a single line in Vim's source, however. The constant DB_COUNT defines the maximum number of diffed files, and it's defined towards the top of diff.c in versions 6.x and earlier, or about two thirds of the way down structs.h in versions 7.0 and up.
diff has built-in option --from-file and --to-file, which compares one operand to all others.
--from-file=FILE1
Compare FILE1 to all operands. FILE1 can be a directory.
--to-file=FILE2
Compare all operands to FILE2. FILE2 can be a directory.
Note: argument name --to-file is optional.
e.g.
# this will compare foo with bar, then foo with baz .html files
$ diff --from-file foo.html bar.html baz.html
# this will compare src/base-main.js with all .js files in git repo,
# that has 'main' in their filename or path
$ git ls-files :/*main*.js | xargs diff -u --from-file src/base-main.js
Checkout "Beyond Compare": http://www.scootersoftware.com/
It lets you compare entire directories of files, and it looks like it runs on Linux too.
if your running multiple diff's based off one file you could probably try writing a script that has a for loop to run through each directory and run the diff. Although it wouldn't be side by side you could at least compare them quickly. hope that helped.
Not answering the main question, but here's something similar to what Benjamin Neil has suggested but diffing all files:
Store the filenames in an array, then loop over the combinations of size two and diff (or do whatever you want).
files=($(ls -d /path/of/files/some-prefix.*)) # Array of files to compare
max=${#files[#]} # Take the length of that array
for ((idxA=0; idxA<max; idxA++)); do # iterate idxA from 0 to length
for ((idxB=idxA + 1; idxB<max; idxB++)); do # iterate idxB + 1 from idxA to length
echo "A: ${files[$idxA]}; B: ${files[$idxB]}" # Do whatever you're here for.
done
done
Derived from #charles-duffy's answer: https://stackoverflow.com/a/46719215/1160428
There is a simple an good way to do this = GREP.
Depending on the size of the text you can copy and paste it, or you can redirect the input of the file to the grep command. If you make a grep -vir /path to make a reverse search or a grep -ir /path. This is my way for certification exams.