How to print tcpdump format like this - tcpdump

I want to print tcpdump in the following format:
202 2018-06-25 18:53:54.051971 192.168.1.121 8.8.8.8 DNS 82 Standard query 0x3cba A weixin.spreadwin.com
I tried:
tcpdump -n -tttt -r ip_ok-name_failed_before.pcap
Which gives me the output:
2018-06-25 18:53:54.051971 IP 192.168.1.121.21497 > 8.8.8.8.53: 15546+ A? weixin.spreadwin.com. (38)
How do I achieve the expected output?

Related

BLAST+ exits with error exit status (2) when using nextflow

I'm using nextflow to analyse minION data. Blast+ terminates with error exit status (2), Command exit status:2 and Command output: (empty)
-HP-Z6-G4-Workstation:~/nextflow_pipelines/nf_pipeline/20221025_insect$ nextflow cat_working_nextflow.nf
N E X T F L O W ~ version 22.04.5
Launching `cat_working_nextflow.nf` [admiring_hopper] DSL1 - revision: 2916bc12af
executor > local (78)
[38/2d0584] process > concatinate (AIG363_pass_barcode01_0eb3c2c3_2.fastq) [100%] 38 of 38 ✔
[dd/3cabdf] process > fastqconvert (output.fastq) [100%] 38 of 38 ✔
[47/dab2cd] process > blast_raw (insect.fasta) [ 0%] 0 of 38
executor > local (78)
[38/2d0584] process > concatinate (AIG363_pass_barcode01_0eb3c2c3_2.fastq) [100%] 38 of 38 ✔
[dd/3cabdf] process > fastqconvert (output.fastq) [100%] 38 of 38 ✔
[47/dab2cd] process > blast_raw (insect.fasta) [ 2%] 1 of 37, failed: 1
Error executing process > 'blast_raw (insect.fasta)'
Caused by:
Process `blast_raw (insect.fasta)` terminated with an error exit status (2)
Command executed:
blastn -query insect.fasta -db /home/blast/nt_db_20221011/nt -outfmt 11 -out blastrawreads.asn -evalue 0.1 -numgnments 1
blast_formatter blastr-archive blastrawreads.asn awrea-outfmt 5 -out blastrawreads.xml
blast_formatter -archive blastrawreads.asn -outfmt "6 qaccver saccver pident length evalue bitscore stitle" -out blastrawreads_rt.tsv
sort -n -r -k 6 blastrawreads_unsort.tsv > blastrawreads.tsv
Command exit status:
2
Command output:
(empty)
Command error:
Warning: [blastn] Examining 5 or more matches is recommended
BLAST Database error: No alias or index file found for nucleotide database [/home/blast/nt_db_20221011/nt] in search path [/home/shaextflow_pipelines/nf_pipeline/20221025_insect/work/96/e885b7e53e1bcf30e33526265e9a3c::]
Work dir:
/home/nextflow_pipelines/nf_pipeline/20221025_insect/work/96/e885b7e53e1bcf30e33526265e9a3c
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`
The nf file:
\#!/usr/bin/env nextflow
//data_location
params.outdir = './results'
params.in = "$PWD/\*.fastq"
dataset = Channel.fromPath(params.in)
params.db = "/home/blast/nt_db_20221011/nt"
process concatenate {
tag "$x"
publishDir "${params.outdir}", mode:'copy'
input:
path (x) from dataset
output:
path ("output.fastq") into cat_ch
script:
"""
cat $x > output.fastq
"""
}
process fastqconvert {
tag "$y"
publishDir "${params.outdir}", mode:'copy'
input:
path (y) from cat_ch
output:
path ("insect.fasta") into convert1_ch,convert2_ch,convert3_ch
script:
"""
seqtk seq -a $y > insect.fasta
"""
}
process blast_raw {
tag "$z"
publishDir "${params.outdir}", mode:'copy'
input:
path (z) from convert1_ch
output:
path ('blastrawreads.xml') into blastrawreads_xml_ch
script:
"""
blastn \
-query $z -db ${params.db} \
-outfmt 11 -out blastrawreads.asn \
-evalue 0.1 \
-num_alignments 1 \
blast_formatter \
-archive blastrawreads.asn \
-outfmt 5 -out blastrawreads.xml
blast_formatter \
-archive blastrawreads.asn \
-outfmt "6 qaccver saccver pident length evalue bitscore stitle" -out blastrawreads_unsort.tsv
sort -n -r -k 6 blastrawreads_unsort.tsv > blastrawreads.tsv
"""
}
I can see that the insect.fasta file has been produced and has the appropriate permissions and is located in the expected dir.
I used the following command to download the nt database
update_blastdb.pl --decompress nt --passive --source gcp
gcp is the google cloud in Australia
The nt database is ~26GiG in size.
I really need an excel, asn and fasta file from blast results for downstream analysis.
Any help would be much appreciated.
BLAST Database error: No alias or index file found for nucleotide
database [/home/blast/nt_db_20221011/nt]
I think you should be able to re-create the above error independently of Nextflow using:
blastdbcmd -db /home/blast/nt_db_20221011/nt -info
Note that the db argument must be a dbname, not a path. For /home/blast/nt_db_20221011/nt to work correctly, you should be able to list your db files using: ls /home/blast/nt_db_20221011/nt.*
Not sure if there's a typo in your question, but the size of the nt database is about an order of magnitude larger, at approximately 250G. I wonder if simply re-downloading the database fixes the problem? Note that you can get a list of BLAST databases (showing their sizes and dates last updated) using:
update_blastdb.pl --showall pretty --source gcp
Note also that DSL1 is now end-of-life1 and will be removed going forward. I strongly recommend migrating to using DSL2 syntax when you get a chance.
From the comments:
The problem is that when you use params to specify a path, the path or files specified will not be localized inside the process working directory when the task is run. What you want is just a second input (value) channel. For example, using DSL2 syntax:
params.db = "/home/blast/Geminiviridae_db_20221118/geminiviridae"
process blast_raw {
tag { query_fasta }
input:
path query_fasta
path db
output:
path "geminiviridae.xml"
"""
blastn \\
-query "${query_fasta}" \\
-db "${db}" \\
-max_target_seqs 10 \\
-outfmt 5 \\
-out "geminiviridae.xml"
"""
}
workflow {
db = file( params.db )
blast_raw( your_upstream_ch, db)
}

extract variable from json style output in Linux command

I try to use the AWS secrets manager in the linux system. I could use aws cli command
aws secretsmanager get-secret-value --secret-id abc_account --version-stage AWSCURRENT
to get following output
{
"ARN": "arn:aws:secretsmanager:us-east-1:123456789:secret:abc_account-XhteiW",
"Name": "abc_account",
"VersionId": "89637ef4-4594-4c63-9887-3f7d2c7ccc6f",
"SecretString": "{\"username\":\"abc_account\",\"password\":\"PASSWORD111\"}",
"VersionStages": [
"AWSCURRENT"
],
"CreatedDate": "2021-02-08T23:57:58.325000-05:00"
}
what I need is to save the password PASSWORD111 into a variable var1 in the linux. something like
var1=$(aws secretsmanager get-secret-value --secret-id svc_vma_insights_data_platform --version-stage AWSCURRENT | awk XXXXXX )
or
var1=$(aws secretsmanager get-secret-value --secret-id svc_vma_insights_data_platform --version-stage AWSCURRENT | grep XXXXXX )
This is extracting the secret string from the JSON output, and then extracting the password from that JSON:
passwd=$(aws ... | jq -r '.SecretString' | jq -r '.password')
On linux you may try this gnu grep:
var1=$(aws ... | grep -oP 'password\W+\K[^"\\]+')
echo "$var1"
PASSWORD111
Command regex:
password\W+: Match text password followed by 1+ non-word characters
\K: Reset match info
[^"\\]+: Match 1+ of any character that is not a " and not a \

R - running commands in terminal and saving output to a dataframe

On OSX, I am using the system() function to run commands in terminal from the R console as part of a script I've written. The script requires connecting to a MySQL() database through an ssh tunnel, and I type into the command line "ps aux | grep ssh" to see what tunnels i am connected to. For example, some output:
.
> system("ps aux | grep ssh")
Home 50915 0.0 0.0 2501204 3264 ?? S 10:32AM server info
Home 50092 0.0 0.0 2504172 3048 ?? Ss 9:35AM server2 info
Home 50090 0.0 0.0 2501372 480 ?? Ss 9:35AM server3 info
Home 1155 0.0 0.0 2544220 1368 ?? S Thu07PM server4 info
Home 51333 0.0 0.0 2434840 800 ?? S 11:00AM 0:00.00 grep ssh
Home 51331 0.0 0.0 2438508 1124 ?? S 11:00AM 0:00.00 sh -c ps aux | grep ssh
.
I would like to turn this output into a dataframe, but cannot. Functions like as.data.frame(system("ps aux | grep ssh")) do not work as how I would hope them to work.
Any thoughts on this would be appreciated!
EDIT - just wanted to highlight error from one suggested comment
> read.table(pipe("ps aux | grep ssh"))
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 1 did not have 34 elements
> pipe("ps aux | grep ssh")
description class mode text opened can read can write
"ps aux | grep ssh" "pipe" "r" "text" "closed" "yes" "yes"
First pipe your output to an actual text file:
> system("ps aux | grep ssh") > output.txt
Then read in this file into R using read.table:
df.output <- read.table(file="output.txt", header=FALSE, sep="")
Note: Using sep="" (which is the default for read.table actually) will treat any type/amount of whitespace as a delimeter between columns. This should cover the output you are getting from your call to Linux.
You can get a little closer (to a character vector) with intern=TRUE:
as.data.frame(system("ps aux | grep ssh", intern=TRUE))

converting bash output to JSON / Dictionary

I am trying to create a JSON compatible output in bash that can be read by nodejs & python:
{"link":XX,"signal":YY,"noise":ZZ}
here's the unfiltered result:
iwconfig wlan0
wlan0 IEEE 802.11bg ESSID:"wifi#someplace" Nickname:"<WIFI#REALTEK>"
Mode:Managed Frequency:2.452 GHz Access Point: C8:4C:75:20:B4:8E
Bit Rate:54 Mb/s Sensitivity:0/0
Retry:off RTS thr:off Fragment thr:off
Encryption key:A022-1191-3A Security mode:open
Power Management:off
Link Quality=100/100 Signal level=67/100 Noise level=0/100
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0
But after applying my filters:
iwconfig wlan0 | grep Link | tr -d '/100' | tr '=' ' ' | xargs | awk '{printf "{\"link\":"$3",\"signal\":"$6",\"noise\":"$9"}"}'
I am getting erratic and incomplete results:
{"link":98,"signal":6,"noise":}
{"link":Signal,"signal":Noise,"noise":}
The "noise" value is never captured, and sometimes printf returns the wrong chunk.
Is there a more 'reliable' way of doing this ?
The problem with the code in your question is here:
tr -d '/100'
What that command does it simply delete all the characters: '/', '1', '0'.
From the manual for tr,
-d, --delete
delete characters in SET1, do not translate
Thats not something you'd want. What you want is to replace the entire string /100 with "".
Use: sed 's/\/100//g' instead.
... | grep Link | sed 's/\/100//g' | tr '=' ' ' | awk '{printf "{\"link\":"$3",\"signal\":"$6",\"noise\":"$9"}"}'
You could restructure the output using perl, by piping the output through the following command:
perl -n -E 'if ($_ =~ qr{Link Quality=(\d+)/100.*?Signal level=(\d+)/100.*?Noise level=(\d+)/100}) { print qq({"link":$1,"signal":$2,"noise":$3}); }'
Using awk it is quite simple:
awk -F '[ =/]+' '$2=="Link"{printf "{\"%s\":%s,\"%s\":%s,\"%s\":%s}\n",
$2, $5, $6, $8, $10, $12}'
{"Link":100,"Signal":67,"Noise":0}

Only one command line in PROJ.4

I would like to know if there are a way to write only one command line to obtain the expected results. I explain:
When you write this :
$ proj +proj=utm +zone=13 +ellps=WGS84 -f %12.6f
If you want to recieved the output data:
500000.000000 4427757.218739
You must to write in another line with the input data:
-105 40
Is it possible to write concatenated command line as this stile?:
$ proj +proj=utm +zone=13 +ellps=WGS84 -f %12.6f | -105 40
Thank you
I also ran into this problem and found the solution:
echo -105 40 | proj +proj=utm +zone=13 +ellps=WGS84 -f %12.6f
That should do the trick.
If you need to do this e.g. from within c#, the command you'd use is this:
cmd.exe /c echo -105 40 | proj +proj=utm +zone=13 +ellps=WGS84 -f %12.6f
Note: you may need to double up the % as the command processor interprets this as a variable.