Split large directory into subdirectories - language-agnostic

I have a directory with about 2.5 million files and is over 70 GB.
I want to split this into subdirectories, each with 1000 files in them.
Here's the command I've tried using:
i=0; for f in *; do d=dir_$(printf %03d $((i/1000+1))); mkdir -p $d; mv "$f" $d; let i++; done
That command works for me on a small scale, but I can leave it running for hours on this directory and it doesn't seem to do anything.
I'm open for doing this in any way via command line: perl, python, etc. Just whatever way would be the fastest to get this done...

I suspect that if you checked, you'd noticed your program was actually moving the files, albeit really slowly. Launching a program is rather expensive (at least compared to making a system call), and you do so three or four times per file! As such, the following should be much faster:
perl -e'
my $base_dir_qfn = ".";
my $i = 0;
my $dir;
opendir(my $dh, $base_dir_qfn)
or die("Can'\''t open dir \"$base_dir_qfn\": $!\n");
while (defined( my $fn = readdir($dh) )) {
next if $fn =~ /^(?:\.\.?|dir_\d+)\z/;
my $qfn = "$base_dir_qfn/$fn";
if ($i % 1000 == 0) {
$dir_qfn = sprintf("%s/dir_%03d", $base_dir_qfn, int($i/1000)+1);
mkdir($dir_qfn)
or die("Can'\''t make directory \"$dir_qfn\": $!\n");
}
rename($qfn, "$dir_qfn/$fn")
or do {
warn("Can'\''t move \"$qfn\" into \"$dir_qfn\": $!\n");
next;
};
++$i;
}
'

Note: ikegami's helpful Perl-based answer is the way to go - it performs the entire operation in a single process and is therefore much faster than the Bash + standard utilities solution below.
A bash-based solution needs to avoid loops in which external utilities are called order to perform reasonably.
Your own solution calls two external utilities and creates a subshell in each loop iteration, which means that you'll end up creating about 7.5 million processes(!) in total.
The following solution avoids loops, but, given the sheer number of input files, will still take quite a while to complete (you'll end up creating 4 processes for every 1000 input files, i.e., ca. 10,000 processes in total):
printf '%s\0' * | xargs -0 -n 1000 bash -O nullglob -c '
dirs=( dir_*/ )
dir=dir_$(printf %04s $(( 1 + ${#dirs[#]} )))
mkdir "$dir"; mv "$#" "$dir"' -
printf '%s\0' * prints a NUL-separated list of all files in the dir.
Note that since printf is a Bash builtin rather than an external utility, the max. command-line length as reported by getconf ARG_MAX does not apply.
xargs -0 -n 1000 invokes the specified command with chunks of 1000 input filenames.
Note that xargs -0 is nonstandard, but supported on both Linux and BSD/OSX.
Using NUL-separated input robustly passes filenames without fear of inadvertently splitting them into multiple parts, and even works with filenames with embedded newlines (though such filenames are very rare).
bash -O nullglob -c executes the specified command string with option nullglob turned on, which means that a globbing pattern that matches nothing will expand to the empty string.
The command string counts the output directories created so far, so as to determine the name of the next output dir with the next higher index, creates the next output dir, and moves the current batch of (up to) 1000 files there.

if the directory is not under use, I suggest the following
find . -maxdepth 1 -type f | split -l 1000 -d -a 5
this will create n number of files named x00000 - x02500 (just to make sure 5 digits although 4 will work too). You can then move the 1000 files listed in each file to a corresponding directory.
perhaps set -o noclobber to eliminate risk of overrides in case of name clash.
to move the files, it's easier to use brace expansion to iterate over file names
for c in x{00000..02500};
do d="d$c";
mkdir $d;
cat $c | xargs -I f mv f $d;
done

Moving files around is always a challenge. IMHO all the solutions presented so far have some risk of destroying your files. This may be because the challenge sounds simple, but there is a lot to consider and to test when implementing it.
We must also not underestimate the efficiency of the solution as we are potentially handling a (very) large number of files.
Here is script carefully & intensively tested with own files. But of course use at your own risk!
This solution:
is safe with filenames that contain spaces.
does not use xargs -L because this will easily result in "Argument list too long" errors
is based on Bash 4 and does not depend on awk, sed, tr etc.
is scaling well with the amount of files to move.
Here is the code:
if [[ "${BASH_VERSINFO[0]}" -lt 4 ]]; then
echo "$(basename "$0") requires Bash 4+"
exit -1
fi >&2
opt_dir=${1:-.}
opt_max=1000
readarray files <<< "$(find "$opt_dir" -maxdepth 1 -mindepth 1 -type f)"
moved=0 dirnum=0 dirname=''
for ((i=0; i < ${#files[#]}; ++i))
do
if [[ $((i % opt_max)) == 0 ]]; then
((dirnum++))
dirname="$opt_dir/$(printf "%02d" $dirnum)"
fi
# chops the LF printed by "find"
file=${files[$i]::-1}
if [[ -n $file ]]; then
[[ -d $dirname ]] || mkdir -v "$dirname" || exit
mv "$file" "$dirname" || exit
((moved++))
fi
done
echo "moved $moved file(s)"
For example, save this as split_directory.sh. Now let's assume you have 2001 files in some/dir:
$ split_directory.sh some/dir
mkdir: created directory some/dir/01
mkdir: created directory some/dir/02
mkdir: created directory some/dir/03
moved 2001 file(s)
Now the new reality looks like this:
some/dir contains 3 directories and 0 files
some/dir/01 contains 1000 files
some/dir/02 contains 1000 files
some/dir/03 contains 1 file
Calling the script again on the same directory is safe and returns almost immediately:
$ split_directory.sh some/dir
moved 0 file(s)
Finally, let's take a look at the special case where we call the script on one of the generated directories:
$ time split_directory.sh some/dir/01
mkdir: created directory 'some/dir/01/01'
moved 1000 file(s)
real 0m19.265s
user 0m4.462s
sys 0m11.184s
$ time split_directory.sh some/dir/01
moved 0 file(s)
real 0m0.140s
user 0m0.015s
sys 0m0.123s
Note that this test ran on a fairly slow, veteran computer.
Good luck :-)

This is probably slower than a Perl program (1 minute for 10.000 files) but it should work with any POSIX compliant shell.
#! /bin/sh
nd=0
nf=0
/bin/ls | \
while read file;
do
case $(expr $nf % 10) in
0)
nd=$(/usr/bin/expr $nd + 1)
dir=$(printf "dir_%04d" $nd)
mkdir $dir
;;
esac
mv "$file" "$dir/$file"
nf=$(/usr/bin/expr $nf + 1)
done
With bash, you can use arithmetic expansion $((...)).
And of course this idea can be improved by using xargs - should not take longer than ~ 45 sec for 2.5 million files.
nd=0
ls | xargs -L 1000 echo | \
while read cmd;
do
nd=$((nd+1))
dir=$(printf "dir_%04d" $nd)
mkdir $dir
mv $cmd $dir
done

I would use the following from the command line:
find . -maxdepth 1 -type f |split -l 1000
for i in `ls x*`
do
mkdir dir$i
mv `cat $i` dir$i& 2>/dev/null
done
Key is the "&" which threads out each mv statement.
Thanks to karakfa for the split idea.

Related

How to format a TXT file into a structured CSV file in bash?

I wanted to get some information about my CPU temperatures on my Linux Server (OpenSuse Leap 15.2). So I wrote a Script which collects data every 20 seconds and writes it into a text file. Now I have removed all garbage data (like "CPU Temp" etc.) I don't need.
Now I have a file like this:
47
1400
75
3800
The first two lines are one reading of the CPU temperature in C and the fan speed in RPM, respectively. The next two lines are another reading of the same measurements.
In the end I want this structure:
47,1400
75,3800
My question is: Can a Bash script do this for me? I tried something with sed and Awk but nothing worked perfectly for me. Furthermore I want a CSV file to make a graph, but i think it isn't a problem to convert a text file into a CSV file.
You could use paste
paste -d, - - < file.txt
With pr
pr -ta2s, file.txt
with ed
ed -s file.txt <<-'EOF'
g/./s/$/,/\
;.+1j
,p
Q
EOF
You can use awk:
awk 'NR%2{printf "%s,",$0;next;}1' file.txt > file.csv
Another awk:
$ awk -v OFS=, '{printf "%s%s",$0,(NR%2?OFS:ORS)}' file
Output:
47,1400
75,3800
Explained:
$ awk -v OFS=, '{ # set output field delimiter to a comma
printf "%s%s", # using printf to control newline in output
$0, # output line
(NR%2?OFS:ORS) # and either a comma or a newline
}' file
Since you asked if a bash script can do this, here's a solution in pure bash. ;o]
c=0
while read -r line; do
if (( c++ % 2 )); then
echo "$line"
else printf "%s," "$line"
fi
done < file
Take a look at 'paste'. This will join multiple lines of text together into a single line and should work for what you want.
echo "${DATA}"
Name
SANISGA01CI
5WWR031
P59CSADB01
CPDEV02
echo "${DATA}"|paste -sd ',' -
Name,SANISGA01CI,5WWR031,P59CSADB01,CPDEV02

I made several attempts to create Tcl variables from file content, always failed. What can it be?

I made an outline of a basic example. To simplify the scenario I am facing on time!
To have a reflection of what is happening to me, let's first follow the line of logical reasoning.
Let's go to the Walkthrough:
1 - I create the parent directory:
$ mkdir -p /tmp/tmp.AbiGaIl
2 - Then I create the subdirectories Children:
$ mkdir -p /tmp/tmp.AbiGaIl/A
$ mkdir -p /tmp/tmp.AbiGaIl/B
$ mkdir -p /tmp/tmp.AbiGaIl/C
3 - Now, I will populate the subdirectories with dummy files:
$ touch /tmp/tmp.AbiGaIl/A/1.png /tmp/tmp.AbiGaIl/A/2.png /tmp/tmp.AbiGaIl/A/3.png
$ touch /tmp/tmp.AbiGaIl/B/1.png /tmp/tmp.AbiGaIl/B/2.png /tmp/tmp.AbiGaIl/B/3.png
$ touch /tmp/tmp.AbiGaIl/C/1.png /tmp/tmp.AbiGaIl/C/2.png /tmp/tmp.AbiGaIl/C/3.png
4 - Anyway, we can check if everything is ok.
$ cd /tmp/tmp.AbiGaIl/
$ pwd
$ ls *
5 - Verified and confirmed, it's time to start working the script.
$ echo "A" > /tmp/.directory
$ DIR=$(cat /tmp/.directory)
$ cd $DIR
$ ls *.*
NOTE - "This last step is the main cause of my headache, not in Bash, but in Tcl." _
Ready! The bourn shell operation [sh] was a success.
Now, starting from this same line of reasoning. Let's start with the Tcl language [tclsh].
In this step, you can only follow the flow from the previous step (bash), so as not to create the file among other features. Just let's reuse something done before. Follow:
1º
set folder "/tmp/tmp.AbiGaIl"
#Return : /tmp/tmp.AbiGaIl
2º
% set entry [cd $folder]
3º
% pwd
#Return : tmp.meLBJzexGc
4º
% puts [glob -type d $entry *]
#Return : . C B A
5º
% set file [open "/tmp/.directory"]
#Return : file3
6º
% set text [read $file]
#Return : A
7º
% close $file
8º
% set view "$text"
#Return : A
9º
% puts [glob -nocomplain -type f -directory $view -tails *.png]
#Return : ?
You should expect this, but do not enter the directory. I did it on purpose for you to understand or what happens if I create.
You will see in the next lines below what happens. Continue..
So let's create a variable to get there.
% set view [cd $text]
Return : couldn't change working directory to "A": no such file or directory
Tai a big problem!
How can I go there and check the directory contents if it is giving me this unknown error!
% puts [glob -nocomplain -type f -directory $view -tails *.png]
If you create a direct insertion of the direct directory name in glob without going through this variable that is fed by a file containing the name of that directory. This works, see:
% puts [glob -nocomplain -type f -directory A -tails *.png]
#Return: 3.png 2.png 1.png
But that's not what I want. I really need this file with the subdirectory name (s).
This file is used every time a widget button is pressed, so an "exec echo "A" > /tmp/.directory" command is triggered according to the letter of the alphabet that corresponds to the name of each group and therefore can be accessed. .
If anyone can explain to me why this is giving error while accessing. Be sure to reply or even comment. Your help will be most welcome!
Firstly, I advise never using cd in scripts because it changes the interpretation of filenames. Instead, it is much simpler in the long run to use fully qualified filenames everywhere.
Secondly, cd never returns anything other than the empty string. Don't use the empty string as a filename, it confuses things (and not just Tcl!)
set folder "/tmp/tmp.AbiGaIl"
set f [open "/tmp/.directory"]
gets $f dir
close $f
set subfolder [file join $folder $dir]
set files [glob -nocomplain -dir $subfolder -type f *.png]
foreach filename $files {
puts "found [file tail $filename] in $subfolder :: $filename"
}
If you just want the last part of the name of a particular file, use file tail when you need it. But keep the full filename around for when you are talking about the file and not just its name: it is just so much more reliable to do so. (If you ever work with GUI apps, this is vital: you don't have anything like the control over the initial directory there.)
That advice also applies (with different syntax) to every other programming language you might use for this task.

Search in large csv files

The problem
I have thousands of csv files in a folder. Every file has 128,000 entries with four columns in each line.
From time to time (two times a day) I need to compare a list (10,000 entries) with all csv files. If one of the entries is identical with the third or fourth column of one of the csv files I need to write the whole csv row to an extra file.
Possible solutions
Grep
#!/bin/bash
getArray() {
array=()
while IFS= read -r line
do
array+=("$line")
done < "$1"
}
getArray "entries.log"
for e in "${array[#]}"
do
echo "$e"
/bin/grep $e ./csv/* >> found
done
This seems to work, but it lasts forever. After almost 48 hours the script checked only 48 entries of about 10,000.
MySQL
The next try was to import all csv files to a mysql database. But there I had problems with my table at around 50,000,000 entries.
So I wrote a script which created a new table after 49,000,000 entries and so I was able to import all csv files.
I tried to create an index on the second column but it always failed (timeout). To create the index before the import process wasn't possible, too. It slowed down the import to days instead of only a few hours.
The select statement was horrible, but it worked. Much faster than the "grep" solution but still to slow.
My question
What else can I try to search within the csv files?
To speed things up I copied all csv files to an ssd. But I hope there are other ways.
This is unlikely to offer you meaningful benefits, but some improvements to your script
use the built-in mapfile to slurp a file into an array:
mapfile -t array < entries.log
use grep with a file of patterns and appropriate flags.
I assume you want to match items in entries.log as fixed strings, not as regex patterns.
I also assume you want to match whole words.
grep -Fwf entries.log ./csv/*
This means you don't have to grep the 1000's of csv files 1000's of times (once for each item in entries.log). Actually this alone should give you a real meaningful performance improvement.
This also removes the need to read entries.log into an array at all.
In awk assuming all the csv files change, otherwise it would be wise to keep track of the already checked files. But first some test material:
$ mkdir test # the csvs go here
$ cat > test/file1 # has a match in 3rd
not not this not
$ cat > test/file2 # no match
not not not not
$ cat > test/file3 # has a match in 4th
not not not that
$ cat > list # these we look for
this
that
Then the script:
$ awk 'NR==FNR{a[$1];next} ($3 in a) || ($4 in a){print >> "out"}' list test/*
$ cat out
not not this not
not not not that
Explained:
$ awk ' # awk
NR==FNR { # process the list file
a[$1] # hash list entries to a
next # next list item
}
($3 in a) || ($4 in a) { # if 3rd or 4th field entry in hash
print >> "out" # append whole record to file "out"
}' list test/* # first list then the rest of the files
The script hashes all the list entries to a and reads thru the csv files looking for 3rd and 4th field entries in the hash outputing when there is a match.
If you test it, let me know how long it ran.
You can build a patterns file and then use xargs and grep -Ef to search for all patterns in batches of csv files, rather than one pattern at a time as in your current solution:
# prepare patterns file
while read -r line; do
printf '%s\n' "^[^,]+,[^,]+,$line,[^,]+$" # find value in third column
printf '%s\n' "^[^,]+,[^,]+,[^,]+,$line$" # find value in fourth column
done < entries.log > patterns.dat
find /path/to/csv -type f -name '*.csv' -print0 | xargs -0 grep -hEf patterns.dat > found.dat
find ... - emits a NUL-delimited list of all csv files found
xargs -0 ... - passes the file list to grep, in batches

Bash - Faster way to check for file changes than md5?

I've got a MySQL DB set up on my system for local testing, and I'm monitoring the tables to see when a change is made.
Step 1 - Go to DIR
cd /usr/local/mysql-5.7.16-osx10.11-x86_64/data/blog_atom_tables/
Step 2 - Run Script
watchDB
Where watchDB() is (slightly modified for readability)...
function watchDB() {
declare -A aa // Associative array of filenames and their md5 hashes
declare k // Holder for current md5
prt="0"
while true; do // Run forever
// Loop through all table files within directory
for i in *.ibd;
do
k=$(sudo md5 -q $i) // md5 of file (table)
// If table has not been hashed yet
if [[ ${aa[$(echo $i | cut -f 1 -d '.')]} == "" ]]; then
aa[$(echo $i | cut -f 1 -d '.')]=$k
// If table has been hashed, and diff md5 (i.e. table changed)
elif [[ ${aa[$(echo $i | cut -f 1 -d '.')]} != $k ]]; then
echo $i;
aa[$(echo $i | cut -f 1 -d '.')]=$k
fi
done
done
}
TL;DR Loop through all the table files within the directory, save a copy of each md5, and continue looping through checking for a change.
I don't need to see what rows/columns have been changed, only that the table itself is different. For the most part, this works exactly as I want, but calculating the md5 for every table takes a noticeable amount of time. For only 25 tables, it takes between 3 and 5 seconds to execute each loop.
Is there a quicker way to do this, other than md5? I'd use something like cmp, but I need to save a reference of the current state of the file, so I have something to compare it against.
This is only about 1/6 of the total tables that will eventually be in there, so any improvement on speed is welcome.
While it's not really checking the content of the file, you could use file system attributes as a simple way to monitor for changes. Unless the filesystem is mounted with the timestamps disabled, you can monitor the access time and modification time timestamps:
stat -f "%m" <filename>
The filesystem driver knows when reads and writes occur and subsequently updates the timestamps.

couldn't load file "/usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so"

I have a problem with running magic vlsi. The problem is
couldn't load file "/usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so": /usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so: undefined symbol: Tk_GetCursorFromData
I think this caused by:
/usr/lib/x86_64-linux-gnu/magic/tcl/magic.tcl
in line 13: load /usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so
the file /usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so exists
The error source in running magic is from qflow in shellscript display.sh:
#!/bin/tcsh -f
#----------------------------------------------------------
# Qflow layout display script using magic-8.0
#----------------------------------------------------------
# Tim Edwards, April 2013
#----------------------------------------------------------
if ($#argv < 2) then echo Usage: display.sh [options] <project_path> <source_name> exit 1 endif
# Split out options from the main arguments (no options---placeholder only) set argline=(`getopt "" $argv[1-]`) set cmdargs=`echo "$argline" | awk 'BEGIN {FS = "-- "} END {print $2}'` set argc=`echo $cmdargs | wc -w`
if ($argc == 2) then set argv1=`echo $cmdargs | cut -d' ' -f1` set argv2=`echo $cmdargs | cut -d' ' -f2` else echo Usage: display.sh [options] <project_path> <source_name> echo where echo <project_path> is the name of the project directory containing echo a file called qflow_vars.sh. echo <source_name> is the root name of the verilog file, and exit 1 endif
foreach option (${argline}) switch (${option})
case --: break endsw end
set projectpath=$argv1 set sourcename=$argv2 set rootname=${sourcename:h}
# This script is called with the first argument <project_path>, which should
# have file "qflow_vars.sh". Get all of our standard variable definitions
# from the qflow_vars.sh file.
if (! -f ${projectpath}/qflow_vars.sh ) then echo "Error: Cannot find file qflow_vars.sh in path ${projectpath}" exit 1 endif
source ${projectpath}/qflow_vars.sh source ${techdir}/${techname}.sh cd ${projectpath}
#----------------------------------------------------------
# Copy the .magicrc file from the tech directory to the
# layout directory, if it does not have one. This file
# automatically loads the correct technology file.
#----------------------------------------------------------
if (! -f ${layoutdir}/.magicrc ) then if ( -f ${techdir}/${magicrc} ) then
cp ${techdir}/${magicrc} ${layoutdir}/.magicrc endif endif
#----------------------------------------------------------
# Done with initialization
#----------------------------------------------------------
cd ${layoutdir}
#---------------------------------------------------
# Create magic layout (.mag file) using the
# technology LEF file to determine route widths
# and other parameters.
#---------------------------------------------------
if ($techleffile == "") then set lefcmd="lef read ${techdir}/${leffile}" else set lefcmd="lef read ${techdir}/${techleffile}\nlef read ${techdir}/${techleffile}" endif
# Timestamp handling: If the .mag file is more recent
# than the .def file, then print a message and do not
# overwrite.
set docreate=1 if ( -f ${rootname}.def && -f ${rootname}.mag) then set defstamp=`stat --format="%Y" ${rootname}.def` set magstamp=`stat --format="%Y" ${rootname}.mag` if ( $magstamp > $defstamp ) then
echo "Magic database file ${rootname}.mag is more recent than DEF file."
echo "If you want to recreate the .mag file, remove or rename the existing one."
set docreate=0 endif endif
# The following script reads in the DEF file and modifies labels so
# that they are rotated outward from the cell, since DEF files don't
# indicate label geometry.
if ( ${docreate} == 1) then ${bindir}/magic -dnull -noconsole <<EOF drc off box 0 0 0 0 snap int ${lefcmd} def read ${rootname} select top cell select area labels setlabel font FreeSans setlabel size 0.3um box grow s -[box height] box grow s 100 select area labels setlabel rotate 90 setlabel just e select top cell box height 100 select area labels setlabel rotate 270 setlabel just w select top cell box width 100 select area labels setlabel just w select top cell box grow w -[box width] box grow w 100 select area labels setlabel just e save ${sourcename} quit -noprompt EOF
endif
# Run magic and query what graphics device types are
# available. Use OpenGL if available, fall back on
# X11, or else exit with a message
${bindir}/magic -noconsole -d <<EOF >& .magic_displays exit EOF
set magicogl=`cat .magic_displays | grep OGL | wc -l` set magicx11=`cat .magic_displays | grep X11 | wc -l`
rm -f .magic_displays
# Run magic again, this time interactively. The script
# exits when the user exits magic.
#if ( ${magicogl} >= 1 ) then magic -d OGL ${rootname} if ( ${magicx11} >= 1) then magic -d X11 ${rootname} else echo "Magic does not support OpenGL or X11 graphics on this host." endif
#------------------------------------------------------------
# Done!
#------------------------------------------------------------
How can I fix this problem?
I should amend my answer, since it sounds like magic may be invoking Tcl 8.5 instead of 8.6, which may have happened if you downloaded magic as a package.
The "-lazy" option to "load" was only implemented in Tcl 8.6, so it's not going to work at all in Tcl 8.5. I would suggest getting magic from opencircuitdesign.com and compiling from source, which usually has no problems on Linux systems. The autoconf script should be able to find Tcl version 8.6.
You can also just ignore the "display" option in qflow and run magic interactively. Use the "lef read" command to read in the standard cell definitions, then "def read" to read in the routed layout from qflow.
This is a known problem (and a very annoying one) that comes from some OS versions' runtime linker not wanting to link a file that contains a reference to an unknown routine, even if that routine is never called. In this case, Tcl is being run without Tk graphics, and so the base Tcl object library doesn't have any Tk routines defined, and the linker stops on the first such routine that shows up (Tk_GetCursorFromData). For some reason the Fedora linker (or something about my setup) doesn't do this, or else I would have fixed the problem a long time ago. As it is, I only fixed it recently, and the solution is in the most recent version of magic-8.1. Since you have Tcl/Tk 8.6, this solution should work (the solution only works with Tcl/Tk 8.6). You can update to the most recent magic version 8.1, or you can just edit the file magic.tcl (in the source, it's tcltk/magic.tcl.in, and installed, it's /usr/local/lib/magic/tcl/magic.tcl). At line 13, add the switch -lazy to the load command:
load -lazy /usr/local/lib/magic/tcl/tclmagic.so
magic- configure... : ./configure --with-tcl=/opt/ActiveTcl-8.6 --with-tk=/usr/local/lib/tk8.6
qflow- configure... : ./configure --with-magic=/usr/local/bin/magic
after install qflow
sudo apt-get magic (it will download magic 7.5 with out overwrite magic 8.0 we only need .so file from there)
open /usr/local/lib/magic/tcl/magic.tcl (as sudo)
and edit line 13 to:
load -lazy /usr/lib/x86_64-linux-gnu/magic/tcl/tclmagic.so