How to save a single webpage with a Linux server - html

I'm trying to save data descriptions from the open data site with a Linux server via SSH:
https://dandelion.eu/datagems/SpazioDati/milano-grid/resource/
https://dandelion.eu/datagems/SpazioDati/milano-grid/description/
I've read the question (What's the best way to save a complete webpage on a linux server?) and tried wget -l 1 https://dandelion.eu/datagems/SpazioDati/milano-grid/description/ and wget -m https://dandelion.eu/datagems/SpazioDati/milano-grid/description/. Neither of them worked. All I can get is an index.html. I want to have the files that I can get using IE/Firefox's 'Save Page' function 'Web page, complete' on a PC. There should be an html file and a folder containing all the stuff like images and such.
Is this possible on a Linux server via SSH? Thanks!
Update
This is what I want (I 'Save page' https://dandelion.eu/datagems/SpazioDati/milano-grid/description/ with Firefox):
|---Milano Grid description _ dandelion_files
| |---a
| |---css_all.css
| |---css.css
| |---dc.js
| |---fbk.jpg
| |---jquery_002.js
| |---jquery.js
| |---js_all.js
| |---js_sitebase.js
| |---Milano_GRID_4326.png
| |---milano-grid-img2.jpg
| |---mixpanel-2.js
| |---odi.png
| |---sbi.css
| |---spaziodati_black.png
| |---spaziodati_white.png
| |---telecom.png
|---Milano Grid description _ dandelion.htm
This is what I get with wget -p -k https://dandelion.eu/datagems/SpazioDati/milano-grid/description/:
|---dandelion.eu
| |---datagems
| | |---SpazioDati
| | | |---milano-grid
| | | | |---resource
| | | | | |---index.html
| |---jsi18n
| | |---index.html
| |---robots.txt

Related

MYSQL client issue - Having an access denied issue connecting via Bash Script (WSL2-Ubantu to a localhost windows MySQL DB 8.0 instance

I seem to be having an issue when running a bash script in windows WSL2 / Ubantu, but have no issue if I run the mysql command line with the same params.
This is a fairly straight forward script, albeit , I am new to bash scripting.
#!/bin/bash
PROPERTY_FILE="MySQLDB_glenn.properties"
SCRIPTS_DIR="/home/glenn/scripts/"
echo $SCRIPTS_DIR$PROPERTY_FILE
WSL_HOST_IP=$(ipconfig.exe | awk '/WSL/ {getline; getline; getline; getline; print substr($14, 1, length($14)-1)}')
ALT_WSL_HOST=$WSL_HOST_IP
function atestfunction()
{
myTest=$(echo "This is a test")
echo $myTest
}
function getProperty()
{
PROP_KEY=$1
PROP_VALUE=$(cat $SCRIPTS_DIR$PROPERTY_FILE | grep $PROP_KEY | cut -d'=' -f2)
echo $PROP_VALUE
}
echo "# Reading properties from $PROPERTY_FILE = " $SCRIPTS_DIR$PROPERTY_FILE
DB_USER=$(getProperty "db.username")
DB_PASS=$(getProperty "db.password")
DB_HOST=$(getProperty "db.hostname")
DB_DEFAULT=$(getProperty "db.defaultdb")
echo $DB_USER
echo $DB_PASS
echo $DB_HOST
echo $DB_DEFAULT
echo $ALT_WSL_HOST
echo "Selecting from DB " $DB_DEFAULT
mysql -u$DB_USER -p$DB_PASS -h$ALT_WSL_HOST $DB_DEFAULT -e "use test1; select * from products;"
I setup two versions of the account user , granting it all and with the ability to access via localhost and the second account via 172.0.0.0/255.0.0.0 to handle the fact that WSL comes up with different addresses on reboot.
the variable $ALT_WSL_HOST value is 172.25.208.1 (as I debug it as I type this)
In the Bash script debugger (VS Code) or while running the script ./simpleDBselect.sh, i get
ERROR 1045 (28000): Access denied for user 'wsl_root
'#'172.25.220.8' (using password: YES)
The same mysql command in the same session (i just invoked it in the Terminal Tab of VS Code) or quitting VS code , I get:
mysql -uwsl_root -pxxx-xxxxx -h172.25.208.1 test1 -e "use test1; select * from products;"
mysql: [Warning] Using a password on the command line interface can be insecure.
+-----------+----------+-------------+---------------+---------+-------------+-----------+-------------+
| productID | widgetid | productName | productNumber | color | inhouseCost | listPrice | productSize |
+-----------+----------+-------------+---------------+---------+-------------+-----------+-------------+
| 1 | 101 | Widget1 | s1001 | Yellow | 25.00 | 40.00 | medium |
| 2 | 102 | Widget2 | s1002 | Black | 27.00 | 45.00 | small |
| 3 | 103 | Widget3 | s1003 | Black | 28.00 | 40.00 | medium |
| 4 | 104 | Widget4 | s1004 | Red | 21.00 | 34.00 | small |
| 5 | 105 | Widget5 | s1005 | Green | 15.00 | 26.00 | large |
| 6 | 106 | Widget6 | s1006 | Magenta | 40.00 | 75.00 | large |
| 7 | 107 | Widget7 | s1007 | Orange | 50.00 | 85.00 | medium |
| 8 | 108 | Widget8 | s1008 | Blue | 39.00 | 55.00 | small |
| 9 | 109 | Widget9 | s1009 | Gold | 189.00 | 300.00 | large |
+-----------+----------+-------------+---------------+---------+-------------+-----------+-------------+
Does anyone know what I might be missing. I have read loads of documentation and think I have the DB accounts setup correctly for the variable hosts. Note,I have also tried '%' as is the default wildcard . Should I just install MySQL in the ubantu running in WSL as opposed to the windows install. Also note that I am cognizant that WSL running its own virtual ip address , hence why i'm using WSL_HOST_IP=$(ipconfig.exe | awk '/WSL/ {getline; getline; getline; getline; print substr($14, 1, length($14)-1)}').
Any help would be appreciated as I'm at my feeble wits end :)
Best Regards,
Glenn Firester
see above description for my attempts, but wsl_root has DBA and all grants to the tables in db test1

How can I load a JSON file in Brython?

I am working on a page with Brython, and one thing I would like to do is load a JSON file on the same server, in the same directory.
How can I do that? open()? Something else?
Thanks,
You are on the right track, here is how I did it.
In my file .bry.
import json
css_file = "static/json/css.json"
j = json.load(open(css_file))
# For debug
console.log(j)
Since it is from your HTML page that the execution is done, make sure to start from your .html file and not .py or .bry to search your JSON file.
Also, check if your JSON file has the right structure ;)
Here is my tree structure.
| index.html
| README.txt
| unicode.txt
|
+---static
| +---assets
| |
| +---css
| |
| +---fonts
| |
| +---js
| |
| +---json
| | css.json
| |
| \---py
| | main.bpy
Hoping to have helped :D

How to generate (not draw/render) sequence diagrams

I'm looking for ways to generate Sequence Diagrams out of programmed Logic. I do not mean rendering them out of e.g. text. A lot of post talk about generating, but I see that as rendering/drawing. I've searched the internet and most found textual tools (like PlantUML) or intuitive graphical payed tools. I'm not looking for either. But I want to program the message flow and let the system draw it based on possible choices. One reason for it is that the conditional if then else with 'alt' is not very useful (my opinion) if the else triggers a whole different path. It works for one different return, but it becomes very ugly soon (again my opinion). The other reason is that I'm busy developing such a generating tool myself and I'm wondering if I do not try the make something nobody is waiting for. Except the fact that it is a nice hobby project for myself. For me it made sense to make it as it Interactively creates the Message Diagrams which are very helpful during the development process or pass on knowledge. Perhaps it could even validate the logic on dead ends. The third reason is that the text becomes complex by itself to maintain (see example below). The fourth reason is that I believe in automating a process that can/should be automated as we should not be busy drawing stuff but writing logic. So is anybody aware of generating (not draw/render) tools available for sequence diagrams?
The following examples were created where the difference started at invalid/valid card, which is difficult to capture in an if/then/else. And even so other choices are made along the way.
Invalid card, choices: electronic -> chip -> invalid card -> keep goods
Merchant Customer Terminal
| | |
+-inform-amount | |
|---choose-method--->| |
|<--chooses-terminal-| |
+-enters-amount | |
|------------start-payment----------->|
| |<--show-amount--|
| +-inserts-card |
| |-method-chosen->|
| |<--card-invalid-|
| +-pay-different |
|<-----------payment-failed-----------|
+-goods-left-behind | |
+-customer-leaves | |
Valid card, choices: electronic -> chip -> valid card -> auth valid -> enough balance
Merchant Customer Terminal Secure-Intf Acc-Srv Acc-DB
| | | | | |
+-inform-amount | | | | |
|----choose-method--->| | | | |
|<--chooses-terminal--| | | | |
+-enters-amount | | | | |
|-------------start-payment------------->| | | |
| |<---show-amount---| | | |
| +-inserts-card | | | |
| |--method-chosen-->| | | |
| |<----card-valid---| | | |
| +-enter-pin | | | |
| |---validate-pin-->| | | |
| | |--sec:authorize-->| | |
| | | |--verify-login-->| |
| | | | |---get-login-details-->|
| | | | |<-----login-details----|
| | | |<-login-response-| |
| | |<--sec:auth-valid-| | |
| | |---sec:transfer-->| | |
| | | |----transfer---->| |
| | | | |------get-balance----->|
| | | | |<-----balance-info-----|
| | | | |-upd-checking-balance->|
| | | | |-upd-merchant-balance->|
| | | | |----commit-changes---->|
| | | | |<---changes-committed--|
| | | |<---transferred--| |
| | |<-sec:transferred-| | |
|<-----------payment-successful----------| | | |
+-goods-given | | | | |
| |<-------paid------| | | |
| +-customer-leaves | | | |
Textual code valid for several free drawing tools. If any changes are needed then I think it becomes difficult to maintain. I rather generate them.
title MSG-Flow for 'Merchant-flows'
participant "Merchant" as Merchant
participant "Customer" as Customer
participant "Terminal" as Terminal
participant "Secure Intf" as Secure_Intf
participant "Acc Srv" as Acc_Srv
participant "Acc DB" as Acc_DB
note left of Merchant: inform-amount
Merchant -> Customer: choose-method
Customer -> Merchant: chooses-terminal
note left of Merchant: enters-amount
Merchant -> Terminal: start-payment
Terminal -> Customer: show-amount
note right of Customer: inserts-card
Customer -> Terminal: method-chosen
Terminal -> Customer: card-valid
note right of Customer: enter-pin
Customer -> Terminal: validate-pin
Terminal -> Secure_Intf: sec:authorize
Secure_Intf -> Acc_Srv: verify-login
Acc_Srv -> Acc_DB: get-login-details
Acc_DB -> Acc_Srv: login-details
Acc_Srv -> Secure_Intf: login-response
Secure_Intf -> Terminal: sec:auth-valid
Terminal -> Secure_Intf: sec:transfer
Secure_Intf -> Acc_Srv: transfer
Acc_Srv -> Acc_DB: get-balance
Acc_DB -> Acc_Srv: balance-info
Acc_Srv -> Acc_DB: upd-checking-balance
Acc_Srv -> Acc_DB: upd-merchant-balance
Acc_Srv -> Acc_DB: commit-changes
Acc_DB -> Acc_Srv: changes-committed
Acc_Srv -> Secure_Intf: transferred
Secure_Intf -> Terminal: sec:transferred
Terminal -> Merchant: payment-successful
note left of Merchant: goods-given
Terminal -> Customer: paid
note right of Customer: customer-leaves
So is anybody aware of generating (not draw/render) tools available for sequence diagrams?
Have you tried ZenUML(https://zenuml.com)?
It can also generate sequence diagram from Java code (as an Intellij Idea plugin).

Extracting and Constructing Tables from HTML Files using Julia

Here's a public link to an example html file. I would like to extract each set of CAN and yearly tax information (example highlighted in red in the image below) from the file and construct a dataframe that looks like the one below.
Target Fields
Example DataFrame
| Row | CAN | Crtf_NoCrtf | Tax_Year | Land_Value | Improv_Value | Total_Value | Total_Tax |
|-----+--------------+-------------+----------+------------+--------------+-------------+-----------|
| 1 | 184750010210 | Yes | 2016 | 16720 | 148330 | 165050 | 4432.24 |
| 2 | 184750010210 | Yes | 2015 | 16720 | 128250 | 144970 | 3901.06 |
| 3 | 184750010210 | Yes | 2014 | 16720 | 109740 | 126460 | 3412.63 |
| 4 | 184750010210 | Yes | 2013 | 16720 | 111430 | 128150 | 3474.46 |
| 5 | 184750010210 | Yes | 2012 | 16720 | 99340 | 116060 | 3146.17 |
| 6 | 184750010210 | Yes | 2011 | 16720 | 102350 | 119070 | 3218.80 |
| 7 | 184750010210 | Yes | 2010 | 16720 | 108440 | 125160 | 3369.97 |
| 8 | 184750010210 | Yes | 2009 | 16720 | 113870 | 130590 | 3458.14 |
| 9 | 184750010210 | Yes | 2008 | 16720 | 122390 | 139110 | 3629.85 |
| 10 | 184750010210 | Yes | 2007 | 16720 | 112820 | 129540 | 3302.72 |
| 11 | 184750010210 | Yes | 2006 | 12380 | 112760 | | 3623.12 |
| 12 | 184750010210 | Yes | 2005 | 19800 | 107400 | | 3882.24 |
Additional Information
If it is not possible to insert the CAN to each row that is okay, I can export the CAN numbers separately and find a way to attach them to the dataframe containing the tax values. I have looked into using beautiful soup for python, but I am an absolute novice with python and the rest of the scripts I am writing are in Julia, so I would prefer to keep everything in one language.
Is there any way to achieve what I am trying to achieve? I have looked at Gumbo.jl but can not find any detailed documentation/tutorials.
So Gumbo.jl will parse the HTML and give you a programatic representation of the structure of the HTML file (called a DOM - Document Object Model). This is typically a tree of html tags, which you can traverse and extract the data you need.
To make this easier, what you really want is a way to query the DOM, so that you can extract the data you need without having to traverse the entire tree yourself. The Cascadia.jl project does this for you. It is built on top of Gumbo, and uses CSS selectors as the query language.
So for your example, you could use something like the following to extract all the CAN fields:
julia> using Gumbo
julia> using Cascadia
julia> h=parsehtml(read("/Users/aviks/Download/z1.html", String))
julia> c = matchall(Selector("td:containsOwn(\"CAN:\") + td span"), h.root)
13-element Array{Gumbo.HTMLNode,1}:
Gumbo.HTMLElement{:span}:
<span class="value">184750010210</span>
...
#print all the CAN values
julia> for x in c
println( x.children[1].text )
end
184750010210
186170040070
175630130020
172640020290
168330020230
156340030160
118210000020
190490040500
173480080430
161160010050
153510060090
050493000250
050470630910
Hopefully this gives you an idea of how to extract all the data you need.
The current answer is a bit out of date since the readall() function no longer exists. I'll update his answer below.
Here's a general breakdown of the package ecosystem for Julia (as of the time of writing this answer):
Requests.jl is used to download the HTML file itself (note that in avik's answer, he reads the HTML file from his local machine)
Cascadia.jl is required to search for CSS tags (e.g. the tag that you would find if you were to use Selector Gadget).
Gumbo.jl is required to parse the resulting HTML
The key thing to remember is that Gumbo stores objects in tree format as HTMLNodes or HTMLElements. So most objects have "parents" and "children." To get the data you need, it's simply a matter of filtering with the right selector (using Cascadia) and then going to the correct point in the Gumbo tree.
An updated version of avik's answer:
using Requests, Cascadia, Gumbo
# r = get(url) # Normally, you'd put a url here, but I couldn't find a way to grab it without having to download it and read it locally
# h = parsehtml(String(r.data)) # Then normally you'd execute this
# Instead, I'm going to read in the html file as a string and give it to Gumbo
h = parsehtml(readstring("z1.html"))
# Exploring with the various structure of Gumbo objects:
println(fieldnames(h.root))
println(fieldnames(h.root.children))
println(size(h.root.children))
# aviks code:
c = matchall(Selector("td:containsOwn(\"CAN:\") + td span"), h.root);
for x in c
println( x.children[1].text )
end
This particular webpage is more difficult to scrape than most, since it doesn't have a great CSS structure.
There's some nice documentation on workflow on the Cascadia README, but I still had some questions after reading it. For anyone else (like me, yesterday) who comes to this page looking for guidance on web scraping in Julia, I've created a jupyter notebook with a simple example that will hopefully help you understand the workflow in greater detail.

MacVim+NERDTree: How to open a file as a split in furthest horizontal split

I've been browsing mvim docs and have tested out the various commands, but I can't seem to find one that solves my issue.
Here is what I have:
/========================================================\
| | | |
| | | |
| | file 1 | |
| | | |
| |______________________| |
| NERDTree | | File 3 |
| | | |
| | file 2 | |
| | | |
\__________|______________________|______________________/
What I'd like to have:
/========================================================\
| | | |
| | | |
| | file 1 | File 4 |
| | | |
| |______________________|______________________|
| NERDTree | | |
| | | |
| | file 2 | File 3 |
| | | |
\__________|______________________|______________________/
I'm able to move things far right, into a new vsplit, as well as far top and far bottom.
New NERDTree files are opening by default in the File 1/File 2 vsplit.
Any help is appreciated, thanks!
It seems as though my particular setup at that time may have been the issue, and I think I understand why. First, how to do what I asked:
Open up nerdtree with :NERDTree
Open your first file with or o
Open second file in horizontal split pane with i
From each of 2 horizontal panes create your third and fourth panes with s. This will open the selected files in vertical split of the last buffer you interacted with, splitting them each in half.
Bare in mind that you'll need to be in the pane you'd like to split, previous to selecting your file to open from NERDTree.
My issue arose primarily from my panes already being in an orientation of my top most diagram above. Everytime I tried to create a horizontal split with File 3 the split would just wind up in the first column of files.
I think I may see why now, though. With mvim you can interact through your mouse - and that's the only way to get directly from that furthest column to NERDTree, without touching any other buffers (as far as I can tell). Whereas with regular vim, you wouldn't be able to have the furthest column as the last interacted window, and therefore would never be able to split it.