I am working on a page with Brython, and one thing I would like to do is load a JSON file on the same server, in the same directory.
How can I do that? open()? Something else?
Thanks,
You are on the right track, here is how I did it.
In my file .bry.
import json
css_file = "static/json/css.json"
j = json.load(open(css_file))
# For debug
console.log(j)
Since it is from your HTML page that the execution is done, make sure to start from your .html file and not .py or .bry to search your JSON file.
Also, check if your JSON file has the right structure ;)
Here is my tree structure.
| index.html
| README.txt
| unicode.txt
|
+---static
| +---assets
| |
| +---css
| |
| +---fonts
| |
| +---js
| |
| +---json
| | css.json
| |
| \---py
| | main.bpy
Hoping to have helped :D
Related
Here's a public link to an example html file. I would like to extract each set of CAN and yearly tax information (example highlighted in red in the image below) from the file and construct a dataframe that looks like the one below.
Target Fields
Example DataFrame
| Row | CAN | Crtf_NoCrtf | Tax_Year | Land_Value | Improv_Value | Total_Value | Total_Tax |
|-----+--------------+-------------+----------+------------+--------------+-------------+-----------|
| 1 | 184750010210 | Yes | 2016 | 16720 | 148330 | 165050 | 4432.24 |
| 2 | 184750010210 | Yes | 2015 | 16720 | 128250 | 144970 | 3901.06 |
| 3 | 184750010210 | Yes | 2014 | 16720 | 109740 | 126460 | 3412.63 |
| 4 | 184750010210 | Yes | 2013 | 16720 | 111430 | 128150 | 3474.46 |
| 5 | 184750010210 | Yes | 2012 | 16720 | 99340 | 116060 | 3146.17 |
| 6 | 184750010210 | Yes | 2011 | 16720 | 102350 | 119070 | 3218.80 |
| 7 | 184750010210 | Yes | 2010 | 16720 | 108440 | 125160 | 3369.97 |
| 8 | 184750010210 | Yes | 2009 | 16720 | 113870 | 130590 | 3458.14 |
| 9 | 184750010210 | Yes | 2008 | 16720 | 122390 | 139110 | 3629.85 |
| 10 | 184750010210 | Yes | 2007 | 16720 | 112820 | 129540 | 3302.72 |
| 11 | 184750010210 | Yes | 2006 | 12380 | 112760 | | 3623.12 |
| 12 | 184750010210 | Yes | 2005 | 19800 | 107400 | | 3882.24 |
Additional Information
If it is not possible to insert the CAN to each row that is okay, I can export the CAN numbers separately and find a way to attach them to the dataframe containing the tax values. I have looked into using beautiful soup for python, but I am an absolute novice with python and the rest of the scripts I am writing are in Julia, so I would prefer to keep everything in one language.
Is there any way to achieve what I am trying to achieve? I have looked at Gumbo.jl but can not find any detailed documentation/tutorials.
So Gumbo.jl will parse the HTML and give you a programatic representation of the structure of the HTML file (called a DOM - Document Object Model). This is typically a tree of html tags, which you can traverse and extract the data you need.
To make this easier, what you really want is a way to query the DOM, so that you can extract the data you need without having to traverse the entire tree yourself. The Cascadia.jl project does this for you. It is built on top of Gumbo, and uses CSS selectors as the query language.
So for your example, you could use something like the following to extract all the CAN fields:
julia> using Gumbo
julia> using Cascadia
julia> h=parsehtml(read("/Users/aviks/Download/z1.html", String))
julia> c = matchall(Selector("td:containsOwn(\"CAN:\") + td span"), h.root)
13-element Array{Gumbo.HTMLNode,1}:
Gumbo.HTMLElement{:span}:
<span class="value">184750010210</span>
...
#print all the CAN values
julia> for x in c
println( x.children[1].text )
end
184750010210
186170040070
175630130020
172640020290
168330020230
156340030160
118210000020
190490040500
173480080430
161160010050
153510060090
050493000250
050470630910
Hopefully this gives you an idea of how to extract all the data you need.
The current answer is a bit out of date since the readall() function no longer exists. I'll update his answer below.
Here's a general breakdown of the package ecosystem for Julia (as of the time of writing this answer):
Requests.jl is used to download the HTML file itself (note that in avik's answer, he reads the HTML file from his local machine)
Cascadia.jl is required to search for CSS tags (e.g. the tag that you would find if you were to use Selector Gadget).
Gumbo.jl is required to parse the resulting HTML
The key thing to remember is that Gumbo stores objects in tree format as HTMLNodes or HTMLElements. So most objects have "parents" and "children." To get the data you need, it's simply a matter of filtering with the right selector (using Cascadia) and then going to the correct point in the Gumbo tree.
An updated version of avik's answer:
using Requests, Cascadia, Gumbo
# r = get(url) # Normally, you'd put a url here, but I couldn't find a way to grab it without having to download it and read it locally
# h = parsehtml(String(r.data)) # Then normally you'd execute this
# Instead, I'm going to read in the html file as a string and give it to Gumbo
h = parsehtml(readstring("z1.html"))
# Exploring with the various structure of Gumbo objects:
println(fieldnames(h.root))
println(fieldnames(h.root.children))
println(size(h.root.children))
# aviks code:
c = matchall(Selector("td:containsOwn(\"CAN:\") + td span"), h.root);
for x in c
println( x.children[1].text )
end
This particular webpage is more difficult to scrape than most, since it doesn't have a great CSS structure.
There's some nice documentation on workflow on the Cascadia README, but I still had some questions after reading it. For anyone else (like me, yesterday) who comes to this page looking for guidance on web scraping in Julia, I've created a jupyter notebook with a simple example that will hopefully help you understand the workflow in greater detail.
Really hoping someone can help me with this 2008 SSIS BIDS issue. I am very new to the software so I am hoping it will be something simple!
I am attempting to use SSIS to retrieve data from an SQL table. My data flow consists of one OLE DB Source pointing directly to one Flat File Destination.
At the moment I can successfully retrieve 4 columns which display fine however when I attempt to add a fifth column (from data returned from SQL database) it appears to jumble everything up.
To elaborate the fifth column should contain a letter between “A – F” however instead of pulling this through from my results in the OLE Data Source it appears to be pulling through data from columns 1 – 4
EG before adding this extra column things look like this:
|----------|------------|-------------|--------|
| Column 1 |Column2 |Column3 |Column4 |
|----------|------------|-------------|--------|
|444456654 |10/01/2015 |User unable |JSMIT14 |
| | |to logout of | |
| | |app without | |
| | |crashing. | |
------------------------------------------------
However after adding the end column and mapping it accordingly everything seems to go out of sync, with added quotation marks:
|----------|------------|-------------|--------|-----------------------|
| Column 1 |Column2 |Column3 |Column4 |Column 5 |
|----------|------------|-------------|--------|-----------------------|
|444456654 |10/01/2015 |User unable |JSMIT14 |444456654”,”User |
| | |to logout of | |unable to logout of |
| | |app without | |app without |
| | |crashing. | |crashing.”,”JSMITH14 |
-----------------------------------------------------------------------
I was expecting it to pull through the data as I had mapped it and appear as below:
|----------|------------|-------------|--------|---------|
| Column 1 |Column2 |Column3 |Column4 |Column 5 |
|----------|------------|-------------|--------|---------|
|444456654 |10/01/2015 |User unable |JSMIT14 | F |
| | |to logout of | | |
| | |app without | | |
| | |crashing. | | |
----------------------------------------------------------
I add the extra column by selecting “Flat File Connection Manager” > “Advanced” > “New”.
I then map the SQL column to my new column by selecting “Flat File Destination” > “Mappings”
Please note if I select the data source and preview the SQL code all appears fine so I feel that something must be going wrong at the Flat File Destination stage.
I've been browsing mvim docs and have tested out the various commands, but I can't seem to find one that solves my issue.
Here is what I have:
/========================================================\
| | | |
| | | |
| | file 1 | |
| | | |
| |______________________| |
| NERDTree | | File 3 |
| | | |
| | file 2 | |
| | | |
\__________|______________________|______________________/
What I'd like to have:
/========================================================\
| | | |
| | | |
| | file 1 | File 4 |
| | | |
| |______________________|______________________|
| NERDTree | | |
| | | |
| | file 2 | File 3 |
| | | |
\__________|______________________|______________________/
I'm able to move things far right, into a new vsplit, as well as far top and far bottom.
New NERDTree files are opening by default in the File 1/File 2 vsplit.
Any help is appreciated, thanks!
It seems as though my particular setup at that time may have been the issue, and I think I understand why. First, how to do what I asked:
Open up nerdtree with :NERDTree
Open your first file with or o
Open second file in horizontal split pane with i
From each of 2 horizontal panes create your third and fourth panes with s. This will open the selected files in vertical split of the last buffer you interacted with, splitting them each in half.
Bare in mind that you'll need to be in the pane you'd like to split, previous to selecting your file to open from NERDTree.
My issue arose primarily from my panes already being in an orientation of my top most diagram above. Everytime I tried to create a horizontal split with File 3 the split would just wind up in the first column of files.
I think I may see why now, though. With mvim you can interact through your mouse - and that's the only way to get directly from that furthest column to NERDTree, without touching any other buffers (as far as I can tell). Whereas with regular vim, you wouldn't be able to have the furthest column as the last interacted window, and therefore would never be able to split it.
I'm trying to save data descriptions from the open data site with a Linux server via SSH:
https://dandelion.eu/datagems/SpazioDati/milano-grid/resource/
https://dandelion.eu/datagems/SpazioDati/milano-grid/description/
I've read the question (What's the best way to save a complete webpage on a linux server?) and tried wget -l 1 https://dandelion.eu/datagems/SpazioDati/milano-grid/description/ and wget -m https://dandelion.eu/datagems/SpazioDati/milano-grid/description/. Neither of them worked. All I can get is an index.html. I want to have the files that I can get using IE/Firefox's 'Save Page' function 'Web page, complete' on a PC. There should be an html file and a folder containing all the stuff like images and such.
Is this possible on a Linux server via SSH? Thanks!
Update
This is what I want (I 'Save page' https://dandelion.eu/datagems/SpazioDati/milano-grid/description/ with Firefox):
|---Milano Grid description _ dandelion_files
| |---a
| |---css_all.css
| |---css.css
| |---dc.js
| |---fbk.jpg
| |---jquery_002.js
| |---jquery.js
| |---js_all.js
| |---js_sitebase.js
| |---Milano_GRID_4326.png
| |---milano-grid-img2.jpg
| |---mixpanel-2.js
| |---odi.png
| |---sbi.css
| |---spaziodati_black.png
| |---spaziodati_white.png
| |---telecom.png
|---Milano Grid description _ dandelion.htm
This is what I get with wget -p -k https://dandelion.eu/datagems/SpazioDati/milano-grid/description/:
|---dandelion.eu
| |---datagems
| | |---SpazioDati
| | | |---milano-grid
| | | | |---resource
| | | | | |---index.html
| |---jsi18n
| | |---index.html
| |---robots.txt
I have the following requirement. I need to pass parameters from html page to batch file which in turn passes the paramter to xml file.I need to know how to pass parameters from html to batch file and from batch file to xml file
Thanks
What kind of "parameters"? What kind of "html page"? What kind of "batch file"? What kind of "xml file"?
Assuming that you mean that data from a HTML form should be processed by a batch file and written to disc as XML:
Data from HTML forms is always processed using the CGI protocol, and it's possible to do it with a batch script, probably even a Windows batch file.
However, this is going to be extremely uncomfortable, error-prone and insecure. It's much better to have a language or framework specifically geared towards web applications handle the low-level CGI stuff for you.
Common choices are: PHP, Perl, Java servlets or ASP.
While it's possible to write XML simply by outputting strings, you're virtually guaranteed to get malformed XML eventually.
It's much better to use a real XML framework to produce the XML - there are several to choose from for pretty much any language worth using.
m.mahesh.2000, it might be worth you drawing a little diagram of the various parts of the puzzle. HTML and XML files are not programs!
Consider these possible diagrams:
CGI Approach:
+--------------+ +----------------+
| Browser | | Web Server |
| | | (eg: Apache) |
| +----------+ | | +------------+ |
| |HTML | | --> | | CGI | |
| |Javascript| | | | | |
| +----------+ | | | +-------+ | |
+--------------+ | | | Perl | | |
| | +-------+ | |
| +------------+ |
+----------------+
Servlet Container Approach:
+--------------+ +------------------+
| Browser | | Tomcat |
| | | |
| +----------+ | | +-------------+ |
| |HTML | | --> | | Servlet | |
| |Javascript| | | | Container | |
| +----------+ | | | +---------+ | |
+--------------+ | | | Servlet | | |
| | +---------+ | |
| +-------------+ |
+------------------+
The browser renders your HTML, executes any javascript, and sends HTTP requests to your server - be this Apache, Tomcat, or other? Do you know what kind of server you have?
Apache spawns child CGI processes to act on certain HTTP requests. CGI processes are typically PHP or Perl scripts.
Tomcat has a number of threads to act on HTTP requests. Some requests are handled by Servlet instances hosted within a Servlet container.
Either the CGI process, or the servlet, will do the work of creating your XML file on the server, and contacting your database.
Hope this helps.
Are the batch file and xml file client or server side?
Either way you will need to add some script to the html file. Or even use server side scripting to generate the html...