Usually when I download sites with Httrack I get all the files; images, CSS, JS etc. Today, the program finished downloading in just 2 seconds and only grabs the index.html file with CSS, IMG code etc inside still linking to external. I've already reset my settings back to default but doesn't help. Anyone know how to change it back to function properly?
Does the site have a robots.txt and you're honouring that in your settings?
If it does, you can turn it off in "Options/spider/spider: Never" (according to this article)
Related
All,
I did a bit of research but haven't found an exact thread or resolution to this issue.
I am using express in this webapp, Chrome Version 60.0.3112.113, and Win 10 Version 1703.
I am currently developing a site where I want to use a hamburger svg for mobile navigation. This is how the html sits for the "topbar"
<div id="topbar">
<img src="../images/hamburger.svg" alt="ham">
</div>
And here is the file structure:
https://puu.sh/xxDih/c842297b54.png
According to the structure, I should only need to do ../images/hamburger.svg, but when I do that, it comes up with a 404 error in the waterfall. I have run into this issue multiple times doing any sort of HTML sourcing into parent directories, but in JS files it works fine.
I'm not exactly sure what the issue is.
For the express server,every uri are processed by the express contains resource url and request url.
Request url(api) is refered to your express api config
resource(image, js, css, html...) is relative to your static server's root directory which was defined by using express.static(root_path).
That's what I want to say.
I noticed that images folder, node_modules folder, and pages folder are all in the same directory, and css is under the pages folder.
"../images/hamburger.svg" is the correct relative path from the pages folder, but being (big red flag) the node_modules is at "../node_modules/" I'm thinking that the server is serving from /pages/ folder, the servers root directory. meaning anything above the /pages/ folder will not be shared.
Clearly you do not want to share out ../../../windows/system32/ or the user documents etc. To prevent sharing those the highest directory you can access from the html page through a browser is the server folder being used. I'm thinking /pages/home.html is localhost/home.html and localhost/css/ is your css directory.
Programs running on the server can access files above the served directory, but the browser can not. "/node_modules/" should be outside of the servers root directory.
I realized this is an issue with express itself.
If (in this case) you have your index.html as express.static('./pages'), then it can't see anything above pages and considers pages as the working directory.
Me, coming from React (which stupidly was the first thing I learned even before basic JS), wants to put all the pages in one folder, which I think would make sense.
The workaround I did, which may not be optimal, was by putting index.html as a sibling to pages, css, and images in the src folder. Then in index.html, it has a meta tag as follows: <meta http-equiv="refresh" content="0; url=./pages/home.html" /> to redirect to the home.html page.
Again, this may not be optimal, but for a kinda OCD guy like myself this makes sense.
Update:
What we ended up doing is to have index.html be a static page, and then load the individual pages in an iframe. This website is mainly for information and has no database (yet), so there won't be much to process. Here's the new file structure that works.
http://puu.sh/xy5Dw/4dbc72ec06.png
src is now our working directory (express.static('./src')) and everything is detailed within there.
Once we do include a database, it will at most be 10 values in the server and will be using very basic requests, nothing crasy.
Just wondering how/why this works, when I'm making a simple html file and linking in some css, then dragging my html file into the browser, no static web server is needed for me to view the file.
Why is that so..
I'm looking at my browser's network tab, and no request is made for the css file, and my browser still displays it perfectly..
Is there a way to do without a static file server on the web for html, css, js files, like when dragging and dropping a file into a browser?
Just going back and requestionning basics here..
Thanks in advance!
Because the link to your CSS file is relative, and your CSS file is accessible locally. Browsers can be used to access local files, not just files on the Internet.
When working with links, you may see just the name of the file referenced, as such:
Link
This is known as a relative link. file.html is relative to wherever the document is that is linking to it. In this case, the two files would be in the same folder.
There's a second type of link, known as an absolute URL, where the full path is specified.
Consider a typical absolute website link:
Link
With a local file, this would essentially be:
Link
The file protocol can be used to access local files.
Considering both the homepage (presumably index.html) and file.html would live in the same folder on both a web server and your local machine, Link would work for either scenario. In fact, with a relative link, the location of the second file is automatically determined based on the location of the first file. In my example, index.html would live at file://[YOUR WEBSITE]/index.html, so your browser is smart enough to known to look in file://[YOUR WEBSITE]/ when searching for any relative URLs.
Note that the same scenario applies to any other file! <link> and <script> tags will look for files in the exact same way -- that includes your stylesheet :)
Hope this helps!
Sounds like you are new to HTML and web development.
It all has to do with relative versus absolute file paths.
Check out these articles and have fun coding! Always remember that Google is your friend, improve your search-foo and you will not have to ask questions like this.
God speed.
http://www.geeksengine.com/article/absolute-relative-path.html
http://www.coffeecup.com/help/articles/absolute-vs-relative-pathslinks/
How to properly reference local resources in HTML?
I'm using HexoJS to create a blog. I was able to generate the static files using hexo generate. Even though there are css files and JS files generated, they are not properly linked to the index.html.
So, I have to open each html page and correct each page links given in href and src attributes one by one. I believe that this is not very practical. Can anyone help ?
The localhost is used for preview the website. When we publish our blog, it should be on a server, then the path will be interpreted correctly, we don't need to change any thing. What we saw on http://localhost:4000 will be same when you published your website.
So, we don't have to worry about the broken paths in the public folder.
I have made my first website and in the preview in Safari and Chrome from Dreamweaver it works fine. But after uploading my files with Filezilla to 000webhost and typing in the URL, only the index page loads, links to other pages on the site don't work, images are broken and the css isn't applied.
I'm think it is because I haven't named the files correctly in the code, but I have no idea what to call them in order to get it right.
The file you upload to is public_html. So I've tried http://www.webaddress/public_html/Pages/entertainment.html but it didn't change anything.
Thanks for any help!
Without code examples it's very difficult to answer this, but it's probably just that your URL format is incorrect.
For example, if you've got example.com/example/example.html and that page contains a CSS file with a location of /css/style.css, the web browser will look for example.com/css/style.css because the slash at the beginning of the URL tells it to go to the root.
In this case, your CSS file is probably actually in example.com/example/css/style.css. Remove the beginning slash so the location is css/style.css and the web browser will look for the file using the current page's location as it's starting point.
We have a page that has been using a server side include for many years. Recently it stopped working. No changes have been made to the page
<!--#include virtual="..\..\includes\nav.include" -->
Near the bottom of a page called contact.html
The 'nav.include' page simply contains html for a navigation bar. No javascript. No server side scripting. Just html.
Is there some setting somewhere that needs to be set to make SSIs work in the way it is implemented here (including a file with an uncommon extension inside a html file)?
A solution that I discovered yesterday:
I duplicated and renamed all my pages to .php (retained the original html files just in case!)
I have replaced all the {<#include virtual="folder_name/file_name.ext" -->} with
<?php include "folder_name/file_name.ext" ; ?>
with the appropriate number of dots and slashes depending upon where the pages are in my folder hierarchy. ( The {} above is to mark out the code only)
Finally, I renamed the original index.html to some other name so that the index.php is picked up instead of the index.html
This seems to be working out - I am still testing out all the pages and links - a very tedious and time consuming exercise!
INCLUDES SYNTAX:
In a php file use
<?php include "..//folder_name/file_name.ext" ; ?>
In an html file use
<!--#include virtual="../folder_name/file_name.ext" -->
EXPERIMENT WITH NUMBER OF "..." AND NUMBER OF "///" IN THE ABOVE SYNTAX TO GET THE CORRECT COMBINATION!!!!
For me, all my includes are small html files in a folder ABC which is directly under the webroot.
For pages which are under sibling folders of ABC i.e. in other folders directly under webroot, "..//" is the number of dots and slashes that work.
For pages which are directly in the webroot (i.e. not in any folder inside webroot), folder_name/file_name.ext without any dots or slashes has worked.
I haven't had the time to check out the number of dots and dashes required for any other level in the hierarchy!
I hope this helps!
Are you using GoDaddy? They did the same to my site, and I found on their forums someone that said to use include file instead of include virtual.
Just switched over to Godaddy servers and my SSI stopped working. I made a .txt file with the following:
AddHandler server-parsed .html
I uploaded it to the public html folder, then renamed it .htaccess, and everything started working.
I had too many files to convert all the extensions to PHP, so I had to find another answer, if at all possible.
For me, for a little while, exchanging include virtual to include file seemed to help, but then it broke again after a few days. I guess GoDaddy was not finishing monkeying around with the SSI configuration. o_O
The solution, as of tonight, was to convert all relative paths to absolute specification in regards to the site root. For example, I had to convert:
<!--#include virtual="..\..\includes\nav.html" -->
To:
<!--#include virtual="\includes\nav.html" -->
Using this approach, I was able to include HTML files inside other HTML files.
I discovered this on one of my pages that mixed absolute and relative path specification.
HTH
I've been seeing this problem frequently on my GoDaddy hosted site. I have to go into the Server configuration page, disable SSI, save the settings, then re-enable SSI and check "Use SSI on .HTM and .HTML files) and it starts working again.
The problem is on GoDaddy's side. For some reason, it's forgetting that it needs to parse SSI in files, until you turn off and turn on that option. Their Tier-2 support only suggested using Virtual instead of File on the Include command... which is preposterous, since not only does that not change a thing, the SSI includes work just fine most of the time... until it doesn't.
I'm also updating old .html pages to .php and replacing some of the with php include statements on all pages when some of the pages displayed [an error occurred while processing this directive].
The pages displaying the error also referenced an old .ssi file that wasn't even in the directory it pointed to. I deleted the old includes code to the non-existent .ssi file in those pages, and that fixed the error.
This error occurs when you have in your code html documentation like this
<!--#My awesome documentatacion-->
to fix it remove the #, like this
<!-- My awesome documentatacion-->