view html source while using CSR with React - html

I'm studying ways to develop a SEO-friendly React website with CSR.
I have read many articles pointing out that to provide a SEO-friendly website, one should go with the SSR approach.
To my knowledge, when using browser's view source feature in CSR, the html content is a bunch of javascript bundle files and the actual html would not be present since view source only shows what's rendered from server side. while in SSR html is rendered and passed to the browser and the displayed html would be present in source view of the page.
However https://divar.ir (a well known retailer site) seems to be using CSR (upon clicking any link, the data is fetched from an api endpoint in json format via an ajax call and then it looks like the page is rendered in client side).
The thing is, when I view the source of the page even after clicking any link, I can see the actual html that is being displayed.
So to sum it up, How can I use CSR in React, and when I view the source of a page, I actually see the html that is being displayed to the user?

Server side rendered react applications usually only pre-render the initial page load. Subsequent navigation may still be entirely handled and rendered by the client.
By using the view source tool it will open the code in a new tab (at least in chrome) that leads to a fresh load of the current route from the server. If the application is server side rendered you will receive a pre-rendered version of that route and therefore see the html for that route.
By providing a sitemap of your website a bot can discover all SEO relevant routes by visiting the urls provided in the sitemap. Each of those requests are independent requests to the server and will be pre-rendered in contrast to how a real user would navigate the page by clicking the links.

Related

Saving static HTML page generated with ReactJS

Background:
I need to allow users to create web pages for various products, with each page having a standard overall appearance. So basically, I will have a template, and based on the input data I need the HTML page to be generated for each product. The input data will be submitted via a web form, following which the data should be merged with the template to produce the output.
I initially considered using a pure templating approach such as Nunjucks, but moved to ReactJS as I have prior experience with the latter.
Problem:
Once I display the output page (by adding the user input to the template file with placeholders), I am getting the desired output page displayed in the browser. But how can I now obtain the HTML code for this specific page?
When I tried to view the source code of the page, I see the contents of 'public/index.html' stating:
This HTML file is a template.
If you open it directly in the browser, you will see an empty page.
Expectedly, the same happens when I try to save (Save As...) the html page via the browser. I understand why the above happens.
But I cannot find a solution to my requirement. Can anyone tell me how I can download/save the static source code for the output page displayed on the browser.
I have read possible solutions such as installing 'React/Redux Development Extension' etc... but these would not work as a solution for external users (who cannot be expected to install these extensions to use my tool). I need a way to do this on production environment.
p.s. Having read the "background" info of my task, do let me know if you can think of any better ways of approaching this.
Edit note:
My app is currently actually just a single page, that accepts user data via a form and displays the output (in a full screen dialog). I don't wish to have these output pages 'published' on the website, and these are simply to be saved/downloaded for internal use. So simply being able to get the "source code" for the dislayed view/page on the browser and saving this to a file would solve my problem. But I am not sure if there is a way to do this?
Its recommended that you use a well-known site generator such as Gatsby or Next for your static sites since "npx create-react-app my-app" is for single page apps.
(ref: https://reactjs.org/docs/create-a-new-react-app.html#recommended-toolchains)
If I'm understanding correctly, you need to generate a new page link for each user. Each of your users will have their own link (http/https) to share with their users.
For example, a scheduling tool will need each user to create their own "booking page", which is a generated link (could be on your domain --> www.yourdomain.com/bookinguser1).
You'll need user profiles to store each user's custom page, a database, and such. If you're not comfortable, I'll use something like an e-commerce tool that will do it for you.
You can turn on the debugger (f12) and go to "Elements"
Then right-click on the HTML tag and press edit as HTML
And then copy everything (ctrl + a)

How to emulate the behavior of master page in Net.Core and VS Code

When building an MVC project, there's a shared folder automatically provided in which I have Layout.cshtml page that's used like the holder or master page (as it was called in Web Forms). So, all the banners, navbars, footers etc. go in there, while the acutal pages being developed refer to it in the source code and got pasted together upon rendition. This far I'm following.
Now, I have a set up and AspNet website using Yeoman and the only thing I have is wwwroot directory in which I put the file start.html. (It's the same as index.html - I just wanted to try out if I have full control over default files.)
I'm unsure how to proceed. I.e. I'd like the links on the start.html to point to files like uno.html, duo.html etc. and read those into a designated part of the landing page (i.e. start.html).
Is it doable without using the magic of templates? I want to have full control over the rendition process.
There's no point googling it, I noticed, because anything I've got the last two hours leads to how to create master page not to how to emulate master page.
Well, the static files middleware is just for that: static files.
You roughly have two options:
Do everything client sided, i.e. rather than having normal links use javascript/ajax calls and embed the content of the static file in your start.html using javascript.
It should work, but has several down-sites like it requires javascript to work (not a big issue these days, except for the paranoids who use no-script browser extensions) and that web crawler still may have issues properly indexing ajax heavy web pages
Wait until ASP.NET Core 1.2 (scheduled for Q1-Q2 2017), which will add Razor Pages. Pages rendered with Razor template engine, but without the need of a controller.
1.2
WebSockets
SignalR
Razor Pages (Views without MVC controllers)
Web API security
If you don't wanna wait, try RazorLight, which is 3rd party open source library for rendering Razor views.
But all except the first one require some "magic template engine".
You could of course write an server-sided includes (SSI) middleware which would be based on UseStaticFiles middleware and parse the file and include the html files server sided before returning it. There is nothing out of the box for it as far as I know.

What does #/ means in url?

I am working on ROR web apps. My webpage url looks like below-
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102#/dashboard
Here I understood that advertiser_id is 2102 but I couldn't understand what #/dashboard is pointing to?
The portion of the URL which follows the # symbol is not normally sent to the server in the request for the page. If you open your web inspector and watch the request for the page, you will see that the #/dashboard portion is not included in the request at all.
On a normal (basic HTML) web page, the # symbol can be used to link to a section within the page, so that the browser jumps down to that section after the page loads.
In fancy javascript-heavy web applications, the # symbol is commonly used followed by more URL paths, for example www.example.com/some-path#/other-path/etc the other-path/etc portion of the URL is not seen by the server, but is available for Javascript to read in the browser and presumably display something different based on that URL path.
So in your case, the first part of the URL is a request to the server:
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102
and the second part of the URL could be for Javascript to display a specific view of the page once it has loaded:
#/dashboard
The # symbol is also used to create a Fragment Identifier and is also typically used to link to a specific piece of content within a web page (such as to cause the browser to jump down to a particular section on the page).
As others have mentioned, this has SEO implications. In order to index pages such as this, you may have to employ different techniques to allow the content that is "behind the # symbol" to be accessible to search engines.
# symbol is called anchor, it redirects to a specific position on the html page.
It's a crawling technique , you could read more Here
Providing another example
Here's a request to github for the sourcecode of a java class
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java
By appending this with "#L90" the web browser will make the same request, and then scroll to line 90 and highlight the code.
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java#L90
Your web browser made the same request to the github server, but in the anchored case, performed the additional action of highlighting the selected line after the response was received.
after # is the hash of the location; the ! the follows is used by search engines to help index AJAX content. After that can be anything, but is usually rendered to look as a path (hence the /)

Get rendered source code from web components site?

I just tried something rather trivial: get the source code of a web page (by saving it) and count how often a certain phrase occurs in the code.
Turns out, it doesn't work if that page uses Polymer / web components. Is this a browser bug?
Try the following: Go to http://www.google.com/design/icons/ and try to find star_half in the code (the last icon on the page). If you inspect the element inside of Chrome or Firefox, it will lead you to
<i class="md-icon dp48">star_half</i>
but this won't be in the source if you copy the root node or save the html to disk.
Is there a way to get the entire code?
Reason for this behavior is probably how source viewing( and source saving as well?) works for browser and because shadow roots are attached to web components on the client side.
When you press ctrl-u on a web page, browser essentially does a network call again on the same url to fetch a copy of what server returned when you hit that url.
In this case, when this page renders, browser identifies the component icons-layout and then executes code to attach a shadow-root to this node. All of this happens when your page reaches the client(browser).
When you are trying to save this page, you are saving what server returned not current state of the page. You'll see same behavior if you fire up chrome console and try to save an icons-layout node.
Is there a way to get the entire code?
I don't know how to do it from browser but phantomjs provides a way to save client side rendered html.

how to open https(X-Frame-Options) website in iframe or any html page

I've an app, which loads data from database. In a table I'm storing some URLs EX: https://facebook.com. Remember these URLs are dynamic and are controlled in admin panel.
Now, I need to get contents of these URLs and display it inside iFrame or inside a div within my app. Idea here is user should not go away from my app.
When I tried to load https://facebook.com it never loads because they've (X-Frame-Options) enabled.
Is there any solution for this?
You cannot tell the browser to ignore the security instructions provided by the third party site. That would defeat the object of having them in the first place.
If you want to display the content on your site, then you will have to display it from your own server (e.g. by using a server side process to read the data from the third party site and serve it from your own). Obviously, this will mean that you cannot (for example) load Facebook using the user's own credentials.