Google Search Console can't index my React Webapp - google-cloud-functions

I have a web app built with react, and is hosted on Firebase's Hosting, and served with cloud functions.
An example of serving the index.html from cloud functions, is changing the .htaccess:
"rewrites": [
{
"source": "**",
"function": "[functionThatReturnsIndex.html]"
}
],
I export the build, and then I serve the index.html from a function, which works fine.
I do this, so I can dynamically set the META tags for each page.
This both loads fine, and when I share a link on a site, like twitter/linkedin/facebook etc., the META information that those websites fetch is also correct and as expected.
However, I'm getting an error (redirect error) from the search console, when trying to index my site.
It suggested I use Lighthouse in the chrome web tools to test what's wrong, and I got decent results:
This is my robots.txt:
# https://www.robotstxt.org/robotstxt.html
User-agent: *
Disallow:
I'm not sure what else to try, or why it's not indexing. Any help would be appreciated.
Let me know if I can provide more information to make the solution more obvious, thank you.
Update 1:

They are 2 types of Sitemaps viz: Static and Dynamic Sitemaps.
In your case, you need a Dynamic Sitemap to discover dynamically created contents in your React App automatically.
I'll suggest you use https://www.npmjs.com/package/react-dynamic-sitemap
to implement a dynamic sitemap and submit the resulting Sitemap URL to Google.
Hope this helps you in fixing this error.

Related

How to combine the SG page of next.js with firebase hosting and CSR routing for some pages

I want to publish my site on firebase hosting.
I'm using Static Generate in Next.js to create a page.
However, there are some pages that I want to do dynamic routing like a blog.
The URL looks like this "blog/[slug]"
If reloaded on a page other than the top page, it will be 404, so use next.config.
"Trailing Slash: true" is set.
When Dynamic Routing is performed with SG, the page has a file such as "blog/[slug]/index.html" exported.
Normally, it works fine for the transition from the top screen, but
If you reload with that URL, it will be 404.
One way to solve this is to detect that the blog is updated and rebuild it with WebHook etc. and deploy it.
There are various ways to do this on the website, but it is difficult to build each time because the update frequency is high.
Next, I'm thinking of doing CSR (Client Side Rendering) only for the blog part in the SG site.
Can't you route using a regular React Router? about it.
I tried to use React Router only for some pages, but I get an error because React Router is not a server side process.
Is the second thing I'm trying to do feasible?
If you have any other solutions to this problem, please let me know.
PS: Firebase Functions cannot be used due to a cold start issue...
Best regard.
It seems that it was solved just by entering the rewrite setting.
Thank you to everyone who saw the question.
"rewrites": [
{
"source": "/notice/**",
"destination": "/notice/[slug]/index.html"
}

Wordrpress wp-json not found on server -- localhost

Here's the sitch:
I downloaded and installed Wampserver64 and Wordpress 5.2.3
I finally made it to my site, but I can't preview or publish pages with the new Gutenberg or block editor because something is broken! When I edit with the Classic Editor, it's all good.
****Here's not the notice I get from the Site Health Plugin****
The REST API is one way WordPress, and other applications, communicate with the server. One example is the block editor screen, which relies on this to display, and save, your posts and pages.
The REST API call gave the following unexpected result: (404)
Not Found
The requested URL /wordpress/wp-json/wp/v2/types/post was not found on this server.
Apache/2.4.39 (Win64) PHP/7.3.5 Server at sitefolder Port 80
I have scoured the internet on how to fix this but so far nada. Help much appreciated!
You have to add index.php after your folder name in the URL in order to make WordPress API work on the localhost.
For example: localhost/trial-wordpress/index.php/wp-json/wp/v2/posts
It worked for me.
If you're still working on it try the suggestions from this post: Wordpress REST API (wp-api) 404 Error Even if it's quite old there are some ideas you could try, like switching your permalink structure to sth. other than plain, check mod_rewrite under apache and so on. Classic Editor is not using the REST API, therefore no error.

How can I get Facebook's open graph scraper to successfully parse my single-page website hosted via GitHub Pages?

One of my friend set up a simple one-page website and asked me to help to integrate open graph metadata so that sharing on Facebook provides a better user experience.
Unfortunately, Facebook doesn't recognize some values and Facebook's URL Debugger doesn't really help, cause it shows stuff from the registrar by default and fails with the error message Error parsing input URL, no data was cached, or no data was scraped. when I click on the Fetch new scrape information button. Also, when I click on See exactly what our scraper sees for your URL, I get the following error Document returned no data.
The URL is: http://know-your-limits.com/. The registrar is Gandi and the site is hosted on GitHub. The DNS configuration is as follows:
dig know-your-limits.com +nostats +nocomments +nocmd
; <<>> DiG 9.8.3-P1 <<>> know-your-limits.com +nostats +nocomments +nocmd
;; global options: +cmd
;know-your-limits.com. IN A
know-your-limits.com. 10771 IN A 192.30.252.153
Is there something I could do to fix this on something I have control on (ie registrat configuration, GitHub repository update, HTML update) as opposed to stuff I don't have control on (ie GitHub web server)?
Do you think it is a bug with GitHub hosting?

Driving chrome to measure page load time

I have a bunch of URLs and I am trying to see what is the page load time (PLT) for those URLs in chrome on Windows. Now there are many ways to do this - but what I want is to automate the process so that chrome can read from somewhere the URLs I want to measure the PLT for and output the results somewhere, may be in another file.
Is there any tool I can make use of here? Or perhaps write a plugin that can read from a file when I start chrome and do this job for me? I am not sure how simple or complicated this can get, since I have no experience in this.
One way I can think of is to add a plugin that can measure the PLT in chrome, write a batch file which contains commands to invoke chrome and open the URLs in separate tabs. However, with this I still have to manually look at the PLT and record them, and I wish to automate this too.
Any help would be appreciated.
""Chrome doesn't technically allow you to access the local file system, but you might be able to do it with this: https://developer.chrome.com/extensions/npapi.html.
Another approach is to send the data to another web location via an API. The Google Drive API comes to mind: https://developers.google.com/drive.
You may already be aware that analyzation of the pages can be done via a content script. Simply inject the JavaScript code or libraries you need into pages the user opens, via the manifest file, something like this:
"content_scripts": [
{
"matches" : [
"<all_urls>"
],
"js" : [
"some_content_script.js"
]
}
],
You'll also need to add "all_urls" to the permissions section of the manifest file.
The load time calculation could simply be accomplished with a timer starting the beginning of the page load (as soon as the script is injected), and ending on "document.onload".
Sounds like a pretty useful extension to be honest!
There are a couple ways you could approach this
Use WebPageTest - either get an API key for the public instance, or install your own private instance (http://andydavies.me/blog/2012/09/18/how-to-create-an-all-in-one-webpagetest-private-instance/)
Drive Chrome via it's remote debug API - Andrea provides an example of how to use the API to generate HAR files, but your case would be simpler - https://github.com/andydavies/chrome-har-capturer
You could also probably hack this Chrome extension to post the times to a remote site - https://chrome.google.com/webstore/detail/page-load-time/fploionmjgeclbkemipmkogoaohcdbig via a background window

Is it possible to configure Sinatra .erb templates for offline using cache.manifest?

I've looked around at various posts on the web; but it looks like it's all only for static .html files. Mephisto and rack-offline looked like they could be useful, but I couldn't figure out if they could help with sinatra templates.
My views/index.erb has 3 get do's - /part1, /part2, /part3 which hold html output; would be great if they could be cached for offline. Any pointers?
I'll try to answer your question as best I can. I guess with "My views/index.erb has 3 get do's", you mean you have three routes in your application, /part1, /part2, and /part3, respectively. Those three routes are processed using ERB templates and return HTML. Now you'd like to put them into a cache manifest for offline use.
First of all: For the client, it doesn't matter if the resource behind a URL is generated dynamically or if it is a static file. You could just put part1 (notice the missing slash) into your manifest and be done.
The effect would be that a client requests /part1 just once, and then use the cached version until you update your manifest.
Here's the catch: If you process ERB templates, you obviously have something dynamic in the response. And that's why I don't get why you'd want to cache the response.
Don't get me wrong: There might be perfectly good reasons why you want to do this. And I don't see any reason why you can't put routes to dynamic resources into your cache manifest.