How to intercept *NON* http/s requests in puppeteer - puppeteer

I'm trying to scrape the following html:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>click.com.cn</title>
<script >
window.location.href='weixin://dl/business/?t=111111'
</script>
</head>
<body>
</body>
</html>
since weixin is a custom protocol the browser cannot navigate to this website.
However it also makes the puppeteer page to get stuck on screenshot request .
I have tried to intercept this weixin:// request but have noticed that the request interceptor intercepts only http/s requests
my code :
await page.setRequestInterception(true);
page.on("request", async (request: Request) => {
const url = request.url();
// request.abort etc...
}
Is there an option in puppeteer or in Chrome DevTools Protocol to intercept all kind of protocols .
Also tried to override the window.location.href property but it failed.

Related

ASP.NET Core 6 Web app with index.html using <base href> would not load script files

I have an ASP.NET Core 6 app and I host my Angular app with it, among other things (from Program.cs):
...
app.Use(async (context, next) =>
{
RewriteXFrameOptionsHeader(context);
await next();
if (context.Response.StatusCode == 404 && !Path.HasExtension(context.Request.Path.Value))
{
context.Request.Path = "/";
await next();
}
});
app.UseDefaultFiles(new DefaultFilesOptions {DefaultFileNames = new List<string> {"index.html"}});
app.UseStaticFiles();
...
It it doesn't work, because scripts are not loaded. The Angular index.html uses base href and relative paths to js sources, like so:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<base href="/ngxapp">
...
<script src="runtime.7cef1b4acdcbe752.js" type="module"></script><script src="polyfills.e4b5afbd657fbe4a.js" type="module"></script><script src="main.c8a9bcf210ef6760.js" type="module"></script>
</body>
</html>
Even though script src's contain relative paths, the browser tries to load script from the root. Here's what dev tools Network tab shows:
Request URL: https://localhost:7101/runtime.7cef1b4acdcbe752.js
MDN docs for base href state:
The HTML element specifies the base URL to use for all relative URLs in a document.
Why then scripts are being loaded from root?

ASP.NET Core - serve different HTML file for SPA?

Question
How can I serve different HTML (entry) files for an SPA application (Vue) in ASP.NET Core?
Explanation
Depending on a condition, I would like to serve a different HTML page (much like a controller would do for a non-SPA). The page would still include the entry point for Vue apps <div id="app">, but some other changes should be done before serving the HTML.
I know I somehow have to change the startup.cs file because that renders the HTML with app.UseStaticFiles() and app.UseSPAStaticFiles()
Example
Condition 1 is fulfilled, base.html is served from client -> public -> base.html
Condition 2 is fulfilled instead, special.html is served from client -> public -> special.html
Code
The basic HTML looks something like this:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<link rel="icon" href="<%= BASE_URL %>favicon.ico">
<title>Title</title>
</head>
<body>
<noscript>
<strong>We're sorry but this webpage doesn't work properly without JavaScript enabled. Please enable it to
continue.</strong>
</noscript>
<div id="app"></div>
<!-- built files will be auto injected -->
</body>
</html>
The important parts of startup.cs looks like this:
services.AddSpaStaticFiles(configuration =>
{
configuration.RootPath = "ClientApp/dist";
});
// ....
app.UseStaticFiles();
app.UseSpaStaticFiles();
// ....
app.UseEndpoints(endpoints =>
{
endpoints.MapControllerRoute(
name: "default",
pattern: "{controller}/{action=Index}/{id?}");
if (env.IsDevelopment())
{
endpoints.MapToVueCliProxy(
"{*path}",
new SpaOptions { SourcePath = "ClientApp" },
npmScript: "serve",
regex: "Compiled successfully");
}
// Add MapRazorPages if the app uses Razor Pages. Since Endpoint Routing includes support for many frameworks, adding Razor Pages is now opt -in.
endpoints.MapRazorPages();
});
// ....
app.UseSpa(spa =>
{
spa.Options.SourcePath = "ClientApp";
});

Meaning of the value true in myRequest.open("GET", url, true)?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Example</title>
<script type="text/javascript">
var myRequest = new XMLHttpRequest();
function loadXMLDoc(url)
{
myRequest.onreadystatechange = onResponse;
myRequest.open("GET", url, true);
myRequest.send();
}
function onResponse()
{
if(myRequest.readyState == 4 && myRequest.status == 200)
document.getElementById("A").innerHTML = myRequest.responseXML.documentElement.getElementsByTagName("color")[1].childNodes[0].nodeValue;
}
</script>
</head>
<body>
<p>Response: <span id="A"></span></p>
<button onclick="loadXMLDoc('example.xml')">Get Color</button>
</body>
</html>
a.meaning of the value true in myRequest.open("GET", url, true).
b. What values exists for the myRequest.readyState field, and what do they mean?
c. After the Get Color button is clicked, which portion of the example.htm code is updated?
d. If the XML file being accessed to get the color data has the following code:
<color_list>
<color>Red</color>
<color>Green</color>
<color>Blue</color>
</color_list>
sketch the Web browser page display of the example.htm page after the Get Color button is pressed and the page is updated.

ASP Classic XMLHTTP GET JSON

I am trying to retrieve the output from the a URL using XMLHTTP GET:
The output in the browser when I hit the url directly is the following:
{
"Titles": {
"resultCount": 37680,
"moreResources": true
}
}
The ASP code on test.asp I am using is:
<%#language=JScript%>
<%
var objSrvHTTP;
objSrvHTTP = Server.CreateObject ("Msxml2.ServerXMLHTTP.6.0");
objSrvHTTP.open ("GET","http://someipaddress:8080/Publisher/Titles/Paging/0,0,tc?output=json", false);
objSrvHTTP.send ();
Response.ContentType = "application/json";
Response.Write (objSrvHTTP.responseText);
%>
The results displayed in browser from hitting test.asp is:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>something</title>
</head>
<body>
{
"Titles": {
"resultCount": 37698,
"moreResources": true
}
}
</body>
</html>
I am looking to have just the data between the body tags returned, or even better just the value for "resultCount". Any help would be much appreciated.
You need to remove the HTML markup, when reading JSON data it should be nothing but valid JSON in the request response.

Ooyala player setembedcode

I am trying to implement Ooyala's player in my code and I was told that if I wanted to use buttons to switch the content of the video player then I should use the setEmbedCode function but the examples they have on their site aren't very clear.
What I want to have happen is to simply have a link that when clicked will change the video to a different URL/embed code. I've tried using the 'setQueryStringParameters'
document.getElementById('video-player'+pageNum).setQueryStringParameters({embedCode:videoURL})
All I get with that is a:
'is not a function message.'
var url = 'http://player.ooyala.com/player.js?embedCode='+videoURL+'&targetReplaceId=video-player'+pageNum+'';
var tempScript = document.createElement('script');
tempScript.type = 'text/javascript';
tempScript.src = url;
When I call this it creates the video player just fine, but I'm not sure how to change the embed code once it's created.
Check this sample code from ooyala site. "SwitchMovie" will play different video with different embedcode.
http://demo.ooyala.com/product-demos/playerScripting-demo.html
document.getElementById('player').setQueryStringParameters({embedCode:'8wNTqa-6MkpEB1c7fNGOpoSJytLptmm9',hide:'share,fullscreen'})
UPDATE:
The following code is working perfect for me. Try it, as I mentioned in my comments below you need to have a callback function if you need to interface with the player.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Swap Video</title>
</head>
<body>
<script src="http://player.ooyala.com/player.js?callback=receiveOoyalaEvent&playerId=player&width=480&height=360&embedCode=llMDQ6rMWxVWbvdxs2yduVEtSrNCJUk1&version=2"></script>
<script>
function receiveOoyalaEvent(playerId, eventName, eventArgs) {
}
</script>
<br><br>
<button onclick="document.getElementById('player').setQueryStringParameters({embedCode:'8wNTqa-6MkpEB1c7fNGOpoSJytLptmm9',hide:'share,fullscreen'})">Switch Movie</button>
</body>
</html>