I am making changes in existing web page in yii2.
I had this section of code:
Html::a('Confirm!',[
'default/apply',
'confirm' => 1,
'id' => $data->id
],['class' => 'btn-primary','data-method' => 'post'])
I have moved this to a different container on the same page.
(I had to adjust slightly, changing $data->id into $projectInfo->id as earlier it was inside anonymous function within a widget and now inside a foreach loop. But this should not be relevant I suppose.)
Both before and after the change the same line is present in html (but in different part of the page):
<a class="btn-primary" href="/participant/default/apply/13/1" data-method="post">Confirm!</a>
But on execution http request is now sent as GET instead of POST.
BEFORE: "POST /participant/default/apply/13/1 HTTP/1.1"
NOW: "GET /participant/default/apply/13/1 HTTP/1.1"
I cannot figure out why this changed and how to get the code to work as POST in new location. This href execution must depend on some additional factor that I am not aware of.
You can send POST request using link thanks to JavaScript inside yii.js file that wraps it in form silently. If this JS is not loaded in assets link works in standard way which is sending GET requests.
Check if yii.js is loaded (usually through registering yii\web\YiiAsset directly or by dependency).
Related
I'm trying to write a spider that will automatically log in to this website. However, when I try using scrapy.FormRequest.from_response in the shell I get the error:
No <form> element found in <200 https://www.athletic.net/account/login/?ReturnUrl=%2Fdefault.aspx>
I can definitely see the form when I inspect element on the site, but it just did not show up in Scrapy when I tried finding it using response.xpath() either. Is it possible for the form content to be hidden from my spider somehow? If so, how do I fix it?
The form is created using Javascript, it's not part of the static HTML source code. Scrapy does not parse Javascript, thus it cannot be found.
The relevant part of the static HTML (where they inject the form using Javascript) is:
<div ng-controller="AppCtrl as appC" class="m-auto pt-3 pb-5 container" style="max-width: 425px;">
<section ui-view></section>
</div>
To find issues like this, I would either:
compare the source code from "View Source Code" and "Inspect" to each other
browse the web page with a browser without Javascript (when I develop scrapers I usually have one browser with Javascript for research and documentations and another one for checking web pages without Javascript)
In this case, you have to manually create your FormRequest for this web page. I was not able to spot any form of CSRF protection on their form, so it might be as simple as:
FormRequest(url='https://www.athletic.net/account/auth.ashx',
formdata={"e": "foo#example.com", "pw": "secret"})
However, I think you cannot use formdata, but instead they expect you to send JSON. Not sure if FormRequest can handle this, I guess you just want to use a standard Request.
Since they heavily use Javascript on their front end, you cannot use the source code of the page to find these parameters either. Instead, I used the developer console of my browser and checked the request/response that happened when I tried to login with invalid credentials.
This gave me:
General:
Request URL: https://www.athletic.net/account/auth.ashx
[...]
Request Payload:
{e: "foo#example.com", pw: "secret"}
Scrapy has a JsonRequest class to help with posting JSON. See here [https://docs.scrapy.org/en/latest/topics/request-response.html]
So something like the below should work
data = {"password": "pword", "username": "user"}
# JSON POST to API login URL
return JsonRequest(
url=url,
callback=self.after_login,
data=data,
)
How can one send <head> contents before the controller finishes? The idea is to start loading CSS as soon as possible (don't wait for controller action).
Sample scenario:
// in the controller
sleep(5);
This gives:
blank page for 5 seconds -> display the head -> start loading CSS -> body
The flow I want to get is:
Send head -> start loading CSS -> wait for the controller -> send rest of the page (body)
The <head> is now in layout.phtml, which later includes the index controller script (index.phtml).
Maybe I could have <head> as a partial and send it somehow before the whole layout?
One approach is to create an abstract controller that all controllers extend, and in the onDispatch function render the head template and flush:
public function onDispatch(MvcEvent $e) {
$renderer = $this->getServiceLocator()->get('ViewRenderer');
$content = new ViewModel();
$content->setTemplate('path/to/head.phtml');
$content = $renderer->render($content);
echo $content;
flush();
parent::onDispatch($e);
}
Drawbacks to this approach:
You have no access to the headTitle, headMeta, headLink, headScript and other view helpers elsewhere in your application (it is possible in a controller or viewscript to add a style sheet and js plugin for just that page).
You will be unable to perform redirects as a response has already been sent
You can't gzip the content as well as flushing it
Some versions of Microsoft Internet Explorer will only start to display the page after they have received 256 bytes of output, so you may need to send extra whitespace before flushing to get those browsers to display the page.
In theory, you could use this approach to load all static content in the layout before echoing $this->content - such as logo, navigation, search bar, etc etc.
As I've stated, this breaks redirects meaning helpers and plugins such a PostRedirectGet will not work.
According to the HTML book which I am reading (and according to here: http://www.w3schools.com/tags/att_form_action.asp) it says that in the case of forms:
<form action="/login/" method="post">
'action' specifies where to send the form data when a form is submitted. The syntax for it could be
<form action="URL">
Now, in the book I am reading, it also talks about a hidden 'next' variable, like so:
<form action="/login/" method="post">
<input type='hidden' name='next' value='/' />
<input type='submit' value='login' />
The book I am reading states that
The form contains a submit button as well as a hidden field called 'next'. This hidden variable contains a URL that tells where to redirect the user after they have logged in.
From my understanding, doesn't 'action' either way tell specify where to redirect to after the form has been submitted? So isn't having the hidden 'next' variable not necessary because 'action' already tells where to redirect to? Which takes priority if action and next are different URLs? Does it redirect to the URL in action or the URL in next?
First up the action attribute has nothing to do with redirection.
When you click a submit button in a form the browser sends an HTTP request to the server, specificaly to the resource mentioned in the action attribute, using either get or post. What happens next is entirely dependant on what you are using server side.
As far as I am aware the presense of a Next property in the request has no special meaning.
Normally when a form is submitted one of two things will happen
The server does some process then return an HTTP Response. This will show the action url in the address bar of the browser
The server does some processing then redirects the user to a new page. This is usualy programmed by you the programmer and does not happen automagicaly. In php you would use http_redirect(someURL). The next hidden field could be used to hold this URL but it won't do anything with it by itself.
On a technical note, http redirects, be they with asp, php, etc cause, an additional round trip to the browser. So in the case of the redirect above, an HTTP Response is send to the broswser with a header indicating to redirect and where to direct to. The browser will then send a new request to the new location. This is why the new address appears in the browser address bar.
Action url is where the page will be redirected to, by default. The value of next has no effect unless you have server side code to do that. Even if you do write code, the redirect will first go to action url and then to any other url you have changed.
Although i don't think it's possible but is it a way to get a page size without downloading it?(it's seems silly but anyway i wanna ask it here)
you can curl a page and get it's size but i don't want to dl the page and also there is nothing interesting in the header with text/html.
Query the Content-Length property from the page header.
As defined by Section 14.13 of the Hypertext Transfer Protocol Documentation.
Use the HEAD HTTP method instead of GET:
The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself.
Even making a HEAD request doesn't guaranty u will get content-length in the output. check it for yourself:
stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);
var_dump(get_headers("http://www.stackoverflow.com", 1));
var_dump(get_headers("http://www.google.com", 1));
var_dump(get_headers("http://php.net/", 1));
I think the best option is still to go and dl the page using curl and then see what's the size(pure text) of the page
Is there a way to specify a form either through type or action url to not open the response? In other words I would like to send the info to the server, but not do anything on the client. I know I can use ajax and ignore the response, but I would like to avoid adding all the js to my code if possible.
Edit: I didn't mean to limit myself to the html form. In my case server side solutions were also acceptable.
Have the server return HTTP 204 (No Content) after the form submission. According to the HTTP 1.1 spec:
10.2.5 204 No Content
The server has fulfilled the request
but does not need to return an
entity-body, and might want to return
updated metainformation. The response
MAY include new or updated
metainformation in the form of
entity-headers, which if present
SHOULD be associated with the
requested variant.
If the client is a user agent, it
SHOULD NOT change its document view
from that which caused the request to
be sent. This response is primarily
intended to allow input for actions to
take place without causing a change to
the user agent's active document view,
although any new or updated
metainformation SHOULD be applied to
the document currently in the user
agent's active view.
The 204 response MUST NOT include a
message-body, and thus is always
terminated by the first empty line
after the header fields.
This sounds like exactly what you want.
try this:
<iframe id="invisible" ...
<form target="invisible" ...
I found that name attribute should be specified as well (I tested in IE11). E.g:
<iframe id="invisible" name="invisible" style="display:none;"></iframe>
<form method="post" target="invisible" action="url.com/whatever?x=y" id="fileForm" enctype="multipart/form-data">
With ASP.NET you could have a page that processes the form post and simply end the response right away, this will leave the user at the same page.
However, no response to the user at all is not the best user experience.....