August'24: Kamaelia is in maintenance mode and will recieve periodic updates, about twice a year, primarily targeted around Python 3 and ecosystem compatibility. PRs are always welcome. Latest Release: 1.14.32 (2024/3/24)
This component is for downloading a single file from an HTTP server. Pick up data received from the server on its "outbox" outbox.
Generally you should use SimpleHTTPClient in preference to this.
How to use it:
Pipeline(
SingleShotHTTPClient("http://www.google.co.uk/"),
SomeComponentThatUnderstandsThoseMessageTypes()
).run()
If you want to use it directly, note that it doesn't output strings but ParsedHTTPHeader, ParsedHTTPBodyChunk and ParsedHTTPEnd like HTTPParser. This makes has the advantage of not buffering huge files in memory but outputting them as a stream of chunks. (with plain strings you would not know the contents of the headers or at what point that response had ended!)
SingleShotHTTPClient accepts a URL parameter at its creation (to __init__). When activated it creates an HTTPParser instance and then connects to the webserver specified in the URL using a TCPClient component. It sends an HTTP request and then any response from the server is received by the HTTPParser.
HTTPParser processes the response and outputs it in parts as:
ParsedHTTPHeader,
ParsedHTTPBodyChunk,
ParsedHTTPBodyChunk,
...
ParsedHTTPBodyChunk,
ParsedHTTPEnd
If SingleShotHTTPClient detects that the requested URL is a redirect page (using the Location header) then it begins this cycle anew with the URL of the new page, otherwise the parts of the page output by HTTPParser are sent on to "outbox".
This component downloads the pages corresponding to HTTP URLs received on "inbox" and outputs their contents (file data) as a message, one per URL, to "outbox" in the order they were received.
Type URLs, and they will be downloaded and placed, back to back in "downloadedfile.txt":
Pipeline(
ConsoleReader(">>> ", ""),
SimpleHTTPClient(),
SimpleFileWriter("downloadedfile.txt"),
).run()
SimpleHTTPClient uses the Carousel component to create a new SingleShotHTTPClient component for every URL requested. As URLs are handled sequentially, it has only one SSHC child at anyone time.
Warning!
You should be using the inbox/outbox interface, not these methods (except construction). This documentation is designed as a roadmap as to their functionalilty for maintainers and new component developers.
Create and link to a carousel object
Destroy child components and send producerFinished when we quit.
Main loop.
SingleShotHTTPClient() -> component that can download a file using HTTP by URL
Arguments: - starturl -- the URL of the file to download - [postbody] -- data to POST to that URL - if set to None becomes an empty body in to a POST (of PUT) request - [connectionclass] -- specify a class other than TCPClient to connect with - [method] -- the HTTP method for the request (default to GET normally or POST if postbody != ""
Warning!
You should be using the inbox/outbox interface, not these methods (except construction). This documentation is designed as a roadmap as to their functionalilty for maintainers and new component developers.
Craft a HTTP request string for the supplied url
Check for a redirect response and queue the fetching the page it points to if it is such a response. Returns true if it was a redirect page and false otherwise.
Main loop.
Called repeatedly by main loop. Checks inboxes and processes messages received. Start the fetching of the new page if the current one is a redirect and has been completely fetched.
Connect to the remote HTTP server and send request
Close TCP connection and HTTP parser
Got a problem with the documentation? Something unclear that could be clearer? Want to help improve it? Constructive criticism is very welcome - especially if you can suggest a better rewording!
Please leave you feedback here in reply to the documentation thread in the Kamaelia blog.
-- Automatic documentation generator, 05 Jun 2009 at 03:01:38 UTC/GMT