Revert removal of Http. Witness my indecisiveness.
This commit is contained in:
parent
fb975ead23
commit
dd203f0065
14 changed files with 6923 additions and 0 deletions
180
ext/http-parser/README.md
Normal file
180
ext/http-parser/README.md
Normal file
|
@ -0,0 +1,180 @@
|
|||
HTTP Parser
|
||||
===========
|
||||
|
||||
[](https://travis-ci.org/joyent/http-parser)
|
||||
|
||||
This is a parser for HTTP messages written in C. It parses both requests and
|
||||
responses. The parser is designed to be used in performance HTTP
|
||||
applications. It does not make any syscalls nor allocations, it does not
|
||||
buffer data, it can be interrupted at anytime. Depending on your
|
||||
architecture, it only requires about 40 bytes of data per message
|
||||
stream (in a web server that is per connection).
|
||||
|
||||
Features:
|
||||
|
||||
* No dependencies
|
||||
* Handles persistent streams (keep-alive).
|
||||
* Decodes chunked encoding.
|
||||
* Upgrade support
|
||||
* Defends against buffer overflow attacks.
|
||||
|
||||
The parser extracts the following information from HTTP messages:
|
||||
|
||||
* Header fields and values
|
||||
* Content-Length
|
||||
* Request method
|
||||
* Response status code
|
||||
* Transfer-Encoding
|
||||
* HTTP version
|
||||
* Request URL
|
||||
* Message body
|
||||
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
One `http_parser` object is used per TCP connection. Initialize the struct
|
||||
using `http_parser_init()` and set the callbacks. That might look something
|
||||
like this for a request parser:
|
||||
|
||||
http_parser_settings settings;
|
||||
settings.on_url = my_url_callback;
|
||||
settings.on_header_field = my_header_field_callback;
|
||||
/* ... */
|
||||
|
||||
http_parser *parser = malloc(sizeof(http_parser));
|
||||
http_parser_init(parser, HTTP_REQUEST);
|
||||
parser->data = my_socket;
|
||||
|
||||
When data is received on the socket execute the parser and check for errors.
|
||||
|
||||
size_t len = 80*1024, nparsed;
|
||||
char buf[len];
|
||||
ssize_t recved;
|
||||
|
||||
recved = recv(fd, buf, len, 0);
|
||||
|
||||
if (recved < 0) {
|
||||
/* Handle error. */
|
||||
}
|
||||
|
||||
/* Start up / continue the parser.
|
||||
* Note we pass recved==0 to signal that EOF has been recieved.
|
||||
*/
|
||||
nparsed = http_parser_execute(parser, &settings, buf, recved);
|
||||
|
||||
if (parser->upgrade) {
|
||||
/* handle new protocol */
|
||||
} else if (nparsed != recved) {
|
||||
/* Handle error. Usually just close the connection. */
|
||||
}
|
||||
|
||||
HTTP needs to know where the end of the stream is. For example, sometimes
|
||||
servers send responses without Content-Length and expect the client to
|
||||
consume input (for the body) until EOF. To tell http_parser about EOF, give
|
||||
`0` as the forth parameter to `http_parser_execute()`. Callbacks and errors
|
||||
can still be encountered during an EOF, so one must still be prepared
|
||||
to receive them.
|
||||
|
||||
Scalar valued message information such as `status_code`, `method`, and the
|
||||
HTTP version are stored in the parser structure. This data is only
|
||||
temporally stored in `http_parser` and gets reset on each new message. If
|
||||
this information is needed later, copy it out of the structure during the
|
||||
`headers_complete` callback.
|
||||
|
||||
The parser decodes the transfer-encoding for both requests and responses
|
||||
transparently. That is, a chunked encoding is decoded before being sent to
|
||||
the on_body callback.
|
||||
|
||||
|
||||
The Special Problem of Upgrade
|
||||
------------------------------
|
||||
|
||||
HTTP supports upgrading the connection to a different protocol. An
|
||||
increasingly common example of this is the Web Socket protocol which sends
|
||||
a request like
|
||||
|
||||
GET /demo HTTP/1.1
|
||||
Upgrade: WebSocket
|
||||
Connection: Upgrade
|
||||
Host: example.com
|
||||
Origin: http://example.com
|
||||
WebSocket-Protocol: sample
|
||||
|
||||
followed by non-HTTP data.
|
||||
|
||||
(See http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75 for more
|
||||
information the Web Socket protocol.)
|
||||
|
||||
To support this, the parser will treat this as a normal HTTP message without a
|
||||
body. Issuing both on_headers_complete and on_message_complete callbacks. However
|
||||
http_parser_execute() will stop parsing at the end of the headers and return.
|
||||
|
||||
The user is expected to check if `parser->upgrade` has been set to 1 after
|
||||
`http_parser_execute()` returns. Non-HTTP data begins at the buffer supplied
|
||||
offset by the return value of `http_parser_execute()`.
|
||||
|
||||
|
||||
Callbacks
|
||||
---------
|
||||
|
||||
During the `http_parser_execute()` call, the callbacks set in
|
||||
`http_parser_settings` will be executed. The parser maintains state and
|
||||
never looks behind, so buffering the data is not necessary. If you need to
|
||||
save certain data for later usage, you can do that from the callbacks.
|
||||
|
||||
There are two types of callbacks:
|
||||
|
||||
* notification `typedef int (*http_cb) (http_parser*);`
|
||||
Callbacks: on_message_begin, on_headers_complete, on_message_complete.
|
||||
* data `typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);`
|
||||
Callbacks: (requests only) on_uri,
|
||||
(common) on_header_field, on_header_value, on_body;
|
||||
|
||||
Callbacks must return 0 on success. Returning a non-zero value indicates
|
||||
error to the parser, making it exit immediately.
|
||||
|
||||
In case you parse HTTP message in chunks (i.e. `read()` request line
|
||||
from socket, parse, read half headers, parse, etc) your data callbacks
|
||||
may be called more than once. Http-parser guarantees that data pointer is only
|
||||
valid for the lifetime of callback. You can also `read()` into a heap allocated
|
||||
buffer to avoid copying memory around if this fits your application.
|
||||
|
||||
Reading headers may be a tricky task if you read/parse headers partially.
|
||||
Basically, you need to remember whether last header callback was field or value
|
||||
and apply following logic:
|
||||
|
||||
(on_header_field and on_header_value shortened to on_h_*)
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| State (prev. callback) | Callback | Description/action |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| nothing (first call) | on_h_field | Allocate new buffer and copy callback data |
|
||||
| | | into it |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| value | on_h_field | New header started. |
|
||||
| | | Copy current name,value buffers to headers |
|
||||
| | | list and allocate new buffer for new name |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| field | on_h_field | Previous name continues. Reallocate name |
|
||||
| | | buffer and append callback data to it |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| field | on_h_value | Value for current header started. Allocate |
|
||||
| | | new buffer and copy callback data to it |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
| value | on_h_value | Value continues. Reallocate value buffer |
|
||||
| | | and append callback data to it |
|
||||
------------------------ ------------ --------------------------------------------
|
||||
|
||||
|
||||
Parsing URLs
|
||||
------------
|
||||
|
||||
A simplistic zero-copy URL parser is provided as `http_parser_parse_url()`.
|
||||
Users of this library may wish to use it to parse URLs constructed from
|
||||
consecutive `on_url` callbacks.
|
||||
|
||||
See examples of reading in headers:
|
||||
|
||||
* [partial example](http://gist.github.com/155877) in C
|
||||
* [from http-parser tests](http://github.com/joyent/http-parser/blob/37a0ff8/test.c#L403) in C
|
||||
* [from Node library](http://github.com/joyent/node/blob/842eaf4/src/http.js#L284) in Javascript
|
Loading…
Add table
Add a link
Reference in a new issue