Tarantool development patches archive
 help / color / mirror / Atom feed
* [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements
@ 2019-10-28 17:11 Ilya Kosarev
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option Ilya Kosarev
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Ilya Kosarev @ 2019-10-28 17:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: tarantool-patches

This patchset introduces CURLOPT_ACCEPT_ENCODING option. It brought up
fix for CURLE_WRITE_ERROR processing and addition of
CURLE_BAD_CONTENT_ENCODING in curl request code processing, as well as
enhancement of provided error info for curl request.

Changes in v2:
- added docbot request
- fixed comments
- enhanced httpc_set_accept_encoding description
- fixed error handling for unsupported encodings

Changes in v3:
- moved error handling for unsupported encodings fix to separate commit
- added error message obtained using CURLOPT_ERRORBUFFER option
- added CURLE_BAD_CONTENT_ENCODING for curl request code processing

Changes in v4:
- fixed commit message in PATCH 2/4
- added test case for error message obtained using CURLOPT_ERRORBUFFER option

Branch: https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4232-curlopt-accept-encoding
Issues: https://github.com/tarantool/tarantool/issues/4232
        https://github.com/tarantool/tarantool/issues/4578
        https://github.com/tarantool/tarantool/issues/4579

Ilya Kosarev (4):
  http: add CURLOPT_ACCEPT_ENCODING option
  http: remove redundant & incorrect case for curl_request code
  http: enrich httpc_request with curl error message buffer
  http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code

 src/httpc.c                       | 24 +++++++++++++++++++-----
 src/httpc.h                       | 29 +++++++++++++++++++++++++++++
 src/lua/httpc.c                   |  9 +++++++++
 src/lua/httpc.lua                 |  2 ++
 test/app-tap/http_client.test.lua |  4 +++-
 5 files changed, 62 insertions(+), 6 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option
  2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
@ 2019-10-28 17:11 ` Ilya Kosarev
  2019-10-28 23:42   ` Alexander Turenko
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code Ilya Kosarev
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Ilya Kosarev @ 2019-10-28 17:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: tarantool-patches

CURLOPT_ACCEPT_ENCODING option is now supported.
This option enables automatic decompression of HTTP downloads.
See https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html

Closes #4232

@TarantoolBot document
Title: http: CURLOPT_ACCEPT_ENCODING option
Update the documentation for curl options to reflect new
CURLOPT_ACCEPT_ENCODING option. It enables automatic
decompression of HTTP downloads by setting the contents of the
Accept-Encoding header sent in a HTTP request and enabling
decoding of a response when a Content-Encoding header is received.
This is a request, not an order; the server may or may not do it.
Servers might respond with Content-Encoding even without getting
an Accept-Encoding in the request. Servers might respond with a
different Content-Encoding than what was asked for in the request.
@param encoding specifies what encoding you'd like. This param
can be an empty string which means Accept-Encoding header will
contain all built-in supported encodings. This param can be
comma-separated list of accepted encodings, like:
"br, gzip, deflate". Bundled libcurl supports "identity",
meaning non-compressed, "deflate" which requests the server to
compress its response using the zlib algorithm and "gzip" which
requests the gzip algorithm. System libcurl also possibly
supports "br" which is brotli.
See https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
---
 src/httpc.c       | 16 ++++++++++++++++
 src/httpc.h       | 24 ++++++++++++++++++++++++
 src/lua/httpc.c   |  5 +++++
 src/lua/httpc.lua |  2 ++
 4 files changed, 47 insertions(+)

diff --git a/src/httpc.c b/src/httpc.c
index 8d18b9966..146a6f067 100644
--- a/src/httpc.c
+++ b/src/httpc.c
@@ -361,6 +361,22 @@ httpc_set_follow_location(struct httpc_request *req, long follow)
 			 follow);
 }
 
+void
+httpc_set_accept_encoding(struct httpc_request *req, const char *encoding)
+{
+/*
+* CURLOPT_ACCEPT_ENCODING was called CURLOPT_ENCODING before
+* libcurl 7.21.6.
+*/
+#if LIBCURL_VERSION_NUM >= 0x071506
+	curl_easy_setopt(req->curl_request.easy, CURLOPT_ACCEPT_ENCODING,
+			 encoding);
+#else
+	curl_easy_setopt(req->curl_request.easy, CURLOPT_ENCODING,
+			 encoding);
+#endif
+}
+
 int
 httpc_execute(struct httpc_request *req, double timeout)
 {
diff --git a/src/httpc.h b/src/httpc.h
index 99fd8fbd4..f710b3d13 100644
--- a/src/httpc.h
+++ b/src/httpc.h
@@ -372,6 +372,30 @@ httpc_set_interface(struct httpc_request *req, const char *interface);
 void
 httpc_set_follow_location(struct httpc_request *req, long follow);
 
+/**
+ * Enable automatic decompression of HTTP downloads: set the
+ * contents of the Accept-Encoding header sent in a HTTP request
+ * and enable decoding of a response when a Content-Encoding
+ * header is received. This is a request, not an order; the
+ * server may or may not do it. Servers might respond with
+ * Content-Encoding even without getting an Accept-Encoding in the
+ * request. Servers might respond with a different
+ * Content-Encoding than what was asked for in the request.
+ * @param req request
+ * @param encoding - specify what encoding you'd like. This param
+ * can be an empty string which means Accept-Encoding header will
+ * contain all built-in supported encodings. This param can be
+ * comma-separated list of accepted encodings, like:
+ * "br, gzip, deflate". Bundled libcurl supports "identity",
+ * meaning non-compressed, "deflate" which requests the server to
+ * compress its response using the zlib algorithm and "gzip" which
+ * requests the gzip algorithm. System libcurl also possibly
+ * supports "br" which is brotli.
+ * @see https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
+ */
+void
+httpc_set_accept_encoding(struct httpc_request *req, const char *encoding);
+
 /**
  * This function does async HTTP request
  * @param request - reference to request object with filled fields
diff --git a/src/lua/httpc.c b/src/lua/httpc.c
index a8e3e2525..6ed4eb788 100644
--- a/src/lua/httpc.c
+++ b/src/lua/httpc.c
@@ -315,6 +315,11 @@ luaT_httpc_request(lua_State *L)
 		httpc_set_follow_location(req, lua_toboolean(L, -1));
 	lua_pop(L, 1);
 
+	lua_getfield(L, 5, "accept_encoding");
+	if (!lua_isnil(L, -1))
+		httpc_set_accept_encoding(req, lua_tostring(L, -1));
+	lua_pop(L, 1);
+
 	if (httpc_execute(req, timeout) != 0) {
 		httpc_request_delete(req);
 		return luaT_error(L);
diff --git a/src/lua/httpc.lua b/src/lua/httpc.lua
index ce9bb9771..3224d8c10 100644
--- a/src/lua/httpc.lua
+++ b/src/lua/httpc.lua
@@ -296,6 +296,8 @@ end
 --          'Location' header that a server sends as part of an
 --          3xx response;
 --
+--      accept_encoding - enables automatic decompression of HTTP downloads;
+--
 --  Returns:
 --      {
 --          status=NUMBER,
-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code
  2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option Ilya Kosarev
@ 2019-10-28 17:11 ` Ilya Kosarev
  2019-10-28 23:42   ` Alexander Turenko
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer Ilya Kosarev
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Ilya Kosarev @ 2019-10-28 17:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: tarantool-patches

After executing curl request we need to process curl_request code. It
might be CURLE_WRITE_ERROR. We had special case for it, which assumed
diagnostic message being set and contained corresponding assert, though
it is incorrect. Better way is to handle it as any other non-standard
event.

Part of #4232
---
 src/httpc.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/src/httpc.c b/src/httpc.c
index 146a6f067..22b54d16a 100644
--- a/src/httpc.c
+++ b/src/httpc.c
@@ -445,16 +445,12 @@ httpc_execute(struct httpc_request *req, double timeout)
 	case CURLE_COULDNT_RESOLVE_PROXY:
 	case CURLE_COULDNT_RESOLVE_HOST:
 	case CURLE_COULDNT_CONNECT:
+	case CURLE_WRITE_ERROR:
 		/* 595 Connection Problem (AnyEvent non-standard) */
 		req->status = 595;
 		req->reason = curl_easy_strerror(req->curl_request.code);
 		++env->stat.failed_requests;
 		break;
-	case CURLE_WRITE_ERROR:
-		/* Diag is already set by curl_write_cb() */
-		assert(!diag_is_empty(&fiber()->diag));
-		++env->stat.failed_requests;
-		return -1;
 	case CURLE_OUT_OF_MEMORY:
 		diag_set(OutOfMemory, 0, "curl", "internal");
 		++env->stat.failed_requests;
-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer
  2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option Ilya Kosarev
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code Ilya Kosarev
@ 2019-10-28 17:11 ` Ilya Kosarev
  2019-10-28 23:41   ` Alexander Turenko
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code Ilya Kosarev
  2019-10-28 23:43 ` [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Alexander Turenko
  4 siblings, 1 reply; 12+ messages in thread
From: Ilya Kosarev @ 2019-10-28 17:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: tarantool-patches

When processing curl request error code, we are collecting error
message using curl_easy_strerror. Now we are providing more info
in errmsg, which is obtained using curl_easy_setopt with
CURLOPT_ERRORBUFFER option. Test case to confirm it is added.

Closes #4578

@TarantoolBot document
Title: http: new field in client_object:request return table
Update the documentation for client_object:request method to reflect
new field errmsg in return table. It contains extended curl request
error message obtained using CURLOPT_ERRORBUFFER option.
See https://curl.haxx.se/libcurl/c/CURLOPT_ERRORBUFFER.html
---
 src/httpc.c                       | 1 +
 src/httpc.h                       | 5 +++++
 src/lua/httpc.c                   | 4 ++++
 test/app-tap/http_client.test.lua | 4 +++-
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/httpc.c b/src/httpc.c
index 22b54d16a..b493a8498 100644
--- a/src/httpc.c
+++ b/src/httpc.c
@@ -398,6 +398,7 @@ httpc_execute(struct httpc_request *req, double timeout)
 	curl_easy_setopt(req->curl_request.easy, CURLOPT_HEADERDATA, (void *) req);
 	curl_easy_setopt(req->curl_request.easy, CURLOPT_PRIVATE, (void *) &req->curl_request);
 	curl_easy_setopt(req->curl_request.easy, CURLOPT_HTTPHEADER, req->headers);
+	curl_easy_setopt(req->curl_request.easy, CURLOPT_ERRORBUFFER, req->errmsg);
 
 	++env->stat.total_requests;
 
diff --git a/src/httpc.h b/src/httpc.h
index f710b3d13..145b59929 100644
--- a/src/httpc.h
+++ b/src/httpc.h
@@ -105,6 +105,11 @@ struct httpc_request {
 	int status;
 	/** Error message */
 	const char *reason;
+	/**
+	 * Buffer for error message obtained using
+	 * CURLOPT_ERRORBUFFER option.
+	 */
+        const char errmsg[CURL_ERROR_SIZE];
 	/** buffer of headers */
 	struct region resp_headers;
 	/** buffer of body */
diff --git a/src/lua/httpc.c b/src/lua/httpc.c
index 6ed4eb788..9fb8e4801 100644
--- a/src/lua/httpc.c
+++ b/src/lua/httpc.c
@@ -335,6 +335,10 @@ luaT_httpc_request(lua_State *L)
 	lua_pushstring(L, req->reason);
 	lua_settable(L, -3);
 
+	lua_pushstring(L, "errmsg");
+	lua_pushstring(L, req->errmsg);
+	lua_settable(L, -3);
+
 	size_t headers_len = region_used(&req->resp_headers);
 	if (headers_len > 0) {
 		char *headers = region_join(&req->resp_headers, headers_len);
diff --git a/test/app-tap/http_client.test.lua b/test/app-tap/http_client.test.lua
index b85b605cf..2fac80115 100755
--- a/test/app-tap/http_client.test.lua
+++ b/test/app-tap/http_client.test.lua
@@ -171,7 +171,7 @@ local function test_cancel_and_errinj(test, url, opts)
 end
 
 local function test_post_and_get(test, url, opts)
-    test:plan(21)
+    test:plan(23)
 
     local http = client.new()
     test:ok(http ~= nil, "client is created")
@@ -227,6 +227,7 @@ local function test_post_and_get(test, url, opts)
     r = responses.absent_get
     test:is(r.status, 500, "GET: absent method http code page exists")
     test:is(r.reason, 'Unknown', '500 - Unknown')
+    test:is(type(r.errmsg), 'string', 'check error message')
     test:is(r.body, "No such method", "GET: absent method right body")
 
     r = responses.empty_post
@@ -249,6 +250,7 @@ local function test_post_and_get(test, url, opts)
     r = responses.bad_get
     test:is(r.status, 404, "GET: http page not exists")
     test:is(r.reason, 'Unknown', '404 - Unknown')
+    test:is(type(r.errmsg), 'string', 'check error message')
     test:isnt(r.body:len(), 0, "GET: not empty body page not exists")
     test:ok(string.find(r.body, "Not Found"),
                 "GET: right body page not exists")
-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code
  2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
                   ` (2 preceding siblings ...)
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer Ilya Kosarev
@ 2019-10-28 17:11 ` Ilya Kosarev
  2019-10-28 23:42   ` Alexander Turenko
  2019-10-28 23:43 ` [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Alexander Turenko
  4 siblings, 1 reply; 12+ messages in thread
From: Ilya Kosarev @ 2019-10-28 17:11 UTC (permalink / raw)
  To: tarantool-patches; +Cc: tarantool-patches

Currently in case of unknown encoding curl returns CURLE_WRITE_ERROR
request code. Since curl/curl#4449 CURLE_BAD_CONTENT_ENCODING request code
will be returned in this case. Therefore mentioned case has to be
added into switch clause processing curl request code.

Closes #4579
---
 src/httpc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/httpc.c b/src/httpc.c
index b493a8498..d3516049e 100644
--- a/src/httpc.c
+++ b/src/httpc.c
@@ -447,6 +447,7 @@ httpc_execute(struct httpc_request *req, double timeout)
 	case CURLE_COULDNT_RESOLVE_HOST:
 	case CURLE_COULDNT_CONNECT:
 	case CURLE_WRITE_ERROR:
+	case CURLE_BAD_CONTENT_ENCODING:
 		/* 595 Connection Problem (AnyEvent non-standard) */
 		req->status = 595;
 		req->reason = curl_easy_strerror(req->curl_request.code);
-- 
2.17.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer Ilya Kosarev
@ 2019-10-28 23:41   ` Alexander Turenko
  2019-11-07 12:07     ` Ilya Kosarev
  0 siblings, 1 reply; 12+ messages in thread
From: Alexander Turenko @ 2019-10-28 23:41 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches, tarantool-patches

Now I don't sure whether we should add the field: the cost is increased
memory footprint (it is megabytes even to 10k requests in the fly, but
this will spoil caches) and so performance. Don't sure how significant
cons and pros are.

Anyway, my comments and thoughts are below.

WBR, Alexander Turenko.

> http: enrich httpc_request with curl error message buffer

Try to fit the commit header into 50 symbols and concentrate on changes
that are visible for a user. How about this header?

 | httpc: give more information is case of an error

> When processing curl request error code, we are collecting error
> message using curl_easy_strerror. Now we are providing more info
> in errmsg, which is obtained using curl_easy_setopt with

I would say: in 'errmsg' field of a response object.

> CURLOPT_ERRORBUFFER option. Test case to confirm it is added.
> 
> Closes #4578
> 
> @TarantoolBot document
> Title: http: new field in client_object:request return table

http -> httpc

`client_object:request` is not valid Lua syntax: better use
`client_object:request()`.

> Update the documentation for client_object:request method to reflect
> new field errmsg in return table. It contains extended curl request
> error message obtained using CURLOPT_ERRORBUFFER option.

An example about, say, your case with unsupported content encoding would
shed light on why this may be needed, when a request object already
contains 'reason' field.

> diff --git a/src/httpc.h b/src/httpc.h
> index f710b3d13..145b59929 100644
> --- a/src/httpc.h
> +++ b/src/httpc.h
> @@ -105,6 +105,11 @@ struct httpc_request {
>  	int status;
>  	/** Error message */
>  	const char *reason;
> +	/**
> +	 * Buffer for error message obtained using
> +	 * CURLOPT_ERRORBUFFER option.
> +	 */
> +        const char errmsg[CURL_ERROR_SIZE];

It is indented using whitespaces rather then tabulation.

I'm tentative about placing 256 bytes buffer right in the structure.
However the whole structure is allocated with mempool_alloc(), so maybe
it is okay.

Also I don't sure whether it is okay to add 256 bytes of allocated
memory per request unconditionally: maybe we should provide an option to
disable it for applications that have tight SLA on a response processing
time.

I would give Georgy necessary context (what and how we allocate in our
http client now) and ask for his opinion on how and where allocate the
buffer.

BTW, I think that it is good to provide the field by default (in case of
an error) to allow to find a cause of an error faster during
development.

Maybe adding this field is not the good idea if it is not cheap and need
to be switched on/off case by case: AFAIR the same information can be
obtained using 'verbose' flag.

It would be good to catch Georgy's opinion on that too.

> diff --git a/src/lua/httpc.c b/src/lua/httpc.c
> index 6ed4eb788..9fb8e4801 100644
> --- a/src/lua/httpc.c
> +++ b/src/lua/httpc.c
> @@ -335,6 +335,10 @@ luaT_httpc_request(lua_State *L)
>  	lua_pushstring(L, req->reason);
>  	lua_settable(L, -3);
>  
> +	lua_pushstring(L, "errmsg");
> +	lua_pushstring(L, req->errmsg);
> +	lua_settable(L, -3);

I think we should add the field only when an error occurs. Now we add it
for successful responses and it looks a bit confusing.

More on that:

> For earlier versions if an error code was returned but there was no
> error detail then the buffer is untouched. 
> https://curl.haxx.se/libcurl/c/CURLOPT_ERRORBUFFER.html

We should handle this case and set a buffer to an empty string before a
request and verify that a code is not CURLE_OK afterwards (see the
example by the link above).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option Ilya Kosarev
@ 2019-10-28 23:42   ` Alexander Turenko
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Turenko @ 2019-10-28 23:42 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches, tarantool-patches

I have only a few comments about texts, the code is okay.

WBR, Alexander Turenko.

> http: add CURLOPT_ACCEPT_ENCODING option

I would use 'httpc:' prefix for http client commits: we did it for
several recent patches about http client.

I would say that accept_encoding option was added or that
CURLOPT_ACCEPT_ENCODING binding is added.

>
> CURLOPT_ACCEPT_ENCODING option is now supported.
> This option enables automatic decompression of HTTP downloads.
> See https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html
> 
> Closes #4232
> 
> @TarantoolBot document
> Title: http: CURLOPT_ACCEPT_ENCODING option

The same comments as above are applicable here too.

> Update the documentation for curl options to reflect new
> CURLOPT_ACCEPT_ENCODING option. It enables automatic

And here.

> decompression of HTTP downloads by setting the contents of the
> Accept-Encoding header sent in a HTTP request and enabling
> decoding of a response when a Content-Encoding header is received.
> This is a request, not an order; the server may or may not do it.
> Servers might respond with Content-Encoding even without getting
> an Accept-Encoding in the request. Servers might respond with a
> different Content-Encoding than what was asked for in the request.
> @param encoding specifies what encoding you'd like. This param

Doxygen-style @param here is a typo? The public http client API only has
'accept_encoding' option, not 'encoding'.

> can be an empty string which means Accept-Encoding header will
> contain all built-in supported encodings. This param can be
> comma-separated list of accepted encodings, like:
> "br, gzip, deflate". Bundled libcurl supports "identity",
> meaning non-compressed, "deflate" which requests the server to
> compress its response using the zlib algorithm and "gzip" which
> requests the gzip algorithm. System libcurl also possibly
> supports "br" which is brotli.
> See https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html

I suggest to split this block of text by paragraph to ease reading.

> +/**
> + * Enable automatic decompression of HTTP downloads: set the
> + * contents of the Accept-Encoding header sent in a HTTP request
> + * and enable decoding of a response when a Content-Encoding
> + * header is received. This is a request, not an order; the
> + * server may or may not do it. Servers might respond with
> + * Content-Encoding even without getting an Accept-Encoding in the
> + * request. Servers might respond with a different
> + * Content-Encoding than what was asked for in the request.
> + * @param req request
> + * @param encoding - specify what encoding you'd like. This param
> + * can be an empty string which means Accept-Encoding header will
> + * contain all built-in supported encodings. This param can be
> + * comma-separated list of accepted encodings, like:
> + * "br, gzip, deflate". Bundled libcurl supports "identity",
> + * meaning non-compressed, "deflate" which requests the server to
> + * compress its response using the zlib algorithm and "gzip" which
> + * requests the gzip algorithm. System libcurl also possibly
> + * supports "br" which is brotli.
> + * @see https://curl.haxx.se/libcurl/c/CURLOPT_ACCEPT_ENCODING.html

The same: I would split it to meaningful paragraphs to ease reading.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code Ilya Kosarev
@ 2019-10-28 23:42   ` Alexander Turenko
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Turenko @ 2019-10-28 23:42 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches, tarantool-patches

This commit should land before accept_encoding, because the feature
opens more possibilities to step into the problem that is fixed by this
patch.

Also I think the the fix should land to all 1.10+ branches: please,
verify that this will be done right when the patch will land.

The code LGTM. See several comments about texts below.

WBR, Alexander Turenko.

> http: remove redundant & incorrect case for curl_request code

Please, try to fit the commit header to 50 symbols if possible. Say
(just as example):

 | httpc: fix assertion fail after a write error

Is I remember it right: any CURLE_WRITE_ERROR did lead to this assertion
fail? Or only unknown Content-Encoding?

>
> After executing curl request we need to process curl_request code. It
> might be CURLE_WRITE_ERROR. We had special case for it, which assumed
> diagnostic message being set and contained corresponding assert, though
> it is incorrect. Better way is to handle it as any other non-standard
> event.

I would add an information how do you catched this: your story about
unknown encoding. This woudl answer to the question why the patch was
made: I think it is good property, when a reader can understood it from
a commit message even when (s)he it is not much involved into a project.

> 
> Part of #4232
> ---
>  src/httpc.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/src/httpc.c b/src/httpc.c
> index 146a6f067..22b54d16a 100644
> --- a/src/httpc.c
> +++ b/src/httpc.c
> @@ -445,16 +445,12 @@ httpc_execute(struct httpc_request *req, double timeout)
>  	case CURLE_COULDNT_RESOLVE_PROXY:
>  	case CURLE_COULDNT_RESOLVE_HOST:
>  	case CURLE_COULDNT_CONNECT:
> +	case CURLE_WRITE_ERROR:
>  		/* 595 Connection Problem (AnyEvent non-standard) */
>  		req->status = 595;
>  		req->reason = curl_easy_strerror(req->curl_request.code);
>  		++env->stat.failed_requests;
>  		break;
> -	case CURLE_WRITE_ERROR:
> -		/* Diag is already set by curl_write_cb() */
> -		assert(!diag_is_empty(&fiber()->diag));
> -		++env->stat.failed_requests;
> -		return -1;
>  	case CURLE_OUT_OF_MEMORY:
>  		diag_set(OutOfMemory, 0, "curl", "internal");
>  		++env->stat.failed_requests;
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code Ilya Kosarev
@ 2019-10-28 23:42   ` Alexander Turenko
  0 siblings, 0 replies; 12+ messages in thread
From: Alexander Turenko @ 2019-10-28 23:42 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches, tarantool-patches

The code LGTM; see several comments about the description below.

WBR, Alexander Turenko.

> http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code

I would write something like that as the commit header:

 | httpc: handle bad Content-Encoding with curl-7.67.0+

It give a reader the reason why we doing this; also it simplify schedule
the patch to proper branches; now we see that it is needed to support
more broad libcurl version range.

> Currently in case of unknown encoding curl returns CURLE_WRITE_ERROR
> request code. Since curl/curl#4449 CURLE_BAD_CONTENT_ENCODING request code

Fit a commit message body in 72 symbols.

> will be returned in this case. Therefore mentioned case has to be
> added into switch clause processing curl request code.

How about such description?

 | libcurl-7.66.0 and older returns CURLE_WRITE_ERROR when a server
 | responds with unknown or not supported Content-Encoding (see [1] and
 | [2]). This was fixed in future libcurl-7.67.0 and proper
 | CURLE_BAD_CONTENT_ENCODING code will be returned in the case.
 |
 | We should process the code in the same way as we do for
 | CURLE_WRITE_ERROR.
 |
 | [1]: issue link
 | [2]: PR link
 |
 | Closes #4579

> 
> Closes #4579
> ---
>  src/httpc.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/httpc.c b/src/httpc.c
> index b493a8498..d3516049e 100644
> --- a/src/httpc.c
> +++ b/src/httpc.c
> @@ -447,6 +447,7 @@ httpc_execute(struct httpc_request *req, double timeout)
>  	case CURLE_COULDNT_RESOLVE_HOST:
>  	case CURLE_COULDNT_CONNECT:
>  	case CURLE_WRITE_ERROR:
> +	case CURLE_BAD_CONTENT_ENCODING:
>  		/* 595 Connection Problem (AnyEvent non-standard) */
>  		req->status = 595;
>  		req->reason = curl_easy_strerror(req->curl_request.code);
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements
  2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
                   ` (3 preceding siblings ...)
  2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code Ilya Kosarev
@ 2019-10-28 23:43 ` Alexander Turenko
  2019-11-07 12:07   ` Ilya Kosarev
  4 siblings, 1 reply; 12+ messages in thread
From: Alexander Turenko @ 2019-10-28 23:43 UTC (permalink / raw)
  To: Ilya Kosarev; +Cc: tarantool-patches, tarantool-patches

Hi!

Thanks for all that work on the http client.

I mostly okay with the code itself (however there are moments where I'm
tentiative in the patch re CURLOPT_ERRORBUFFER), but I would larify
messages around.

I'll comment each patch separately.

WBR, Alexander Turenko.

On Mon, Oct 28, 2019 at 08:11:11PM +0300, Ilya Kosarev wrote:
> This patchset introduces CURLOPT_ACCEPT_ENCODING option. It brought up
> fix for CURLE_WRITE_ERROR processing and addition of
> CURLE_BAD_CONTENT_ENCODING in curl request code processing, as well as
> enhancement of provided error info for curl request.
> 
> Changes in v2:
> - added docbot request
> - fixed comments
> - enhanced httpc_set_accept_encoding description
> - fixed error handling for unsupported encodings
> 
> Changes in v3:
> - moved error handling for unsupported encodings fix to separate commit
> - added error message obtained using CURLOPT_ERRORBUFFER option
> - added CURLE_BAD_CONTENT_ENCODING for curl request code processing
> 
> Changes in v4:
> - fixed commit message in PATCH 2/4
> - added test case for error message obtained using CURLOPT_ERRORBUFFER option
> 
> Branch: https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4232-curlopt-accept-encoding
> Issues: https://github.com/tarantool/tarantool/issues/4232
>         https://github.com/tarantool/tarantool/issues/4578
>         https://github.com/tarantool/tarantool/issues/4579
> 
> Ilya Kosarev (4):
>   http: add CURLOPT_ACCEPT_ENCODING option
>   http: remove redundant & incorrect case for curl_request code
>   http: enrich httpc_request with curl error message buffer
>   http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code
> 
>  src/httpc.c                       | 24 +++++++++++++++++++-----
>  src/httpc.h                       | 29 +++++++++++++++++++++++++++++
>  src/lua/httpc.c                   |  9 +++++++++
>  src/lua/httpc.lua                 |  2 ++
>  test/app-tap/http_client.test.lua |  4 +++-
>  5 files changed, 62 insertions(+), 6 deletions(-)
> 
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer
  2019-10-28 23:41   ` Alexander Turenko
@ 2019-11-07 12:07     ` Ilya Kosarev
  0 siblings, 0 replies; 12+ messages in thread
From: Ilya Kosarev @ 2019-11-07 12:07 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches, tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 4910 bytes --]

Hi!

Thanks for your review.

In verbal discussion with Georgy he came up with an idea to provide
single error buffer for all curl requests instead of the error buffer
in each request and strdup it if needed.
However, turns out that we can't use single error buffer due to async
curl requests and inconsistent buffer filling: it might be filled in
arbitrary part of the curl request, which means it is inconsistent with
the remaining answer object body. In other words, we might get an
answer from one request and an error buffer from the other asynchronous
request, where it was already filled for some reason.
Therefore, our only option seems to be an error buffer in each request,
which doesn't seem very reasonable, as far as we already have some
error info and we can also obtain more info using @param verbose option
of client_object:request() if needed for debug.
As a conclusion for the thoughts above, I decide to exclude this patch
from the v5 of the current patchset and leave it alone until some
serious reason to reconsider this decision will come up.

>Вторник, 29 октября 2019, 2:42 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>Now I don't sure whether we should add the field: the cost is increased
>memory footprint (it is megabytes even to 10k requests in the fly, but
>this will spoil caches) and so performance. Don't sure how significant
>cons and pros are.
>
>Anyway, my comments and thoughts are below.
>
>WBR, Alexander Turenko.
>
>> http: enrich httpc_request with curl error message buffer
>Try to fit the commit header into 50 symbols and concentrate on changes
>that are visible for a user. How about this header?
>
> | httpc: give more information is case of an error
>
>> When processing curl request error code, we are collecting error
>> message using curl_easy_strerror. Now we are providing more info
>> in errmsg, which is obtained using curl_easy_setopt with
>
>I would say: in 'errmsg' field of a response object.
>
>> CURLOPT_ERRORBUFFER option. Test case to confirm it is added.
>> 
>> Closes #4578
>> 
>> @TarantoolBot document
>> Title: http: new field in client_object:request return table
>
>http -> httpc
>
>`client_object:request` is not valid Lua syntax: better use
>`client_object:request()`.
>
>> Update the documentation for client_object:request method to reflect
>> new field errmsg in return table. It contains extended curl request
>> error message obtained using CURLOPT_ERRORBUFFER option.
>
>An example about, say, your case with unsupported content encoding would
>shed light on why this may be needed, when a request object already
>contains 'reason' field.
>
>> diff --git a/src/httpc.h b/src/httpc.h
>> index f710b3d13..145b59929 100644
>> --- a/src/httpc.h
>> +++ b/src/httpc.h
>> @@ -105,6 +105,11 @@ struct httpc_request {
>>  	int status;
>>  	/** Error message */
>>  	const char *reason;
>> +	/**
>> +	 * Buffer for error message obtained using
>> +	 * CURLOPT_ERRORBUFFER option.
>> +	 */
>> +        const char errmsg[CURL_ERROR_SIZE];
>
>It is indented using whitespaces rather then tabulation.
>
>I'm tentative about placing 256 bytes buffer right in the structure.
>However the whole structure is allocated with mempool_alloc(), so maybe
>it is okay.
>
>Also I don't sure whether it is okay to add 256 bytes of allocated
>memory per request unconditionally: maybe we should provide an option to
>disable it for applications that have tight SLA on a response processing
>time.
>
>I would give Georgy necessary context (what and how we allocate in our
>http client now) and ask for his opinion on how and where allocate the
>buffer.
>
>BTW, I think that it is good to provide the field by default (in case of
>an error) to allow to find a cause of an error faster during
>development.
>
>Maybe adding this field is not the good idea if it is not cheap and need
>to be switched on/off case by case: AFAIR the same information can be
>obtained using 'verbose' flag.
>
>It would be good to catch Georgy's opinion on that too.
>
>> diff --git a/src/lua/httpc.c b/src/lua/httpc.c
>> index 6ed4eb788..9fb8e4801 100644
>> --- a/src/lua/httpc.c
>> +++ b/src/lua/httpc.c
>> @@ -335,6 +335,10 @@ luaT_httpc_request(lua_State *L)
>>  	lua_pushstring(L, req->reason);
>>  	lua_settable(L, -3);
>> 
>> +	lua_pushstring(L, "errmsg");
>> +	lua_pushstring(L, req->errmsg);
>> +	lua_settable(L, -3);
>
>I think we should add the field only when an error occurs. Now we add it
>for successful responses and it looks a bit confusing.
>
>More on that:
>
>> For earlier versions if an error code was returned but there was no
>> error detail then the buffer is untouched. 
>>  https://curl.haxx.se/libcurl/c/CURLOPT_ERRORBUFFER.html
>
>We should handle this case and set a buffer to an empty string before a
>request and verify that a code is not CURLE_OK afterwards (see the
>example by the link above).


-- 
Ilya Kosarev

[-- Attachment #2: Type: text/html, Size: 6406 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements
  2019-10-28 23:43 ` [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Alexander Turenko
@ 2019-11-07 12:07   ` Ilya Kosarev
  0 siblings, 0 replies; 12+ messages in thread
From: Ilya Kosarev @ 2019-11-07 12:07 UTC (permalink / raw)
  To: Alexander Turenko; +Cc: tarantool-patches, tarantool-patches

[-- Attachment #1: Type: text/plain, Size: 2380 bytes --]

Hi!

Thanks for your review.

In already provided v5 of this patchset patches 1, 2 & 4 are fixed
according to review comments.
Detailed answer for patch 3 review is also sent.

>Вторник, 29 октября 2019, 2:43 +03:00 от Alexander Turenko <alexander.turenko@tarantool.org>:
>
>Hi!
>
>Thanks for all that work on the http client.
>
>I mostly okay with the code itself (however there are moments where I'm
>tentiative in the patch re CURLOPT_ERRORBUFFER), but I would larify
>messages around.
>
>I'll comment each patch separately.
>
>WBR, Alexander Turenko.
>
>On Mon, Oct 28, 2019 at 08:11:11PM +0300, Ilya Kosarev wrote:
>> This patchset introduces CURLOPT_ACCEPT_ENCODING option. It brought up
>> fix for CURLE_WRITE_ERROR processing and addition of
>> CURLE_BAD_CONTENT_ENCODING in curl request code processing, as well as
>> enhancement of provided error info for curl request.
>> 
>> Changes in v2:
>> - added docbot request
>> - fixed comments
>> - enhanced httpc_set_accept_encoding description
>> - fixed error handling for unsupported encodings
>> 
>> Changes in v3:
>> - moved error handling for unsupported encodings fix to separate commit
>> - added error message obtained using CURLOPT_ERRORBUFFER option
>> - added CURLE_BAD_CONTENT_ENCODING for curl request code processing
>> 
>> Changes in v4:
>> - fixed commit message in PATCH 2/4
>> - added test case for error message obtained using CURLOPT_ERRORBUFFER option
>> 
>> Branch:  https://github.com/tarantool/tarantool/tree/i.kosarev/gh-4232-curlopt-accept-encoding
>> Issues:  https://github.com/tarantool/tarantool/issues/4232
>>  https://github.com/tarantool/tarantool/issues/4578
>>  https://github.com/tarantool/tarantool/issues/4579
>> 
>> Ilya Kosarev (4):
>>   http: add CURLOPT_ACCEPT_ENCODING option
>>   http: remove redundant & incorrect case for curl_request code
>>   http: enrich httpc_request with curl error message buffer
>>   http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code
>> 
>>  src/httpc.c                       | 24 +++++++++++++++++++-----
>>  src/httpc.h                       | 29 +++++++++++++++++++++++++++++
>>  src/lua/httpc.c                   |  9 +++++++++
>>  src/lua/httpc.lua                 |  2 ++
>>  test/app-tap/http_client.test.lua |  4 +++-
>>  5 files changed, 62 insertions(+), 6 deletions(-)
>> 
>> -- 
>> 2.17.1
>> 


-- 
Ilya Kosarev

[-- Attachment #2: Type: text/html, Size: 3965 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-11-07 12:07 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-28 17:11 [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Ilya Kosarev
2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 1/4] http: add CURLOPT_ACCEPT_ENCODING option Ilya Kosarev
2019-10-28 23:42   ` Alexander Turenko
2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 2/4] http: remove redundant & incorrect case for curl_request code Ilya Kosarev
2019-10-28 23:42   ` Alexander Turenko
2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 3/4] http: enrich httpc_request with curl error message buffer Ilya Kosarev
2019-10-28 23:41   ` Alexander Turenko
2019-11-07 12:07     ` Ilya Kosarev
2019-10-28 17:11 ` [Tarantool-patches] [PATCH v4 4/4] http: add CURLE_BAD_CONTENT_ENCODING case for curl_request code Ilya Kosarev
2019-10-28 23:42   ` Alexander Turenko
2019-10-28 23:43 ` [Tarantool-patches] [PATCH v4 0/4] http: add CURLOPT_ACCEPT_ENCODING option and following improvements Alexander Turenko
2019-11-07 12:07   ` Ilya Kosarev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox