From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from localhost (localhost [127.0.0.1]) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTP id 73DF22BF69 for ; Fri, 5 Apr 2019 15:48:19 -0400 (EDT) Received: from turing.freelists.org ([127.0.0.1]) by localhost (turing.freelists.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JYsdRSdEuq8U for ; Fri, 5 Apr 2019 15:48:19 -0400 (EDT) Received: from smtp20.mail.ru (smtp20.mail.ru [94.100.179.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by turing.freelists.org (Avenir Technologies Mail Multiplex) with ESMTPS id 98DA62BDFF for ; Fri, 5 Apr 2019 15:48:18 -0400 (EDT) Date: Fri, 5 Apr 2019 22:48:15 +0300 From: Konstantin Osipov Subject: [tarantool-patches] Re: [PATCH 2/2] sql: make aggregate functions types more strict Message-ID: <20190405194815.GH3789@chai> References: <49e4ae0bc187dc02f908427692c0ddb2cc2d36a8.1554475881.git.ivan.koptelov@tarantool.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49e4ae0bc187dc02f908427692c0ddb2cc2d36a8.1554475881.git.ivan.koptelov@tarantool.org> Sender: tarantool-patches-bounce@freelists.org Errors-to: tarantool-patches-bounce@freelists.org Reply-To: tarantool-patches@freelists.org List-Help: List-Unsubscribe: List-software: Ecartis version 1.0.0 List-Id: tarantool-patches List-Subscribe: List-Owner: List-post: List-Archive: To: tarantool-patches@freelists.org Cc: korablev@tarantool.org, Ivan Koptelov * Ivan Koptelov [19/04/05 18:02]: > +/* > + * This structure is for keeping context during work of > + * aggregate function. > + */ > +struct aggregate_context { > + /** Value being aggregated. (e.g. current MAX or current counter value). */ > + Mem value; > + /** Reference value to keep track of previous argument's type. */ > + Mem reference_value; > +}; Why not call this struct agg_value? Besides, keeping a reference to the previous argument is an overkill. Why not keep a type instead, and assign it to FIELD_TYPE_SCALAR initially and change to a more specific type after the first assignment? > + } else { > + diag_set(ClientError, ER_INCONSISTENT_TYPES, > + "INTEGER or FLOAT", mem_type_to_str(argv[0])); > + context->fErrorOrAux = 1; > + context->isError = SQL_TARANTOOL_ERROR; This message would look confusing. Could we get rid of "or" in the message and be more specific about what is inconsistent? > + if (sql_type != ref_sql_type) { > + is_compatible = false; > + if ((sql_type == SQL_INTEGER || sql_type == SQL_FLOAT) && > + (ref_sql_type == SQL_INTEGER || > + ref_sql_type == SQL_FLOAT)) { > + is_compatible = true; This is a very hot path and doing so much work to check compatibility is a) clumsy when reading b) slow c) hard to maintain. Please use a compatibility matrix statically defined as a 8x8 bitmap. Besides, I guess you can get rid of this check for most common cases - averaging a column of the same type - so this is perhaps better to make a separate opcode, not part of the main opcode, and emit only when we're not sure the type is going to be the same across all values. I don't know how hard this is to do, however - perhaps should be moved into a separate patch, but I'd guess detecting that the aggregate function argument has a non-mutable type is not hard. -- Konstantin Osipov, Moscow, Russia, +7 903 626 22 32 http://tarantool.io - www.twitter.com/kostja_osipov