<HTML><BODY><div> <blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;">Понедельник, 24 февраля 2020, 15:31 +03:00 от Георгий Кириченко <kirichenkoga@gmail.com>:<br> <div id=""><div class="js-helper js-readmsg-msg"><style type="text/css"></style><div><div id="style_15825475081363639705_BODY"><div class="class_1582738019"><div dir="ltr"><div>Please read messages before answering. I did never say that: </div><div>> You've been suggesting that filtering on the master is safer.</div><div>I said it safer do to it on the replica side and replica should not rely on master correctness.</div><div> </div><div>> I pointed out it's not, there is no way to guarantee</div>(even in theory) correctness/safety if replica if master is<br>malfunctioning.<div>Excuse my but this is demagogy, we talk about what is more safer but not absolutely safety.</div><div>>The situation is symmetrical. Both peers do not have the whole</div>>picture. You can make either of the peers responsible for the<br>>decision, then the other peer will need to supply the missing<br>>bits.<div>No, you are wrong. A master has only one information source about the stream it should send to a replica whereas</div><div> a replica could connect to many masters to fetch proper data (from one or many masters). And we already implemented similar logic - </div><div>a voting protocol and yoh should known about it.Additionally my approach allows to collect all corresponding logic as filtering</div><div> of concurrent streams, vclock following, subcriptions and replication groups which are not implemented yet, registration and whatever else in one module at replica side.</div><div><div>>I do not think the scope of this issue has ever been protecting</div>>against hacked masters. It has never been a goal of the protocol<br>>either.</div><div>A hacked master could be a master with an implementation error and we should be able to detech such error as soon as possible. But if a replica will not</div><div>check an incomming stream there is no way to prevent fatal data losses.</div><div>>This was added for specific reasons. There is no known reason the</div>>master should send unnecessary data to replica or replica fast<br>>path should get slower.<div>I am afraid you did not understand me. I did not ever said that I am against any optimization which could make replication faster.</div><div>I completely against any attempts to rely on an optimiztion logic. If a master allows to skip unrequired rows then replica should not rely on this code corectness.</div><div> In other words, if some input stream could broke replica the replica should protect itself agains such data. This is not the replicas master responsibility.</div></div></div></div></div></div></div></blockquote></div><div> </div><div>Hi! I’ve just sent v4, which is closest to what Kostja wants to see,</div><div>as far as I understood, at least.</div><div>Please, check it out and tell me what you think.</div><div>Sorry, but I didn’t fully understand your proposal. Looks like you stand</div><div>for v2 of the patch, where all the filtering is done on replica side.<br>Is it true?</div><div> </div><div>--</div><div>Serge Petrenko</div><div> </div><div> </div><div><blockquote style="border-left:1px solid #0857A6; margin:10px; padding:0 0 0 10px;"><div><div class="js-helper js-readmsg-msg"><div><div><div class="class_1582738019"><div class="gmail_quote_mailru_css_attribute_postfix"><div class="mail-quote-collapse"><div class="gmail_attr_mailru_css_attribute_postfix" dir="ltr"><span data-email="kostja.osipov@gmail.com" data-name="Konstantin Osipov" data-quote-id="1699421387843403462" data-timestamp="1582539480" data-type="sender">пн, 24 февр. 2020 г. в 13:18, Konstantin Osipov <<a href="//e.mail.ru/compose/?mailto=mailto%3akostja.osipov@gmail.com" rel="noopener noreferrer">kostja.osipov@gmail.com</a>>:</span></div><div data-quote-id="1699421387843403462" data-type="body"><blockquote class="gmail_quote_mailru_css_attribute_postfix" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">* Georgy Kirichenko <<a href="//e.mail.ru/compose/?mailto=mailto%3akirichenkoga@gmail.com" rel="noopener noreferrer">kirichenkoga@gmail.com</a>> [20/02/23 12:21]:<br><br>> Please do not think you are the only person who knows about byzantine faults.<br>> Also there is little relevance between byzantine faults and my suggestion to<br>> enforce replica-side checking.<br><br>You've been suggesting that filtering on the master is safer. I<br>pointed out it's not, there is no way to guarantee<br>(even in theory) correctness/safety if replica if master is<br>malfunctioning.<br><br>I merely pointed out that your safety argument has no merit.<br><br>There are no other practical advantages of filtering on replica<br>either: there is a disadvantage, more traffic and more filtering work to do<br>inside tx thread (as opposed to relay/wal thread if done on<br>master).<br><br>It is also against the current responsibilities of IPROTO_SUBSCRIBE: the<br>concept of a subscription is that replica specifies what it is<br>interested in. Specifically, it specifies vclock components it's.<br>You suggest to make the replica responsible for<br>submitting its vclock, but the master decide what to do with it -<br>this splits the decision making logic between the two, making the<br>whole thing harder to understand.<br><br>IPROTO_SUBSCRIBE responsibility layout today is typical for a<br>request-response protocol: the master, being the server, executes<br>the command as specified by the client (the replica), and the<br>replica runs the logic to decide what command to issue.<br><br>You suggest to change it because of some theoretical concerns you<br>have.<br><br>> In any case filtering on the master side is the most worst  thing we could do.<br>> In this case master has only one peer and have no chance to make a proper<br>> decision if replica is broken. And we have no chance to know about it (except<br>> assert which are excluded from release builds, or panic messages). For<br>> instance if master skipped some rows then there are no any tracks of the<br>> situation we could detect.<br><br>The situation is symmetrical. Both peers do not have the whole<br>picture. You can make either of the peers responsible for the<br>decision, then the other peer will need to supply the missing<br>bits. There is no way you can make it safer by changing who makes<br>the decision, but you can certainly make it more messed up by<br>splitting this logic or going against an established layout.<br><br>If you have a specific example why things will improve if done<br>otherwise - in the number of packets, or traffic, or some other<br>measurable way, you should point it out.<br><br>> In the opposite case a replica could connect to as many masters as they need<br>> to filter out all invalid data or hacked masters. At least we could enforce<br>> replication stream meta checking.<br><br>I do not think the scope of this issue has ever been protecting<br>against hacked masters. It has never been a goal of the protocol<br>either.<br><br>> Two major point I would like to mention are:<br>> 1. Replica could consistently follow all vclock members and apply all<br>> transactions without gaps (I already got rid of them, I hope you remember)<br>> 2. Replica could protect itself against concurrent local writes (one was made<br>> locally, the second one is returned from master)<br><br>This was added for specific reasons. There is no known reason the<br>master should send unnecessary data to replica or replica fast<br>path should get slower.<br><br>--<br>Konstantin Osipov, Moscow, Russia<br><a href="https://scylladb.com" rel="noopener noreferrer" target="_blank">https://scylladb.com</a></blockquote></div></div></div></div></div></div></div></div></blockquote> <div> </div></div></BODY></HTML>