1 2015-11-02 00:41:44 <xMopxShell> jgarzik: fixed a thing in your lib. https://github.com/jgarzik/python-bitcoinrpc/pull/55
  2 2015-11-02 04:42:28 <Luke-Jr> hmm, jl2012's post is making me second-guess the BIP113-is-a-hardfork conclusion
  3 2015-11-02 04:45:56 <sipa> link?
  4 2015-11-02 04:48:58 <Luke-Jr> http://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-November/011648.html
  5 2015-11-02 04:49:25 <gmaxwell> I also pointed this out on the PR.
  6 2015-11-02 04:49:39 <gmaxwell> We swapped the comparison signs for some reason.
  7 2015-11-02 04:50:18 <gmaxwell> Locktimes have to be higher than the block times, not the other way around.
  8 2015-11-02 04:51:01 <Luke-Jr> eh, I'm confused now.
  9 2015-11-02 04:51:14 <Luke-Jr> locktime needs to be lower, no?
 10 2015-11-02 04:52:03 <gmaxwell> Right.
 11 2015-11-02 04:52:46 <gmaxwell> MTP = 100,  TX = 150,  Block=200.  TX is valid under current rules because 200>150.  It's not valid yet under MTP because 100<150.
 12 2015-11-02 04:54:31 <sipa> agree
 13 2015-11-02 04:54:32 <Luke-Jr> so BIP 113 is fine after all, and somehow I managed to confuse the comparison contageously this morning. :/
 14 2015-11-02 04:54:37 <sipa> i swapped it too
 15 2015-11-02 04:56:12 <gmaxwell> Time to undo my revert.
 16 2015-11-02 04:56:26 <wumpus> I think it's too late for that
 17 2015-11-02 04:56:50 <sipa> how so?
 18 2015-11-02 04:57:00 <wumpus> apparently we're not quite sure this is correct, we shouldn't have merged it in the first place
 19 2015-11-02 04:57:07 <gmaxwell> great, I'll just never report a potental problem again. problem solved.
 20 2015-11-02 04:57:21 <sipa> i'm convinced it is correct now
 21 2015-11-02 04:58:00 <Luke-Jr> I agree it's scary that today happened, but I don't know we can possibly get a higher QA than we had, on anything.
 22 2015-11-02 04:58:03 <sipa> not sure how i misread it even; the code is obvious
 23 2015-11-02 04:58:05 <Luke-Jr> we're hitting the limits of practicality.
 24 2015-11-02 04:58:18 <wumpus> at least i think we should wait a bit before doing this again. It seems too scary
 25 2015-11-02 04:58:29 <wumpus> we almost had an inadvertent hardfork
 26 2015-11-02 04:58:40 <gmaxwell> Except we didn't.
 27 2015-11-02 04:58:51 <gmaxwell> (didn't almost, I mean)
 28 2015-11-02 04:58:52 <wumpus> yeah near-miss / near-hit ...
 29 2015-11-02 05:00:26 <gmaxwell> wumpus: luke was confused and confused sipa and I didn't question it seriously enough unti later because I was so freaked out by what I thought was my error. As soon as I sat down and thought it through I realized it was wrong.
 30 2015-11-02 05:01:28 <gmaxwell> Also, even if this were wrong, it was mempool only and wouldn't have been a hardfork.
 31 2015-11-02 05:01:36 <wumpus> I understand. I have no solution for this either :-/ Just seems to risky at some point to change even anything :(
 32 2015-11-02 05:02:05 <gmaxwell> At worst it would have caused a DOS when createnewblock failed and crashed all the miners with this code. :)
 33 2015-11-02 05:02:09 <sipa> in IsFinalTx there is "if tx.nLockTime < (expression representing block time) return true;"... this PR decreases the value of that expression, so it can only stop returning true
 34 2015-11-02 05:02:23 <Luke-Jr> IMO lesson should be to not be so quick to revert things, especially when they have lots of qualified ACKs.
 35 2015-11-02 05:02:31 <wumpus> I mean, what if we now decide it is safe, but a day later there's yet another problem
 36 2015-11-02 05:03:07 <gmaxwell> Luke-Jr: I disagree.
 37 2015-11-02 05:03:12 <sipa> how about just seeing this as a recognition that review sometimes fails
 38 2015-11-02 05:03:31 <sipa> but we're rather overly cautious than the other way around
 39 2015-11-02 05:03:43 <Luke-Jr> gmaxwell: had we given it 24 hours to think on, you and jl2012 would have noticed before the revert happened..
 40 2015-11-02 05:03:54 <gmaxwell> Luke-Jr: I thought it was good to revert it soon because under that misunderstanding git master would randomly crash when mining.
 41 2015-11-02 05:04:11 <gmaxwell> Luke-Jr: sure, but no harm in reverting unless the consequence is that we won't put it back after more consideration.
 42 2015-11-02 05:04:11 <wumpus> right, now that it was still mempool only it couldn't cause a hardfork, so it wasn't that urgent yet
 43 2015-11-02 05:04:28 <Luke-Jr> people should not be mining on git master O.o
 44 2015-11-02 05:04:32 <wumpus> but it looked like a panic so I merged it immediately
 45 2015-11-02 05:04:41 <gmaxwell> I think wumpus did right.
 46 2015-11-02 05:05:06 <gmaxwell> wumpus: yea, thats part of the motivation of mempool only; to get things in use before they are a consensus rule.. so if something is wrong it isn't the end of the world.
 47 2015-11-02 05:05:07 <wumpus> I was just awake, wasn't aware of the whole context, and these things give me nightmares
 48 2015-11-02 05:05:36 <sipa> i think it's perfectly reasonable to revert given doubt
 49 2015-11-02 05:05:42 <Belxjander> is there any documented list of the consensus rules outside the codebase ?
 50 2015-11-02 05:05:49 <sipa> Belxjander: no
 51 2015-11-02 05:05:57 <gmaxwell> Luke-Jr: as far as "can possibly get a higher"; well if this had better tests my response woud have been "Then how do the tests pass?"
 52 2015-11-02 05:06:04 <sipa> but it's also perfectly reasonable to revert the revert now the doubt has disappeared
 53 2015-11-02 05:06:10 <sipa> i had never reviewed this code, i think
 54 2015-11-02 05:06:27 <wumpus> I don't think this revert should mean we should never put it back after consideration, but it is a warning to be careful and not over-hasty
 55 2015-11-02 05:07:12 <gmaxwell> I complained about this code being hard to review when it went up; so when sipa said it was wrong and gave a plausable explination, I went and saw there were no tests that would preclude that, and wrote the revert so we wouldn't end up with some genius mining on master having a bad day. :)
 56 2015-11-02 05:07:37 <wumpus> to me this is another signal that BIP113, in the current impelmentation, isn't ready for primetime yet
 57 2015-11-02 05:08:29 <gmaxwell> wumpus: I don't think so-- I mean the confusion here wasn't in 113 itself.
 58 2015-11-02 05:09:37 <gmaxwell> I was hasty because I though I must not have reviewed it sufficiently because I did not like the use of max() on flags, and I was ashamed of doing a bad job and wanted to fix it ASAP.
 59 2015-11-02 05:10:20 <gmaxwell> Thats basically all that went through my mind when sipa said he thought it was wrong; that sort of thing gives me nightmares too.
 60 2015-11-02 05:10:45 <Luke-Jr> it presumably received code-review ACKs from: CodeShark, btcdrak, rusty, instagibbs, jmcorgan, afk11, rubensayshi, petertodd, jtimon, and myself. I guess it could have waited for more, but that's not a trivial amount of ACKs, even if we ignore the names I don't recognise..
 61 2015-11-02 05:10:53 <wumpus> sounds like a reasonable issue - probably should have been fixed before merging
 62 2015-11-02 05:12:05 <wumpus> (though the flags were hardcoded at this point so at least that couldn't have given issues yet)
 63 2015-11-02 05:12:13 <gmaxwell> Well for things like that I dunno when I'm being picky vs it being an actual issue. But in any case, it was unrelated to the issue here. The confusion that hit sipa and Luke-Jr was just a fundimental confusion with respect to how locktime was working; it was not the fault of this code; which couldn't have possibly been more clear on this point.
 64 2015-11-02 05:14:05 <gmaxwell> wumpus: yes, the flags are just staic, so that code was doing nothing. I only mentioned it because I know from expirence that I review less after hitting the first nit. (usually because I expect the nit to get fixed, and then I review again.) So I thought I must have done that here.
 65 2015-11-02 05:16:19 <Luke-Jr> (FWIW, my confusion came 10 days after carefully reviewing and ACKing it, and I was biased toward finding it when I actually tried to confirm it in the code.)
 66 2015-11-02 05:17:02 <gmaxwell> In any case, as a matter of principle, we shouldn't avoid undoing an over eager revert; because we'd rather be over-eager with reverts than not. We shouldn't give ourselves another reason to not revert something.
 67 2015-11-02 05:17:44 <wumpus> gmaxwell: yeah we're all imperfect with reviewing, the only hope is that having lots of people look at it will increase the coverage to a point where the risk is acceptable...
 68 2015-11-02 05:18:12 <gmaxwell> It's a positive sign, I guess, if we get some false positives on code being wrong.
 69 2015-11-02 05:18:24 <wumpus> at least people are looking!
 70 2015-11-02 05:18:45 <gmaxwell> not just looking but willing to call out doubts!
 71 2015-11-02 05:44:59 <remiah> thats a coinya
 72 2015-11-02 05:47:49 <phantomcircuit> gmaxwell, possibly the logic should keep the check to explicitly require the locktime be less than the block time for clarity (yes it's checking twice then)
 73 2015-11-02 08:16:25 <btcdrak> wumpus: I think the lesson learned is about being overhasty the other way. There was a group hallucination that there was a problem that didnt really exist. We cant behave as if we just narrowly escaped an inadvertent hardfork. We had a bad dream, and now we woke up from it, nothing happened.
 74 2015-11-02 08:18:20 <phantomcircuit> btcdrak, the logic is confusing, which is a bug
 75 2015-11-02 08:18:28 <phantomcircuit> it needs to be obvious that there isn't a bug
 76 2015-11-02 08:18:40 <btcdrak> I think we could probably do with some better comments in the code.
 77 2015-11-02 08:18:45 <moa> unsubsribing from the hallucination group
 78 2015-11-02 08:19:17 <btcdrak> but we should unrevert the revert and add some explicit comments, It will be useful for people in the future anyhow to understand the code.
 79 2015-11-02 08:22:50 <gmaxwell> phantomcircuit: I don't think the change itself was confusing.
 80 2015-11-02 08:29:18 <sipa> ... i still have difficulty reasoning about it
 81 2015-11-02 08:30:10 <sipa> somehow, every time i start thinking about it, it seems that by making block time stamps earlier, it's going to allow transactions to unlock sooner
 82 2015-11-02 08:30:35 <sipa> by doing the math, and looking at the cide, i am convinced the change is safe
 83 2015-11-02 08:30:43 <sipa> but my intuition still says the opposite
 84 2015-11-02 08:33:07 <gmaxwell> sipa: do you also think that if I set your clock earlier that you will think it is time to go home from work earlier?
 85 2015-11-02 08:34:23 <phantomcircuit> gmaxwell, it's only obvious if you're thinking about the median time of the previous 11 blocks being the minimum block time
 86 2015-11-02 08:34:32 <CodeShark> does sipa ever go home from work? :)
 87 2015-11-02 08:34:41 <sipa> gmaxwell: are you saying there is an hour at which i can stop working whatsoever?
 88 2015-11-02 08:35:04 <gmaxwell> I didn't say anything about stopping working, I said go home.
 89 2015-11-02 08:35:16 <gmaxwell> you know, because the cleaning people show up and are distracting.
 90 2015-11-02 08:35:46 <sipa> gmaxwell: of course not; but it does require a mental "wait, why did that work again?" every timr
 91 2015-11-02 08:35:52 <gmaxwell> Some have hypotheized that life continues if you stop working, but this sounds like too dangerous an expirement to me. :)
 92 2015-11-02 08:39:36 <sipa> anyway, i'm bad with time. i can't even read an analogue clock, you shouldn't trust me to review time things!
 93 2015-11-02 08:39:49 <wumpus> btcdrak: agreed - I didn't mean to say we narrowly escaped from a hard fork, but it was a good reminder that it can happen
 94 2015-11-02 08:41:37 <CodeShark> sipa: I think the key intuition here is that the locktime cannot be sandwiched between blocktime and mtp
 95 2015-11-02 08:42:30 <CodeShark> then to consider the symmetry of swapping mtp and blocktime
 96 2015-11-02 08:42:45 <sipa> CodeShark: of course it can be, but in the safe direction, not the unsafe one
 97 2015-11-02 08:42:57 <CodeShark> right :)
 98 2015-11-02 08:43:21 <wumpus> gmaxwell: hah so risk-adverse
 99 2015-11-02 08:44:07 <CodeShark> in any case, it's only the cases where it's sandwiched that are interesting :)
100 2015-11-02 09:18:00 <BlueMatt> phantomcircuit: so I had never gotten around to rebasing the mutate-to-low-s branch to mempool limiting...was it you who was asking for that? anyway, its there now...
101 2015-11-02 09:18:49 <phantomcircuit> BlueMatt, yes it was, same branch?
102 2015-11-02 09:19:54 <BlueMatt> yea, "seed"
103 2015-11-02 10:26:18 <jtimon> btcdrak: +1 on group hallucination, after Luke-Jr sipa and gmaxwell said there was a problem, I was more in a hurry to undesrtand some deployment consequences than in actually undesrtanding why it was a problem: "I can do that tomorrow", I thought. My learned lesson is that I probably trust these guys too much :p
104 2015-11-02 10:29:33 <btcdrak> jtimon: Remember the words of Ronald Reagan, "Trust but verify!" :))
105 2015-11-02 10:30:42 <btcdrak> anyway, no harm done, and let better err on the side of caution. for me this incident gave me more confidence that people are looking deeply to find issues even post merge.
106 2015-11-02 10:33:59 <jtimon> yep, that's what I thought "sure let's revert and think about this again" when apparently there was a problem, better safe than sorry
107 2015-11-02 11:10:30 <JWU42> bitcoind: main.cpp:3882: void ProcessGetData(CNode*): Assertion `!"cannot load block from disk"' failed.
108 2015-11-02 11:10:40 <JWU42> most likely a HDD issue ?
109 2015-11-02 11:11:01 <JWU42> smart details are all OK
110 2015-11-02 11:11:11 <JWU42> TIA
111 2015-11-02 11:13:40 <JWU42> ok - google agrees - bad blockchain and/or disk issues
112 2015-11-02 11:50:34 <phantomcircuit> JWU42, operating system/
113 2015-11-02 11:50:35 <phantomcircuit> ?
114 2015-11-02 11:51:15 <JWU42> phantomcircuit: linux (ubuntu LTS)
115 2015-11-02 11:51:40 <JWU42> it is a dedicated box that has been running well for over a year but has started to show problems with corruption the last 1-2 months
116 2015-11-02 11:52:05 <JWU42> now it is this issue (after rebuilding the DB 2-3 months back)
117 2015-11-02 11:52:25 <phantomcircuit> JWU42, hardware issue for sure
118 2015-11-02 11:52:38 <JWU42> lovely
119 2015-11-02 11:52:40 <JWU42> =)
120 2015-11-02 11:52:50 <JWU42> thanks for the thoughts
121 2015-11-02 11:52:52 <phantomcircuit> lol yeah
122 2015-11-02 11:53:04 <JWU42> now to argue with the provider
123 2015-11-02 11:53:09 <JWU42> again, thanks
124 2015-11-02 11:57:36 <wumpus> what version of bitcoin core?
125 2015-11-02 11:59:10 <wumpus> it's most likely a hw issue, although in the past there has been a bug that blocks could be written that overlap after a reindex, but if you're using an up-to-date version that shouldn't happen
126 2015-11-02 12:43:31 <instagibbs> I wasn't around this weekend much, but I should have realized that my tests I ran for mtp disproved the hardfork theory. Oh well.
127 2015-11-02 13:03:11 <jgarzik> instagibbs, mtp?
128 2015-11-02 13:08:51 <phantomcircuit> jgarzik, bip 113 stuff
129 2015-11-02 14:30:51 <mcelrath> No one had any comments on my proposal to validate (level)DB correctness using UTXO set commitment hashes?  Good/bad/indifferent?  Would this help us get a db implementation out of the core or e.g. @gmaxwell do you think it would still be required?
130 2015-11-02 14:32:04 <mcelrath> FWIW this kind of computation can be used to validate correctness, whether or not the hashes are broadcast in blocks.
131 2015-11-02 14:32:06 <sipa> mcelrath: we already have gettxoutsetinfo which reports such a hash
132 2015-11-02 14:33:06 <sipa> and i don't think it matters... it's an extra warning layer, but not a replacement for avoiding unnecessary risk
133 2015-11-02 14:34:26 <mcelrath> Oh interesting, didn't know about gettxoutsetinfo...
134 2015-11-02 14:35:00 <sipa> it is not usable as a commitment scheme because it's horribly slow
135 2015-11-02 14:35:23 <sipa> but it can (and has) been used to identify corruptio
136 2015-11-02 14:35:31 <mcelrath> So given multiple pluggable db backends with acceptable performance, what will happen?  Will we allow user selection at ./configure time?  Will we switch from leveldb and import an entire db codebase into the core?
137 2015-11-02 14:36:08 <sipa> i consider multiple pluggable db backends to be unnecessary risk
138 2015-11-02 14:36:21 <wumpus> unless there is overwhelming evidence that some other database works better, we'll just stick with leveldb
139 2015-11-02 14:36:28 <sipa> unless there is not one database that can perform adequately
140 2015-11-02 14:37:25 <mcelrath> I was thinking LMDB which runs in 64-bit only mode with some speed advantage, and another db for the raspberry pi users...
141 2015-11-02 14:37:30 <mcelrath> (for instance)
142 2015-11-02 14:37:41 <wumpus> have you profiled lmdb with bitcoin?
143 2015-11-02 14:38:08 <mcelrath> I'm considering throwing some time at that this week.  But I don't want to waste my time if everyone is going to put the kibosh on the idea.
144 2015-11-02 14:38:09 <wumpus> if not, please don't make statements, bitcoind's use pattern is kind of different from the average micro benchmark
145 2015-11-02 14:39:08 <sipa> if it turns out to be unreasonably much faster, i think it is something worth considering (but LMDB has other downaides too, like no checksums)
146 2015-11-02 14:39:10 <wumpus> well if you do it do it as an experiment, not with the expectation that it will be merged any time soon. It's nice to be able to compare databases.
147 2015-11-02 14:39:38 <mcelrath> Of course it's an experiment.
148 2015-11-02 14:39:53 <mcelrath> Not having db corruption so often is reason enough to proceed with the experiment.
149 2015-11-02 14:40:28 <sipa> i have never ever (as in: at all) seen leveldb corrupt on a system of mine, and i reindex a lot
150 2015-11-02 14:40:34 <wumpus> are you having db corruption often?
151 2015-11-02 14:40:57 <sipa> i've done tests that include ripping the power on a running system
152 2015-11-02 14:40:58 <wumpus> I have had corruption but it always turned out to be due to faulty hardware
153 2015-11-02 14:41:25 <sipa> the only time i have seen corruption is when i wrote a script that replaced radom bytes in the middle of db files
154 2015-11-02 14:41:25 <wumpus> if you are having problems on windows help test: https://github.com/bitcoin/bitcoin/pull/6917
155 2015-11-02 14:41:35 <jgarzik> I've never had leveldb corruption that I could successfully blame on leveldb
156 2015-11-02 14:41:39 <mcelrath> I have had corruption that I was able to trace to faulty hardware.  I've had other corruption that I wasn't able to identify the source.  And windows users complain a lot about corruption from what I've seen.
157 2015-11-02 14:41:52 <jgarzik> Never at home, and occasionally on a cheap VPS where VPS is most likely culprit
158 2015-11-02 14:41:54 <wumpus> mcelrath: if you are a windows user, please test https://github.com/bitcoin/bitcoin/pull/6917
159 2015-11-02 14:42:07 <wumpus> (executables can be found in that thread, too)
160 2015-11-02 14:42:12 <mcelrath> Nice
161 2015-11-02 14:43:01 <jgarzik> After that one leveldb version fix upstream, most corruption reports externally seem to be faulty hardware
162 2015-11-02 14:43:21 <wumpus> yes, either faulty hardware or windows-and-pulled the plug
163 2015-11-02 14:43:56 <mcelrath> I'm actually pretty concerned about faulty hardware and adding methods to the core that can separate faulty hardware from blockchain forks, and inform the operator.
164 2015-11-02 14:43:57 <wumpus> of which the second problem should be solved by #6917, haven't managed to cause any leveldb corruption on a crash after that
165 2015-11-02 14:44:13 <mcelrath> wumpus: that's awesome
166 2015-11-02 14:44:15 <wumpus> mcelrath: that's exactly what leveldb does now - it detects the corruption and tells the user
167 2015-11-02 14:44:25 <wumpus> lmdb wouldn't, for example.
168 2015-11-02 14:44:41 <wumpus> leveldb checks CRCs on *everything*
169 2015-11-02 14:44:50 <mcelrath> That's a very nice feature.
170 2015-11-02 14:44:51 <wumpus> (at least in the way we use it)
171 2015-11-02 14:45:05 <wumpus> really, leveldb is good software
172 2015-11-02 14:45:16 <mcelrath> But CRC checks could be added for any db by the caller.
173 2015-11-02 14:45:20 <wumpus> you have to be really good to beat it
174 2015-11-02 14:45:27 <mcelrath> wumpus: that doesn't seem to be the balance of opinion about leveldb ;-)
175 2015-11-02 14:45:47 <wumpus> mcelrath: it's the base of many other databases and sw used in production at companies
176 2015-11-02 14:46:39 <wumpus> unless you did research in databases and completely understand the implementations and implications, I'm not really interested in balance of opinions
177 2015-11-02 14:46:42 <mcelrath> So the second complaint everyone makes about leveldb is that it's unmaintained.  What's your opinion on that?  (is it a problem)
178 2015-11-02 14:47:27 <wumpus> is that a problem?
179 2015-11-02 14:47:43 <mcelrath> yes
180 2015-11-02 14:47:56 <wumpus> depends on whether we can fix issues as they come up
181 2015-11-02 14:48:00 <wumpus> a sample size of one says: yes
182 2015-11-02 14:53:34 <wumpus> again, if you have overwhelming evidence that another specific database, which is better maintained, works better in our load patterns that would be great, switching databases at some point in the future for a good reason is open - I'm not wedded to leveldb. But if not I don't see the point of even discussing it...
183 2015-11-02 14:53:35 <sipa> i have to admit i got into an unreasonable "leveldb is terrible, we must go find a replacement" mentality myself, and started assuming some unicorn database would actually exist that solves all our problemd
184 2015-11-02 14:54:30 <jgarzik> RE maintenance:  It is a problem in theory.  In reality, last time we had a Really Big problem, we banged a drum and the maintainers were willing to help with a fix.
185 2015-11-02 14:54:47 <sipa> time is better spent on actually solving issues that pop up - something we are certainly capable of to some extent
186 2015-11-02 14:55:00 <sipa> and yes, we can keep looking for replacements
187 2015-11-02 14:55:41 <jgarzik> It stands at the level of technical debt, not pressing need:  A better replacement, or a better maintained replacement, is in general preferred -- with all the "meeting a high bar" requirements that come with any replacement.
188 2015-11-02 14:55:43 <mcelrath> sipa: hash_serialized iterates over the entire leveldb.  No wonder it's slow.  My proposal would keep a running tally and would be a lot faster.  e.g. hash(utxo set) ~= hash(all txo's) - hash(spent txo's)
189 2015-11-02 14:56:27 <sipa> how is that faster?
190 2015-11-02 14:56:39 <mcelrath> It's updated with each block ingest.
191 2015-11-02 14:56:51 <mcelrath> From the previous block hashes
192 2015-11-02 14:57:13 <sipa> how would it detect database corruption?
193 2015-11-02 14:57:31 <mcelrath> You compute it twice, once on block ingest by looking at the block, and once by querying the db.
194 2015-11-02 14:57:56 <sipa> that doesn't answer my question :)
195 2015-11-02 14:58:08 <mcelrath> If the db hash is different from the ingest hash, the db has fucked up ;-)
196 2015-11-02 14:58:44 <sipa> if you don't recompute it from the data actually in thebdatabase, how will it detect corruption?
197 2015-11-02 14:59:11 <mcelrath> You have to use the database to detect corruption in the database... I'm not sure what you're getting at...
198 2015-11-02 14:59:47 <mcelrath> There's a way to compute this *without* the database, which differentiates it from hash_serialized.
199 2015-11-02 15:00:50 <sipa> mcelrath: ok i see what you're saying
200 2015-11-02 15:01:28 <sipa> it doesn't speed up corruption detection, but it does offer a faster incremental way to compute it without the database
201 2015-11-02 15:02:09 <sipa> how is the hash(set) operation implemented?
202 2015-11-02 15:02:16 <mcelrath> I need to understand how hash_serialized detects corruption.  But I agree with the second part.
203 2015-11-02 15:02:58 <jgarzik> That's interesting.  A dbm that automatically provides a stable hash for its state.
204 2015-11-02 15:03:13 <mcelrath> I think I described it well enough on the mailing list, but I can rehash it here.
205 2015-11-02 15:03:31 <sipa> mcelrath: i'm not on the mailing list, but i can read a link
206 2015-11-02 15:03:35 <mcelrath> On block ingest compute the hash of all txo's and seperately any spent txo's in that block.  So there are two.
207 2015-11-02 15:04:01 <mcelrath> http://lists.linuxfoundation.org/pipermail/bitcoin-dev/2015-October/011638.html
208 2015-11-02 15:05:08 <sipa> that requires ordering
209 2015-11-02 15:05:16 <mcelrath> Yes.  Blocks specify the ordering.
210 2015-11-02 15:05:21 <sipa> the utxo set does not maintain ordering
211 2015-11-02 15:05:32 <sipa> so you can't recompute it from the utxo set
212 2015-11-02 15:05:45 <sipa> it also doesn't contain spent entries
213 2015-11-02 15:06:57 <mcelrath> When I'm ingesting a block, I query the db for a UTXO that gets spent in that block.  So it is in the db before I'm done ingesting the block.
214 2015-11-02 15:07:39 <sipa> you're explaining how to compute a hash of all txouts created and one for all txouts spent; i'm with you that far
215 2015-11-02 15:08:07 <sipa> but there is no way to recompute it from just the utxo set, so how would it provide a mechanism to prove that a particular utxo set is correct?
216 2015-11-02 15:08:23 <sipa> except by giving all blocks in history
217 2015-11-02 15:10:40 <mcelrath> There is no way to compute it from just the UTXO set, true.  One has to iterate over past blocks (to determine ordering) and the check is done per-block.  So really, you're validating the (u|s)txo's involved in that block only, not the entire db.
218 2015-11-02 15:10:57 <sipa> ok
219 2015-11-02 15:11:56 <mcelrath> Validating the entire db at once is of course a more comprehensive test of the db.  But costly.
220 2015-11-02 15:13:16 <sipa> the solution i like most so far is to compute the utxo hash by implicitly building a merkle tree of utxo entries and computing its root hash every 2016 blocks or so, and then committing to it 2016 blocks later
221 2015-11-02 15:14:01 <sipa> that doesn't need a fully tree-structured database with all intermediate hashes stored like a merkle-structured database
222 2015-11-02 15:14:36 <mcelrath> Yeah I've seen that discussion, mostly in the context of proving utxo's for thin wallets.  My proposal is a faster compromise that isn't really useful for thin wallets.
223 2015-11-02 15:15:05 <mcelrath> e.g. bramc's "Merkle Set"
224 2015-11-02 15:15:30 <mcelrath> Also requires ordering...
225 2015-11-02 15:15:43 <sipa> there are different use cases
226 2015-11-02 15:15:50 <mcelrath> Yes, different use cases.
227 2015-11-02 15:16:22 <mcelrath> But the latter implies the former.  If we had Merkle UTXO commitments we wouldn't need what I'm proposing, they're equivalent.
228 2015-11-02 15:16:23 <sipa> the one i care about most is being to give someone a utxo set, and prove that the blockchain contains a commitment to it, validatable without actually seeing the full block chain
229 2015-11-02 15:16:34 <sipa> yours doesn't provide that
230 2015-11-02 15:17:11 <mcelrath> Nope.
231 2015-11-02 15:17:14 <sipa> that is doable without any merkle structure
232 2015-11-02 15:17:26 <mcelrath> But extremely computationally intensive.
233 2015-11-02 15:17:39 <sipa> not more than the current serialized_hash
234 2015-11-02 15:17:53 <mcelrath> True.
235 2015-11-02 15:18:17 <sipa> but yes, not something you want inside the block validation path
236 2015-11-02 15:19:06 <sipa> you need merkle structure for two things: conpact proofs, and fast update
237 2015-11-02 15:19:39 <sipa> the former only needs an implicit merkle structure (we could implement it today with the current database)
238 2015-11-02 15:19:43 <mcelrath> So what we need is a faster way to compute the Merkle-ized UTXO commitment, or a compromise.  I'm proposing a compromise that accomplishes some goals, but I'd love to see a proposal for a faster commitment computatoin.
239 2015-11-02 15:20:21 <sipa> the latter needs an explicit merkle structured databasez with a merkle-structured rollbackable cache, ...
240 2015-11-02 15:20:44 <sipa> to make it efficient enough, and even then, it probably is an order of magnitude more I/o
241 2015-11-02 15:20:49 <mcelrath> Does that exist?
242 2015-11-02 15:21:05 <sipa> people have been implementing those for years
243 2015-11-02 15:21:11 <sipa> not inside bitcoin core
244 2015-11-02 15:21:18 <sipa> but these are well-researched ideas
245 2015-11-02 15:21:38 <mcelrath> Can you point to some links?
246 2015-11-02 15:21:57 <mcelrath> If I'm going to fool with replacing leveldb, I'd rather try with a Merkle-ized implementation!
247 2015-11-02 15:22:18 <jgarzik> mcelrath, please keep me in the loop.  I'm already writing an implementation..
248 2015-11-02 15:22:25 <sipa> i wouldn't even replace leveldb; it can be done on top of any database
249 2015-11-02 15:22:32 <mcelrath> jgarzik: I know.  Will do.
250 2015-11-02 15:22:42 <sipa> but no matter what, i expect it to require an order of magnitude more I/O
251 2015-11-02 15:22:56 <sipa> which i doubt is acceptable overhead currebtly
252 2015-11-02 15:23:07 <mcelrath> Every write requires updating a Merkle branch, so yes, I see an order of magnitude there.
253 2015-11-02 15:23:28 <jgarzik> mcelrath, sipa, my current effort is already a COW database, which makes a few things easier on the recompute-hash side
254 2015-11-02 15:23:52 <sipa> you can't do this at the database layer
255 2015-11-02 15:23:59 <jgarzik> depends on how you structure the tree and tree updates...
256 2015-11-02 15:24:04 <sipa> as the hash would be over semantic data
257 2015-11-02 15:24:13 <jgarzik> don't necessarily have to go for the naive merkle approach
258 2015-11-02 15:24:40 <sipa> (you don't want bitcoin's consensus rules to depend on the db backend you chose, right?)
259 2015-11-02 15:26:43 <mcelrath> @jgarzik you're writing a Merkle-db implementation?  Or are you talking about your sqlite branch?
260 2015-11-02 15:29:18 <jgarzik> mcelrath, 1) pgdb2 will have a merkle db option, yes,     2) no, not talking about sqlite.  sqlite is an experiment that's reach an endpoint IMO.
261 2015-11-02 15:33:07 <mcelrath> Neat!  I'm willing to help.
262 2015-11-02 15:38:46 <jgarzik> mcelrath,  pgdb2 is a refresh of some earlier kernel filesystem designs of mine.  page-based copy-on-write transactional lower layer + (not yet written) higher multi-table key/value db with hash stability
263 2015-11-02 15:39:29 <jgarzik> the hope is that it is flexible enough for merkle db also, but I need more background info on use cases
264 2015-11-02 15:39:58 <mcelrath> Why not pull the hash tree out from the db and keep it separately?
265 2015-11-02 15:40:18 <jgarzik> mcelrath, that's easily doable with this layered design
266 2015-11-02 15:41:18 <jgarzik> paged file < COW inodes [multi-page runs, map-able together] < database layers
267 2015-11-02 15:41:45 <mcelrath> Which kernel filesystems did you work on?
268 2015-11-02 15:42:36 <jgarzik> core Linux VFS, ext4, and stuff of my own design.  a little bit on btrfs
269 2015-11-02 15:42:59 <jgarzik> little bits here and there in unimportant filesystems like hpfs ;p
270 2015-11-02 15:43:39 <mcelrath> Neat.  A fascination of mine but I've never actually worked on a fs.  I run btrfs on all my systems.  ;-)  COW is the only way to go.
271 2015-11-02 15:44:08 <jgarzik> COW is actually friendly to modern flash-based devices, which perform wear levelling anyway
272 2015-11-02 15:44:58 <jgarzik> mcelrath, has anyone theorized what a C/C++ merkle db api might look like?
273 2015-11-02 15:46:20 <mcelrath> I'd think it would be identical to a key-value store, with two extra methods: getroot and getbranch(key) to get the Merkle branch.
274 2015-11-02 15:47:01 <mcelrath> Insert and delete would have to be modified to update the tree all the way back to the root.  That concerns me, It's a lot more I/O.
275 2015-11-02 15:50:15 <jgarzik> mcelrath, there's a lot of hidden i/o in COW anyway, since a data update potentially updates the list of where data is stored (extent list), which potentially updates inode, which potentially updates superblock.
276 2015-11-02 15:50:33 <jgarzik> mcelrath, just make sure to stream the i/o together all at once.
277 2015-11-02 15:51:04 <mcelrath> That had occurred to me. btrfs must be updating the superblock with every write.
278 2015-11-02 15:51:11 <jgarzik> COW trades off additional i/o for less double-writing of a journal
279 2015-11-02 15:51:42 <jgarzik> (arguably COW replaces it with more-than-double-writing, but for modern SSDs who cares)
280 2015-11-02 15:52:29 <jgarzik> mcelrath, That's why I think a COW is very friendly to hash-stable apps
281 2015-11-02 15:52:48 <jgarzik> it's also nicely lock-free in many paths
282 2015-11-02 15:52:57 <mcelrath> Bundling all IO for an update of a large tree seems hard.  You need the root node and a leaf node in the same block.
283 2015-11-02 15:53:28 <mcelrath> That makes the tree inefficiently stored for traversal of any other path.
284 2015-11-02 15:53:29 <jgarzik> mcelrath, not at all. just need to write(2) them at the same time
285 2015-11-02 15:53:54 <mcelrath> In any case, it seems any update involves multiple writes.
286 2015-11-02 15:53:55 <jgarzik> mcelrath, modern OS and, underneath, modern storage handle scatter/gather just fine
287 2015-11-02 15:54:57 <mcelrath> We need bramc on this conversation.  I want some details on his Merkle Set.  ;-)
288 2015-11-02 15:56:16 <jgarzik> mcelrath, yes and no.  to over-simplify, on modern OS, the kernel bundles all writes in various locations of the file together into one bundle, to send to storage, between fsync() calls.
289 2015-11-02 15:56:49 <jgarzik> mcelrath, multiple writes get aggregated at several levels.
290 2015-11-02 15:57:43 <jgarzik> mcelrath, one key issue is seek time.  if you use SSD, then reading and writing "all over the place" is just fine, as seek time is basically zero.  if you use a rotational hard drive, seek time plays a role.
291 2015-11-02 15:58:07 <Diablo-D3> re: ssd
292 2015-11-02 15:58:14 <Diablo-D3> THIS is why I use ssd
293 2015-11-02 15:58:16 <jgarzik> mcelrath, for this merkle db application, (1) many writes + (2) recommend SSD
294 2015-11-02 15:58:34 <Diablo-D3> because db apps love to write several things concurrently that would drive a hdd into the ground via seeks
295 2015-11-02 15:58:40 <Diablo-D3> and db apps also love to read those back
296 2015-11-02 15:59:08 <Diablo-D3> and SSDs can complete o_fsync ops a shitload faster
297 2015-11-02 15:59:17 <Diablo-D3> [10:51:11] <jgarzik> COW trades off additional i/o for less double-writing of a journal
298 2015-11-02 15:59:20 <Diablo-D3> jgarzik: thats not entirely true
299 2015-11-02 15:59:34 <jgarzik> Diablo-D3, read the next line
300 2015-11-02 15:59:49 <Diablo-D3> [10:51:43] <jgarzik> (arguably COW replaces it with more-than-double-writing, but for modern SSDs who cares)
301 2015-11-02 15:59:50 <Diablo-D3> that?
302 2015-11-02 15:59:54 <Diablo-D3> depends on how COW is impl
303 2015-11-02 16:00:07 <jgarzik> yes
304 2015-11-02 16:00:14 <Diablo-D3> if you're doing a dumb write log that eventually gets baked into a canonical db
305 2015-11-02 16:00:16 <Diablo-D3> yeah its double writes
306 2015-11-02 16:00:36 <Diablo-D3> if you're just storing previous known working copies, its double storage but not necessarily double writes
307 2015-11-02 16:00:48 <Diablo-D3> though it depends on the scale of your data objects
308 2015-11-02 16:01:13 <Diablo-D3> if your db rows are tiny and you update very few at a time, yeah, the size of your data update is going to be the size of your journal update
309 2015-11-02 16:01:21 <Diablo-D3> ergo, double writing
310 2015-11-02 16:01:26 <Diablo-D3> BUT writes aren't even fatal
311 2015-11-02 16:01:33 <Diablo-D3> modern ssds do hundreds of TB before they die
312 2015-11-02 16:01:51 <Diablo-D3> like, crucial m500, m550, and mx200 (aka m600 dc but for consumers)?
313 2015-11-02 16:01:56 <Diablo-D3> all claim shit like 72TB
314 2015-11-02 16:02:02 <Diablo-D3> thats their _warranty_ value
315 2015-11-02 16:02:12 <Diablo-D3> they last at least 4x that.
316 2015-11-02 16:02:47 <mcelrath> jgarzik: There's a collection of literature on "incremental hash functions" that I wonder might be useful for this. I considered it for my post about UTXO commitments, but I realized it wasn't necessary, that idea can be done with standard hash functions.
317 2015-11-02 16:02:58 <afk11> Since people are researching DB schemes, I'm writing an SQL database to hold the data a full node would. I've been having some fun with nested sets - it implicitly lets you write simple queries for hierarchical chains, tips, etc.
318 2015-11-02 16:03:01 <Diablo-D3> 72TB works out to something like 3 years of an extremely artificial case of 24/7 writing and never reading
319 2015-11-02 16:03:20 <Diablo-D3> like, not even reading to check that your write succeeded properly
320 2015-11-02 16:03:26 <Diablo-D3> not even doing fs journaling
321 2015-11-02 16:03:40 <Diablo-D3> its extremely hard to kill a properly functioning good ssd these days
322 2015-11-02 16:04:01 <Diablo-D3> _plus_ now that pci-e 3.0 x2 interfaces over sata express are becoming the norm
323 2015-11-02 16:05:05 <Diablo-D3> doing >150k 4k write iops on highly randomized patterns on low queue depths (>1 but <=half (usually 32 on pre nvme drives)) is becoming the norm..
324 2015-11-02 16:05:36 <Diablo-D3> its like, you have the db write performance of what used to be one goddamned huge server, now in a single drive
325 2015-11-02 16:08:09 <mcelrath> afk11: Can you elaborate on nested set queries?  Any links?
326 2015-11-02 16:10:05 <bitcoin-dev480> How can I calulcate miners fees when constructing raw transactions?
327 2015-11-02 16:16:32 <bitcoin-dev480> anyone?
328 2015-11-02 16:26:45 <afk11> mcelrath: http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/ explains adjacency set first, then nested sets.
329 2015-11-02 16:28:48 <afk11> so, instead of maintaining a table that contains your 'best chain', you keep them all in one and query for tips.
330 2015-11-02 16:30:22 <afk11> UTXO set is handled the same way. Store them all, and join against the chain of blocks up to a 'tip' you specify.
331 2015-11-02 16:58:32 <sipa> afk11: your performance will be horrible if you store all utxo sets
332 2015-11-02 17:11:29 <mcelrath> I think he's discovering that ;-)
333 2015-11-02 17:17:01 <gmaxwell> 06:45 < mcelrath> But CRC checks could be added for any db by the caller.
334 2015-11-02 17:17:24 <gmaxwell> not so, ... how do you CRC a key not found error (except via an expensive reimplementation of the database)
335 2015-11-02 17:17:28 <gmaxwell> ?
336 2015-11-02 17:18:34 <gmaxwell> Part of the reason there are corruption reports is because of the super extensive checking, both in leveldb itself, and at the application level... but as was said above, AFAICT the corruption reports we have are limited to these windows specific ones where we know the cause.
337 2015-11-02 17:19:08 <gmaxwell> On linux I left a system for a month on a remote power switch that hard cut the power then let it start back up and cut it again... and never corrupted.
338 2015-11-02 17:20:07 <gmaxwell> Not a guarentee of course, but pretty good.
339 2015-11-02 18:04:50 <mcelrath> Point taken.
340 2015-11-02 18:05:09 <mcelrath> We've had linux corruption too.  If I can prove it's not flaky hardware, I'll report it.
341 2015-11-02 18:06:44 <gmaxwell> yea, not impossible.  Unfortunately "flaky hardware" is super common. I think on linux I'm reasonably confident that flaky hardware is much more likely than a problem with leveldb.
342 2015-11-02 18:07:15 <gmaxwell> Also, errors can be in the filesystem or in the disk firmware too.
343 2015-11-02 18:10:39 <mcelrath> A bit longer term I'd really like to identify pathways prone to corruption, and trigger a re-evaluation of the computation that led to corruption.  This problem will only get worse.
344 2015-11-02 18:14:06 <sipa> re-evaluation may mean reindexing the blockchain from scratch, as the data the computation is based on may not be avilable anymore
345 2015-11-02 18:14:46 <mcelrath> I'd like to find some happy medium that requires anything less than a full reindex...
346 2015-11-02 18:16:42 <sipa> etafeel free to think about that :)
347 2015-11-02 18:16:52 <sipa> *feel free
348 2015-11-02 18:26:24 <gmaxwell> mcelrath: I believe the leveldb error checking is technically overly agressive, in that it's possible for there to be perfectly recoverable errors that it refuses to continue on.  Even if there is a "recovery" it can be hard to be absolutely sure you haven't silently loss something, and so we think it's better to suffer a reindex.
349 2015-11-02 18:26:32 <gmaxwell> Also, reindexs are currently artifically slow.
350 2015-11-02 18:29:27 <mcelrath> Yeah one would need to really prove the recovery was correct.  What keeps reindex slow?
351 2015-11-02 18:31:16 <sipa> due to a bug it revalidates historical signatures
352 2015-11-02 18:35:56 <Luke-Jr> mcelrath: I thought I had Linux corruption too, but when I went to make a sample db for wumpus to look at, the system stopped working entirely, so.. :/
353 2015-11-02 18:58:38 <bitcoin-dev415> Does bitcoin-core consider unconfirmed change as "available" balance?
354 2015-11-02 19:00:12 <mcelrath> Unconformed change is no different than any other kind of transaction output.  If it's unconfirmed, it's unconfirmed.  A new transaction to spend it is valid.
355 2015-11-02 19:01:07 <bitcoin-dev415> but it does not show up when using "listunpsent"?
356 2015-11-02 19:02:07 <bitcoin-dev415> or I have to specify for 0 confirmations? is that really safe to use in a new transaction?
357 2015-11-02 19:04:22 <bitcoin-dev415> what happens if the previous transaction gets modified by transaction malleability?
358 2015-11-02 19:05:32 <mcelrath> It's not safe.  listunspent takes two parameters minconf, maxconf which set the minimum and maximum confirmations to filter.
359 2015-11-02 19:05:41 <mcelrath> BTW this probably belongs in #bitcoin
360 2015-11-02 19:05:54 <Luke-Jr> mcelrath: I don't think you can corrupt it by killing Bitcoin Core..
361 2015-11-02 19:06:13 <mcelrath> Luke-Jr: I don't think so either.  But it's worth a try.  ;-)
362 2015-11-02 19:08:23 <gmaxwell> mcelrath: thats not correct
363 2015-11-02 19:11:19 <gmaxwell> mcelrath: by default bitcoin core will spend its own unconfirmed change, but only if it has no other choice.
364 2015-11-02 19:11:44 <gmaxwell> This can be disabled with a config option.
365 2015-11-02 19:12:51 <mcelrath> Thanks gmaxwell.  bitcoin-dev415 the option is -spendzeroconfchange
366 2015-11-02 19:13:43 <bitcoin-dev415> Thanks, I will try that!
367 2015-11-02 19:19:01 <bitcoin-dev415> mcelrath: just what I needed, thank you!
368 2015-11-02 21:33:49 <morcos> andybody know if there are duplicate coinbases on testnet3?
369 2015-11-02 21:34:35 <morcos> i'm wondering if its safe to skip BIP30 check there after BIP34 activation, but depends on whether there is still the potential to create duplicate transactions
370 2015-11-02 21:35:25 <jgarzik> morcos, well you get down to the probability of hash collision ...
371 2015-11-02 21:35:36 <jgarzik> never zero but "atoms in the universe" small
372 2015-11-02 21:36:22 <jgarzik> morcos, the by-accident consensus rule is that the later duplicate "overwrites" the visibility of the prior transaction
373 2015-11-02 21:36:24 <morcos> jgarzik: yes, but if duplicate coinbases were created before BIP34 and unlike on the main chain, the first was spent before the second overwrote it, then there is still the possiblity to create overwriting txs
374 2015-11-02 21:36:45 <morcos> jgarzik: after BIP30 the rule is you are not allowed to overwrite
375 2015-11-02 21:37:46 <morcos> i'm going to disable enforcing BIP30 on the main chain after BIP34 activation, and want to know if I can do the same on testnet
376 2015-11-02 21:37:56 <jgarzik> morcos, nod - though operationally you have people making coding software that makes assumptions based on uniqueness of hash
377 2015-11-02 21:38:10 <jgarzik> so IMO in effect the problem still remains [to a very small extent]
378 2015-11-02 23:59:02 <zooko> If anybody is, or knows how to communicate with, Jan Carlsson, could you have Jan contact me? zooko@LeastAuthority.com .