1 2014-06-09 00:00:03 <phantomcircuit> iirc it's CURRENT but im not sure
  2 2014-06-09 00:00:16 <phantomcircuit> wait no it's not that
  3 2014-06-09 00:00:20 <sipa> i don't think so
  4 2014-06-09 00:00:32 <phantomcircuit> fun
  5 2014-06-09 00:00:33 <sipa> but we can always introduce extra keys to indicate a particular version
  6 2014-06-09 00:00:56 <phantomcircuit> sipa, i was thinking about adding things like sequence numbers and proper checksums
  7 2014-06-09 00:01:07 <phantomcircuit> but that changes the journal format and probably the sorted tables format
  8 2014-06-09 00:01:18 <sipa> i don't want to change leveldb
  9 2014-06-09 00:01:23 <phantomcircuit> so there would need to be an indicator of whther to use them or not
 10 2014-06-09 00:01:31 <phantomcircuit> sipa, no?
 11 2014-06-09 00:01:35 <sipa> no
 12 2014-06-09 00:01:38 <btc123> ls
 13 2014-06-09 00:01:46 <sipa> i thought you were talking about what we're storing inside leveldb
 14 2014-06-09 00:01:47 <phantomcircuit> hehe @ btc123
 15 2014-06-09 00:01:53 <phantomcircuit> sipa, oh no
 16 2014-06-09 00:01:59 <sipa> i'm sure leveldb itself has version markers
 17 2014-06-09 00:02:08 <phantomcircuit> im talking about fixing the durability and consistency issues with leveldb
 18 2014-06-09 00:02:25 <phantomcircuit> they've been largely solved by fixes to handling under os x
 19 2014-06-09 00:02:28 <sipa> we don't need durability
 20 2014-06-09 00:02:48 <sipa> we do need consistency and integrity though
 21 2014-06-09 00:02:51 <phantomcircuit> sipa, no but we at least need to be able to detect when entries have gone missing
 22 2014-06-09 00:02:52 <phantomcircuit> :P
 23 2014-06-09 00:02:59 <sipa> no
 24 2014-06-09 00:03:14 <sipa> when the last changes are undone, you're just returning to a previous validation state
 25 2014-06-09 00:03:22 <sipa> and will redo whatever validation was done since then
 26 2014-06-09 00:03:33 <phantomcircuit> right im talking about entries already in a sorted table being screwed up
 27 2014-06-09 00:03:37 <gmaxwell> wumpus: thanks for tracking down the leveldb binary incompatiblity w/ arm.
 28 2014-06-09 00:03:40 <phantomcircuit> not the journal
 29 2014-06-09 00:04:39 <phantomcircuit> 23:53:33-23:46:04 = (53-46)*60 + (33-4) = 449 seconds
 30 2014-06-09 00:04:57 <phantomcircuit> 2014-06-09 00:04:54 UpdateTip: new best=000000000000034a7dedef4a161fa058a2d67a173a90155f3a2fe6fc132e0ebf  height=200000  log2_work=68.741562  tx=7316696  date=2012-09-22 10:45:59 progress=0.095681
 31 2014-06-09 00:05:05 <phantomcircuit> 2014-06-09 00:04:54 UpdateTip: new best=000000000000034a7dedef4a161fa058a2d67a173a90155f3a2fe6fc132e0ebf  height=200000  log2_work=68.741562  tx=7316696  date=2012-09-22 10:45:59 progress=0.095681
 32 2014-06-09 00:05:17 <phantomcircuit> er
 33 2014-06-09 00:05:21 <phantomcircuit> 2014-06-08 23:58:45 UpdateTip: new best=000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f  height=0  log2_work=32.000022  tx=1  date=2009-01-03 18:15:05 progress=0.000000
 34 2014-06-09 00:06:05 <phantomcircuit> 351 seconds
 35 2014-06-09 00:06:30 <phantomcircuit> sipa, 21% reduction in reindex runtime to block 200k
 36 2014-06-09 00:06:32 <phantomcircuit> neat
 37 2014-06-09 00:06:41 <sipa> that's way more than i would have expected
 38 2014-06-09 00:06:58 <goosoodude> Ok, so. 1 thing to remember about the entire project I'm about to work on: I'm 14, and need to study this a lot. It's 98% thinking and brainstorming, and 2% coding. I will get it done, though!
 39 2014-06-09 00:07:04 <phantomcircuit> sipa, sha256 is ~20% of the cpu time
 40 2014-06-09 00:07:13 <phantomcircuit> it's a single threaded process
 41 2014-06-09 00:07:17 <sipa> i know
 42 2014-06-09 00:07:31 <phantomcircuit> so i'd have expected less also
 43 2014-06-09 00:07:38 <gmaxwell> phantomcircuit: how much of the remaining sha256 is merkle tree computation?
 44 2014-06-09 00:07:41 <phantomcircuit> since it didn't eliminate every call to sha256
 45 2014-06-09 00:07:47 <sipa> gmaxwell: half
 46 2014-06-09 00:08:02 <sipa> gmaxwell: as every txhash is computed once, and the merkle tree is computed once :)
 47 2014-06-09 00:08:15 <phantomcircuit> gmaxwell, http://pastebin.com/raw.php?i=W0vqmQ8V
 48 2014-06-09 00:08:21 <phantomcircuit> that's with the patch
 49 2014-06-09 00:08:26 <Luke-Jr> goosoodude: btw, please be sure to disclose that in any code you submit. we'll probably need your parents to sign a waiver for legal reasons :/
 50 2014-06-09 00:08:35 <phantomcircuit> it's now almost entirely recalculating block headers
 51 2014-06-09 00:08:46 <sipa> which are computed 10 times apparently
 52 2014-06-09 00:08:47 <gmaxwell> (the reason I ask is because the trees could get a 4-way SIMD implementation which might be a ~2x speedup or so)
 53 2014-06-09 00:08:49 <phantomcircuit> and checking hashes when reading from disk
 54 2014-06-09 00:09:02 <gmaxwell> not worth doing before getting the redundancy out there.
 55 2014-06-09 00:09:04 <gmaxwell> er though.
 56 2014-06-09 00:09:20 <phantomcircuit> gmaxwell, i dont think there is any redundancy in the merkle tree calculations now
 57 2014-06-09 00:09:39 <phantomcircuit> so that would be a nice improvement
 58 2014-06-09 00:09:46 <sipa> phantomcircuit: i count 369 seconds btw
 59 2014-06-09 00:09:59 <phantomcircuit> sipa, is my timestamp math wrong?
 60 2014-06-09 00:10:03 <phantomcircuit> that's entirely possible lol
 61 2014-06-09 00:10:27 <sipa> yes
 62 2014-06-09 00:10:30 <sipa> vs 449s
 63 2014-06-09 00:11:01 <phantomcircuit> 00:04:54-23:58:45 = (64-58) * 60 + (54-45) = 369
 64 2014-06-09 00:11:04 <phantomcircuit> ah yeah you're right
 65 2014-06-09 00:11:17 <sipa> still neat!
 66 2014-06-09 00:11:35 <phantomcircuit> ~17% reduction
 67 2014-06-09 00:14:53 <sipa> there are on average 133 transactions in a block
 68 2014-06-09 00:16:02 <sipa> if we can save 8 out of 10 block hash computations, that means saving 8 hashes per block, while we saved 4*133 from transactions per block
 69 2014-06-09 00:16:33 <sipa> which means we would get 0.25% reindex speed gain from that
 70 2014-06-09 00:16:42 <phantomcircuit> heh malloc is looking more and more expensive
 71 2014-06-09 00:17:08 <sipa> we waste dynamic memory all the time
 72 2014-06-09 00:17:22 <gmaxwell> the profiling often underreports the true cost of the heap allocations too.
 73 2014-06-09 00:17:36 <goosoodude> Ite.
 74 2014-06-09 00:18:15 <phantomcircuit> it seems pretty common for profiling tools to report things that just seem completely nonsensical
 75 2014-06-09 00:18:32 <gmaxwell> sipa: in the last 1000 blocks the median number of transactions is 314. ... yea, so this perhaps suggests that using the SIMD sha256 for it might not yet be a worthwhile excercise.
 76 2014-06-09 00:19:12 <sipa> gmaxwell: my 0.25% number was for avoiding duplicate block hash computations, not merkle tree
 77 2014-06-09 00:19:28 <sipa> i'm sure with merkle tree hash speed doubling you can gain more
 78 2014-06-09 00:21:49 <phantomcircuit> sipa, it definitely seems like it would be worth doing the same thing for CBlock & CBlockHeader (if possible)
 79 2014-06-09 00:22:13 <sipa> for 0.25% gain, imho no
 80 2014-06-09 00:22:36 <sipa> but we may find some trivial cases to fix, by just passing an extra hash around
 81 2014-06-09 00:23:09 <phantomcircuit> sipa, im guessing it's more than 0.25% though
 82 2014-06-09 00:23:40 <sipa> is my math wrong?
 83 2014-06-09 00:24:43 <phantomcircuit> hmm actually this call graph is only for the first 50k blocks
 84 2014-06-09 00:25:06 <phantomcircuit> i should let this run through a complete reindex before looking at it
 85 2014-06-09 00:25:09 <phantomcircuit> impatience :P
 86 2014-06-09 00:26:49 <phantomcircuit> sipa, http://i.imgur.com/vDhdzH5.png
 87 2014-06-09 00:26:58 <phantomcircuit> got a nice laugh at vprintf
 88 2014-06-09 00:27:52 <gmaxwell> sipa: right, ... apparently the 4-way SIMD sha256 has 3.4x more throughput than the openssl scalar sha256, assuming perfect loading. so actually the speedup sounds like it would be pretty good for even as few as 300 transactions.
 89 2014-06-09 00:28:46 <phantomcircuit> this is going to take a long long time to reindex under valgrind...
 90 2014-06-09 00:29:45 <shesek> how much fees would you estimate is needed for a tx of 15kb?
 91 2014-06-09 00:30:11 <phantomcircuit> actually the reindex could be pipelined and the consistency checks run in parallel...
 92 2014-06-09 00:43:17 <phantomcircuit> 2014-06-09 00:42:48 UpdateTip: new best=000000000001083432dadda634904778fb72b15ec6ac92ff5e00345b82120a6c  height=111570  log2_work=60.5591  tx=307097  date=2011-03-03 15:38:25 progress=0.004016
 93 2014-06-09 00:43:24 <phantomcircuit> progress bar is depressingly accurate
 94 2014-06-09 00:44:48 <goosoodude> So first off, Luke-Jr? By blockchain obfuscation, do you mean transaction anonymity, or do you mean the obfuscation that Ethereum is taking on?
 95 2014-06-09 00:45:27 <Luke-Jr> goosoodude: no, just a cheap XOR of the blockchain data on disk
 96 2014-06-09 00:45:34 <goosoodude> ok
 97 2014-06-09 00:45:40 <Luke-Jr> goosoodude: so braindead software doesn't mistake it as something else
 98 2014-06-09 00:45:49 <goosoodude> Ah, ok. I understand it now.
 99 2014-06-09 00:46:11 <goosoodude> So, mainly, keep norton and other awful anti-viruses from picking it up.
100 2014-06-09 00:46:12 <Luke-Jr> eg, right now someone can embed a virus signature in the blockchain to make antivirus delete your blockchain files
101 2014-06-09 00:46:15 <Luke-Jr> yeah
102 2014-06-09 00:46:21 <goosoodude> ok
103 2014-06-09 00:47:15 <btc123> if you just XOR it, then they'll put the XOR'd signature in and it will invert and be detected again ;p
104 2014-06-09 00:48:28 <btc123> but yes, probably need some cheap encryption/obfuscation
105 2014-06-09 00:48:33 <gmaxwell> btc123: the 'xor' would be either per host or per txid.
106 2014-06-09 00:49:09 <btc123> gmaxwell: oh good point. lol
107 2014-06-09 00:49:20 <sipa> per txid seems very hard for the blockchain file
108 2014-06-09 00:49:57 <gmaxwell> (oh I missed that the context was the blockchain file, yea, that would be per host or per block hash)
109 2014-06-09 00:52:00 <goosoodude> Would it be appropriate to figure out WHY the antiviruses are detecting it, and move on from there? Or, I mean, should I just tackle it from what I know?
110 2014-06-09 00:55:08 <gmaxwell> We know why. Because people intentionally put virus signatures in txouts.
111 2014-06-09 00:55:48 <uiop> heh
112 2014-06-09 00:55:55 <goosoodude> ok.
113 2014-06-09 00:56:15 <btc123> hah
114 2014-06-09 00:56:41 <phantomcircuit> block 135k is only 1.2% progress
115 2014-06-09 00:56:43 <phantomcircuit> lol
116 2014-06-09 00:56:44 <btc123> goosoodude: has a virus scanner picked it up from you?
117 2014-06-09 00:56:48 <phantomcircuit> this is going to take hours
118 2014-06-09 00:57:11 <uiop> i wonder how big the smallest virus signatures (that are current, whatever) are
119 2014-06-09 00:57:19 <goosoodude> no. Need to download Norton. Obviously, starting with the king of false positives.
120 2014-06-09 00:57:53 <buZz> startkeylogger ? :)
121 2014-06-09 00:58:12 <btc123> anyway, gmaxwell /sipa have a good solution, just xor the block data with the hash. problem solved
122 2014-06-09 00:58:40 <shesek> Luke-Jr, it seems like eligius's pushtx is choking on large transactions
123 2014-06-09 00:58:50 <phantomcircuit> if the progress meter is correct it's going to take me about 2 full days to completely reindex under valgrind
124 2014-06-09 00:58:54 <phantomcircuit> sigh
125 2014-06-09 00:59:07 <phantomcircuit> guess i should move this to a server
126 2014-06-09 01:00:17 <goosoodude> Who's to say it won't pick up on the xor?
127 2014-06-09 01:00:45 <phantomcircuit> goosoodude, common sense and math
128 2014-06-09 01:01:29 <uiop> oh, a virus sig is just a grep or something..
129 2014-06-09 01:02:14 <phantomcircuit> sigh
130 2014-06-09 01:02:28 <btc123> goosoodude: sounds like you have some learning to do,
131 2014-06-09 01:02:29 <phantomcircuit> with xor the data would be completely random
132 2014-06-09 01:02:36 <phantomcircuit> the only signature there is entropy
133 2014-06-09 01:02:53 <phantomcircuit> in which case the av would delete random data files also
134 2014-06-09 01:03:14 <goosoodude> I do. I got into bitcoin because of a learning experience, and I'm going to continue that :P