1 2012-08-19 12:21:01 <denisx> jgarzik: do I understand it correctly that the histtable to check for duplicates in pushpoold grows until it reaches 10000 and then it stays there?
  2 2012-08-19 12:26:56 <amiller> alright, merkle tree protocol specification and reference implementation is up: https://bitcointalk.org/index.php?topic=101734.0
  3 2012-08-19 12:37:17 <Eliel> amiller: one thought occurred to me about this UTXO merkle tree. Couldn't you cut the size needed to store the merkle tree by 50% by not storing the leaf hashes? Those can always by recalculated O(1)
  4 2012-08-19 12:37:30 <amiller> yup
  5 2012-08-19 12:40:46 <Eliel> does ultrapune database contain the UTXO list sorted by txid?
  6 2012-08-19 12:42:41 <amiller> yes
  7 2012-08-19 12:45:07 <Eliel> ok, that potentially makes reading neighboring txs fast, so you could maybe get away with not storing up to 3-levels of the merkle tree without an IO penalty on a harddrive.
  8 2012-08-19 12:46:12 <Eliel> that'd drop the space requirement for the merkle tree by 87.5% (assuming all nodes take equal space)
  9 2012-08-19 12:47:15 <Eliel> ah wait, I'm over-optimistic here
 10 2012-08-19 12:47:21 <Eliel> you can't drop all the data, just the hashes
 11 2012-08-19 12:50:33 <amiller> also full validation requires recomputing the all the new hashes at least for each block
 12 2012-08-19 12:50:42 <amiller> for example you never need to compute the intermediate root hash in between transactions within a block
 13 2012-08-19 12:51:07 <amiller> but you'd have to compute the root hash after the last block
 14 2012-08-19 12:51:28 <amiller> you'd be able to cut out some constant fraction of the hash computations this way
 15 2012-08-19 12:52:24 <Eliel> how big does the merkle tree get? excluding the UTXO database itself?
 16 2012-08-19 12:54:45 <amiller> i don't know, it got up to 2 gigabytes before block 140k, according to 'top' when I ran it, but that's with python's overhead for nested tuples, and no compression at all (the keys are mostly redundant)
 17 2012-08-19 12:57:49 <Eliel> python's overhead can be rather massive.
 18 2012-08-19 13:01:58 <amiller> my guess is that the storage requirements for the UTXO will be an order of magnitude greater than in ultraprune, but the tradeoff is that even a full-validation node doesn't even need to store the whole thing
 19 2012-08-19 13:02:14 <amiller> storage is the relatively easy part
 20 2012-08-19 13:02:39 <Joric> 'standard Satoshi bitcoin client (...) To do this, select Help -> Debug Window, and in the window, on the bottom line, enter importprivkey <privatekey>' what is this sorcery? 0.7? is there a win32 build of that?
 21 2012-08-19 13:02:41 <amiller> since this proposal only involves untrusted data, it's easy to imagine bittorrenting everything
 22 2012-08-19 13:03:33 <Joric> or maybe there is a command line key for enabling the debug window
 23 2012-08-19 13:04:29 <Eliel> amiller: I'd think it's relatively easy to estimate the size.
 24 2012-08-19 13:05:11 <amiller> sure, the total number of nodes is 2*N, and so that's the number of unique node hashes
 25 2012-08-19 13:05:20 <Eliel> amiller: is 4 bytes enough for the index?
 26 2012-08-19 13:05:24 <amiller> the leaf nodes are already hashes of keys so that includes it
 27 2012-08-19 13:05:33 <amiller> not only is 4 bytes enough, but you could like, delta compress them
 28 2012-08-19 13:05:49 <amiller> lets just say we say they're amortized zero and try to lower-bound it
 29 2012-08-19 13:06:08 <amiller> 2*N*32 bytes
 30 2012-08-19 13:06:33 <amiller> What's N? sipa said that the peak UTXO count occurs right now (i think that's probably true)
 31 2012-08-19 13:06:50 <amiller> how many UTXO's are there right now?
 32 2012-08-19 13:07:19 <Eliel> we don't need that. N should cancel itself out if we calculate the relative size
 33 2012-08-19 13:07:28 <amiller> the relative size isn't fixed
 34 2012-08-19 13:08:04 <Eliel> why not? transaction size variance?
 35 2012-08-19 13:08:13 <amiller> it's not really about transaction size variance but about number of txoutputs
 36 2012-08-19 13:08:17 <amiller> the number of txoutputs varies
 37 2012-08-19 13:08:37 <amiller> also the point is that UTXOs get removed from the set once they're spent
 38 2012-08-19 13:09:45 <amiller> this is a statistic that should be included in blockchain.info and such! but apparently it's the case that you'd need to be running sipa's ultraprune to even have a shot at calculating that
 39 2012-08-19 13:13:12 <Eliel> well, let's see how long this executes :D "select count(a.txout_id) FROM txout a LEFT OUTER JOIN txin b ON (b.txout_id=a.txout_id) WHERE b.txin_id IS NULL"
 40 2012-08-19 13:16:02 <amiller> if sipa says 80mb, and an address is roughly 32 bytes, i'd guess there are about 250,000
 41 2012-08-19 13:16:40 <amiller> that's assuming there are on average one outstanding txout per transaction
 42 2012-08-19 13:17:24 <amiller> otherwise, sipa's database would be getting an advantage and mine have to count 2*N against the full number of unique txouts, no matter how many are grouped together with a single txid
 43 2012-08-19 13:17:52 <Joric> oh cool got that version with debug window here http://luke.dashjr.org/programs/bitcoin/files/bitcoind/next/test/20120813/bitcoin-next-test-20120813-win32.zip
 44 2012-08-19 13:22:19 <jgarzik> no orphan chains last night... pynode still up
 45 2012-08-19 13:23:16 <amiller> i thought you fixed reorgs!
 46 2012-08-19 13:24:27 <Joric> i love debug window ^_^
 47 2012-08-19 13:30:42 <Joric> heh i thought gettransaction is a full blown json transaction it's just some really basic information, json-formatted
 48 2012-08-19 13:32:41 <Joric> at least getrawtransaction is fine, bw can decode those
 49 2012-08-19 13:34:57 <denisx> jgarzik: do I understand it correctly that the histtable to check for duplicates in pushpoold grows until it reaches 10000 and then it stays there?
 50 2012-08-19 13:37:44 <jgarzik> denisx: yes
 51 2012-08-19 13:39:05 <denisx> jgarzik: I think that list can be cleared when a new block is found
 52 2012-08-19 13:39:08 <Eliel> amiller: 2049000 is the count of txouts that haven't been linked to a txin in my bitcoin-abe database.
 53 2012-08-19 13:40:16 <amiller> mmm estimation for-the-win, i kind of have a streak going
 54 2012-08-19 13:41:03 <amiller> wait i was off by an order of magnitude
 55 2012-08-19 13:41:26 <amiller> (so much for my streak) i wonder if that means that many of them are grouped in transactions
 56 2012-08-19 13:41:44 <amiller> can you tell me how many unique txout's are represented in there?
 57 2012-08-19 13:42:49 <Eliel> you mean unique txouts as in how many of those txouts are the only txout in their txid?
 58 2012-08-19 13:44:17 <amiller> the number of unique txid's that currently have an unspent-txout
 59 2012-08-19 13:46:06 <amiller> this would let me relate the estimated size of the UTXO-tree, which can't take advantage of compressibility when there are few txid's compared to txouts, to the current ultraprune-UTXO, which can
 60 2012-08-19 13:46:10 <Eliel> ... ok, let me scratch my head on how to make this into SQL statement first :)
 61 2012-08-19 13:46:13 <amiller> :p
 62 2012-08-19 13:54:29 <Eliel> it's a shame it's the txin table that references txout table and not the other way around. would be a fast one table query otherwise :)
 63 2012-08-19 13:54:34 <Eliel> well, afte adding an index.
 64 2012-08-19 13:56:52 <Eliel> amiller: 1027344 txids
 65 2012-08-19 13:57:41 <genjix> jgarzik: what do you think about manned mars missions? i think they are the most important goal of the human race, so when i see this, https://bitcointalk.org/index.php?topic=1601.msg20309#msg20309 , it kills me inside.
 66 2012-08-19 13:58:35 <amiller> ohh pff, i _was_ correct with my estimate, i just copied the number wrong - i estimated 2,000,000 uto's based on 80mb ultraprune and my streak is intact.
 67 2012-08-19 14:00:00 <amiller> Eliel, here's a simple one - what's the total number of txouts, in history, not just the currently unspent ones?
 68 2012-08-19 14:02:51 <amiller> thanks by the way for running these interesting queries
 69 2012-08-19 14:03:20 <amiller> if you were to pm me a bitcoin address i would send it a bitcoin out of gratitude
 70 2012-08-19 14:07:01 <Eliel> amiller: 14072679 total txouts
 71 2012-08-19 14:09:55 <Eliel> amiller: these stats are 4 blocks old though at the moment though :)
 72 2012-08-19 14:10:19 <amiller> interesting, i know from blockchain.info that there are about 1195159 transactions
 73 2012-08-19 14:10:47 <Eliel> the cronjob only runs once an hour to update the database
 74 2012-08-19 14:11:16 <lianj> amiller: more like 6,192,500
 75 2012-08-19 14:11:40 <amiller> yeah
 76 2012-08-19 14:11:48 <amiller> 16,471,556 actually
 77 2012-08-19 14:12:08 <lianj> where does that number come from?
 78 2012-08-19 14:12:16 <amiller> http://blockchain.info/tx-index/16471556/32ef4bc10f573fa7ae4ed9ec19ac2bdae688753ea9629970082252244a0916ea
 79 2012-08-19 14:12:51 <amiller> maybe i'm misinterpreting that as the actual transaction sequence number, and it's just something blockchain.info uses?
 80 2012-08-19 14:13:10 <amiller> 6,192,500 makes much more sense :/
 81 2012-08-19 14:13:22 <amiller> an average of 2 txout per tx is what i would expect
 82 2012-08-19 14:16:35 <lianj> definitely more like 6,192,500
 83 2012-08-19 14:49:35 <Eliel> amiller: that 16,471,556 is likely the database row id number.
 84 2012-08-19 14:54:22 <genjix> https://bitcointalk.org/index.php?topic=101686.msg1113153#msg1113153
 85 2012-08-19 14:54:32 <genjix> this post is very confusing to me.
 86 2012-08-19 14:57:16 <Eliel> if that's the goal, then it really should be clearly stated on bitcoin.org website.
 87 2012-08-19 14:58:06 <Eliel> along with pointers on where to go if you're not a geeky type who knows how to keep their computer secure :P
 88 2012-08-19 15:08:57 <jgarzik> tcatm_: bitcoinwatch.com block count is wrong
 89 2012-08-19 15:14:14 <D34TH> question: would it be better to loadblocks 0001 or 0002
 90 2012-08-19 15:14:23 <D34TH> i.e. does 0002 chain 0001
 91 2012-08-19 15:20:04 <jgarzik> D34TH: blk0002.dat is the continuation of blk0001.dat.  You need both, and you want to start at blk0001.dat.
 92 2012-08-19 15:21:33 <D34TH> thanks
 93 2012-08-19 15:24:03 <jgarzik> genjix: yeah, they are sillyheads.  everyone is entitled to their opinions, and everyone has different priorities....  I think we make Earth better by exploring other planets.  Other people prefer to spend Earth money on Earth things.
 94 2012-08-19 15:25:04 <jgarzik> sipa gmaxwell: is mempool ACK-worthy?  after reverting my most recent change, mempool/getdata changes are as you remember them
 95 2012-08-19 15:25:38 <jgarzik> sipa: I think genjix' argument convinced me that making mempool part of the mandatory NODE_NETWORK protocol is the best thing to do
 96 2012-08-19 15:25:56 <jgarzik> in a year, we will all be glad it was included in the base proto
 97 2012-08-19 15:52:39 <Eliel> jgarzik: I got the impression his beef wasn't with space exploration or science but rather that the money for it is collected by force.
 98 2012-08-19 15:53:08 <Eliel> as in, it'd be fine if it was based on voluntary contributions.
 99 2012-08-19 17:39:44 <question> hellow
100 2012-08-19 17:39:52 <question> i have got a question
101 2012-08-19 17:39:57 <lianj> ask
102 2012-08-19 17:39:58 <sunshinehappy> hi
103 2012-08-19 17:40:21 <question> when my bitcoin wallet loads there is a message i ave never seen before
104 2012-08-19 17:40:46 <question> it asked me to upgrade or my nodes need to upgrade
105 2012-08-19 17:40:55 <question> i have downloaded the lastest version twice
106 2012-08-19 17:41:08 <question> can anywone help me
107 2012-08-19 17:41:21 <gmaxwell> question: what block number are you on? and is the block count going up?
108 2012-08-19 17:41:32 <question> the block count is stopped
109 2012-08-19 17:41:38 <gmaxwell> Stopped at what?
110 2012-08-19 17:41:40 <gmaxwell> ;;bc,blocks
111 2012-08-19 17:41:41 <gribble> 194678
112 2012-08-19 17:41:50 <question> i will tell you when the wallet is finished loading
113 2012-08-19 17:43:10 <question> it can take 5 , cpu ! :(
114 2012-08-19 17:47:32 <question> the message below in the wallet where normally the block counts : Warning : displayed transactions may not be correct ! you may need to upgrade , or other nodes need to upgrade
115 2012-08-19 17:47:36 <question> the block :
116 2012-08-19 17:48:02 <question> 193745
117 2012-08-19 17:50:25 <question> gmaxwell
118 2012-08-19 18:00:29 <cande> i get tons of errors when downloading the block chain
119 2012-08-19 18:01:15 <cande> i'm tailing the logfile
120 2012-08-19 18:02:20 <gmaxwell> cande: some 'errors' are normal and not errors at all.
121 2012-08-19 18:02:23 <gmaxwell> Can you pastebin.
122 2012-08-19 18:02:27 <cande> sure
123 2012-08-19 18:02:56 <jgarzik> w00t.  pynode seems to have survive the bitcoincharts reorganize torture test.
124 2012-08-19 18:03:05 <jgarzik> *survived
125 2012-08-19 18:03:09 <cande> http://pastebin.com/DLrD2xNZ
126 2012-08-19 18:03:33 <cande> gmaxwell ^
127 2012-08-19 18:03:33 <jgarzik> now downloading the rest of the chain, then for some night-long automated verification testing
128 2012-08-19 18:04:59 <cande> gmaxwell, okej, now it's accepting tons of blocks with very few errors
129 2012-08-19 18:05:28 <gmaxwell> cande: yea, those aren't actual errors. Thats expected.
130 2012-08-19 18:05:42 <gmaxwell> It's just new transactions which you can't yet validate because you aren't caught up yet.
131 2012-08-19 18:05:58 <cande> ah
132 2012-08-19 18:10:13 <cande> gmaxwell, thx
133 2012-08-19 18:58:31 <sipa> jgarzik: yes, no problems with mempool
134 2012-08-19 19:01:16 <sipa> amiller: an UTXO takes on average 23 bytes in ultraprune
135 2012-08-19 19:01:37 <sipa> but each tx has an overhead of 36 bytes or so
136 2012-08-19 19:02:43 <amiller> does your database (i assume bdb still) take advantage of the fact that the txids are only roughly 50% unique?
137 2012-08-19 19:03:40 <amiller> er, i skimmed your code and concluded you used a packed format rather than bdb but i'm not too sure
138 2012-08-19 19:04:01 <sipa> ?
139 2012-08-19 19:04:16 <sipa> it's a txid -> utxo map
140 2012-08-19 19:04:30 <sipa> the data itself does not contain txids
141 2012-08-19 19:04:32 <amiller> hm.
142 2012-08-19 19:04:50 <sipa> i use bdb
143 2012-08-19 19:05:05 <sipa> but CCoins is the utxo's of one tx
144 2012-08-19 19:05:15 <sipa> and that is serialized
145 2012-08-19 19:06:20 <sipa> but the txouts themselves are very cheap, 21 bytes for a normal pay to pubkeyhash script, and 2 bytes for the amount
146 2012-08-19 19:06:34 <sipa> it's the txids that are expensive
147 2012-08-19 19:07:52 <amiller> fair enough, i'm predicting i'm going to hit 64 bytes or so additional per txout
148 2012-08-19 19:08:47 <amiller> bah, i just realized i forgot to actually implement the "ADDR" index type
149 2012-08-19 19:12:24 <sipa> amiller: 64 bytes per txout, or per tx?
150 2012-08-19 19:12:30 <amiller> per txout
151 2012-08-19 19:13:06 <amiller> nothing _additional_ txid i don't think
152 2012-08-19 19:13:09 <amiller> per txid*
153 2012-08-19 19:27:30 <amiller> i'm looking into 'camlistore', its an a system for "content addressable storage" camlistore.org
154 2012-08-19 19:49:31 <amiller> maybe this is a closer fit: http://twistedstorage.sourceforge.net/features.html
155 2012-08-19 19:51:09 <amiller> bah, none of this stuff is widely used
156 2012-08-19 20:04:28 <sipa> amiller: i have no idea what that site is about
157 2012-08-19 20:04:35 <sipa> something about storage of documents
158 2012-08-19 20:05:55 <amiller> i was kind of hoping there would be a simple library that's like a key-value store, except only for when (key,value) = (H(value), value)
159 2012-08-19 20:06:28 <amiller> apache cassandra for example is a great key value store but it has a lot of unnecessary overhead related to collisions/consistency
160 2012-08-19 20:07:21 <amiller> i say
161 2012-08-19 20:07:35 <amiller> er, i say "a lot of overhead" but that might not be justified
162 2012-08-19 20:09:05 <ageis> distributed keystores mmm
163 2012-08-19 20:09:40 <amiller> most of the values are highly redundant, and in fact they overlap with the set of keys too - so if such a library exists, i wouldn't have to worry about it and it would compress it all for me
164 2012-08-19 20:51:40 <bdcs> It doesn't really matter, he got every block ; )
165 2012-08-19 20:51:50 <bdcs> nm, was scrolled way up
166 2012-08-19 20:58:21 <amiller> oh you know what
167 2012-08-19 20:59:34 <amiller> hrm, if you build/store the UTXO-sets as a binary tree with the same balancing algorithm, you really don't need to compute _any_ hashes
168 2012-08-19 20:59:47 <amiller> you can always compute the hashes later, whenever you feel like it
169 2012-08-19 21:00:20 <amiller> that means you could boot strap real fast.
170 2012-08-19 21:00:30 <amiller> and validate probabilistically
171 2012-08-19 21:01:38 <amiller> to validate probabilistically, you would do the full tree balancing procedure at each step, and you would recompute all the hashes at random intervals
172 2012-08-19 21:04:48 <amiller> okay, i think i suddenly understand the significance of tries vs balanced trees, but it's subtle :/
173 2012-08-19 21:06:04 <Eliel> amiller: do tell :)
174 2012-08-19 21:06:27 <amiller> the way i implement balanced trees, it's optimally efficient to store all the trees in history at once
175 2012-08-19 21:06:37 <amiller> this is known as a "persistent authenticated dictionary"
176 2012-08-19 21:06:44 <amiller> and it's why this structure would let us fork without reorgs
177 2012-08-19 21:07:59 <amiller> you wouldn't need to reorg, or at least reorgs would be simpler, you just switch your head pointer and revalidate from ther
178 2012-08-19 21:09:13 <Eliel> and the difference tries would make?
179 2012-08-19 21:09:58 <denisx> have the asics arrived?
180 2012-08-19 21:10:04 <denisx> the hashrate is exploding
181 2012-08-19 21:10:10 <amiller> erm, maybe tries can do that as well.
182 2012-08-19 21:11:44 <Eliel> denisx: I wouldn't call that exploding just yet.
183 2012-08-19 21:11:49 <Eliel> if it keeps going up, then maybe
184 2012-08-19 21:11:56 <Eliel> but it's still within normal range.
185 2012-08-19 21:12:27 <denisx> I think it is to steep for a normal range
186 2012-08-19 21:12:56 <Eliel> denisx: look at the hashrate growth during last summer. That was steeper
187 2012-08-19 21:13:45 <denisx> Eliel: I hope youre right! ;)
188 2012-08-19 21:20:03 <sipa> amiller: that's hiw roconnor implemented it in purecoin
189 2012-08-19 21:20:12 <sipa> *how
190 2012-08-19 21:20:39 <amiller> ACTION should probably look at more of pure coin
191 2012-08-19 21:21:01 <amiller> i don't remember my link to purecoin, i asked roconnor for it in private but it seemed like he didn't want it out in the open
192 2012-08-19 21:21:15 <sipa> it's on his darcs repository
193 2012-08-19 21:21:30 <gmaxwell> darcs http://r6.ca/Purecoin
194 2012-08-19 21:22:14 <amiller> i'm going to ask him if i can mirror it to github
195 2012-08-19 21:22:34 <sipa> roconnor: can amiller mirror purecoin on github?
196 2012-08-19 21:22:59 <sipa> gavinandresen: welcome back, by the way
197 2012-08-19 21:23:06 <gavinandresen> thanks
198 2012-08-19 21:25:47 <genjix> https://www.youtube.com/watch?v=AiVnMazRIII
199 2012-08-19 21:29:11 <amiller> i should just take the haskell library i used for my specification, cram hashes into it, and check it for equivalence to my current implementation.
200 2012-08-19 21:29:47 <amiller> i bet haskell's object-graphs are way more efficient
201 2012-08-19 21:31:16 <Eliel> amiller: that sounds much better than doing it in python :)
202 2012-08-19 21:31:50 <amiller> it's either going to have to recompute a ton of hashes, or it's going to have to memoize everything
203 2012-08-19 21:32:25 <amiller> i wonder if haskell does something intelligent like implement a least-recently-used cache so i can have a huge virtual memory space and it will start putting things on disk
204 2012-08-19 21:32:52 <Eliel> not without you telling it to do so by using a data structure that does that.
205 2012-08-19 21:33:25 <Eliel> actually, I'm not sure if what you want to do is possible.
206 2012-08-19 21:34:21 <Eliel> well, it probably is now that I think about it. No idea if anyone's made a library for it though
207 2012-08-19 21:34:57 <Eliel> I think the ideal solution is to memoize the more internal hashes and just recompute the rest as needed.
208 2012-08-19 21:36:30 <Eliel> otherwise you'll need to calculate 2*(2**k) hashes for each update.
209 2012-08-19 21:36:50 <Eliel> where k is the depth of the tree
210 2012-08-19 21:37:26 <Eliel> anyway, I better get some sleep. Good night.
211 2012-08-19 21:40:16 <roconnor> I think it would be a bad idea to put purecoin on github, but I cannot stop anyone from doing so.
212 2012-08-19 21:40:27 <roconnor> sipa: amiller: I think it would be a bad idea to put purecoin on github, but I cannot stop anyone from doing so.
213 2012-08-19 21:40:45 <roconnor> which reminds me I need to put an MIT licence on it
214 2012-08-19 21:41:11 <lianj> roconnor: why bad idea?
215 2012-08-19 21:41:36 <roconnor> lianj: because it isn't fully compaitble with bitcoin
216 2012-08-19 21:41:54 <lianj> ah
217 2012-08-19 21:42:06 <roconnor> and it would be bad if incompatable versions of bitcoin became popular
218 2012-08-19 21:42:27 <roconnor> granted this is going to happen anyways, so many I shouldn't fret over it
219 2012-08-19 21:43:07 <sipa> roconnor: don't worry, it's haskell, it won't be used :)
220 2012-08-19 21:43:10 <amiller> unless it's a very subtle kind of incompatible, i don't see how someone would be confused into thinking it's compatible if it isn't
221 2012-08-19 21:43:11 <roconnor> :)
222 2012-08-19 21:43:12 <lianj> in what sense is it incompatable?
223 2012-08-19 21:43:16 <lianj> sipa: hihi
224 2012-08-19 21:43:32 <sipa> "avoid success at all costs"
225 2012-08-19 21:43:36 <roconnor> lianj: in one of the million subtle ways implementations can be incompataible
226 2012-08-19 21:43:48 <roconnor> if I knew in which way, I'd fix it
227 2012-08-19 21:44:31 <roconnor> lianj: one way is that it tramples duplicate coins
228 2012-08-19 21:46:27 <Eliel> duplicate unspent txouts are invalid now
229 2012-08-19 21:46:41 <amiller> i hate to contradict your judgment, but if that's the main thing holding you back - go ahead and put an MIT license on it, and I'll take the heat for unleashing it on the world.
230 2012-08-19 21:46:50 <roconnor> lianj: I also don't disallow pushing integers larger than 4 bytes
231 2012-08-19 21:47:31 <roconnor> lianj: and probably other things
232 2012-08-19 21:49:01 <roconnor> okay, there should be an MIT licence on it now.
233 2012-08-19 21:49:26 <roconnor> amiller: I cannot stop you from putting it on github, but I think you should reconsider doing so.
234 2012-08-19 21:49:48 <roconnor> or at least be aware of the downsides
235 2012-08-19 21:50:08 <amiller> last time, i asked for specific permission to post a particular file, the set of script opcodes
236 2012-08-19 21:50:27 <amiller> mostly my interest would be in linking to specific parts i think are good/interesting ideas or that i'd use for comparison
237 2012-08-19 21:51:07 <amiller> so maybe you would prefer if i didn't push the whole thing up there but only the parts i really want to
238 2012-08-19 21:51:11 <roconnor> pasting bits and pieces is probably quite reasonable.  It is helpful to be formal when talking about things.
239 2012-08-19 21:52:03 <Eliel> you could also express your concerns in a readme file
240 2012-08-19 21:52:27 <Eliel> github shows the readme, if any, automatically
241 2012-08-19 21:52:43 <roconnor> maybe that is a good idea
242 2012-08-19 21:59:19 <amiller> i think the thing to do is to build a redblack tree using a 'zipper monad'
243 2012-08-19 21:59:46 <amiller> the reason why is that i can compose it with a 'memoize monad' which would let me implement whatever caching policy i need
244 2012-08-19 22:01:36 <amiller> i know of a haskell red-black tree using a zipper, but a) it doesn't use the monadic style so i'd have to rewrite it in order to compose with memo-monad, b) it didn't implement the "delete" half of redblack tree, and c) i'd have to add the 'digest' computation
245 2012-08-19 22:03:08 <roconnor> amiller: use a Patrica Trie
246 2012-08-19 22:03:19 <roconnor> it has a canonical shape
247 2012-08-19 22:03:24 <amiller> roconnor, do you think a patricia will work as a hash tree
248 2012-08-19 22:03:26 <roconnor> for a given data set
249 2012-08-19 22:03:38 <amiller> how would you augment a patricia trie with hashes
250 2012-08-19 22:03:51 <roconnor> I rejected balanced binary trees because they depend on what order things are inserted and deleted
251 2012-08-19 22:04:12 <amiller> etotheipi has been trying to convince me of exactly that point for weeks now.
252 2012-08-19 22:04:24 <roconnor> listen to him :)
253 2012-08-19 22:04:45 <amiller> i don't understand why that insertion order is important? and in either case, what's stopping both of us is i don't see how to implement a patricia trie with hashes efficiently
254 2012-08-19 22:05:32 <sipa> a binary patricia tree, with each node containing both hashes of the subnodes
255 2012-08-19 22:05:39 <sipa> shouldn't be too hard
256 2012-08-19 22:05:48 <sipa> as it's essentially automatically balanced
257 2012-08-19 22:05:59 <roconnor> sipa: not automatically
258 2012-08-19 22:06:16 <sipa> although it can have theoretically depth 256
259 2012-08-19 22:06:42 <roconnor> patricia tree are expected to be balanced given random keys
260 2012-08-19 22:06:57 <sipa> but in practice any depth over 2*log(2) should be vanishingly small
261 2012-08-19 22:07:14 <sipa> *probability for
262 2012-08-19 22:07:43 <amiller> that randomness is probably pretty close to what would be the typical scenario
263 2012-08-19 22:07:43 <sipa> which is as good as red-black tree
264 2012-08-19 22:08:14 <roconnor> amiller: yes, which is why you should use a patrica trie
265 2012-08-19 22:08:32 <amiller> can you explain why insertion order independence matters for bitcoin?
266 2012-08-19 22:08:41 <amiller> if you have a block, you have the hash of the previous tree as well
267 2012-08-19 22:08:54 <amiller> etotheipi has been stuck trying to explain to me why it's important
268 2012-08-19 22:09:09 <roconnor> amiller: we are talking about a tree of unspent coins, right?
269 2012-08-19 22:09:15 <amiller> yes
270 2012-08-19 22:09:30 <sipa> in case of reorgs, the order can depend on the history
271 2012-08-19 22:09:52 <sipa> as not all nodes have necessarily seen the reorg
272 2012-08-19 22:09:53 <roconnor> amiller: it is so that you will create the correct tree just given the data of the tree.
273 2012-08-19 22:10:00 <amiller> but if you have a reorg, wherever the fork point is, you have the hash of the tree at that point
274 2012-08-19 22:10:13 <amiller> you create the correct tree, given the block where the forkpoint is
275 2012-08-19 22:10:15 <roconnor> amiller: the reason is defensive programming
276 2012-08-19 22:10:42 <sipa> yes, it's just a nice property that the hash only depends on the utxo set, and nothing else
277 2012-08-19 22:12:55 <roconnor> for a protocol, especially of the nature of bitcoin, you want the number of non-essential programming choices to be as small as possible so that you are more likely to have implementor impelement the spec correctly.
278 2012-08-19 22:13:22 <roconnor> Following Murphy's law: anything that can go wrong, will go wrong.
279 2012-08-19 22:13:24 <sipa> wwe have way too many arbitrary-sounding rules already
280 2012-08-19 22:13:37 <amiller> i hope you'll let me match that somewhat-weak point with another weak-point
281 2012-08-19 22:13:41 <roconnor> when designing something you want to make sure that nothing can go wrong, because if it can go wrong, it will go wrong.
282 2012-08-19 22:13:42 <amiller> the patricia trie has a poor worst-case guarantee
283 2012-08-19 22:13:50 <amiller> and even though the utxos are likely to be random
284 2012-08-19 22:14:00 <amiller> since people have control over their addresses, it would still be vulnerable to the worst-case
285 2012-08-19 22:14:11 <amiller> and the problem with defensive programming is that you're constrained by the worst _plausible_ validation effort
286 2012-08-19 22:14:19 <sipa> attacking it costs hashing power
287 2012-08-19 22:14:31 <roconnor> e.g. make sure your accellormeters can only be inserted one way: http://r6.ca/blog/20050928T195400Z.html
288 2012-08-19 22:14:32 <amiller> someone could _tell_ you that you need to validate a very unbalanced merkle trie and you'd have to expend more effort to show it's wrong
289 2012-08-19 22:15:15 <amiller> with the red black tree, you have a firm worst-case 2*log N (or something near that) that lets you reject quickly
290 2012-08-19 22:15:47 <roconnor> well the worse case depth is only 256 checks;  I personally consider it an acceptable trade-off.  But it is probably a matter of opinion.
291 2012-08-19 22:15:58 <sipa> 160 actually
292 2012-08-19 22:16:12 <sipa> if someone manages to create a collision on 160 bits, bitcoin is dommed
293 2012-08-19 22:16:17 <sipa> *doomed
294 2012-08-19 22:16:32 <sipa> sorry, preimage
295 2012-08-19 22:16:36 <sipa> never mind
296 2012-08-19 22:16:54 <amiller> a txout is 256+8
297 2012-08-19 22:16:55 <sipa> here the problem is collisions, which is a lot easier than preimages
298 2012-08-19 22:17:09 <sipa> my utxo set is only indexed by txid
299 2012-08-19 22:17:10 <amiller> and if you want to include a search index for addresses, then the keys are 160+256+8
300 2012-08-19 22:17:44 <amiller> sipa, yes but then you would need auxiliary information that you couldn't validate just given a root hash
301 2012-08-19 22:17:56 <sipa> amiller: such as?
302 2012-08-19 22:19:13 <amiller> if all i have is the root hash, and all i learn from querying the tree is that there exists an unspent output in txid=xxxx, then i would still have to rescan the chain to see which txout(s) are unspent
303 2012-08-19 22:19:51 <sipa> the value contains the actual txouts
304 2012-08-19 22:19:59 <sipa> but those are not part of the key
305 2012-08-19 22:21:08 <phantomcircuit> DiabloD3, any chance you'd be willing to write a gpu based vanity generator for tor .onion addresses?
306 2012-08-19 22:21:19 <phantomcircuit> the existing cpu implementation leaves something to be desired
307 2012-08-19 22:21:27 <sipa> so i compromise the efficiency of looking up a utxo (by requiring a linear search of the value in the txid->utxos map), for improved storage
308 2012-08-19 22:21:47 <sipa> as some data is shared for all utxos of one tx
309 2012-08-19 22:22:24 <amiller> but it's an unnecessary compromise, you can take advantage of compressed storage anyway regardless of how the hashes are combined
310 2012-08-19 22:22:36 <sipa> how so?
311 2012-08-19 22:23:03 <amiller> well for example you can store just the bitvector for the utxos
312 2012-08-19 22:23:26 <sipa> (my utxo data so contains the height of each tx, for example)
313 2012-08-19 22:23:31 <sipa> *also
314 2012-08-19 22:24:01 <amiller> let me put it this way
315 2012-08-19 22:24:23 <amiller> hrm
316 2012-08-19 22:24:44 <amiller> basically you could store two trees, one of them just contains the binary tree of hashes, and the other just contains the trie of data without hashes
317 2012-08-19 22:25:03 <sipa> ok
318 2012-08-19 22:25:05 <sipa> yes
319 2012-08-19 22:25:21 <DiabloD3> [08:21:08] <phantomcircuit> DiabloD3, any chance you'd be willing to write a gpu based vanity generator for tor .onion addresses?
320 2012-08-19 22:25:24 <DiabloD3> there already is one
321 2012-08-19 22:25:36 <phantomcircuit> is there? i could only find the cpu one
322 2012-08-19 22:25:52 <phantomcircuit> shallot is the only one i could fine
323 2012-08-19 22:25:54 <phantomcircuit> find
324 2012-08-19 22:26:55 <amiller> yeah i don't know where to go from there, it wasn't such a clear thought
325 2012-08-19 22:27:03 <DiabloD3> phantomcircuit: someone ported shallot to gpu
326 2012-08-19 22:27:14 <DiabloD3> its on github somewhere
327 2012-08-19 22:27:28 <sipa> amiller: my design decision was made partly becaus the current database layer (bdb, or later maybe levelfb), stores keys and values completely
328 2012-08-19 22:28:12 <sipa> so i didn't want to store the txids (which are effectively like 35-40% of thr serialized data)
329 2012-08-19 22:28:19 <sipa> multiple times
330 2012-08-19 22:29:25 <amiller> that doesn't seem like a concern that should be reflected in the normative specification though
331 2012-08-19 22:29:39 <amiller> if you find a better compressing database then it won't matter
332 2012-08-19 22:29:39 <sipa> the original point of ultraprune was actually not to rewrite the validation engine, but just to have an idea of how small the utxobset would be
333 2012-08-19 22:30:16 <amiller> it's different when we're talking about committing to a particular structure in a blockchain, especially for lite clients with minimal trusted storage
334 2012-08-19 22:30:49 <amiller> like putting a trie in the normative protocol makes storage/implementation superficially easier, but it exposes lite clients to worse plausible-validation ddos attacks
335 2012-08-19 22:32:04 <sipa> with the work currently performed by the entire bitcoin network, you could have some chance at grtting a collision of 140 bits
336 2012-08-19 22:32:05 <amiller> putting a balanced binary tree in the normative protocol means you'll need to use a compression heuristic or something, but the information complexity is the same for random data, and lite clients have a better "firm reject" limit
337 2012-08-19 22:33:15 <amiller> sipa, in order for your point to be applicable, you would have to be talking about making a "hard reject rule" that doesn't check past 140 bits - that is actually weakening your security level to that number of bits
338 2012-08-19 22:33:15 <phantomcircuit> DiabloD3, lol even knowing it's there i still cant find it
339 2012-08-19 22:33:18 <sipa> well, if the balanced tree implementation is over 8 times slower than the patricia system (just a constant factor), that point becomes invalid :)
340 2012-08-19 22:33:29 <DiabloD3> phantomcircuit: weird
341 2012-08-19 22:33:55 <roconnor> in theory patrica tree lookups can be pretty fast
342 2012-08-19 22:34:14 <sipa> because 40 bit collisions will occur in a current patricia tree, and 40 levels will happen in redblack trees too
343 2012-08-19 22:34:27 <amiller> roconnor, when you say that, are you thinking about patricia "Merkle" tree lookups, including validating a hash at each bit?
344 2012-08-19 22:34:38 <roconnor> amiller: no
345 2012-08-19 22:35:49 <phantomcircuit> DiabloD3, #tor oftc people say it doesn't exist
346 2012-08-19 22:35:56 <phantomcircuit> who is right? FIGHT TO THE DEATH
347 2012-08-19 22:36:40 <sipa> amiller: so the worst case difference is at most a constant factor 6
348 2012-08-19 22:37:06 <sipa> and that is assuming a 2^256 step attack, which would kill bitcoin anyway
349 2012-08-19 22:37:12 <phantomcircuit> DiabloD3, lol the cpu based one is a fucking mess
350 2012-08-19 22:37:16 <phantomcircuit> it keeps seg faulting
351 2012-08-19 22:37:23 <phantomcircuit> race conditions everywhere
352 2012-08-19 22:37:31 <amiller> so what's the case hit to the compressibility of the data?
353 2012-08-19 22:37:36 <amiller> since that ultimately determines the storage cost
354 2012-08-19 22:37:48 <amiller> i'd argue that the compressibility of the data is unaffected
355 2012-08-19 22:37:50 <sipa> amiller: that's a more interesting question
356 2012-08-19 22:38:18 <amiller> http://www.cs.uni-paderborn.de/fileadmin/Informatik/FG-TI/Christian/Publications/HashedPatriciaTrie.pdf here's a paper on patricia hash tries, maybe i'll find some insight in there
357 2012-08-19 22:39:12 <amiller> nevermind this doesn't use hashes for security at all, it's a different idea altogether
358 2012-08-19 22:39:34 <roconnor> sipa: compressability?
359 2012-08-19 22:39:51 <amiller> i'm pretty sure that any advantage of a patricia trie melts away when you implement a merkleized version you can commit to in a block, but i'm not sure how else to defend that statement
360 2012-08-19 22:42:02 <sipa> amiller: i don't care about a particular structure, but i agree with roconnor that a deterministic system, that only allows one structure (and thus one root hash) per data set, is an important property
361 2012-08-19 22:42:31 <sipa> and more loosely defined structures will almost certainly be more efficient, as they require less rebalancing
362 2012-08-19 22:42:58 <sipa> but that difference should at most be a constant factor anyway
363 2012-08-19 22:43:53 <amiller> well that puts me entirely by myself in the balanced binary tree camp.
364 2012-08-19 22:44:24 <sipa> fully-balanced binary trees also have that property :)
365 2012-08-19 22:45:23 <amiller> i think what we would need to do to resolve this is to specify what it is that matters about having order independnece
366 2012-08-19 22:46:04 <amiller> if you attempt to state that clearly, i think you'll find that it puts the cart in front of the horse, because it begins with "given you have an O(N) set of trusted utxo data, you can compute the O(1) root hash"
367 2012-08-19 22:46:04 <sipa> it has the advantage that you can just send a list of txouts to someone and they can construct the correct tree from it
368 2012-08-19 22:46:29 <amiller> if you can send them a list of txouts, and they trust you,
369 2012-08-19 22:47:08 <amiller> then you could send them the txouts, plus another N of structure for the binary tree, and then they compute the correct tree from it just as easily
370 2012-08-19 22:47:29 <amiller> in any case, they will have to _trust_ you to send an entire list of txouts, which is not the important scenario
371 2012-08-19 22:47:41 <amiller> the important scenario is when you only want to send them O(1) if they have to trust it
372 2012-08-19 22:47:56 <roconnor> for me it is about reducing programming errors in implemenations.
373 2012-08-19 22:49:05 <sipa> i think it would just be wrong to require that for someone to verify the UTXO set, they need to know in which order it was created
374 2012-08-19 22:49:18 <sipa> that sounds not what we're trying to do
375 2012-08-19 22:49:25 <amiller> they don't need to know which order it was created, but they need to know the tree structure
376 2012-08-19 22:49:31 <sipa> agree
377 2012-08-19 22:49:33 <amiller> not even all the hashes in the tree structure, just the tree structure
378 2012-08-19 22:49:40 <amiller> they can recompute the hashes in this scenario
379 2012-08-19 22:49:48 <sipa> agree, but why require that data?
380 2012-08-19 22:50:01 <amiller> you're picking the strangest of all scenarios
381 2012-08-19 22:50:07 <amiller> you don't require all that data
382 2012-08-19 22:50:13 <amiller> because if you can send them trusted data, just send them the root hash and you're done
383 2012-08-19 22:50:47 <sipa> the trees will still have to be transferred somehow
384 2012-08-19 22:50:57 <sipa> to bootstrap, for example
385 2012-08-19 22:51:08 <amiller> the trees will not have to be trasnferred over a trsuted channel
386 2012-08-19 22:51:16 <phantomcircuit> DiabloD3, hmm so for an onion generator you need to generate a bunch of rsa key pairs and then hash the public key
387 2012-08-19 22:51:19 <amiller> once they have the root hash, they can scavenge the rest from the internet
388 2012-08-19 22:51:52 <sipa> amiller: yes, indeed, but they still have to
389 2012-08-19 22:52:01 <amiller> even then, when you say 'have to'
390 2012-08-19 22:52:23 <amiller> have to in order to do what? to be bootstrapped? if they have the O(1) hash they're already boostrapped because they can query just the parts of the tree they are interested in
391 2012-08-19 22:52:50 <sipa> in one scenario, yes
392 2012-08-19 22:53:09 <sipa> but certainly a validating node will want the entire set anyway
393 2012-08-19 22:53:21 <amiller> in the scenario where you transfer O(N) of trusted information, then the entire tree is still O(N)
394 2012-08-19 22:53:40 <amiller> in the scenario where you transfer O(1) trusted information, then you don't need the O(N) data
395 2012-08-19 22:53:42 <sipa> yes, we're just discussing constant factors anyway
396 2012-08-19 22:53:51 <sipa> in both cases
397 2012-08-19 22:54:31 <sipa> but it still feels wrong to me that the history of the utxo set has an influence on its hash
398 2012-08-19 22:56:06 <sipa> as that does force you to keep old states of the utxo tree, or undo data that includes structural information about the tree
399 2012-08-19 22:56:24 <amiller> those are needed for reorgs anyway
400 2012-08-19 22:56:24 <sipa> instead of just being able to construct that undo data from the block chain
401 2012-08-19 22:56:34 <roconnor> ah right
402 2012-08-19 22:56:38 <roconnor> that was a problem
403 2012-08-19 22:56:39 <amiller> you can't construct the undo data from the blockchain anyway
404 2012-08-19 22:56:48 <amiller> you'd have to construct the undo data from an incrementally modified utxo
405 2012-08-19 22:56:53 <sipa> right, you need the utxo set for that
406 2012-08-19 22:56:56 <amiller> unless you lookup the utxo by its hash, but that's what i'm saying you can do anyway
407 2012-08-19 22:57:15 <sipa> but you don't need the utxo tree strucure for that
408 2012-08-19 22:57:46 <amiller> if you're doing the incremental update, then the additional tree structure is no additional burden
409 2012-08-19 22:57:57 <amiller> if you aren't doing the incremental update, then what matters is that you have a root hash, not the structure underneath it
410 2012-08-19 22:58:31 <sipa> wel, this sounds trivial: if the tree structure is defined by the set, then obviously patch data for it will not require any structural information, but only set modification data
411 2012-08-19 22:59:44 <amiller> if that's what "feels wrong" then it's a constant factor
412 2012-08-19 23:00:03 <roconnor> The problem was that I wanted (the option) to store only the unspent tree at the node of the tree with the highest work, and be able to zip it down and back up during a reorg; storing only the transaction deltas at the internal nodes of the block tree.
413 2012-08-19 23:00:09 <sipa> feels wrong is not a strong argument of coutse
414 2012-08-19 23:00:22 <amiller> to me, having worst-case validation costs go up by a constant factor is a bigger priority than having incremental update costs go up
415 2012-08-19 23:01:16 <amiller> roconnor, you might enjoy my post about the Hash-Value-Highway since it sounds a bit like that
416 2012-08-19 23:01:24 <sipa> but an attacker has to spend exponentially more work, to get a linear effect on validation hardnrss
417 2012-08-19 23:01:33 <amiller> it's a skip list, it lets you zip around to the highest work blocks
418 2012-08-19 23:02:08 <amiller> sipa, i don't think that's true in this case
419 2012-08-19 23:02:34 <amiller> because when you query the tree, it's not like you validate the transactions at the bottom
420 2012-08-19 23:03:22 <amiller> i think we agree that the costs here are about one constant factor that feels bad in one location, vs the same constant factor in another location that feels bad for a different reason
421 2012-08-19 23:03:44 <sipa> the constant factor isn't what bothers me
422 2012-08-19 23:03:52 <sipa> it is the dependence on history
423 2012-08-19 23:04:32 <amiller> the hash of the current head is fully dependent on the entire history of the blockchain
424 2012-08-19 23:04:46 <sipa> i'm just arguing that the side effect for preventing this dependence on history is limited
425 2012-08-19 23:05:03 <sipa> but the order in the blockchain is somethinh relevant
426 2012-08-19 23:05:16 <sipa> the order in which utxo set uodates were performed isnt
427 2012-08-19 23:05:43 <roconnor> indeed you are no longer hashing the utxo set; you are hashing the utxo set plus some more stuff.
428 2012-08-19 23:07:00 <sipa> and you can use a fully balanced binary tree too
429 2012-08-19 23:07:16 <sipa> that is deterministic, and has perfect worst case. behaviour
430 2012-08-19 23:07:24 <sipa> but still higher constant factors
431 2012-08-19 23:07:50 <amiller> i don't think there's any fully balanced binary tree that has worst-case updates
432 2012-08-19 23:07:51 <sipa> as you can potentially have many rebalancing steps
433 2012-08-19 23:08:16 <sipa> that statement doesn't make sense?
434 2012-08-19 23:08:26 <amiller> like what i'm concerned about is whichever is worse, the worst-case query cost or the worst-case update cost
435 2012-08-19 23:09:06 <amiller> anyway i think it's interesting that you said that the constant factor isn't what bothers you
436 2012-08-19 23:09:12 <sipa> a lot depends probably on how you expect this datastructure to fit in a mkre complex system, and how it will be used
437 2012-08-19 23:09:18 <amiller> you're saying that you have an uneasy feeling about order independence that goes beyond a constant factor
438 2012-08-19 23:09:23 <sipa> yes
439 2012-08-19 23:09:28 <amiller> but every example you can give me about how the trie behaves differently
440 2012-08-19 23:09:34 <amiller> is only worse by a constant factor in cost
441 2012-08-19 23:10:48 <sipa> if i have the utxo set, then i only want to need 1 hash to know it is correct
442 2012-08-19 23:11:18 <sipa> even if i don't know the order of operations that resulted in this set
443 2012-08-19 23:11:58 <amiller> you don't. if you have the utxo set, i'm not saying you need to also know the order of operations, i'm saying you also need the 'tree structure' which is very small
444 2012-08-19 23:12:19 <amiller> it's a constant factor
445 2012-08-19 23:12:40 <amiller> the order of operations is only relevant when you don't have the utxo because you're iterating from scratch anyway
446 2012-08-19 23:13:12 <amiller> if i download your UTXO, i'm getting a BTree.
447 2012-08-19 23:13:36 <sipa> well, we're discussing an uneasy feeling, and you're not going to change that with a rational argument before i think about it more myself
448 2012-08-19 23:14:09 <amiller> whatever it is that's making you feel uneasy, roconnor etotheipi gmaxwell and you must have all eaten the same thing :/
449 2012-08-19 23:14:25 <sipa> afaik, we all have different reasons
450 2012-08-19 23:14:34 <phantomcircuit> DiabloD3, it'll be interesting to see how long it takes to generate a "bitcoin" vanity address on a i3-2100
451 2012-08-19 23:14:50 <amiller> well yours match etotheipi's very closely
452 2012-08-19 23:15:08 <sipa> hell no, i think indexing on addresses is overkill :)
453 2012-08-19 23:16:07 <sipa> or at least non-essential
454 2012-08-19 23:16:24 <amiller> your UTXO uses a btree right now right
455 2012-08-19 23:16:35 <DiabloD3> phantomcircuit: heh
456 2012-08-19 23:16:44 <amiller> suppose i could turn a bdb utxo btree into the same redblack tree that my algorithm produces
457 2012-08-19 23:16:46 <jgarzik> but but... if we indexed on addresses, we could turn bitcoin into a balance sheet database, and ditch the history!
458 2012-08-19 23:16:49 <jgarzik> ACTION runs
459 2012-08-19 23:17:15 <sipa> amiller: bdb is a b+ tree i suppose, with a large fanout
460 2012-08-19 23:17:36 <amiller> sipa, did you know that any amount of fanout is logically equivalent to a red black tree
461 2012-08-19 23:17:36 <sipa> but from the semantics point of view, i have a set and no tree
462 2012-08-19 23:17:47 <amiller> there is no resume to presume you have a set and no tree
463 2012-08-19 23:18:04 <amiller> there is never a reasonable case when you have a set and no tree
464 2012-08-19 23:18:31 <sipa> i mean: nothing in my system requires it being implemented as a tree
465 2012-08-19 23:18:48 <sipa> as soon as you need a merkle root, the structure becomes relevant
466 2012-08-19 23:19:02 <amiller> right
467 2012-08-19 23:19:10 <amiller> it's not relevant, but it's not malleable either
468 2012-08-19 23:19:23 <sipa> and a red-black tree requires a fill rate of 50-100% i suppose in your btree
469 2012-08-19 23:19:30 <sipa> to be equivalent
470 2012-08-19 23:19:56 <sipa> anyway, i need sleep
471 2012-08-19 23:20:01 <jgarzik> meh.  need to implement signature cache in pynode
472 2012-08-19 23:20:02 <sipa> but it's way too watm
473 2012-08-19 23:20:06 <sipa> warm
474 2012-08-19 23:20:13 <jgarzik> otherwise script verf will be too slow, I think
475 2012-08-19 23:20:23 <jgarzik> or require multi-processing (no threads for pynode)
476 2012-08-19 23:20:33 <sipa> jgarzik: sigcache only gains you a factor 2 anyway
477 2012-08-19 23:21:11 <jgarzik> sipa: it also spreads out the load across time, reducing the number of signatures to verf when a block arrives
478 2012-08-19 23:21:21 <sipa> agree
479 2012-08-19 23:21:30 <sipa> that's more important probably
480 2012-08-19 23:22:05 <jgarzik> it makes a big impact, since python is so much slower
481 2012-08-19 23:22:16 <jgarzik> or rather, I presume it will when it is implemented :)
482 2012-08-19 23:56:33 <amiller> http://hackage.haskell.org/package/Monatron this haskell library provides the "zippers as a monad transformers" infrastructure that i want
483 2012-08-19 23:56:54 <amiller> if i can figure out wtf i'm doing i should be able to port the zipper redblack tree to this structure
484 2012-08-19 23:57:51 <sipa> amiller: not sure you mean the same thing with zippers
485 2012-08-19 23:59:01 <amiller> it lets me abstract out details about how i compute hashes, for example using a cache/memoization table, or for example doing the version without hashes (which would be most comparable to the utxo you have now)