1 2012-09-23 00:11:36 <amiller> jgarzik, i was able to verify that it's using keep-alive
  2 2012-09-23 00:11:52 <amiller> curl -v --user rpcuser:asdflkajsdlfkj --data-binary '{"jsonrpc": "1.0", "id":"curltest", "method": "getinfo", "params": [] }' http://localhost:9332/
  3 2012-09-23 00:12:03 <amiller> * Connection #0 to host localhost left intact
  4 2012-09-23 00:23:50 <jgarzik> amiller: good enough
  5 2012-09-23 00:25:07 <jgarzik> amiller: time for a pull req?
  6 2012-09-23 00:26:04 <amiller> it's taking me a long time to catch up to the front of the chain, i'm at 140k now with pynode
  7 2012-09-23 00:26:20 <amiller> so i think i'll rebase and pull req as soon as i'm sure that's working alright
  8 2012-09-23 00:27:27 <jgarzik> amiller: yeah, it takes a while, as it does not have multi-threaded script verf :/
  9 2012-09-23 00:30:39 <amiller> while it's busy verifying, it appears to be less responsive with gevent than with asyncore, in that it takes longer to process rpc requests while its busy
 10 2012-09-23 00:52:48 <jgarzik> amiller: hrm, did everything get reformatted?
 11 2012-09-23 00:53:03 <amiller> yeah i couldn't handle the tabs
 12 2012-09-23 00:53:09 <amiller> i'm trying to split that into two commits so it looks better
 13 2012-09-23 00:53:43 <jgarzik> amiller: it's going back to tab to be consistent with the rest of the code, though
 14 2012-09-23 00:53:54 <jgarzik> amiller: a bit difficult to spot and review changes
 15 2012-09-23 00:54:14 <amiller> k let me replace it with tabs in my own commit
 16 2012-09-23 01:04:59 <amiller> https://github.com/amiller/pynode/commit/efa6a1e054fdb4aeac9ab267f0096bb3133283f2 this is probably good enough jgarzik there are a couple of unnecessary whitespace changes but not very many
 17 2012-09-23 01:05:22 <jgarzik> amiller: anything is better than 100% changed ;p
 18 2012-09-23 01:07:15 <jgarzik> amiller: yep, looks grat
 19 2012-09-23 01:07:18 <jgarzik> *great
 20 2012-09-23 01:28:07 <jgarzik> amiller: might need 'git push --force' to update the pynode pull request
 21 2012-09-23 01:28:39 <jgarzik> amiller: pull req shows commit from ~5 hours ago
 22 2012-09-23 01:29:10 <amiller> it's the timestamp from the first squashed commit, efa6a1e is my newest
 23 2012-09-23 01:30:45 <jgarzik> amiller: ah, ok
 24 2012-09-23 01:57:25 <amiller> jgarzik, here's a profile run http://i.imgur.com/khuT7.png
 25 2012-09-23 02:01:11 <amiller> not sure how reliable this is with gevent, but i think it means that gdbm is the bottleneck
 26 2012-09-23 02:02:18 <amiller> gonna try replacing gdbm with leveldb and see how that goes
 27 2012-09-23 02:03:21 <jaxtr> cygwin X server is quite handy
 28 2012-09-23 02:20:17 <jgarzik> amiller: database is absolutely a bottleneck.  it is Quite Shitty(tm) design
 29 2012-09-23 02:20:41 <jgarzik> amiller: I was intending to use https://github.com/jgarzik/pagedb
 30 2012-09-23 02:21:04 <jgarzik> amiller: but a mature, working replacement is certainly preferred
 31 2012-09-23 02:21:47 <jgarzik> amiller: right now, with multiple gdbm databases, pynode is not even transactional for a single block
 32 2012-09-23 02:22:06 <jgarzik> amiller: pagedb is my attempt to do leveldb-like transactional db
 33 2012-09-23 02:23:37 <amiller> jgarzik, you should see how sexy gevent works with ipython
 34 2012-09-23 02:23:49 <amiller> i'm used to doing this with other graphics-based eventloops like opengl, but it works just as well for distributed things
 35 2012-09-23 02:24:08 <amiller> you can modify code and restart it without having to actually restart your process
 36 2012-09-23 02:25:28 <jgarzik> amiller: know what else is a pynode bottleneck?  what shows up at the top of all profiles?  python object copying, of the CTransaction objects within a block, during SignatureHash() script verification.
 37 2012-09-23 02:25:49 <jgarzik> amiller: if you are a python guru (I'm not), optimizing that one operation will shave hours off full block chain verification
 38 2012-09-23 02:28:25 <amiller> http://i.imgur.com/VISjU.png
 39 2012-09-23 02:28:31 <amiller> here we go, look how much of a difference that makes
 40 2012-09-23 02:30:04 <jgarzik> amiller: cool
 41 2012-09-23 02:30:51 <amiller> let me see if i can rebase and meet up with you again, i have no idea what to do
 42 2012-09-23 02:31:37 <jgarzik> amiller: give me a sec
 43 2012-09-23 02:31:46 <jgarzik> amiller: still need to push out your stuff
 44 2012-09-23 02:32:42 <jgarzik> amiller: OK, pushed
 45 2012-09-23 02:47:05 <amiller> ah its faster but i'm not sure it's working correctly with gevent
 46 2012-09-23 03:01:20 <jgarzik> amiller: you really want to do a batch write, and wrap all those db updates into a single log event
 47 2012-09-23 03:03:03 <jgarzik> (if I understand leveldb correctly)
 48 2012-09-23 03:06:37 <jgarzik> whee
 49 2012-09-23 03:07:30 <jgarzik> Very happy about gevent upgrade in pynode.  This sweeps away an asyncore bug so annoying that it blocked multi-peer node operations.
 50 2012-09-23 03:08:12 <jgarzik> amiller: really we should do multiprocess in pynode:  (1) network, (2) database, (3) N workers for parallel verification
 51 2012-09-23 03:08:16 <amiller> ad hoc state machines are inherently evil
 52 2012-09-23 03:09:14 <jgarzik> note I said multi-process and not multi-thread ;p
 53 2012-09-23 03:09:21 <amiller> yeah i'm good at multiprocess
 54 2012-09-23 03:09:41 <amiller> i don't understand how to do concurrent verification
 55 2012-09-23 03:10:50 <jgarzik> amiller: in theory each TX may be processed in parallel, and there is quite a bit of I/O parallelization
 56 2012-09-23 03:11:17 <jgarzik> amiller: plus, you make parallel the expensive CTransaction copying within SignatureHash that I've measured
 57 2012-09-23 03:20:01 <jgarzik> amiller: what are your plans vis a vis pybond and gevent?  Shall I go ahead and do that conversion myself?
 58 2012-09-23 03:20:14 <jgarzik> should mirror pynode
 59 2012-09-23 03:21:21 <amiller> yeah you should give that a try, i don't think there are any troubles
 60 2012-09-23 03:21:48 <amiller> i'm working on batch writes and then i'm going to try putting the db in a subprocess with a queue
 61 2012-09-23 03:22:31 <jgarzik> cool
 62 2012-09-23 03:22:40 <jgarzik> pybond DHT should go live in 24-48 hours then
 63 2012-09-23 03:24:30 <jgarzik> <amiller> ah its faster but i'm not sure it's working correctly with gevent  <<-- so I should not pull leveldb into pynode?
 64 2012-09-23 03:24:47 <amiller> yeah don't pull that yet
 65 2012-09-23 03:25:36 <amiller> jgarzik, when you start pynode fresh, do you see a lot of TX ff9ec1c4b48dd1121901e554b2c96a34db0da1596db20d9c3ddd9ee182831e93/0 no-dep 6ab5a21d401bcb99511391d034c92bfc4646d7509eb941a3c4172ec8c60caa0a
 66 2012-09-23 03:25:44 <amiller> er, i mean MemPool: Ignoring failed-sig TX
 67 2012-09-23 03:26:10 <jgarzik> amiller: yes.  "no-dep" == I have not seen that dependency, so I cannot verify the signature, so I am dropping the TX on the floor
 68 2012-09-23 03:26:40 <jgarzik> amiller: illustrates a TODO item: implement TX orphan cache, and associated processing
 69 2012-09-23 03:28:14 <jgarzik> amiller: bitcoin keeps a cache, capped at 10000 entries.  if a dependency appears, the TX is moved cache -> mempool.
 70 2012-09-23 03:28:54 <jgarzik> no-dep leads to failed-sig.
 71 2012-09-23 03:56:39 <dwon> Hey, does anyone know who's responsible for contrib/debian/?  Was the stuff in that dir copied from Debian's bitcoin package or vice versa?
 72 2012-09-23 04:23:06 <Luke-Jr> dwon: BlueMatt
 73 2012-09-23 04:23:15 <jgarzik> amiller: Here's a partial pybond conversion, http://gtf.org/garzik/bitcoin/patch.pybond-greenlet
 74 2012-09-23 04:23:35 <dwon> Luke-Jr: Thanks.
 75 2012-09-23 04:23:45 <jgarzik> amiller: ran into lack of knowledge about how to share incoming and outgoing TCP connections in the same code, and stopped.
 76 2012-09-23 04:24:19 <amiller> if you put it into a git branch i can play with it
 77 2012-09-23 04:24:21 <jgarzik> amiller: i.e. StreamServer() does not automatically plug into NodeConn().  This is a problem that pynode must solve as well.
 78 2012-09-23 04:25:02 <jgarzik> in both pynode and pybond's cases, the P2P "server" and "client" connections must share much logic
 79 2012-09-23 04:38:23 <amiller> okay jgarzik you may want to try pulling my leveldb commit, it's pretty cool, here's the score so far http://imgur.com/z4Ebx
 80 2012-09-23 04:39:16 <amiller> write baching made it much more efficient, now it's mostly limited by the signature throughput as we'd expect
 81 2012-09-23 04:40:03 <amiller> also it no longer jams up my whole computer like gdbm did wtf
 82 2012-09-23 04:40:09 <amiller> that's kind of the important part
 83 2012-09-23 04:50:14 <amiller> ahh lol
 84 2012-09-23 04:52:31 <amiller> okay i fixed the major contention, by putting a tiny gevent.sleep() at the beginning of the 'doblock' command
 85 2012-09-23 04:58:11 <jgarzik> amiller: I bet that problem would not have existed if you were using an ad hoc state machine ;p
 86 2012-09-23 04:58:43 <jgarzik> ACTION doesn't trust any model that does not accurately reflect the way execution and blocking work in the kernel
 87 2012-09-23 04:59:14 <jgarzik> you wind up peppering your code with sleep() and yield() in odd places
 88 2012-09-23 04:59:41 <amiller> well the only reason it shows up here is that it buffers a lot of getblocks
 89 2012-09-23 04:59:53 <amiller> and then it processes all the putblocks with no yielding in between
 90 2012-09-23 05:02:23 <jgarzik> ACTION is timing a block chain import w/ pynode + leveldb
 91 2012-09-23 05:03:15 <jgarzik> ACTION still thinks start-async-operation(my callback) more accurately reflects hardware and OS APIs
 92 2012-09-23 05:03:59 <jgarzik> emulating a blocking API, when underlying is non-blocking, just leads to more code and effort
 93 2012-09-23 05:16:37 <jgarzik> amiller: another help would be to store blocks themselves outside the database, in a flat file a la bitcoind
 94 2012-09-23 05:20:20 <midnightmagic> sipa: Any reason why I shouldn't use /ultraprune with a real wallet and spend some bitcoins with it?
 95 2012-09-23 05:24:18 <amiller> jgarzik, this is my buddy brandyn, he wrote the hadoopy library and has all the practical python knowledge, we usually work together on computer vision projects
 96 2012-09-23 05:24:22 <amiller> i only know how to use gevent because he figured it out first
 97 2012-09-23 05:25:12 <amiller> anyway if the leveldb behavior is stable, the lowest hanging fruit now is the serialization
 98 2012-09-23 05:26:18 <amiller> it's slow to use python struct, the best way is to use cython to compile a short c routine
 99 2012-09-23 06:02:26 <wumpus> midnightmagic: I've tested around a bit with ultraprune on the testnet, did not run into any problems
100 2012-09-23 06:02:33 <midnightmagic> wumpus: Thank you.
101 2012-09-23 08:52:38 <Diapolo> When I want to protect a variable because of it beeing used in multiple threads, do I need to lock it when is't written to AND read from?
102 2012-09-23 08:59:36 <phantomcircuit> jgarzik, cant build cpuminer on ubuntu because of missing libcurl auotconf macros
103 2012-09-23 08:59:37 <phantomcircuit> :(
104 2012-09-23 08:59:46 <Arnavion> Diapolo, yes
105 2012-09-23 08:59:54 <Arnavion> unless you can guarantee the writes will be atomic
106 2012-09-23 09:00:00 <Arnavion> and the reads
107 2012-09-23 09:00:43 <phantomcircuit> Diapolo, for reads it depends on what kind of variable it is
108 2012-09-23 09:00:57 <phantomcircuit> if it's an integer you can atomically read the value
109 2012-09-23 09:01:01 <phantomcircuit> without a lock
110 2012-09-23 09:01:19 <phantomcircuit> if it's a complex data structure than you probably want a lock for both reads and writes
111 2012-09-23 09:01:37 <Arnavion> It's not _guaranteed_ for ints either, is it?
112 2012-09-23 09:01:37 <phantomcircuit> and somebody will yell at me about the not using a lock for reads of simple variables thing too
113 2012-09-23 09:01:49 <Arnavion> Just that every arch has atomic r/w for ints?
114 2012-09-23 09:02:06 <phantomcircuit> Arnavion, in c reads of a single int is guaranteed atomic
115 2012-09-23 09:02:08 <Diapolo> As it's not a simple integer I'm fine with your answer, thanks :).
116 2012-09-23 09:02:43 <phantomcircuit> Arnavion, although i guess that depends on the specific architecture but really what arch doesn't have atomic r/w of words?
117 2012-09-23 09:02:48 <Arnavion> Yeah
118 2012-09-23 09:03:06 <Arnavion> It's not guaranteed by the spec, but it IS true of common arches
119 2012-09-23 09:10:29 <Diapolo> Nice, Secunia accepted my request to add Bitcoin-Qt into their database: http://i47.tinypic.com/3eqli.png
120 2012-09-23 09:26:22 <sharperguy> How do i get the package libdb4.8++-dev on ubuntu?
121 2012-09-23 09:32:16 <sharperguy> Ah looks like 5.1 is the new standard in ubuntu 12.04
122 2012-09-23 10:04:23 <TD> hmm
123 2012-09-23 10:04:31 <TD> mempool size seems to be increasing steadily with time
124 2012-09-23 10:08:47 <Arnavion> sharperguy If you use 5.1, then your wallet will be incompatible with clients which use 4.8
125 2012-09-23 10:10:56 <sharperguy> oh :/ Well ubuntu doesnt have that package anymore, so I don't know how to build it
126 2012-09-23 10:13:06 <Hasimir> sharperguy, compile from source, in /opt/local if it screws too much with /usr/local
127 2012-09-23 10:15:36 <sharperguy> All I need are the header files - I can't compile those from source
128 2012-09-23 10:16:30 <jdnavarro> has there been any change with the magic bytes for the testnet in version 0.7.0?
129 2012-09-23 10:22:50 <jdnavarro> I just found out with wireshark that it's now 0x0B110907 instead of 0xDAB5BFFA as it's specified in the wiki
130 2012-09-23 10:22:59 <jdnavarro> am I missing something?
131 2012-09-23 10:30:40 <darkhosis> anyone else experienced a gradual +memory creep?  i hadn't checked for about 72hrs and my kimsufi was swamped (was so slow had to kill the bitcoind process)....  seemed like it was eating a few more MB every few minutes
132 2012-09-23 10:51:14 <TD> darkhosis: it's not a memory leak. the memory pool is slowly expanding, though
133 2012-09-23 10:51:23 <TD> a few mb every few minutes sounds a bit extreme though
134 2012-09-23 10:53:21 <darkhosis> some tx's that never go through?
135 2012-09-23 10:53:43 <darkhosis> yeah that was exaggerating a bit i guess.  more like a dozen MB an hour probably
136 2012-09-23 10:55:20 <darkhosis> wasn't a big deal on this hetzner, but that kimsufi with the 2GB was trashed
137 2012-09-23 12:24:03 <Randy_> Hey did anyone go to the bitcoin conference in London?
138 2012-09-23 12:24:40 <sipa> yes
139 2012-09-23 12:24:55 <phantomcircuit> Randy_, no the rooms were empty
140 2012-09-23 12:24:59 <phantomcircuit> ACTION runs
141 2012-09-23 12:25:19 <Randy_> You where there?
142 2012-09-23 12:25:25 <phantomcircuit> uh yes?
143 2012-09-23 12:25:51 <Randy_> http://bitcoin2012.com/ At this one
144 2012-09-23 12:25:55 <sipa> midnightmagic: less tested, potential subtle bugs, perhaps database format will still change (though it ahouldnt)
145 2012-09-23 12:26:45 <sipa> midnightmagic: if you don't mine or accept coins, it is probably safe
146 2012-09-23 12:27:07 <phantomcircuit> Randy_, there hasn't been any other conference in london so yes?
147 2012-09-23 12:28:06 <Randy_> I was surprised that it didn't work as planned.
148 2012-09-23 12:31:15 <phantomcircuit> Randy_, i was joking...
149 2012-09-23 12:31:28 <phantomcircuit> there were several hundred people there
150 2012-09-23 12:31:32 <Randy_> ?
151 2012-09-23 12:31:37 <Randy_> lol
152 2012-09-23 13:34:08 <Greee> ACTION New Mining Pool Server ( GBT Protocol , ASCI Support ) no fees , pm for details.
153 2012-09-23 13:42:38 <sipa> asci?
154 2012-09-23 13:42:47 <sipa> ah, asic!
155 2012-09-23 13:43:44 <Greee> yeah sry
156 2012-09-23 14:00:19 <Joric> huh ascii mining
157 2012-09-23 14:00:39 <Joric> fancy!
158 2012-09-23 14:25:26 <jgarzik> amiller: IMO the low hanging fruit is getting the network code ready for P2P
159 2012-09-23 14:25:47 <jgarzik> amiller: right now the gevent code is only useful for outgoing connections, and it needs to work for incoming connections too
160 2012-09-23 14:28:41 <ageis> my bitcoin-qt is segfaulting, can anyone make sense of the db.log http://ageispolis.net/test/db.txt
161 2012-09-23 14:28:58 <amiller> jgarzik, k i'll try to fix that up for you right quick now, but i'm expecting to get 50% improvement by replacing the serialization
162 2012-09-23 14:29:06 <amiller> i ran validation over night and i'm still at 175k or so
163 2012-09-23 14:29:50 <jgarzik> amiller: downloading from the network, I presume?
164 2012-09-23 14:30:31 <jgarzik> amiller: I imported from a block chain file with reorganizations inside it, and your new leveldb code does not survive, whereas the older gdbm stuff did:  http://gtf.org/garzik/bitcoin/blk0001.dat.bz2
165 2012-09-23 14:31:52 <amiller> yeah downloading from the network
166 2012-09-23 14:31:56 <sipa> ageis: what does debug.log say?
167 2012-09-23 14:32:10 <amiller> jgarzik, i see, if i try it out with that one, then i can observe reorganizations
168 2012-09-23 14:32:57 <ageis> lets see
169 2012-09-23 14:33:31 <ageis> Loading addresses...
170 2012-09-23 14:33:56 <sipa> ageis: still on 0.6.3 ?
171 2012-09-23 14:34:02 <ageis> ah yes
172 2012-09-23 14:34:15 <ageis> i'll upgrade..
173 2012-09-23 14:34:23 <sipa> wait
174 2012-09-23 14:34:33 <sipa> first try removing addr.dat
175 2012-09-23 14:34:37 <ageis> ok
176 2012-09-23 14:34:52 <sipa> (don't delete wallet.dat by all means)
177 2012-09-23 14:34:57 <ageis> that worked
178 2012-09-23 14:34:59 <ageis> i think
179 2012-09-23 14:35:11 <ageis> so my address file got corrupt somehow
180 2012-09-23 14:35:18 <sipa> yes, looks like it
181 2012-09-23 14:35:37 <sipa> but 0.7 doesn't use addr.dat anymore, a.O; because it frequently got corrupted
182 2012-09-23 14:35:47 <ageis> hehe
183 2012-09-23 14:35:59 <ageis> what does it use?
184 2012-09-23 14:36:14 <sipa> peers.dat, which is not a database file anymore, but manually managed
185 2012-09-23 14:36:22 <ageis> gotcha
186 2012-09-23 14:39:52 <ageis> i like how it shows (out of sync) now, that's nice
187 2012-09-23 14:39:55 <sipa> gmaxwell: yes, parallel signature checking should help a lot, but my latest branch still got reproducible segfaults every few hours, so i'd rather have ultraprune be merged without it
188 2012-09-23 14:41:15 <sipa> well, not necessarily right now, it deserves an awful lot of testing, but i can't guarantee that i'll have enough time soon to fix it
189 2012-09-23 14:42:57 <amiller> jgarzik, when you say does not survive, what happens?
190 2012-09-23 14:43:18 <amiller> and can you walk me through testing this? i've had a hard time getting different forms of bitcoin block dumps to work
191 2012-09-23 14:44:09 <jgarzik> amiller: add loadblock=/path/to/blk0001.dat in pynode's configuration file
192 2012-09-23 14:44:43 <jgarzik> amiller: start with an empty database directory
193 2012-09-23 14:45:15 <jgarzik> amiller: it will say "REORGANIZE" in the log, and call ChainDb's reorganize() method
194 2012-09-23 14:45:23 <jgarzik> amiller: locally, it just kept printing REORGANIZE
195 2012-09-23 14:45:43 <jgarzik> amiller: NOTE: it is possible that this simply exposes a bug that gdbm hit
196 2012-09-23 14:46:13 <amiller> there's probably a simpler solution
197 2012-09-23 14:46:37 <amiller> which is that i had to change a bunch of logic replacing 'if x then' with 'try: x except:'
198 2012-09-23 14:46:51 <amiller> but there were also some 'if not x then's
199 2012-09-23 14:46:57 <amiller> and i probably mixed up half of them :o
200 2012-09-23 14:47:09 <amiller> i probably just haven't seen any reorgs so i didn't catch it
201 2012-09-23 14:47:29 <jgarzik> amiller: yeah, you -never- see that case when IBD'ing across the network
202 2012-09-23 14:47:41 <amiller> ibd?
203 2012-09-23 14:47:45 <jgarzik> amiller: you would need a specially prepared file (like my example), or wait until the network produces one naturally
204 2012-09-23 14:47:53 <amiller> interactive block download?
205 2012-09-23 14:47:56 <jgarzik> amiller: IBD: Initial Block Download, aka network sync
206 2012-09-23 14:50:10 <amiller> https://github.com/amiller/pynode/commit/e268e6556c7ddf726f5c8e707a493edf608e460f#L0R416 ah typo here
207 2012-09-23 14:50:20 <amiller> s/blkmeta/blkmeta:
208 2012-09-23 14:50:35 <amiller> maybe you can fix that and try it out while i download blk0001.dat
209 2012-09-23 15:03:25 <jgarzik> amiller: ok
210 2012-09-23 15:06:47 <amiller> jgarzik, can you give me a digest for blk0001.dat.bz2, i think i corrupted it during the dl
211 2012-09-23 15:07:51 <jgarzik> amiller: bzip2 -tv $file
212 2012-09-23 15:08:00 <jgarzik> amiller: the container has integrity tests too
213 2012-09-23 15:10:05 <jgarzik> amiller: s/blkmeta/blkmeta:/ fix seems to have made progress
214 2012-09-23 15:10:39 <jgarzik> amiller: sha256sum 073b0d2dfd3256ce400692b5115e1a0a5c533c26f66b0ece9a92c43635e8ae86
215 2012-09-23 15:10:49 <jgarzik> amiller: sha1sum d4d92f31e21e38b05c23ead4a26dacfa6f120696
216 2012-09-23 15:11:55 <jgarzik> amiller: crashed at reorg here, http://pastebin.com/HEUDyQs7
217 2012-09-23 15:12:41 <amiller> lol i missed that one too
218 2012-09-23 15:13:15 <jgarzik> amiller: what is the existence test?  self.db.Exists()?
219 2012-09-23 15:13:19 <amiller> no
220 2012-09-23 15:13:28 <jgarzik> amiller: a useless Get?
221 2012-09-23 15:13:35 <amiller> keep going :p
222 2012-09-23 15:13:53 <amiller> use a try: and just put the Get inside the try
223 2012-09-23 15:14:52 <jgarzik> amiller: yes, that is what I meant by a useless Get ;p
224 2012-09-23 15:15:00 <jgarzik> amiller: one code line becomes 3-4
225 2012-09-23 15:15:22 <amiller> no the point is you save a get
226 2012-09-23 15:15:36 <amiller> if you really want an Exists then it's a one-line
227 2012-09-23 15:15:53 <amiller> that will put you back to one line of code for the cases where you really just want an exist (and it isn't followed by a get)
228 2012-09-23 15:16:17 <amiller> def Exists(k): try: db.Get(k) return True except KeyError: return False
229 2012-09-23 15:17:06 <jgarzik> amiller: for the try...etc. idiom, the source code LOC increases at each usage site.
230 2012-09-23 15:17:09 <jgarzik> amiller: for Exists, the underlying machinery wastes time returning a data item's key+value from the database, only to throw away that memory immediately upon return.  That's why most embedded database solutions have Exists as well as Get.
231 2012-09-23 15:17:31 <jgarzik> it matters if you do a lot of existence tests
232 2012-09-23 15:18:24 <jgarzik> or in this case, one must retrieve a key+value... in order to delete it.
233 2012-09-23 15:19:16 <amiller> so there are three cases.
234 2012-09-23 15:19:55 <amiller> 1. Exists followed by Get:  leveldb caches and so does any database with an explicit Exist, so get followed by get is still pretty quick but it's two lines of code there anyway
235 2012-09-23 15:22:33 <amiller> meh i'm not going to finish my enumeration there
236 2012-09-23 15:23:36 <amiller> py-leveldb doesn't come with an Exist
237 2012-09-23 15:23:54 <amiller> https://groups.google.com/d/msg/leveldb/2JJ4smpSC6Q/TYk7tsIQ1E8J this post suggests that there is a faster exist in leveldb that py-leveldb might be upgraded to use
238 2012-09-23 15:26:08 <jgarzik> amiller: owel, too lazy do that ;p
239 2012-09-23 15:26:46 <amiller> as far as idioms go, i'd happily trade a line of code for a much more pleasing idom :]
240 2012-09-23 15:26:59 <jgarzik> amiller: my hands disagree, youngster
241 2012-09-23 15:27:29 <amiller> ACTION launches spitball
242 2012-09-23 15:27:29 <jgarzik> amiller: lunchtime. we'll see if these two fixes make it past reorg...
243 2012-09-23 15:27:32 <jgarzik> ;p
244 2012-09-23 15:27:39 <amiller> k blk.dat.bz2 recovered
245 2012-09-23 16:28:47 <jgarzik> drat.  reorg test broken by a typo.
246 2012-09-23 16:29:00 <jgarzik> one of the reasons why python is inferior to C/C++ ;p
247 2012-09-23 16:30:28 <amiller> jgarzik, you should probably _try_ to run static analysis on your python code :p
248 2012-09-23 16:30:55 <jgarzik> amiller: how?
249 2012-09-23 16:31:36 <amiller> pyflakes/pylint/pychecker
250 2012-09-23 16:32:09 <amiller> i don't know how to use them effectively, and my code probably has lots of unnecessary bugs in it
251 2012-09-23 16:33:05 <amiller> (this doesn't detract from your point about C/C++)
252 2012-09-23 16:35:09 <jgarzik> I can prototype more rapidly in python, being seemingly more productive.  However, a rather large class of errors that is trivial to detect in statically typed languages is missed in python.
253 2012-09-23 16:35:25 <amiller> i'm in the strange place of being halfway between python and haskell
254 2012-09-23 16:35:44 <jgarzik> those are my type of errors, too...  I tend to make few design errors, but plenty of typos-in-obscure-places
255 2012-09-23 16:35:45 <amiller> the types are more powerful, so they express important invariants in your code, not just the trivial things like int->int
256 2012-09-23 16:36:10 <jgarzik> indeed.  and quite a bit of python code is type-static in practice
257 2012-09-23 16:36:11 <amiller> but i'm not productive in anything except python, and barely productive in python so i'm not going to give that up
258 2012-09-23 16:37:18 <jgarzik> rpython is interesting.  there are some other attempts to compile (not just VM) python that add type annotations, changing the language a bit.
259 2012-09-23 16:38:01 <jgarzik> C requires a lot of reinventing the wheel.
260 2012-09-23 16:38:16 <jgarzik> C++ requires acres of boilerplate, to get the core code looking the way you want it too
261 2012-09-23 16:38:37 <jgarzik> Python is rapid-productive, but several classes of errors are easier to achieve and hide in the code
262 2012-09-23 16:38:45 <jgarzik> and python is slow
263 2012-09-23 16:39:10 <amiller> so for when it counts
264 2012-09-23 16:39:28 <amiller> er well especially for small inner loops, i like to use cython
265 2012-09-23 16:39:43 <amiller> that's a statically checked compiled language that goes from python->C and you get a safe python module as a result
266 2012-09-23 16:40:08 <jgarzik> cute
267 2012-09-23 16:40:24 <jgarzik> for production, I would just write a CPython module in C
268 2012-09-23 16:40:29 <amiller> fair enough
269 2012-09-23 16:41:15 <jgarzik> I was thinking about anti-flooding schemes for the pybond DHT, and thinking about proof-of-work.
270 2012-09-23 16:41:29 <jgarzik> if pybond needs to generate a proof-of-work, it would almost certainly require a C module
271 2012-09-23 16:42:09 <amiller> how can you make a dht that has like an adjustable/subjective value for proof-of-work filtering
272 2012-09-23 16:42:15 <amiller> i would want it to be like publish-subscribe somehow
273 2012-09-23 16:42:22 <amiller> where you say "i'm interested in things with X much work"
274 2012-09-23 16:42:25 <amiller> and you get those
275 2012-09-23 16:42:43 <amiller> and you can give higher priority to higher difficulty things
276 2012-09-23 16:43:09 <amiller> this way you could have a combined namespace that's a dht for bitcoin blocks, as well as lower work items like pybond work
277 2012-09-23 16:43:13 <jgarzik> amiller: open to suggestions ;p    I was mainly thinking of requiring proof-of-work for NODE_ID, thus increasing the cost of generating a NODE_ID.  Then you detect flooding based on that.
278 2012-09-23 16:43:21 <amiller> i see
279 2012-09-23 16:43:23 <jgarzik> Obviously not foolproof by any means
280 2012-09-23 16:43:47 <jgarzik> DHTs are vulnerable to data flooding and Sybil attacks
281 2012-09-23 16:44:10 <amiller> hrm
282 2012-09-23 16:44:22 <jgarzik> I was thinking of requiring _some_ sort of cost, to publish data in the DHT
283 2012-09-23 16:44:29 <amiller> dues paying / membership fee
284 2012-09-23 16:44:33 <amiller> yeah
285 2012-09-23 16:44:50 <jgarzik> well, it doesn't have to be a direct monetary cost.  it could be "require PoW" somehow.
286 2012-09-23 16:44:53 <amiller> the trouble with proof of work is that it's often expected to be free, when really the point is that it should be somewhat costly
287 2012-09-23 16:44:56 <amiller> yeah i think that makes plenty of sense
288 2012-09-23 16:45:13 <amiller> how easy can you tolerate making it
289 2012-09-23 16:45:31 <amiller> like someone with a cell phone probably won't be able to perform the work directly on their phone
290 2012-09-23 16:45:39 <amiller> some with a gpu miner can generate a vanity address overnight
291 2012-09-23 16:46:07 <jgarzik> I don't think we've solved the problem yet of:  Crowd pays $Cloud provably.  $Cloud provides decentralized service provably.  $Cloud pays providers provably and fairly.
292 2012-09-23 16:46:17 <jgarzik> so PoW is a cheap, imperfect substitute
293 2012-09-23 16:46:49 <amiller> okay so lets say each node-id is associated with a proof-of-work and a difficulty
294 2012-09-23 16:47:09 <amiller> because the nodeid is a hash, it has zeros in front, and the hash preimage of the nodeid contains the 'bid' for how difficult it's supposed to be
295 2012-09-23 16:47:18 <amiller> as a heuristic reaction to flooding attempts, you start dropping people with too little work
296 2012-09-23 16:47:36 <amiller> when the flooding is low, everyone gets through
297 2012-09-23 16:48:03 <amiller> if there's a ton of flooding affecting everyone, then packets from nodeids with low work get prioritized very low and sometimes discarded
298 2012-09-23 16:48:20 <jgarzik> or perhaps more simply... you have limited resource (capacity) of X.  sort by PoW.
299 2012-09-23 16:48:22 <jgarzik> yep
300 2012-09-23 16:48:36 <amiller> so if we're under a constant siege of flooding
301 2012-09-23 16:48:40 <amiller> then everyone has to work harder to get a nodeid
302 2012-09-23 16:48:44 <jgarzik> yep
303 2012-09-23 16:48:54 <amiller> it's probably possible for an asshole with a supercomputer to knock all the smartphones off the network
304 2012-09-23 16:49:00 <jgarzik> yep :(
305 2012-09-23 16:49:16 <amiller> but he can't hit the laptops or the servers
306 2012-09-23 16:49:20 <amiller> so there's really no effect
307 2012-09-23 16:49:27 <jgarzik> though that creates incentive to work a long time, to buy a long-use NODE_ID
308 2012-09-23 16:49:31 <amiller> right
309 2012-09-23 16:49:42 <jgarzik> NODE_ID reuse is not so good for privacy, though
310 2012-09-23 16:49:50 <amiller> that's the price you pay for anonymity
311 2012-09-23 16:49:56 <jgarzik> true
312 2012-09-23 16:50:16 <amiller> if you're anonymous on bitcoin-otc, you don't get the benefit of an establisehd reputaiton
313 2012-09-23 16:50:30 <amiller> this is the social cost of cheap pseudonyms, it shows up anywhere you might find a sybil attack
314 2012-09-23 16:52:30 <amiller> so maybe the way to put it is that the asshole with a supercomputer can force people to reuse nodeids to a higher extent than they would otherwise
315 2012-09-23 16:52:58 <amiller> if the attacker's goal is to compromise everyone's anonymity by making it expensive to use fresh IDs.... well it's an expensive attack
316 2012-09-23 16:53:20 <jgarzik> yep
317 2012-09-23 16:53:34 <jgarzik> amiller: this seems like a useful idea to explore :)
318 2012-09-23 16:54:34 <amiller> how is priority handled in kademlia/pybond.dht
319 2012-09-23 16:54:38 <jgarzik> amiller: still have the IP-address-level Sybil attacks to worry about.  current bitcoind code, attempting IP address diversity across IP blocks, is useful.
320 2012-09-23 16:54:49 <jgarzik> amiller: priority of what?
321 2012-09-23 16:54:53 <amiller> i dunno
322 2012-09-23 16:54:54 <amiller> items in cache
323 2012-09-23 16:54:57 <amiller> how do you decide what to store
324 2012-09-23 16:55:03 <jgarzik> amiller: currently it is LRU
325 2012-09-23 16:55:12 <jgarzik> amiller: with a size limit, chosen per node
326 2012-09-23 16:55:42 <jgarzik> amiller: could adjust the LRU to consider PoW value, and remove data items first from a lower PoW
327 2012-09-23 16:56:27 <jgarzik> I admit we are in fresh territory, that I have not thought through ;p
328 2012-09-23 16:56:36 <amiller> i've found papers on pub/sub in a dht
329 2012-09-23 16:56:48 <amiller> i wouldn't have thought i'd be able to implement it, but watching you just bang out a kademlia has given me a bit more inspiration
330 2012-09-23 16:57:56 <midnightmagic> sipa: Thanks for the note! I'm running it right now on an EeePC 701-series (the original kind) and it's working very nicely at the moment.
331 2012-09-23 16:58:06 <jgarzik> amiller: you can move fast, when you don't have to worry about breaking existing users ;p
332 2012-09-23 16:59:23 <amiller> http://www.cs.cornell.edu/home/rvr/papers/willow.pdf this is an example of a dht with pubsub, but i don't understand it yet
333 2012-09-23 17:00:08 <jgarzik> amiller: will read...
334 2012-09-23 17:00:59 <jgarzik> amiller: ...but I am suspicious of pub/sub.  All designs seen so far have been vulnerable to amplification attacks, where you trick the network into DDoS'ing third party targets.
335 2012-09-23 17:01:16 <jgarzik> maybe PoW NODE_ID solves that
336 2012-09-23 17:01:22 <jgarzik> s/solves/mitigates/
337 2012-09-23 17:02:01 <amiller> the analog in the ordinary dht is a cache flush attack i think
338 2012-09-23 17:02:18 <amiller> where you can drive a key range out of everyone's cache
339 2012-09-23 17:02:22 <jgarzik> yep
340 2012-09-23 17:02:24 <amiller> by finding the nodes most likely to store it, and filling up their caches
341 2012-09-23 17:03:07 <jgarzik> at a minimum, PoW-based NODE_ID + signed messages might be an interesting avenue to explore.
342 2012-09-23 17:03:19 <jgarzik> Currently Kademlia starts up, and bootstraps by running a query for its own NODE_ID
343 2012-09-23 17:03:50 <jgarzik> in pybond DHT, we could augment that with a STORE of NODE_ID + public keys
344 2012-09-23 17:04:43 <jgarzik> then in pub/sub, you'd be able to authenticate membership requests and publish requests to a degree
345 2012-09-23 17:05:28 <jgarzik> pub/sub also exacerbates another DHT problem, hot spots.  you need special code to proactively avoid hot spots.
346 2012-09-23 17:08:30 <amiller> yeah, if pow node-id works alright for kademlia, then we can worry later about whether or not it also enables a pubsub of some kind
347 2012-09-23 17:10:26 <jgarzik> yep
348 2012-09-23 17:11:05 <jgarzik> for pybond it starts out with a p2p for broadcasting
349 2012-09-23 17:11:15 <jgarzik> so that's the natural avenue for pubsub
350 2012-09-23 17:14:25 <midnightmagic> why are you using a pow mechanism to build nodeid? Am I correct in thinking that it only prevents a single machine from spamming node-ids, but doesn't stop an actual botnet? or are you making the nodeid pow prohibitively hard?
351 2012-09-23 17:14:53 <midnightmagic> (and perhaps let it become a commodity)
352 2012-09-23 17:15:59 <jgarzik> Sadly bonds (and smartcoins of any sort) devolve into using the blockchain as a storage system for property tokens.  It would be nice if there was some way to move that into a temporal alt-chain, and only put a hash in each bitcoin block.  But such a secure non-bitcoin storage system is far afield (maybe gmaxwell has ideas)
353 2012-09-23 17:16:52 <jgarzik> TD[gone]'s argument is powerful, though:  enabling decentralized financial system is the gain, perhaps the cost is bearable.
354 2012-09-23 17:17:11 <jgarzik> ability to do atomic payment/property swaps is also powerful
355 2012-09-23 17:18:05 <jgarzik> midnightmagic: open questions, all
356 2012-09-23 17:18:42 <midnightmagic> jgarzik: If it were explicit that the tokens were not fungible, it would make pruning a lot simpler.
357 2012-09-23 17:18:57 <jgarzik> midnightmagic: PoW NODE_ID controls rate of data flow into DHT.  Higher value PoW NODE_ID traffic has priority over lower value PoW NODE_ID
358 2012-09-23 17:19:00 <midnightmagic> jgarzik: Are you talking about *entire* chains which were temporal?
359 2012-09-23 17:19:19 <jgarzik> midnightmagic: so a botnet could spent a lot of time generating a few high-value PoW NODE_ID's, yes
360 2012-09-23 17:19:29 <jgarzik> midnightmagic: or a botnet could generate many low-value PoW NODE_ID's
361 2012-09-23 17:19:30 <midnightmagic> rate of data flow/store requests or rate of flow of nodeid participation?
362 2012-09-23 17:20:01 <jgarzik> rate of data flow/store, and _a bit of_ nodeid participation
363 2012-09-23 17:20:07 <jgarzik> it doesn't fully address Sybil
364 2012-09-23 17:21:16 <jgarzik> midnightmagic: you could hire a GPU farm to spend a month working on a super-high-value PoW NODE_ID, and sell it on the market
365 2012-09-23 17:21:56 <jgarzik> (or botnet, if you're a criminal)
366 2012-09-23 17:21:59 <midnightmagic> i'm considering the longterm, patient types like btcexpress.  ah interesting if the nodeids are mobile.  I thought it was going to be tied somehow to the origin
367 2012-09-23 17:22:11 <jgarzik> yes, nodeids are completely mobile
368 2012-09-23 17:22:47 <jgarzik> including external IP address into hash was considered, and may wind up being required
369 2012-09-23 17:22:49 <midnightmagic> it makes sense I guess, since how else could they be anything but mobile
370 2012-09-23 17:23:00 <jgarzik> but detected external IP is difficult, so that was shelved for the moment
371 2012-09-23 17:23:05 <jgarzik> *detecting
372 2012-09-23 17:23:11 <jgarzik> yep
373 2012-09-23 17:23:16 <midnightmagic> plus how could you show that the message wasn't just opaquely proxied
374 2012-09-23 17:23:22 <jgarzik> nod
375 2012-09-23 17:23:50 <midnightmagic> i like the notion that data store in the dht itself might require pow
376 2012-09-23 17:24:35 <midnightmagic> by the by, did I ever thank you for writing pushpool?  early on it really helped me keep an eye on my miners.
377 2012-09-23 17:24:41 <midnightmagic> so, thanks!
378 2012-09-23 17:25:13 <midnightmagic> (also the whole, writing code in a way that's easy for me to modify)
379 2012-09-23 17:25:38 <jgarzik> midnightmagic: the base idea was that, in times of data flooding, you may use the numeric PoW value (hash -> 256 bit integer a la bitcoin, for example) as a priority value, to choose what data to ignore, keep cached, or replace an existing cached value
380 2012-09-23 17:26:19 <jgarzik> "capacity" being some limited resource X.  if traffic < X, store all.  if traffic >= X, ignore traffic from lower-priority PoW NODE_ID's
381 2012-09-23 17:26:32 <midnightmagic> ah neat, because there's no guarantee that the data is successfully stored anyway. i guess based on the p2p nature of it too, each node could estimate its chances of getting a key stored and adjust its pow accordingly.
382 2012-09-23 17:28:01 <jgarzik> adjust == raise, you would only ever want to seek a higher value PoW NODE_ID, AFAICS
383 2012-09-23 17:28:10 <midnightmagic> so I didn't catch the primary purpose of the dht in bitcoin; is this for node discovery?
384 2012-09-23 17:28:35 <gmaxwell> midnightmagic: Put down the crackpipe. :P
385 2012-09-23 17:28:35 <jgarzik> midnightmagic: this is pybond, not bitcoin.  see financial hashmap stuff in https://bitcointalk.org/index.php?topic=92421.0
386 2012-09-23 17:29:08 <midnightmagic> well if traffic spiked for a long time, but then went away (like Art came online and starting futzing with things, as he likes to do occasionally) it would be an empty, expensive network to participate in.
387 2012-09-23 17:29:15 <midnightmagic> gmaxwell: lol
388 2012-09-23 17:29:17 <gmaxwell> midnightmagic: you can't use a DHT for node discovery, well you can.. but how will you find the dht nodes? ... use another DHT for DHT node discovery? :)
389 2012-09-23 17:29:39 <midnightmagic> gmaxwell: :) i thought that might get a comment out of you. lol
390 2012-09-23 17:29:40 <jgarzik> midnightmagic: currently pybond just as a stupid, standard Kademlia network, https://github.com/jgarzik/pybond/blob/master/dht.py
391 2012-09-23 17:29:45 <jgarzik> *has
392 2012-09-23 17:30:02 <midnightmagic> jgarzik: ah!  cool.
393 2012-09-23 17:30:59 <midnightmagic> ACTION sets up a mirror
394 2012-09-23 17:31:50 <midnightmagic> whoah! bond markets!
395 2012-09-23 17:32:44 <jgarzik> midnightmagic: even more than that...  smartcoin markets
396 2012-09-23 17:32:59 <jgarzik> midnightmagic: holding a coin may represent... ownership of a house or car
397 2012-09-23 17:33:08 <midnightmagic> jgarzik: with contracts?  i wonder if that could be used to assist the WoT v2 project
398 2012-09-23 17:33:52 <jgarzik> midnightmagic: I'm just doing the base level bond stuff.  issue a bond, people buy bonds, you make payments.
399 2012-09-23 17:34:05 <jgarzik> midnightmagic: contract text can be one part of the hashed bond descriptor, sure
400 2012-09-23 17:34:40 <midnightmagic> ACTION 's mind fills up and stalls
401 2012-09-23 17:35:17 <jgarzik> amiller: well, with those two fixes, pynode at least is shitting itself in familiar ways, when it hits a reorg
402 2012-09-23 17:35:35 <jgarzik> amiller: I think we're close to merging
403 2012-09-23 17:38:09 <midnightmagic> jgarzik: Do you need alpha testers?
404 2012-09-23 17:39:27 <jgarzik> midnightmagic: not really, though you are welcome to play
405 2012-09-23 17:39:54 <midnightmagic> jgarzik: Is there a generic common network somewhere I can join up with yet?
406 2012-09-23 17:40:04 <jgarzik> midnightmagic: not yet
407 2012-09-23 17:40:07 <midnightmagic> k
408 2012-09-23 17:41:11 <sipa> midnightmagic: ah, good to hear it's working nicely
409 2012-09-23 17:41:36 <midnightmagic> sipa: the speed with which it rebuilds a fresh block set is..  so, so nice.
410 2012-09-23 17:42:01 <sipa> "rebuilds" ?
411 2012-09-23 17:42:47 <midnightmagic> sipa: yeah, start from zero, -connect=local.ip, catch up to 200k+ blocks
412 2012-09-23 17:56:44 <Greee> ACTION New Mining Pool Server ( GBT Protocol , ASIC Support ) no fees , pm for details.
413 2012-09-23 18:16:46 <amiller> maybe my isp is tampering my http
414 2012-09-23 18:16:47 <amiller> i duno
415 2012-09-23 18:17:05 <gmaxwell> amiller: it happens.. can you cmp the files?
416 2012-09-23 18:17:14 <gmaxwell> It would be really good to know if that was going on.
417 2012-09-23 18:17:30 <amiller> i sha256sum them
418 2012-09-23 18:17:41 <amiller> i guess i could diff them
419 2012-09-23 18:18:36 <jgarzik> amiller: you've been branded by your ISP as an evil P2P'er ;p
420 2012-09-23 18:18:58 <gmaxwell> amiller: I didn't say "cmp" because I was too lazy to type compare. :P
421 2012-09-23 18:32:23 <amiller> i did run a tor exit for a while, long enough to get ip/banned by 4chan
422 2012-09-23 18:33:36 <jgarzik> amiller: any visits from the FBI or NSL letters?  ;p
423 2012-09-23 18:36:31 <amiller> apparently if i got a NSL i wouldn't even be allowed to answer that question, but no
424 2012-09-23 18:43:10 <eian> gmaxwell: what are the ways two people can create a single transaction?
425 2012-09-23 18:43:33 <eian> using the old signature types, I mean
426 2012-09-23 18:45:47 <gmaxwell> eian: I thought I wrote up a demonstration of this but I can't find it...
427 2012-09-23 18:46:22 <sipa> ACTION has his GPG key signed by RMS
428 2012-09-23 18:46:33 <gmaxwell> eian: they just pick which inputs they want to spend, e.g. one from each person.. then one of them uses createrawtransaction to draft a txn spending both of them to the agreed outputs, signs it, sends it to the other person.. they sign.. then its valid.
429 2012-09-23 18:46:40 <gmaxwell> sipa: 0_o
430 2012-09-23 18:47:16 <jgarzik> eian: this provides a concrete example of two parties building a single transaction: https://bitcointalk.org/index.php?topic=112007.msg1212356#msg1212356
431 2012-09-23 18:47:30 <gmaxwell> sipa: I have signatures from two other FSF board members, never thought to ask RMS.
432 2012-09-23 18:48:06 <jgarzik> I am glad RMS exists, and fights the fight he fights, but he is personally a bit difficult IMO
433 2012-09-23 18:48:29 <eian> thanks
434 2012-09-23 18:48:43 <jgarzik> sipa: speaking of...  we need to do a keysigning
435 2012-09-23 18:49:08 <jgarzik> unfortunately I have to go jump on a bicycle right this second
436 2012-09-23 18:49:12 <jgarzik> *poof*
437 2012-09-23 18:49:28 <sipa> cya
438 2012-09-23 19:06:20 <eian> did the createrawtransaction function exist before version 0.7?
439 2012-09-23 19:07:09 <gmaxwell> eian: 0.7 is the first released version of it??? of course it's always been possible to do this.. just not easily exposed.
440 2012-09-23 19:07:39 <gmaxwell> e.g. absent 0.7 you could use a web transaction creator like joric's  or random custom software..
441 2012-09-23 19:20:01 <sipa> midnightmagic: how long does that take you?
442 2012-09-23 19:51:44 <amiller> https://gist.github.com/3516775736e2b9befd9c this is my result from diffing the two corrupted files i downloaded
443 2012-09-23 19:51:49 <amiller> only one or two segments were corrupted
444 2012-09-23 19:51:55 <amiller> different ones in each
445 2012-09-23 19:52:31 <amiller> which means that after i downloaded two corrupt versions, i could have just identified the segments that differed and tried the 8 or so ways of merging them
446 2012-09-23 19:52:35 <Luke-Jr> ACTION has a bunch of those kind of things - would be nice if there was a tool to look at 3-N files and decide what the consensus for each block was <.<
447 2012-09-23 19:52:49 <Luke-Jr> or with smart validity checking, sure
448 2012-09-23 20:26:35 <eian> what is actually being signed in a tx input script?
449 2012-09-23 20:27:36 <sipa> ,the transaction, with the signature itself erased, and some other processing
450 2012-09-23 20:28:05 <sipa> i believe tx input references are replaced by the output referenced
451 2012-09-23 21:06:48 <amiller> hrm
452 2012-09-23 21:07:52 <amiller> jgarzik, what would be the easiest way to make pynode function somewhat like an spv client
453 2012-09-23 21:08:25 <amiller> what i'd want it to do is skip signature validation, but validate all the work, and gradually fill in the data for the blocks and txes
454 2012-09-23 21:09:04 <amiller> how do i skip the IBD and just ask for blocks
455 2012-09-23 21:09:04 <jgarzik> amiller: one easy thing to do is add checkpoints.  don't check scripts < block height X
456 2012-09-23 21:10:27 <amiller> oh
457 2012-09-23 21:10:31 <amiller> how do i feed pynode a checkpoint
458 2012-09-23 21:10:46 <jgarzik> amiller: script verification is accomplished when ChainDb.connect_block calls self.tx_signed
459 2012-09-23 21:10:53 <amiller> i can just tell it to use the latest block from my normal bitcoind
460 2012-09-23 21:11:01 <amiller> actually for that matter
461 2012-09-23 21:11:04 <amiller> how do i tell bitcoind to do that too
462 2012-09-23 21:11:35 <jgarzik> amiller: copy the list from bitcoind at https://github.com/bitcoin/bitcoin/blob/master/src/checkpoints.cpp#L24
463 2012-09-23 21:11:49 <jgarzik> amiller: bitcoind should already do that
464 2012-09-23 21:13:17 <amiller> no i don't mean developer checkpoints
465 2012-09-23 21:13:25 <amiller> i mean i alreayd have a bitcoind up to date on my laptop
466 2012-09-23 21:13:32 <amiller> and i want to spawn a new bitcoind somewhere else on a server
467 2012-09-23 21:13:39 <amiller> i don't want to revalidate all the txes on the new machine
468 2012-09-23 21:13:44 <amiller> but i don't want to personally transfer everything to it
469 2012-09-23 21:13:55 <amiller> so i just want to tell it about my personal current head block
470 2012-09-23 21:14:09 <amiller> basically i just want to provide my own checkpoint
471 2012-09-23 21:14:17 <amiller> bitcoind -checkpoint=<blockhash>
472 2012-09-23 21:15:10 <gmaxwell> amiller: that just isn't how it works.
473 2012-09-23 21:15:37 <jgarzik> amiller: The best you can do is copy blk*.dat from bitcoind A to bitcoind B
474 2012-09-23 21:15:41 <gmaxwell> amiller: how can you tell that a _subsequent_ block is valid without the set of unspent outputs that you didn't build.
475 2012-09-23 21:15:53 <amiller> i still need to build the indexes
476 2012-09-23 21:15:55 <amiller> and download all the data
477 2012-09-23 21:15:57 <amiller> i just skip tx validation
478 2012-09-23 21:16:05 <amiller> and i potentially download the data in a better order
479 2012-09-23 21:16:25 <gmaxwell> amiller: the validation isn't a big deal though. It's something like 15 minutes of cpu time for a full chain.
480 2012-09-23 21:16:41 <amiller> oh
481 2012-09-23 21:16:53 <amiller> it seems much slower on pynode
482 2012-09-23 21:16:59 <gmaxwell> when we talk about validation taking all the work its really the index manipulation.
483 2012-09-23 21:17:58 <gmaxwell> I like python, but sometimes I wish it (and JS and ruby and PHP) were outlawed because they screw up people's reasoning about computational complexity.
484 2012-09-23 21:18:32 <amiller> that's rubbish
485 2012-09-23 21:19:05 <amiller> bitcoind is taking much much longer than 15 minutes to validate a whole chain, so maybe i'm doing something horribly wrong
486 2012-09-23 21:19:22 <sipa> the actual sig checking for tge entire chain including before the checkpoints, takes 1h cpu time on my laptop, when multithreaded
487 2012-09-23 21:19:45 <gmaxwell> sipa: ah, I was only thinking higher than the highest current checkpoint.
488 2012-09-23 21:19:46 <sipa> eh, 1h wall clock time, not cpu time
489 2012-09-23 21:20:36 <gmaxwell> amiller: most of the time stock bitcoind spend validating the chain is thrashing the databases.
490 2012-09-23 21:20:40 <sipa> and amiller: you can't measure the checking time without the indexing
491 2012-09-23 21:21:04 <amiller> i don't understand what the checkpoint saves you, you still have to do an indexing run from the beginning?
492 2012-09-23 21:21:10 <amiller> or is the checkpoint somehow a hash of an index structure?
493 2012-09-23 21:21:35 <sipa> it's just a block hash
494 2012-09-23 21:21:41 <jgarzik> amiller: checkpoint says "you may skip certain checks, if height < X"
495 2012-09-23 21:21:48 <jgarzik> script verification is one of those
496 2012-09-23 21:21:52 <gmaxwell> amiller: it just turns of ECDSA before the top checkpoint; saves some cpu. (which matters more perhaps because ECDSA and IO are not overlapped)
497 2012-09-23 21:22:07 <amiller> ecdsa and script verification then.
498 2012-09-23 21:23:07 <jgarzik> amiller: for pynode, checkpoints will have an enormous impact
499 2012-09-23 21:23:12 <amiller> so, what i would like to do is to be able to do bitcoind -checkpoint=<blockhash> and skip all those checks as a first pass
500 2012-09-23 21:23:18 <amiller> or checkpoint=<blockhash> in config.cfg for pynode
501 2012-09-23 21:23:28 <gmaxwell> I suspect we wouldn't have bothered with that optimization it bitcoind if we had a better handle on what was causing the slowness when it was done.
502 2012-09-23 21:23:33 <jgarzik> amiller: need hash + height
503 2012-09-23 21:23:42 <amiller> the height's included in the block header isn't it?
504 2012-09-23 21:23:47 <jgarzik> no
505 2012-09-23 21:23:57 <amiller> lol. great
506 2012-09-23 21:24:01 <amiller> so actually (hash,height) is the digest of a block
507 2012-09-23 21:24:10 <gmaxwell> amiller: it wouldn't buy you anything useful for your case.
508 2012-09-23 21:24:42 <gmaxwell> amiller: whats your current height on the node you're saying is slow?
509 2012-09-23 21:25:09 <amiller> well pynode is taking its sweet time around 178692
510 2012-09-23 21:25:20 <amiller> ;;bc,blocks
511 2012-09-23 21:25:21 <gribble> 200229
512 2012-09-23 21:25:21 <sipa> amiller: also, are you running ultraprune?
513 2012-09-23 21:25:27 <gmaxwell> oh well, I dunno what pynode does. thats another matter.
514 2012-09-23 21:25:28 <amiller> sipa, no i'm not running ultraprune for my bitcoin client
515 2012-09-23 21:26:09 <amiller> gmaxwell, on my server i'm stuck at 185025 and chugging along
516 2012-09-23 21:26:09 <sipa> if not, you're really just benchmarking bdb's filesystem syncs...
517 2012-09-23 21:26:24 <gmaxwell> Again, this is what I was saying about python screwing up reasoning about this stuff.  To get script validation (vs ecdsa) to show up in a profile run on bitcoind, even with checkpoints disabled, you have to set the profiler to show things with <1% usage.
518 2012-09-23 21:26:41 <gmaxwell> amiller: stuck? thats that because the fetching logic is stupid.
519 2012-09-23 21:27:05 <gmaxwell> amiller: is it really not moving at all?
520 2012-09-23 21:27:23 <amiller> no it's not stuck i didn't mean to use that word
521 2012-09-23 21:27:30 <amiller> it's not moving very fast though
522 2012-09-23 21:27:47 <amiller> i have no idea how it's spending its time though, i've only profiled the pynode part
523 2012-09-23 21:27:53 <gmaxwell> in any case, the highest checkpoint is currently at 193000.
524 2012-09-23 21:28:14 <amiller> so if i'm running with default settings i'm not even doing signature checks below 193000
525 2012-09-23 21:28:18 <amiller> and all this time is spent indexing
526 2012-09-23 21:28:22 <amiller> but i'm not using ultraprune
527 2012-09-23 21:29:03 <gmaxwell> Right.
528 2012-09-23 21:29:35 <amiller> so i'd like to be able to set a checkpoint through an option rather than in the code if it doesn't make too much of a difference
529 2012-09-23 21:30:23 <amiller> but it doesn't make much of a difference so maybe just retract where i suggested it as a bitcoind thing
530 2012-09-23 21:30:32 <sipa> amiller: in ultraprune, i can do over 3 blocks/s in the final part of the chain, including script and ecdsa
531 2012-09-23 21:30:52 <sipa> single threaded
532 2012-09-23 21:30:54 <amiller> sipa, how fast without script/ecdsa?
533 2012-09-23 21:31:19 <sipa> i suppose a multiple of that
534 2012-09-23 21:31:27 <sipa> maybe 10/s
535 2012-09-23 21:32:19 <sipa> with parallel sig checking, close to 7/s
536 2012-09-23 21:32:35 <jgarzik> amiller: for testing purposes, just comment out tx_signed
537 2012-09-23 21:33:12 <sipa> but these are numbers from my memory, probably not from identical parts in the chain
538 2012-09-23 21:33:24 <[\\\\\\]> just a suggestion.. on bitcoin.org, instead of "#bitcoin-mining (GPU mining related)", use "#bitcoin-mining (Bitcoin mining related)"
539 2012-09-23 21:34:48 <gmaxwell> [\\\\\\]: submit pull request? :P
540 2012-09-23 21:35:07 <gmaxwell> (not that it's really required; but its useful if more people are comfortable doing them)
541 2012-09-23 21:36:24 <[\\\\\\]> Gmaxwell, it'd take me more time to submit the request than it would for somone to just make the change the next time they're doing other stuff.  Its technically not wrong asis, just less relevant. :D
542 2012-09-23 21:42:29 <amiller> jgarzik, this is the profile chart of running pynode without tx_signed http://i.imgur.com/HzH0u.png
543 2012-09-23 21:43:18 <amiller> i wonder why i'm doing calc_merkle in this stage
544 2012-09-23 21:44:20 <eian> amiller, what generated that profile chart?
545 2012-09-23 21:44:31 <eian> what tool I mean
546 2012-09-23 21:44:33 <amiller> i use ipython's profiling tool
547 2012-09-23 21:44:55 <eian> that's awesome - I wonder if there is something like that for C++
548 2012-09-23 21:46:35 <amiller> there is, gprof
549 2012-09-23 21:47:00 <eian> I've used it but only via command line
550 2012-09-23 21:47:05 <eian> Does it generate graphs like this?
551 2012-09-23 21:47:20 <amiller> oh, yeah i'm using a tool called gprof2dot to convert the gprof style output to a pretty graph
552 2012-09-23 21:48:35 <gmaxwell> eian: kcachegrind can generate plots like that from valgrind callgrind output.
553 2012-09-23 21:49:25 <amiller> so what i want to be able to do is to keep a special table of valid blocks
554 2012-09-23 21:49:26 <gmaxwell> And can give charts like http://people.xiph.org/~greg/ultraprune_profile2.png  (from an old ultraprune build)
555 2012-09-23 21:49:28 <amiller> if a block is valid, it's always valid
556 2012-09-23 21:49:32 <amiller> and all the txes in it are valid
557 2012-09-23 21:50:25 <sipa> amiller: ultraprune keeps a flag in the block database about how well it was validated
558 2012-09-23 21:50:32 <sipa> in several stages
559 2012-09-23 21:50:49 <sipa> so ot never needs to redo a certain check
560 2012-09-23 21:51:22 <sipa> signature/script checking being the last stage
561 2012-09-23 21:51:51 <eian> gmaxwell, I've apparently been doing profiling like a cave man - that graphic is amazing
562 2012-09-23 21:52:18 <gmaxwell> eian: it's interactive in kcachegrind to..
563 2012-09-23 21:52:27 <amiller> that's awesome, ultraprofiled
564 2012-09-23 21:52:35 <gmaxwell> you can click to explode out any part (that graph is generated with the detail cranked up)
565 2012-09-23 21:52:52 <amiller> sipa okay i think i get it, maybe i should port ultraprune to pynode
566 2012-09-23 21:52:52 <gmaxwell> and you can flip between parent relative and absolute percentages.
567 2012-09-23 21:54:16 <amiller> what's the best writeup of ultraprune's data structures?
568 2012-09-23 21:54:32 <amiller> jgarzik also has his own db format
569 2012-09-23 21:54:46 <amiller> not db format but i mean, data organization on top of a db
570 2012-09-23 21:55:26 <amiller> actually i think you explained the whole thing to me once already
571 2012-09-23 21:56:10 <jgarzik> amiller: Just implemented pynode checkpoints, including skipping script verf if < height 193000.
572 2012-09-23 21:58:58 <sipa> amiller: i'm actually writing a document about the serialization and data structures, but it's not finished or online
573 2012-09-23 22:00:25 <jgarzik> amiller: with this checkpointing code, we can now have a true comparison between gdbm and leveldb (if that's not already good for a laugh)
574 2012-09-23 22:00:54 <sipa> basically it's a txid -> [pruned txouts] map on the one hand, and a blkid -> diskpos map otherwise
575 2012-09-23 22:01:15 <sipa> with some metadata
576 2012-09-23 22:01:17 <amiller> are the blocks stored to promote seeking in some way
577 2012-09-23 22:01:22 <amiller> i guess there's stored in order
578 2012-09-23 22:01:30 <amiller> and when processing a backlog of transactions you
579 2012-09-23 22:01:34 <amiller> nevermind they're probably mostly random seeks?
580 2012-09-23 22:01:56 <jgarzik> amiller: stored in time order.  retrieval is random... but less likely to go far in the past.
581 2012-09-23 22:02:04 <sipa> you don't need the blocks for validation when you have txouts
582 2012-09-23 22:02:23 <sipa> that's what makes ultraprune fast
583 2012-09-23 22:02:36 <amiller> i thought you were hitting the blocks in ultraprune for some reason still
584 2012-09-23 22:02:47 <amiller> oh no nvm i remember what i was confused about, it's the txid
585 2012-09-23 22:02:50 <amiller> hrm
586 2012-09-23 22:02:52 <sipa> reorgs, rescans and serving
587 2012-09-23 22:03:22 <sipa> but not validation
588 2012-09-23 22:03:28 <amiller> got it
589 2012-09-23 22:03:45 <sipa> ACTION bedtime
590 2012-09-23 22:03:51 <jgarzik> sipa: we should have a NODE_VALIDATION nServices bit.  Set it now, in bitcoind
591 2012-09-23 22:04:00 <jgarzik> sipa: then remove NODE_NETWORK, if not archive node
592 2012-09-23 22:05:03 <sipa> jgarzik: something like that, indeed
593 2012-09-23 22:07:35 <gmaxwell> define a NODE_VALIDATION as being able and willing to forward txn and blocks, do full validation, offer full headers,  and be able to serve at least the last N days of blocks? (better to specify the latter with time instead of height; so a node knows who it needs to talk)
594 2012-09-23 22:07:39 <gmaxwell> oh.. hm.
595 2012-09-23 22:07:53 <gmaxwell> sipa: an ultraprune node without the blocks can't act as a server for SPV nodes, can it?
596 2012-09-23 22:08:52 <gmaxwell> (it can't generate tree fragments showing even an unspent tx was ever mined)
597 2012-09-23 22:10:19 <jgarzik> gmaxwell: issues like that are why we should avoid pruning blocks in the ref client, for a long time to come
598 2012-09-23 22:10:59 <sipa> gmaxwell: indeed, you need an archive node for that
599 2012-09-23 22:11:34 <sipa> jgarzik: i disagree, we don't need 10000 nodes serving blocks
600 2012-09-23 22:11:59 <amiller> do any rpc nodes currently answer questions of the form "what's the merkle branch and header chain showing that <txid> was mined, before block <blkhash>"
601 2012-09-23 22:12:18 <jgarzik> sipa: I rather like that we do, and think bitcoin is stronger for it ;p
602 2012-09-23 22:12:52 <jgarzik> amiller: not AFAIK
603 2012-09-23 22:13:37 <sipa> jgarzik: i think we are losing tons of full nodes in favor of non-nodes like electrum because of insisting that every node can provide syncup for every newcomer
604 2012-09-23 22:13:41 <maaku> block chain data is self-validating, there's no technical reason to have a few geographically distributed archive nodes
605 2012-09-23 22:13:50 <gmaxwell> sipa: meh, it doesn't need a full archival though, it needs an archive of the hash trees.
606 2012-09-23 22:14:04 <maaku> ???not to have...
607 2012-09-23 22:14:25 <jgarzik> gmaxwell: a full list of txid's
608 2012-09-23 22:14:59 <gmaxwell> maaku: it's very important that you be able to _get_ the data, since it's not self validating if you can't get it. Having lots of copies held by involantary altruists keeps the cost of access low.
609 2012-09-23 22:15:08 <jgarzik> you might as well have the full txid list, and can still generate merkle branches on demand
610 2012-09-23 22:15:21 <amiller> there's no reason for each individual to need to store the whole thing, you can shard the effort
611 2012-09-23 22:15:39 <jgarzik> amiller: theory != practice, there
612 2012-09-23 22:16:13 <amiller> so right now we have many people with no blockchain contribution, and some people with full blockchain contribution
613 2012-09-23 22:16:19 <jgarzik> amiller: difficult to find the right incentives for a distributed network to do that in a reliable and trustworthy manner
614 2012-09-23 22:16:34 <maaku> look at freenet
615 2012-09-23 22:16:37 <gmaxwell> amiller: I dunno about how many of the 'many' are likely to ever have any.
616 2012-09-23 22:16:38 <sipa> and efficient...
617 2012-09-23 22:18:24 <jgarzik> sipa: anyway, I agree to the point about losing full nodes in favor of non-nodes...  but it is also true that the total network count of full nodes will take an enormous nosedive, once a future bitcoin release prunes old blocks by default.
618 2012-09-23 22:18:31 <jgarzik> a red flag day
619 2012-09-23 22:18:57 <jgarzik> shift from involuntary altruists -> voluntary, prepared altruists with now the weight of bitcoin on their shoulders
620 2012-09-23 22:19:48 <Luke-Jr> ?
621 2012-09-23 22:20:16 <jgarzik> Luke-Jr: the only people remaining on the network are those who intentionally elect to be full nodes... which would no longer be default
622 2012-09-23 22:20:19 <sipa> jgarzik: no need to prune by default :)
623 2012-09-23 22:20:21 <amiller> http://i.imgur.com/VVfvW.png
624 2012-09-23 22:20:30 <jgarzik> anyway, baby bedtime, bbiab
625 2012-09-23 22:22:04 <gmaxwell> amiller: I'm concerned that elements of our model that work for rational actors may not work too well for real people, because real people aren't quite rational enough to weigh the value of running a network node over just using a thin client. So I think that one ways we can help keeping things functional is by using the power of defaults to encourage involuntary altruism. E.g. run full/archival nodes by default when the system can support it
626 2012-09-23 22:22:35 <gmaxwell> Of course, that means we have to get the burden from those features low enough that it doesn't mess with the user's selfish motivations and cause them to go the thinclient route.
627 2012-09-23 22:23:25 <amiller> i agree with that as a pragmatic decision, it just deserves to be highlighted since it leads to a false dichotomy
628 2012-09-23 22:23:29 <gmaxwell> so e.g. a new node should bootstrap as a SPV node, and become a full / archival / etc node in the background.  By default.. perhaps checking what sort of system it's on. (if it'll run out of space, don't bother)
629 2012-09-23 22:24:00 <amiller> the false dichotomy is about being in between a full node and an spv node
630 2012-09-23 22:24:17 <gmaxwell> well, that was the original design; we've since had a lot of powerful ideas.
631 2012-09-23 22:24:28 <amiller> not even all of them require the full merkle thing either
632 2012-09-23 22:24:45 <amiller> this is just about having a way to bootstrap a node by providing it a checkpoint from an _already validated_ blockchain
633 2012-09-23 22:25:00 <gmaxwell> amiller: just be careful that you don't assume a bunch of essential location finding stuff that _no one knows how to make attack resistant_.
634 2012-09-23 22:25:26 <amiller> i'm not, i assume that the service providers are going to be centralized and shared
635 2012-09-23 22:25:33 <amiller> like why don't we have a ton of blk0001.dat on s3
636 2012-09-23 22:25:39 <gmaxwell> amiller: what you're asking for there is simply incompatible with the current architecture of the software.
637 2012-09-23 22:25:59 <amiller> gmaxwell, it's incompatible with the current architecture of the satoshi client
638 2012-09-23 22:26:05 <gmaxwell> Yes.
639 2012-09-23 22:26:12 <gmaxwell> thats all.
640 2012-09-23 22:26:13 <amiller> but 'software' all around includes electrum non-nodes and full-nodes satoshi
641 2012-09-23 22:26:28 <amiller> there's a useful middle ground which is a validating node but yet does not have the whole burden of being a full node
642 2012-09-23 22:26:31 <gmaxwell> yes, so what?