1 2016-01-06 03:11:47 <rusty> Hmm, bitcoin-cli giving me "error: couldn't parse reply from server" under load (couple of dozen of them running). Is this a known issue?
2 2016-01-06 03:12:59 <jgarzik> rusty, Sounds like truncated JSON, due to truncated reply. What RPC?
3 2016-01-06 03:13:23 <jgarzik> rusty, and is this modern libevent http server or prehistoric crapola server?
4 2016-01-06 03:13:43 <rusty> jgarzik: nothing exotic, just localhost... hmm, recent git build.
5 2016-01-06 03:14:15 <rusty> jgarzik: sometimes they all pass, sometimes they all fail. Let me just check it's not something weird I'm doing (though I'd expect a different error if my caller were screwing up)
6 2016-01-06 03:14:39 <rusty> v0.12.99.0-7922592
7 2016-01-06 03:18:13 <jgarzik> rusty, what RPCs? large data-output RPCs like getrawmempool?
8 2016-01-06 03:18:27 <rusty> jgarzik: nope, getrawtransaction
9 2016-01-06 03:19:10 <rusty> jgarzik: stracing now...
10 2016-01-06 03:19:39 <rusty> jgarzik: ah!
11 2016-01-06 03:19:40 <rusty> "HTTP/1.1 500 Internal Server Error\r\nDate: Wed, 06 Jan 2016 03:17:47 GMT\r\nContent-Length: 25\r\nContent-Type: text/html; charset=ISO-8859-1\r\nConnection: close\r\n\r\nWork queue depth exceeded", 184
12 2016-01-06 03:20:03 <jgarzik> bitcoin-cli error reporting could be improved
13 2016-01-06 03:20:10 <rusty> jgarzik: err.. yeah :)
14 2016-01-06 03:22:05 <rusty> jgarzik: so, the answer seems to be no more than 4 requests at once by default.
15 2016-01-06 03:22:46 <jgarzik> rusty, nod, easy to fix with command line option
16 2016-01-06 03:23:11 <jgarzik> rusty, the better fix is buffering requests rather than dropping
17 2016-01-06 03:23:22 <jgarzik> that's what normal http servers do
18 2016-01-06 03:23:55 <rusty> Actaully, it's the queue limit. 16 per thread.
19 2016-01-06 03:25:31 <rusty> Can also be adjusted, but I don't see why there's a limit there, really.
20 2016-01-06 03:37:21 <phantomcircuit> rusty, memory exhaustion prevention? 16 seems low though
21 2016-01-06 03:38:15 <rusty> phantomcircuit: AFAICT bitcoind's RPC doesn't have any async commands, so you can only have one command per socket fd. That seems sufficient limit to me.
22 2016-01-06 04:08:40 <jtoomim> what's the O( ) scaling for a single UTXO lookup from disk (e.g. SSD)?
23 2016-01-06 04:09:29 <jtoomim> vs UTXO set size
24 2016-01-06 04:11:58 <petertodd> jtoomim: O(<something log something>) likely, if you're talking about the theory and physics
25 2016-01-06 04:12:07 <petertodd> jtoomim: in practice it's lumpy; it's lot a O() scaling question
26 2016-01-06 04:12:20 <petertodd> *not a
27 2016-01-06 04:13:37 <jtoomim> lumpy because of the 4 kB block size of disks?
28 2016-01-06 04:14:36 <jtoomim> my question i guess is about the leveldb backend for coins entries, how it's organized, how it's searched, etc.
29 2016-01-06 04:15:02 <petertodd> jtoomim: it's lumpy because disks are physical things with finite bandwidth
30 2016-01-06 04:15:04 <jtoomim> it seems like it should be O(log n) if you're using a tree or something like that on disk
31 2016-01-06 04:16:08 <jtoomim> do we have any indexes for where to find certain utxos on disk, or do we just search all the entries until we find it?
32 2016-01-06 04:16:17 <jtoomim> is the index stored in memory, or is it on disk?
33 2016-01-06 04:16:33 <jtoomim> if you don't know offhand, no big deal, i'll read the code when i get around to it
34 2016-01-06 04:16:36 <petertodd> jtoomim: how would you "seach all the entries"...
35 2016-01-06 04:16:37 <jtoomim> just curious, really
36 2016-01-06 04:16:51 <phantomcircuit> jtoomim, with leveldb it is... O(log n) for n entires modulo a ton of implementation details
37 2016-01-06 04:17:13 <jtoomim> ok, thanks phantomcircuit
38 2016-01-06 04:17:42 <phantomcircuit> jtoomim, there's a *ton* of implementation details though
39 2016-01-06 04:18:01 <jtoomim> not surprised
40 2016-01-06 04:18:11 <jtoomim> needing atomic writes makes things tricky
41 2016-01-06 04:18:37 <petertodd> jtoomim: like I say, this isn't a O() scaling question in practice
42 2016-01-06 04:19:09 <petertodd> jtoomim: keep in mind a lot of nodes are on shit VPS's
43 2016-01-06 04:19:27 <phantomcircuit> jtoomim, for example by default there's a ton of caching layers between bitcoin and the data
44 2016-01-06 04:19:38 <jtoomim> right
45 2016-01-06 04:19:55 <jtoomim> i'm just thinking if maybe we could have a different way of caching/storing the data
46 2016-01-06 04:20:11 <petertodd> jtoomim: have you seen my txo commitments?
47 2016-01-06 04:20:19 <jtoomim> like maybe having a map in memory of address to disk index
48 2016-01-06 04:20:58 <jtoomim> to try to reduce the implementation details, and make better use of disk io
49 2016-01-06 04:20:59 <petertodd> jtoomim: random lookup of a set of things is always going to be a simple question of how many IOPs you can push, but you can make the size of that set small enough to fit into RAM
50 2016-01-06 04:21:09 <phantomcircuit> bitcoind internal cache L1, L2, L3 memory, os page cache L2, L3 memory, storage memory cache, storage itself
51 2016-01-06 04:21:10 <jtoomim> but at this point, i'm just doing shower thoughts
52 2016-01-06 04:21:14 <petertodd> jtoomim: <for some definition of "a small amount of ram">
53 2016-01-06 04:21:45 <jtoomim> petertodd: no i haven't seen txo commitments
54 2016-01-06 04:24:00 <phantomcircuit> and each is more or less an order of magnitude slower than the previous
55 2016-01-06 04:24:00 <phantomcircuit> each of those will have it's own independent complexity for random access
56 2016-01-06 04:24:00 <phantomcircuit> petertodd, i suspect there's something that could be done to improve memory usage of the dbcache but only at pretty extreme cpu cycle cost
57 2016-01-06 04:24:35 <phantomcircuit> it might be worth it but only on systems where power consumption is irrelevant
58 2016-01-06 04:24:35 <phantomcircuit> jtoomim, tl;dr optimization beyond what is already being done would require knowledge of the target hardware
59 2016-01-06 04:25:16 <petertodd> jtoomim: and if you need that kind of heroics, something is very wrong with your system in a decentralized environment...
60 2016-01-06 04:26:35 <petertodd> jtoomim: it's really bad when getting into mining requires knowledge of a whole bunch of esotetic stuff like SSD optimization
61 2016-01-06 04:26:59 <petertodd> jtoomim: equally, that's a system with no safety margin
62 2016-01-06 04:27:01 <jtoomim> petertodd: i disagree on that point. I have no sympathy for lazy or incompetent miners who still want to be profitable.
63 2016-01-06 04:27:22 <phantomcircuit> petertodd, ssd optimization? psh you clearly need more memory
64 2016-01-06 04:27:34 <petertodd> phantomcircuit: heh, indeed
65 2016-01-06 04:27:51 <petertodd> phantomcircuit: which can be a pretty big fixed cost
66 2016-01-06 04:28:02 <jtoomim> like $200?
67 2016-01-06 04:28:16 <petertodd> jtoomim: we're designing a decentralized system; those miners add much needed diversity
68 2016-01-06 04:28:18 <jtoomim> compared to the cost of a single antminer S7, which is $1200?
69 2016-01-06 04:28:40 <petertodd> jtoomim: UTXO size can currently grow by gigabytes per year
70 2016-01-06 04:28:55 <petertodd> jtoomim: lots of scenarios where that happens
71 2016-01-06 04:29:17 <jtoomim> are you talking about 50 GB per year, or 1 GB per year, or something in between?
72 2016-01-06 04:29:50 <jtoomim> 32 GB costs about $200
73 2016-01-06 04:29:55 <petertodd> jtoomim: equally, in a healthy system you need anonymity to blunt attacks, which means running a full node with sufficient speed to help minres be profitable can't be unusual
74 2016-01-06 04:30:06 <jtoomim> so if we had to spend that much on RAM each year, we would be... not spending much
75 2016-01-06 04:30:38 <petertodd> jtoomim: yes, we can grow up to ~50GB per year in theory, in practice some % of that, probably low double-digits is a realistic scenario
76 2016-01-06 04:31:00 <jtoomim> ok, petertodd, at this point you're just repeating points which you should know i disagree with, and not really answering my questions earlier about how the current system works
77 2016-01-06 04:31:02 <petertodd> jtoomim: VPS's with lots of ram are expensive
78 2016-01-06 04:31:43 <jtoomim> petertodd, i think you don't have a good idea of the economics of mining.
79 2016-01-06 04:31:55 <jtoomim> my company spends over $10k per month on electricity.
80 2016-01-06 04:32:03 <petertodd> jtoomim: I think this is a case where you're designing a different system than I am
81 2016-01-06 04:32:09 <jtoomim> and we've got about 0.2% of the network hashrate or something like that.
82 2016-01-06 04:32:29 <jtoomim> and we've also got very low electricity prices.
83 2016-01-06 04:32:30 <petertodd> jtoomim: your company depends on a wider community
84 2016-01-06 04:33:00 <jtoomim> the hardware cost to run a node is basically free to us, even when we overspend
85 2016-01-06 04:33:03 <petertodd> jtoomim: btw do you mine, or point that hashing power at a pool
86 2016-01-06 04:33:16 <phantomcircuit> jtoomim, to get the absolute lowest switching time possible you currently need like 64GB of ram
87 2016-01-06 04:33:19 <jtoomim> we do p2pool as much as possible
88 2016-01-06 04:33:20 <petertodd> jtoomim: again, the hardware cost that matters isn't your node, it's everyone elses nodes
89 2016-01-06 04:33:53 <petertodd> jtoomim: what % is p2pool at right now?
90 2016-01-06 04:34:04 <jtoomim> petertodd: less than 1%
91 2016-01-06 04:34:11 <jtoomim> it's about 1.4 PH/s
92 2016-01-06 04:34:43 <jtoomim> and don't you blame that on the costs of running a node, because that has nothing to do with p2pool's problems
93 2016-01-06 04:34:46 <petertodd> jtoomim: heh, you're probably in a position where you're actually taking away shares from other p2pool users in theory, although it's probably irrelevant compared to variance reduction benefits
94 2016-01-06 04:35:18 <petertodd> jtoomim: p2pool has all the perverse latency incentive issues that bitcoin does, but in minature
95 2016-01-06 04:35:21 <jtoomim> we operate several different nodes. we don't have all our hashrate on one node.
96 2016-01-06 04:35:35 <jtoomim> but yes, it's been a concern of mine
97 2016-01-06 04:36:06 <jtoomim> but the main concern of mine for p2pool is the fact that antminers don't work reliably with it
98 2016-01-06 04:36:09 <jtoomim> and lose hashrate
99 2016-01-06 04:36:09 <petertodd> jtoomim: what's the ping time between those nodes?
100 2016-01-06 04:36:20 <jtoomim> depends on what kind of ping you're talking about
101 2016-01-06 04:36:23 <jtoomim> p2pool is mostly cpu bound
102 2016-01-06 04:36:51 <petertodd> jtoomim: if you have lower average latency to the rest of the p2pool community you've get more than your fair share of p2pool shares
103 2016-01-06 04:36:53 <jtoomim> so the net-stack-level ping can be hundreds of milliseconds, or about our second for the slowest node
104 2016-01-06 04:36:57 <phantomcircuit> ;;calc 10000/((1400000 * 0.25)/1000)
105 2016-01-06 04:36:58 <gribble> 28.5714285714
106 2016-01-06 04:37:17 <jtoomim> in practice, it's mostly CPU, not network
107 2016-01-06 04:37:36 <phantomcircuit> ;;calc 28.57 / 730
108 2016-01-06 04:37:37 <gribble> 0.0391369863014
109 2016-01-06 04:37:39 <jtoomim> but yeah, 0.1 ms ping on the network stack
110 2016-01-06 04:37:46 <phantomcircuit> jtoomim, not a terrible price
111 2016-01-06 04:37:53 <phantomcircuit> assuming you're running S7s
112 2016-01-06 04:38:10 <jtoomim> $0.04/kWh? that's not what we pay
113 2016-01-06 04:38:34 <petertodd> jtoomim: how are you setup? I easily got lower cpu latency than that
114 2016-01-06 04:39:00 <rusty> jtoomim: last I did a rough check, UTXO tended to be very "recent first", so caching the last few blocks wins. I think it was ~25% of inputs come from last 6 blocks.
115 2016-01-06 04:39:06 <petertodd> jtoomim: and yeah, the 0.1ms between nodes is a big advantage
116 2016-01-06 04:39:09 <jtoomim> on p2pool with a about 100 TH/s on each node in about a dozen different users?
117 2016-01-06 04:39:26 <phantomcircuit> jtoomim, antminers dont work reliably on p2pool?
118 2016-01-06 04:39:27 <phantomcircuit> wat
119 2016-01-06 04:39:30 <jtoomim> p2pool's code doesnot scale very well
120 2016-01-06 04:39:46 <petertodd> jtoomim: right, so scale out - no need to have a dozen users on a p2pool node
121 2016-01-06 04:39:53 <jtoomim> antminers do not work reliably on p2pool because (a) they drop stale work instead of submitting it to the pool, and (b) they crash a lot with very high difficulty work
122 2016-01-06 04:40:15 <petertodd> jtoomim: and TH/s should have nothing to do with it in a proper setup - crank up share diff
123 2016-01-06 04:40:27 <jtoomim> when they (b) crash, they shut the fans off immediately, but the ASICs continue to generate heat for a couple of minutes, which means that p2pool can actually destroy the hardware
124 2016-01-06 04:40:37 <jtoomim> it's ... suboptimal
125 2016-01-06 04:40:42 <phantomcircuit> rusty, the whole thing in ram consumes ~6GB today then you need more to delay flushing the write buffer... basically forever
126 2016-01-06 04:41:02 <petertodd> jtoomim: why can't the mining software do the share filtering for you?
127 2016-01-06 04:41:11 <petertodd> jtoomim: crashing on high diff work is bizzare
128 2016-01-06 04:41:11 <phantomcircuit> jtoomim, they drop stale work for a reason, believe me you dont want them to change that :P
129 2016-01-06 04:41:31 <jtoomim> with p2pool work, you have share-stale work that is still block-valid work
130 2016-01-06 04:41:41 <rusty> phantomcircuit: sure, but if you're worried long term, there's a benefit to moving to a two-level approach.
131 2016-01-06 04:41:56 <jtoomim> this is for work for which there was a nonce successfully found
132 2016-01-06 04:42:00 <phantomcircuit> jtoomim, with spi chaining you can get the same piece of work submitted to the software in an infinite loop
133 2016-01-06 04:42:02 <jtoomim> this is not flushing the work out of the system
134 2016-01-06 04:42:06 <phantomcircuit> you really dont want that going to the pool
135 2016-01-06 04:42:25 <jtoomim> phantomcircuit, i think the pool is a better determinant of that
136 2016-01-06 04:42:29 <jtoomim> especially with p2pool
137 2016-01-06 04:42:38 <jtoomim> where it's still valid work that's just for an old share
138 2016-01-06 04:42:41 <phantomcircuit> rusty, yes i agree the read caching ness of the cache needs to be improved, currently it's mostly a write cache by design
139 2016-01-06 04:42:47 <jtoomim> it could be valid blocks that you're throwing away
140 2016-01-06 04:43:12 <phantomcircuit> jtoomim, maybe but what if you have 1000 things connected to the pool all spewing nonsense?
141 2016-01-06 04:43:19 <phantomcircuit> there's a tradeoff to be made
142 2016-01-06 04:43:23 <jtoomim> phantomcircuit i don't understand the SPI chaining thing
143 2016-01-06 04:43:35 <phantomcircuit> jtoomim, hardware wizardry
144 2016-01-06 04:43:43 <jtoomim> i know what spi chaining is
145 2016-01-06 04:43:53 <jtoomim> i don't understand how you think it can result in an infinite loop
146 2016-01-06 04:44:21 <phantomcircuit> jtoomim, the logic on any 1 of the chips getting wedged can leave it stuck flagging it has a result
147 2016-01-06 04:44:30 <jtoomim> especially since this bug has been fixed in alternate firmware, which bitmain just chose not to use...
148 2016-01-06 04:44:49 <phantomcircuit> you can also fix it by detecting duplicate nonce results
149 2016-01-06 04:44:56 <phantomcircuit> (but like two lines of code man)
150 2016-01-06 04:45:14 <jtoomim> ok, so if a chip is sending back bad results, usually the firmware chooses to shut off that asic
151 2016-01-06 04:45:23 <jtoomim> ... problem solved?
152 2016-01-06 04:45:31 <phantomcircuit> jtoomim, valid but duplicate results
153 2016-01-06 04:45:48 <jtoomim> duplicate results are different from stale results
154 2016-01-06 04:45:51 <phantomcircuit> also i dont think you can really turn off the spi logic without shutting down the entire chain
155 2016-01-06 04:46:01 <phantomcircuit> oooh right
156 2016-01-06 04:46:11 <jtoomim> and this check is after the spi communication
157 2016-01-06 04:46:16 <phantomcircuit> jtoomim, just modify p2pool not to set the flush flag?
158 2016-01-06 04:46:17 <jtoomim> afaik
159 2016-01-06 04:46:26 <jtoomim> no, that would kill your share orphan rate
160 2016-01-06 04:47:46 <jtoomim> so back to the original question, any idea how many disk accesses are needed per UTXO lookup?
161 2016-01-06 04:47:58 <jtoomim> rough estimate
162 2016-01-06 04:48:47 <phantomcircuit> jtoomim, "it depends"
163 2016-01-06 04:49:19 <phantomcircuit> as little as 0 as high as thousands
164 2016-01-06 04:49:36 <jtoomim> any idea what the typical number is?
165 2016-01-06 04:50:42 <jtoomim> i'm envisioning a system in which we could get the worst case down to 1 as long as you have enough ram for 10% of the total serialized UTXO set by storing in ram the location where you could get the actual data
166 2016-01-06 04:51:08 <jtoomim> so you just have to read that one spot, instead of walking a tree or something with disk accesses
167 2016-01-06 04:51:52 <phantomcircuit> jtoomim, i was serious
168 2016-01-06 04:51:55 <phantomcircuit> it totally depends
169 2016-01-06 04:52:11 <phantomcircuit> if you're me you can go and mess with things and use tons of memory such that it's basically always zero
170 2016-01-06 04:52:15 <phantomcircuit> otherwise
171 2016-01-06 04:52:30 <phantomcircuit> math.random
172 2016-01-06 04:52:38 <jtoomim> so for something that misses the cache?
173 2016-01-06 04:52:51 <petertodd> jtoomim: SSD's are internally rather complex these days - they're not a flat addressable memory
174 2016-01-06 04:53:11 <jtoomim> petertodd i don't think we need to know how the firmware of the SSD works
175 2016-01-06 04:53:11 <phantomcircuit> every step of caching is equally complex these days
176 2016-01-06 04:53:21 <phantomcircuit> jtoomim, oh but you do!
177 2016-01-06 04:53:23 <petertodd> jtoomim: not the firmware, the physical design
178 2016-01-06 04:53:29 <petertodd> jtoomim: (well, firmware matters too)
179 2016-01-06 04:53:31 <phantomcircuit> since it probably involves multiple disk accesses itself
180 2016-01-06 04:53:32 <jtoomim> the firmware hides the physical design from us
181 2016-01-06 04:53:42 <jtoomim> we give it an LBA address, it gives us back 4 kB
182 2016-01-06 04:53:42 <phantomcircuit> jtoomim, nah it *tries* to hide that
183 2016-01-06 04:53:43 <petertodd> jtoomim: at multiple levels you have limits on bandwidth for a given area
184 2016-01-06 04:53:45 <jtoomim> or the OS
185 2016-01-06 04:53:53 <phantomcircuit> they fail horribly at it with truly random access
186 2016-01-06 04:53:56 <petertodd> jtoomim: hell, even ram is structured that way too these days
187 2016-01-06 04:54:01 <jtoomim> and it does so in a few µs
188 2016-01-06 04:54:02 <petertodd> jtoomim: random access is a myth
189 2016-01-06 04:54:23 <phantomcircuit> jtoomim, lulz very very few ssds will actually return a random read in microseconds
190 2016-01-06 04:55:06 <jtoomim> ok, whatever, doesn't matter. the point i'm making is that we can probably do one UTXO, one disk read
191 2016-01-06 04:55:12 <jtoomim> and it sounds like we're not doing that
192 2016-01-06 04:55:19 <jtoomim> do we have a good reason for not doing that?
193 2016-01-06 04:55:44 <petertodd> jtoomim: what makes you think we aren't doing that?
194 2016-01-06 04:55:44 <phantomcircuit> jtoomim, no the point is that you *cant* do that
195 2016-01-06 04:55:55 <petertodd> jtoomim: indexes are cached in ram anyway
196 2016-01-06 04:56:16 <phantomcircuit> it's a tree, you try to shove as much of the top of the tree into ram as possible
197 2016-01-06 04:56:20 <jtoomim> petertodd because phantomcircuit said sometimes 1000 reads per utxo
198 2016-01-06 04:56:21 <petertodd> jtoomim: and the hardware physically can't do arbitrary random access
199 2016-01-06 04:56:35 <phantomcircuit> jtoomim, the magic of having to read a journal
200 2016-01-06 04:56:56 <jtoomim> you mean the filesystem journal? ext4 or whatever?
201 2016-01-06 04:56:58 <phantomcircuit> like i said
202 2016-01-06 04:57:03 <phantomcircuit> LOTS OF DETAILS
203 2016-01-06 04:57:12 <phantomcircuit> leveldb has a journal
204 2016-01-06 04:57:23 <jtoomim> ok
205 2016-01-06 04:57:43 <phantomcircuit> the bigger issue is that it really is O(log n)
206 2016-01-06 04:58:00 <phantomcircuit> but the cost of 1 operation is also variable based on system state
207 2016-01-06 04:58:35 <phantomcircuit> so yeah you had to call read() only once, but maybe that took 100 times longer than that other time you called read() 100 times
208 2016-01-06 04:58:40 <phantomcircuit> one the same hardware
209 2016-01-06 04:58:44 <phantomcircuit> on*
210 2016-01-06 04:59:04 <jtoomim> that's a leveldb read()?
211 2016-01-06 04:59:25 <phantomcircuit> jtoomim, it's a k/v db so get() ? something like that
212 2016-01-06 04:59:57 <jtoomim> a seek(key) then a GetValue(result), i think
213 2016-01-06 05:00:01 <phantomcircuit> you could switch to something that isn't tree structured but then performance would be atrocious on hard drives and only slightly better on an ssd
214 2016-01-06 05:00:22 <phantomcircuit> jtoomim, the seek is totally virtual it's not like it moves the disk or anything
215 2016-01-06 05:00:29 <jtoomim> pcursor->seek...
216 2016-01-06 05:00:55 <phantomcircuit> the cursor stuff is for iterating over multiple values
217 2016-01-06 05:01:31 <phantomcircuit> pdb->Get
218 2016-01-06 05:01:50 <phantomcircuit> ok well lets have some fun tracing things
219 2016-01-06 05:03:04 <phantomcircuit> utxo lookups go through CCoinsViewCache which checks an std::unordered_map, O(1) average, but probably causes a main memory cache miss 99% of the time
220 2016-01-06 05:03:26 <phantomcircuit> assume that misses
221 2016-01-06 05:04:28 <phantomcircuit> pdb->Get calculates which sorted table file the result will be in and then does a bisecting search of the file O(log n)
222 2016-01-06 05:05:18 <phantomcircuit> so now you're actually doing O(log n) disk seeks since they're random access
223 2016-01-06 05:05:38 <phantomcircuit> but wait there is the os page cache!
224 2016-01-06 05:06:16 <phantomcircuit> that's like lots of context switching and tons of main memory hits
225 2016-01-06 05:06:23 <phantomcircuit> so
226 2016-01-06 05:06:24 <phantomcircuit> yeah
227 2016-01-06 05:06:29 <phantomcircuit> jtoomim, there is no typical
228 2016-01-06 05:06:44 <phantomcircuit> and you cannot guarantee that only a single disk seek occurs
229 2016-01-06 05:06:57 <phantomcircuit> it's the nature of random access
230 2016-01-06 05:07:20 <jtoomim> if you know the exact position, to the byte, of the utxo entry, then you can do a single disk seek
231 2016-01-06 05:08:08 <petertodd> jtoomim: so? most of that process will be cached in memory anyway
232 2016-01-06 05:08:39 <jtoomim> it should be. is it?
233 2016-01-06 05:08:40 <petertodd> jtoomim: also, remember that leveldb uses b-trees, so there aren't that many levels
234 2016-01-06 05:08:49 <jtoomim> i'm curious how much worse the average case is than the typical case, etc.
235 2016-01-06 05:08:57 <phantomcircuit> jtoomim, the hard part is figuring out where it is
236 2016-01-06 05:08:59 <petertodd> jtoomim: you mean worst case?
237 2016-01-06 05:09:22 <jtoomim> worst case is interesting too
238 2016-01-06 05:09:33 <phantomcircuit> oh yeah and you can construct a worst case block
239 2016-01-06 05:09:36 <petertodd> jtoomim: what do you mean by avg and typ cases?
240 2016-01-06 05:09:38 <phantomcircuit> i had forgotten about that actually
241 2016-01-06 05:09:45 <jtoomim> deviation of average from typical is another way of describing the bad-but-common case
242 2016-01-06 05:09:47 <phantomcircuit> well you cant anymore if obfuscation is on
243 2016-01-06 05:10:03 <petertodd> phantomcircuit: +1 <censored>
244 2016-01-06 05:10:32 <phantomcircuit> jtoomim, average and typical mean the same thing, worst case is the interesting thing to analyze especially if it can be intentionally triggered
245 2016-01-06 05:10:49 <jtoomim> average = mean, typical = mode
246 2016-01-06 05:11:42 <jtoomim> i probably could have stated that more clearly earlier, sorry
247 2016-01-06 05:12:09 <petertodd> jtoomim: again, worst case is the interesting one - get that right and everything else is likely to be a wash
248 2016-01-06 05:12:16 <jtoomim> i'm curious about what portion of the total UTXO lookup times are taken up by what parts of the latency-distribution
249 2016-01-06 05:12:37 <jtoomim> i.e. is it a very heavy-tail distribution where the extreme slow lookups take up almost all of the time, etc.
250 2016-01-06 05:13:28 <petertodd> jtoomim: with obfusecation the only interesting part of that will be cached vs. uncached lookups
251 2016-01-06 05:13:41 <jtoomim> if you want to avoid the worst case being bad, then the most efficient allocation of memory is to store the top levels of the tree, and not cache any actual UTXOs
252 2016-01-06 05:14:02 <jtoomim> if you want to help the typical case, then you store the most commonly used utxos, and don't store the parts of the tree that you don't use often
253 2016-01-06 05:14:16 <jtoomim> if you want to help the average case, then you have to balance those two strategies for memory allocation
254 2016-01-06 05:14:27 <petertodd> jtoomim: the degree to which that avoids the worst-case being bad is going to be sufficiently small as to not matter - utxo growth is a *constraint*
255 2016-01-06 05:17:29 <petertodd> jtoomim: it's trivial to calculate worst case by deciding on a reference IOPs figure, dividing that by the number of UTXO's a block
256 2016-01-06 05:17:58 <petertodd> can need to lookup, and then doing themath on what orphan rate that'll end up with
257 2016-01-06 05:18:45 <phantomcircuit> petertodd, even that would end up not calculating a true worst case though
258 2016-01-06 05:19:01 <petertodd> you can make a reasonable assumption that a large, well-resourced, minre can push that number to basically zero, so then decide what's the orphan rate - read revenue -> profitability - difference you're willing to accept between small and large
259 2016-01-06 05:19:20 <petertodd> phantomcircuit: sure, but that should cover the attacker triggerable worst case fairly well
260 2016-01-06 05:19:33 <phantomcircuit> since a true worst case occurs when leveldb decided to rewrite one of the tables (in theory it happens in the background and doesn't effect operating time, in practice disks have limited resources and it doesn't work that way)
261 2016-01-06 05:19:53 <petertodd> phantomcircuit: so long as an attacker can't trigger that I'm not too worried
262 2016-01-06 05:20:36 <phantomcircuit> petertodd, they probably cant without also generating a huge amount of utxos
263 2016-01-06 05:22:45 <petertodd> jtoomim: now, if you're not willing to entertain a max orpahn rate difference between large/well-resourced minres and small/poorly-resourced miners, again, we're just don't have anything close to the same goals
264 2016-01-06 05:23:37 <jtoomim> how large is large? how small is small?
265 2016-01-06 05:23:49 <jtoomim> are we talking small = 1%? 0.1%? 0.01%?
266 2016-01-06 05:24:12 <petertodd> jtoomim: small = 0%
267 2016-01-06 05:24:19 <petertodd> jtoomim: large probably would mean 100%
268 2016-01-06 05:24:37 <petertodd> jtoomim: like I said, assume "large" has a 0% orphan rate
269 2016-01-06 05:24:50 <jtoomim> i'm not interested in those assumptions
270 2016-01-06 05:25:03 <jtoomim> i'm not interested in a full-node miner in every home
271 2016-01-06 05:25:09 <jtoomim> i don't think that's realistic
272 2016-01-06 05:25:23 <jtoomim> and i don't think that constraining bitcoin to that requirement is good for bitcoin
273 2016-01-06 05:25:38 <petertodd> jtoomim: like I said, you're not designing a system anything like what I'm interested in designing
274 2016-01-06 05:26:13 <phantomcircuit> jtoomim, are you assuming the only users of the system with a full node are miners?
275 2016-01-06 05:26:15 <petertodd> jtoomim: granted, so you admit that bitcoin doesn't scale, and as it grows you expect the % of the users trusting others to run the system to approach 100%?
276 2016-01-06 05:26:21 <phantomcircuit> if you are then you're designing a broken system
277 2016-01-06 05:26:36 <petertodd> phantomcircuit: ...also designing a much easier to design system!
278 2016-01-06 05:26:45 <petertodd> phantomcircuit: why bank chains are trivial in comparison
279 2016-01-06 05:26:48 <phantomcircuit> petertodd, yeah in that case
280 2016-01-06 05:26:55 <phantomcircuit> PATRICK SIGNS BLOCK #1!
281 2016-01-06 05:26:57 <jtoomim> phantomcircuit, no i'm assuming the only ones who need latency in processing a block to be a few seconds for adversarial conditions are miners or companies with deep pockets
282 2016-01-06 05:27:53 <phantomcircuit> jtoomim, normal users need to be able to complete initial block synchronization in about 6 hours before they start screaming bloody murder
283 2016-01-06 05:27:56 <jtoomim> joe running a full node in his basement for his wallet can afford to have block take a while to verify in some cases, even minutes (though that is distasteful)
284 2016-01-06 05:28:18 <phantomcircuit> build me a system that runs the full validation in that time on a $500 pc and we can talk
285 2016-01-06 05:28:26 <jtoomim> i don't think normal users should be using a full node wallet. i think that is for power users.
286 2016-01-06 05:28:37 <petertodd> jtoomim: so normal users are trusting others?
287 2016-01-06 05:28:38 <phantomcircuit> then you assume a system that will fail
288 2016-01-06 05:28:40 <jtoomim> power users can be patient for IBD.
289 2016-01-06 05:29:27 <jtoomim> normal users might be paying other users for a contract giving them the data they need
290 2016-01-06 05:29:37 <jtoomim> and if there's fraud, then the full node operator is liable
291 2016-01-06 05:29:52 <petertodd> jtoomim: why bother with this mining stuff? Why not just pay a bank?
292 2016-01-06 05:29:54 <jtoomim> for most wallet users, i think that should be more than enough
293 2016-01-06 05:30:00 <petertodd> jtoomim: or hell, why not use a layer on top of bitcoin?
294 2016-01-06 05:30:17 <brg444> sorry to interject but isn't the idea to minimize trust?
295 2016-01-06 05:30:32 <petertodd> jtoomim: adversarial forced soft-forks are bad enough already - in an environment where the super majority of the economy is trusting a small group sof miners... ugh
296 2016-01-06 05:30:35 <jtoomim> because you don't need to use a bank. you can run your own full node if you want to pay the hardware costs.
297 2016-01-06 05:30:43 <jtoomim> it's just a matter of convenience.
298 2016-01-06 05:30:50 <jtoomim> full nodes will never be convenient.
299 2016-01-06 05:30:53 <gijensen> brg444: There's sarcasm going around
300 2016-01-06 05:31:23 <petertodd> jtoomim: anyway, this conversation isn't going to be productive - you have fundementally different goals than I do
301 2016-01-06 05:31:28 <jtoomim> the complete safety of full nodes is great for some use cases, but for a lot of people, they aren't a good tradeoff
302 2016-01-06 05:31:51 <jtoomim> i don't need to run a full node on my cell phone for it to be useful to me
303 2016-01-06 05:32:00 <jtoomim> yeah, fine, let's drop it
304 2016-01-06 05:32:06 <rusty> petertodd: to be fair, fraud proofs will allow a more nuanced range than "full validation" or "miner trust"
305 2016-01-06 05:32:34 <petertodd> rusty: I no longer think fraud proofs work very well
306 2016-01-06 05:33:01 <aj> petertodd: why not?
307 2016-01-06 05:33:04 <petertodd> rusty: you need the data to make a fraud proof, so we're much more likely to end up with validity challenges, where a fraud proof is simply an unmet challenge
308 2016-01-06 05:33:23 <petertodd> rusty: if I make a fraudulent block, why would I ever distribute the data necessary to prove it's fraudulent?
309 2016-01-06 05:33:27 <rusty> petertodd: sure, you withhold data.
310 2016-01-06 05:34:00 <petertodd> rusty: and if you design a system where it otherwise works in that scenario, I have strong doubts we can reliably make it work
311 2016-01-06 05:34:31 <petertodd> rusty: e.g., the converse is my client-side validation ideas in treechains, where there's no need for a fraud proof because the system doesn't work at all unless you validate (trivially ripped off)
312 2016-01-06 05:34:33 <rusty> petertodd: in pettycoin I forced you to present hashes of modified previous block's txs (back 1, 2, 4, ...), similar to the suggestions recently.
313 2016-01-06 05:34:43 <aj> petertodd: doesn't probabilistic validation prevent that? ie, each of your 10 peers randomly asks for 10% of the transactions, and considers it a failure and won't forward if you don't have the data
314 2016-01-06 05:35:02 <petertodd> aj: how do you know those peers aren't a sybil attack?
315 2016-01-06 05:35:23 <rusty> aj: naah, miners are racing, so they build on blocks they can't validate.
316 2016-01-06 05:35:24 <petertodd> aj: probabalistic verification that's consensus enforced is plausible, but not the way you described it
317 2016-01-06 05:35:25 <aj> petertodd: this assumes i'm the attacker
318 2016-01-06 05:35:50 <phantomcircuit> rusty, i used to think that widely deployed fraud proofs offered a mechanism for security that was strong enough to make the need for full nodes reduced
319 2016-01-06 05:35:56 <phantomcircuit> rusty, i no longer believe that
320 2016-01-06 05:35:57 <petertodd> rusty: sure, but that's not really related to fraud proofs
321 2016-01-06 05:36:23 <petertodd> aj: if you're the attacker, sybil attack the network with nodes that don't follow that protocol
322 2016-01-06 05:36:24 <rusty> petertodd: it's related to witholding attacks, which are the remaining avenue once you have compact fraud proofs.
323 2016-01-06 05:36:30 <aj> petertodd: that is, i'm a miner producing invalid blocks, and my peers are exchanges trying to cheaply validate my block
324 2016-01-06 05:36:57 <petertodd> aj: now for something that could work, look at linearized coin history, which uses probabalistic techniques to check for fraud, while simultaneously institutionalizing it
325 2016-01-06 05:37:07 <rusty> aj: but you only need 1 flawed tx, and you can answer 90% of the queries...
326 2016-01-06 05:37:09 <petertodd> rusty: wait, what type of withholding attack?
327 2016-01-06 05:37:31 <rusty> petertodd: information witholding, where miner refuses to supply validation information for a block.
328 2016-01-06 05:38:14 <petertodd> rusty: right, which screws over other *honest* minres, but you still don't have a situation where users can reliably find fraud
329 2016-01-06 05:38:28 <petertodd> rusty: (without validating everything)
330 2016-01-06 05:39:08 <petertodd> rusty: fwiw, I strongly suspect that absent zkSNARKS you *must* have some amount of inflation to make the system scale with regard to validation costs
331 2016-01-06 05:39:36 <petertodd> rusty: (or put another way, with zkSNARKS validation is so easy that the inflation rate necessary to paper over fraud is zero)
332 2016-01-06 05:40:07 <rusty> petertodd: that's why you force miners to prove they knew something about prev block. But that itself becomes hard to disprove; you are back to relying on miners not trusting blocks they can't validate, which has proven a flawed assumption recently in bitcoin.
333 2016-01-06 05:40:16 <rusty> petertodd: that's quite possibly true.
334 2016-01-06 05:41:13 <petertodd> rusty: yeah, from the point of the user, those kinds of proofs are hard to check in a useful way - my segwit prev-block-proof proposal is something I know is pretty weak for instance
335 2016-01-06 05:41:55 <petertodd> rusty: equally, so what if the miners had the data? still doesn't prove all the data is valid if a majority of minres decide to change the protocol
336 2016-01-06 05:43:54 <rusty> petertodd: indeed. gmaxwell proposed a query-response scheme using fountain codes so you couldn't answer more than a very limited number of queries without revealing everything statistically, but I couldn't make sure it was robust against deliberate deceipt...
337 2016-01-06 05:45:46 <petertodd> rusty: yup, that's my experience playing with naive non-interactive proofs as well
338 2016-01-06 05:45:58 <rusty> petertodd: yes, I couldn't improve on your prev-block-proof scheme, either. pettycoin uses a single byte per prev-block (top of SHA256 of <this-block-output> <prevtxs...>).
339 2016-01-06 05:47:27 <petertodd> rusty: heh, well, I guess that's a good sign :/
340 2016-01-06 05:47:27 <rusty> You still end up vulnerable to some miner helpfully supplying the values to you, to optimize your block generation...
341 2016-01-06 05:49:06 <petertodd> rusty: oh I know - it just keeps us at the current status quo, no better
342 2016-01-06 05:49:36 <petertodd> rusty: I'm certainely not claiming that solution stops validationless mining, I just want something simple that will prevent the worst of it
343 2016-01-06 05:50:16 <rusty> petertodd: certainly it increases the work required to do it, which probably won't hurt.
344 2016-01-06 05:51:09 <petertodd> rusty: I'm mostly worried about lazy miners taking shortcuts, at least in the near term. Easy to get some pretty big reorgs that way
345 2016-01-06 05:51:34 <rusty> petertodd: agreed.
346 2016-01-06 06:11:47 <jl2012> I'm lost in the discussion. Could that be a tree of H(current block coinbase outputs|previous block tx)? 1 previous block tx as 1 leave
347 2016-01-06 06:14:15 <phantomcircuit> jl2012, which one?
348 2016-01-06 06:14:34 <jl2012> the previous block proof
349 2016-01-06 06:48:25 <jtoomim> trying to sleep, failing...
350 2016-01-06 06:48:42 <jtoomim> IIRC, an SSD and an HDD both read in 4 kiB chunks
351 2016-01-06 06:49:00 <jtoomim> so reading 1 byte is as expensive as reading 4 kiB, as long as the 4 kiB is aligned properly
352 2016-01-06 06:50:11 <jtoomim> if we want to be able to avoid more than 1 read, and if we had the utxo entries stored in an ordered fashion...
353 2016-01-06 06:50:36 <jtoomim> then we can figure out which block contains any given utxo knowing only the first utxo in that block
354 2016-01-06 06:51:18 <jtoomim> for 1 GiB of utxo / 4096 bytes per block, that means 244140 blocks, assuming they're all 100% full
355 2016-01-06 06:51:58 <jtoomim> that's something like 8 MB of ram, right?
356 2016-01-06 06:52:08 <jtoomim> i'm probably missing something, though.
357 2016-01-06 08:41:49 <bedeho> does nTweak vary from session to session in the bloom filter?
358 2016-01-06 10:01:27 <bedeho> is the inv response to a mempool message filtered by the bloom filter?
359 2016-01-06 10:09:23 <jl2012> I'm explaining Byzantine Generals Problem to my someone who has no computer science background at all. She asks whether generals with faster communication channel have any advantage. So smart. It's the whole problem for scaling bitcoin
360 2016-01-06 14:51:30 <gavinandresen> where is the latest versionbits BIP? Is it https://gist.github.com/sipa/bf69659f43e763540550 (created a year ago) ??
361 2016-01-06 14:51:38 <gavinandresen> ... draft bip ...
362 2016-01-06 15:00:15 <sipa> https://github.com/bitcoin/bips/blob/master/bip-0009.mediawiki
363 2016-01-06 15:28:05 <xabbix__> Are txids available for query (via decoderawtransaction) only when they are accepted into the mempool?
364 2016-01-06 15:28:41 <sipa> decoderawtransaction takes a raw transaction and decodes it
365 2016-01-06 15:28:46 <sipa> it doesn't need anything
366 2016-01-06 15:29:07 <sipa> getrawtransaction works only for mempool txn, and for blockchain txn if -txindex is enabled
367 2016-01-06 15:29:15 <sipa> gettransaction works on wallet txn
368 2016-01-06 15:29:23 <xabbix__> I'm running getrawtranscation and decoderawtransaction on txids I see in my debug.log (under 'got inv: tx...'), sometimes I get the info and sometimes I get No information available about transaction (code -5)
369 2016-01-06 15:29:28 <xabbix__> even though I'm running with txindex=1
370 2016-01-06 15:30:04 <sipa> what bitcoin core version?
371 2016-01-06 15:30:32 <xabbix__> v0.11.2.0-g7e27892
372 2016-01-06 15:30:40 <xabbix__> but as you say, getrawtransaction only works on mempool txs
373 2016-01-06 15:30:48 <xabbix__> So maybe that's the issue
374 2016-01-06 15:31:49 <sipa> if you have -txindex, it works on blockchain txn as well
375 2016-01-06 15:31:55 <sipa> except the genesis block
376 2016-01-06 15:32:22 <xabbix__> sipa, as soon as I see 'got inv: tx <txid>' I should be able to run getrawtransaction and decoderawtransaction and get the correct results?
377 2016-01-06 15:32:32 <xabbix__> given I'm running with txindex=1
378 2016-01-06 15:32:40 <sipa> no
379 2016-01-06 15:32:54 <sipa> having the inv does not mean you have the transaction
380 2016-01-06 15:33:20 <sipa> it may take a while to request the actual txn
381 2016-01-06 15:33:40 <sipa> and it may be a known invalid/nonconformant transaction, so we don't request it again
382 2016-01-06 15:33:40 <xabbix__> Oh, I see now. So got inv, then askfor tx and then requesting tx
383 2016-01-06 15:34:21 <xabbix__> When is it safe to assume the tx is valid and I have it to query on it? When I see the 'AcceptToMemoryPool'?
384 2016-01-06 15:34:38 <sipa> when it's relayed :)
385 2016-01-06 15:34:46 <sipa> there is already a protocol for that :p
386 2016-01-06 15:35:00 <xabbix__> :)
387 2016-01-06 15:37:30 <kefkius> Also btw instead of getrawtransaction and decoderawtransaction you can just do 'getrawtransaction <txid> 1' and it will decode it for you
388 2016-01-06 15:38:55 <xabbix__> kefkius: thanks, didn't know that!
389 2016-01-06 15:39:02 <kefkius> np :)
390 2016-01-06 16:19:35 <morcos> jtoomim: sorry i couldn't make it through all the back log, but it seems fairly obvious to me that the right thing to do is keep the whole utxo in memory (around 6GB?) but just not to infrequently also flush it to disk. or really maybe you don't even care about doing that. if that node crashes, switch to another with the utxo updated.
391 2016-01-06 16:20:00 <morcos> why would you ever be looking up the utxo on disk if you're mining
392 2016-01-06 16:20:59 <phantomcircuit> morcos, because you dont have umpteen GB of ram? :P
393 2016-01-06 16:26:37 <jtoomim> morcos, i'm not necessarily thinking just about miners right now
394 2016-01-06 16:27:02 <jtoomim> i'm just thinking that it should be fairly inexpensive to optimize the worst-case to be effectively O(1)
395 2016-01-06 16:27:10 <jtoomim> and that sounds desirable
396 2016-01-06 16:27:39 <jtoomim> the conversation got derailed a bit, so most of the conversation history is not worth reading
397 2016-01-06 16:27:54 <morcos> phantomcircuit: i agree with many of jtoomins point's that i think some of you guys have got stuck in your heads the old view of mining operations. even a very small miner these days, the cost of a full node and good hardware is minimal
398 2016-01-06 16:28:28 <jtoomim> it's jtoomim, not jtoomin, by the way
399 2016-01-06 16:28:35 <jtoomim> (jtoomin won't tag me)
400 2016-01-06 16:29:29 <jtoomim> i've written up a summary of my DB idea
401 2016-01-06 16:29:29 <morcos> jtoomim: whoops, sorry. i think there is a lot that can be done to minimize the number of disk reads still by using our in memory utxo cache even if we don't try to cache the whole utxo
402 2016-01-06 16:29:33 <jtoomim> if you want i can email it to you
403 2016-01-06 16:29:47 <jtoomim> i'd rather hold off a little before posting it to the list in case there's anything i need to tweak
404 2016-01-06 16:29:57 <jtoomim> morcos agreed
405 2016-01-06 16:30:46 <morcos> we eliminated a lot of unneccessary reads for 0.12, but there is still one left even if everything is in cache (will be merged for 0.13 hopefully) but on top of that there is talk of switching the cache structure to be individual utxo based and not tx based, and combining that with smarter logic on which utxos to keep fresh.
406 2016-01-06 16:31:44 <morcos> right now when we flush the cache to disk, the cache is flushed, that's just plain silly. we should at least keep some stuff there. i experimented with that, and found it surprisingly hard to get much improvement due to deallocation overhead of the utxo storage on the order of 10's of ms.
407 2016-01-06 16:32:34 <morcos> that should have been improved with prevectors, but there is much to go, before trying to worry about optimizing leveldb usage itself i think
408 2016-01-06 16:32:50 <jtoomim> morcos i was thinking of rewriting GBT so that it submitted the template before deallocation, but that would mess up the RPC server system a bit
409 2016-01-06 16:32:56 <jtoomim> unless a clever hack was discovered
410 2016-01-06 16:33:04 <jtoomim> but this utxo DB thing is a very different idea
411 2016-01-06 16:33:18 <jtoomim> it's basically a different DB layout
412 2016-01-06 16:33:52 <jtoomim> where you index each UTXO only to the 4 kiB page it sits in, then read that whole page to RAM, and then search the page for the actual UTXO
413 2016-01-06 16:34:09 <jtoomim> and you can do that using way less RAM than a more precise approach
414 2016-01-06 16:35:15 <jtoomim> you just need to store in RAM the starting key value (which means implicitly the ending key value) for each page
415 2016-01-06 16:36:17 <jtoomim> excuse me, need to go power cycle a few miners...
416 2016-01-06 16:37:30 <morcos> jtoomim: sounds maybe interesting, but think its putting our optimization efforts at not the lowest hanging fruit
417 2016-01-06 16:37:47 <sipa> jtoomim: that's kindof how leveldb already works (nor quite, but it has acxeleration/index structures)
418 2016-01-06 16:38:30 <sipa> jtoomim: but you still need to access disk at that point
419 2016-01-06 16:41:53 <jtoomim> sipa yes, a disk access may be unavoidable if you can't cache the UTXOs that you're interested in
420 2016-01-06 16:42:05 <jtoomim> i'm just trying to make the worst case better by guaranteeing only a single access
421 2016-01-06 16:42:09 <jtoomim> at worst
422 2016-01-06 16:42:18 <jtoomim> and zero accesses when caching works
423 2016-01-06 16:42:39 <jtoomim> and being more efficient with the RAM usage for the acceleration/index structures that leveldb uses
424 2016-01-06 16:43:16 <jtoomim> leveldb is probably designed to be functional across a broad range of entry sizes, and utxos are pretty small, so i think we can make some tradeoffs more intelligently than leveldb
425 2016-01-06 16:43:50 <jtoomim> obviously, caching all the UTXOs in RAM is going to be the fastest. I'm not disagreeing with that. that is preferable for any miner.
426 2016-01-06 16:44:50 <jtoomim> but for non-miners (e.g. IBD) and for miners who don't know that they should change the -dbcache setting, improving performance by 2x or more could be possible
427 2016-01-06 16:45:55 <jtoomim> morcos yes, maybe. I'm not looking at implementing this right now. it's just an early thought i had. i don't really understand the current db system well enough to be a good judge.
428 2016-01-06 16:49:02 <jtoomim> it would be nice if the project had a benchmark suite so we could easily and reliably compare the effects of different hardware and different algorithms...
429 2016-01-06 17:17:15 <maaku> Utxos are not necessarily small -- they are stored per tx not pet output
430 2016-01-06 17:36:06 <jtoomim> maaku: noted, thanks.
431 2016-01-06 18:29:22 <gavinandresen> jtoomim: there's a benchmark framework in place, see src/bench/
432 2016-01-06 18:30:50 <jtoomim> added 3 months ago. nice.
433 2016-01-06 18:31:03 <jtoomim> looks pretty sparse though
434 2016-01-06 18:33:41 <dgenr8> jtoomim: are you thinking of a utxo lookup scenario other than a full node keeping up with validation?
435 2016-01-06 18:34:20 <jtoomim> i try to think of all of the scenarios
436 2016-01-06 18:35:01 <jtoomim> e.g. a home p2pool miner who doesn't pay much attention to optimization
437 2016-01-06 18:35:21 <jtoomim> a medium-small-scale miner in a future with 50 GiB UTXO set who only has 32 GB of ram
438 2016-01-06 18:35:30 <jtoomim> a full node keeping up with validation
439 2016-01-06 18:35:35 <jtoomim> a full node doing IBD
440 2016-01-06 18:35:48 <jtoomim> each scenario has different needs
441 2016-01-06 18:36:29 <jtoomim> and it seems to me that no scenario would suffer from dedicating about 0.8% of the UTXO's size to a RAM table to make UTXO lookup O(1) and single-access
442 2016-01-06 18:36:35 <jtoomim> and a few of them would benefit
443 2016-01-06 18:36:55 <jtoomim> i'm not sure it's worth the programming effort, but it might be
444 2016-01-06 18:36:59 <jtoomim> dgenr8
445 2016-01-06 18:38:39 <dgenr8> the scenario i like to think about is a SPV client asking the network to validate a utxo
446 2016-01-06 18:39:05 <jtoomim> ok, that's a good one too
447 2016-01-06 18:39:27 <jtoomim> eventually, there will be some limiting ratio of SPV wallets to full nodes based on the performance of that lookup
448 2016-01-06 18:39:47 <jtoomim> (there might be another limiting ratio that's smaller, of course)
449 2016-01-06 18:39:49 <dgenr8> for that, no cache needed. just indexes. and perhaps they could be distributed.
450 2016-01-06 18:41:27 <jtoomim> yeah, fermi est says SPV requests shouldn't be numerically significant
451 2016-01-06 18:42:53 <dgenr8> huh?
452 2016-01-06 18:43:00 <jtoomim> a fermi estimate
453 2016-01-06 18:43:55 <jtoomim> if the average SPV wallet requests a batch of 100 utxos once per day, that's about 1 request every 1000 seconds
454 2016-01-06 18:44:22 <jtoomim> if a full node can do about 1000 disk lookups per second, then one full node should be able to handle around 1 million SPV wallets
455 2016-01-06 18:44:42 <jtoomim> numbers accurate to within one or two orders of magnitude
456 2016-01-06 18:44:48 <phantomcircuit> morcos, it's true that miners today can afford a pretty expensive server since it's cheap relative to cost of hardware
457 2016-01-06 18:45:02 <phantomcircuit> morcos, the problem is that the entry cost is pretty damned high
458 2016-01-06 18:45:06 <jtoomim> dgenr8 so probably not relevant
459 2016-01-06 18:45:28 <phantomcircuit> morcos, ie the ecosystem is super messed up right now, why would we actively make it worse?
460 2016-01-06 18:45:47 <phantomcircuit> either way the issue is more that user of the system must run full nodes
461 2016-01-06 18:45:51 <phantomcircuit> and less about miners really
462 2016-01-06 18:46:15 <phantomcircuit> but that requires a more sophisticated argument about incentives which is why the focus seems to have been on miners
463 2016-01-06 18:47:11 <dgenr8> jtoomim: with cache an no index, full node needs a lot of memory. without a distributed index, it also needs a lot of disk. unless block size is tiny or course.
464 2016-01-06 18:47:44 <jtoomim> dgenr8: distributed index?
465 2016-01-06 18:48:30 <dgenr8> partial nodes collectively route and serve requests. future idea.
466 2016-01-06 18:48:40 <phantomcircuit> <morcos> right now when we flush the cache to disk, the cache is flushed, that's just plain silly. we should at least keep some stuff there. i experimented with that, and found it surprisingly hard to get much improvement due to deallocation overhead of the utxo storage on the order of 10's of ms.
467 2016-01-06 18:48:42 <phantomcircuit> wait what?
468 2016-01-06 18:51:04 <phantomcircuit> <dgenr8> the scenario i like to think about is a SPV client asking the network to validate a utxo
469 2016-01-06 18:51:09 <phantomcircuit> eh? that's totally useless
470 2016-01-06 18:51:21 <phantomcircuit> "hello random internet people please dont lie to me"
471 2016-01-06 18:52:41 <Skender> Hello guys, I want to send some bitcoin packets over sockets but apparently I'm getting the checksum wrong. Is there something special about how bitcoin-qt calculates sha256(sha256(something))? I tried to verify the checksum of packets sent by bitcoin-qt and can't seem to get the right checksum. On the bitcoin developer reference it says that sha256(sha256(<empty string>)) should give 0x5df6e0e2 as the first 4 bytes.
472 2016-01-06 18:53:08 <phantomcircuit> Skender, iirc it's byte swapped
473 2016-01-06 18:53:20 <Skender> kk
474 2016-01-06 18:53:28 <Skender> so the full payload reversed?
475 2016-01-06 18:53:46 <phantomcircuit> Skender, no it's sha256(sha256(payload))
476 2016-01-06 18:53:51 <phantomcircuit> then reverse the result
477 2016-01-06 18:54:00 <dgenr8> phantomcircuit: that view is not surprising if you already thing SPV clients are useless
478 2016-01-06 18:54:03 <Skender> okay
479 2016-01-06 18:55:21 <phantomcircuit> dgenr8, nobody has yet implemented an spv client as described in the whitepaper because nobody has implemented fraud proofs, because it turns out they dont work as well as we thought 5 years ago
480 2016-01-06 18:55:41 <phantomcircuit> the things people are using today which they call spv clients are just that clients
481 2016-01-06 18:56:07 <phantomcircuit> they are not network participants and do not provide any incentives alignment against miners arbitrarily changing the rules
482 2016-01-06 18:56:24 <phantomcircuit> like say increasing the subsidy payout in violation of the protocol
483 2016-01-06 18:56:54 <phantomcircuit> (which is exactly what some people have argued for, but in a very round about way that avoids explicitly calling for it)
484 2016-01-06 18:57:48 <Skender> Do I need to reverse the sha256 hash before hashing it again? So sha256("") -> reverse -> sha256 -> reverse? Because if I reverse it after double hashing I still don't get 0x5df6e0e2 :/
485 2016-01-06 18:59:01 <phantomcircuit> i was wrong it's not reversed
486 2016-01-06 18:59:05 <phantomcircuit> i dont know what you're doing but
487 2016-01-06 18:59:18 <phantomcircuit> sha256(sha256("")) == 0x5df6e0e2761359d30a8275058e299fcc0381534545f55cf43e41983f5d4c9456
488 2016-01-06 19:00:20 <Skender> This tool: http://www.xorbin.com/tools/sha256-hash-calculator as well as my local implementation says otherwise :/ I should maybe just use the sha256 implementation of the bitcoin-core on github.
489 2016-01-06 19:02:10 <Skender> How did you calculate the hash?
490 2016-01-06 19:02:13 <phantomcircuit> Skender, it sounds like you're calculating sha256(hex(sha256("")))
491 2016-01-06 19:02:18 <phantomcircuit> what result do you get
492 2016-01-06 19:02:29 <Skender> cd372fb85148700fa88095e3492d3f9f5beb43e555e5ff26d95f5a6adc36f8e6
493 2016-01-06 19:02:59 <dgenr8> phantomcircuit: improved SPV client and protocol is fertile ground
494 2016-01-06 19:03:01 <phantomcircuit> yup
495 2016-01-06 19:03:10 <phantomcircuit> Skender, you're calculating the hash of the hex string the second time
496 2016-01-06 19:03:19 <phantomcircuit> you need to calculate the hash of the binary result
497 2016-01-06 19:03:19 <Skender> okay ty
498 2016-01-06 19:03:52 <Skender> Got the right result :) ty
499 2016-01-06 19:03:56 <phantomcircuit> dgenr8, yes and i wish... someone else all the luck in the world on trying
500 2016-01-06 23:16:33 <lorenzoasr> hello
501 2016-01-06 23:19:13 <lorenzoasr> which is the minimum overall transaction amount that is not considered as "dust" ?
502 2016-01-06 23:19:36 <sipa> depends on your relay fee, which is configurable
503 2016-01-06 23:20:20 <sipa> in recent vereions of Bitcoin Corez with the exception of 0.11.2 (due to a temporary mempool bloating fix), the relay fee is by default 1 satoshi per byte
504 2016-01-06 23:20:44 <sipa> which corresponds to a few hundred satoshi as dust limit
505 2016-01-06 23:21:02 <sipa> as it would cost more to redeem the coin than leaving it
506 2016-01-06 23:28:36 <lorenzoasr> thank you sipa, so if I send a tx of 1 satoshi with 1 input and 1 output, the fee should be at least 200 satoshis as the overall weight will be around 200 bytes, is this correct?
507 2016-01-06 23:57:41 <stevenroose> does solo mining with bitcoind support longpolling?