1 2013-07-27 12:38:40 <bmcgee> hey guys, am I right in thinking that re-indexing blocks on disk takes just as long as downloading the block chain initially because the process is cpu bound and not network bound?
  2 2013-07-27 12:40:04 <Luke-Jr> more or less
  3 2013-07-27 12:43:28 <sipa> bmcgee: depends, for the early blocks it's likely network-bound
  4 2013-07-27 12:43:31 <bmcgee> yeah, figured. Sucks...
  5 2013-07-27 12:43:53 <bmcgee> my little mac mini has been going for a day now. about a 3rd of the way through :(
  6 2013-07-27 12:43:54 <sipa> or disk bound
  7 2013-07-27 12:43:59 <sipa> wut?
  8 2013-07-27 12:44:03 <bmcgee> yup
  9 2013-07-27 12:44:18 <sipa> Luke-Jr: you also reported a really slow reindex, no?
 10 2013-07-27 12:44:22 <Luke-Jr> yes
 11 2013-07-27 12:44:27 <Scrat> bad peer luck?
 12 2013-07-27 12:44:28 <bmcgee> it's not the most powerful machine on the planet but it aint slow either
 13 2013-07-27 12:44:45 <sipa> Scrat: for reindexing, your peers don't matter
 14 2013-07-27 12:45:01 <bmcgee> i would've thought re-indexing would be relatively fast...
 15 2013-07-27 12:45:23 <sipa> the hard part is maintaining the UTXO set
 16 2013-07-27 12:45:31 <sipa> you need to do that whether you validate or not, even
 17 2013-07-27 12:46:50 <bmcgee> Just checked, my mini is running with 2.26 Ghz Intel Core 2 Duo and 2Gb RAM. Fair enough the RAM is a bit low
 18 2013-07-27 12:47:23 <sipa> unless you set -dbcache high, amount of memory doesn't really matter
 19 2013-07-27 12:47:30 <sipa> that speeds up reindexing a lot though
 20 2013-07-27 12:47:33 <sipa> (for me at least)
 21 2013-07-27 12:49:38 <bmcgee> so i just stick that into bitcoin.conf
 22 2013-07-27 12:51:46 <gmaxwell> bmcgee: if you're network lucky then you're cpu bound.
 23 2013-07-27 12:52:13 <gmaxwell> though that seems to be uncommon now unless you sync from local nodes.
 24 2013-07-27 12:52:20 <bmcgee> what do you guys mean by lucky in terms of peers?
 25 2013-07-27 12:52:35 <gmaxwell> meaning you don't get a slow and far away one as your chosen fetch point.
 26 2013-07-27 12:53:38 <gmaxwell> Reindex takes me a couple hours (haven't timed it recently, but I think not more than three) but my recently timed network sync are 10 hours, 14 hours, and >18 hours (the last I aborted and switched to pull from a local node)
 27 2013-07-27 12:53:53 <bmcgee> ah k
 28 2013-07-27 12:54:01 <jgarzik> Hey, cool.  a 'lockunspent' user.  Glad to see people using that.
 29 2013-07-27 12:55:33 <bmcgee> i've added dbcache=high, lets see if it makes a dent
 30 2013-07-27 12:56:37 <sipa> bmcgee: by 'high' i mean a large number
 31 2013-07-27 12:56:45 <bmcgee> ah lol
 32 2013-07-27 12:56:46 <sipa> it's an integer representing a number in megabytes
 33 2013-07-27 12:57:10 <bmcgee> i'll try giving 3/4 of the total RAM
 34 2013-07-27 12:57:19 <sipa> what OS?
 35 2013-07-27 12:57:47 <bmcgee> Mac
 36 2013-07-27 12:57:55 <bmcgee> 2 Gb total on the box
 37 2013-07-27 12:57:55 <sipa> k
 38 2013-07-27 12:58:48 <bmcgee> i've been using it for testing a pool server, doesn't need to be massively powerful. Power cable got pulled out a few days ago, seems to have corrupted the db. Ball ache waiting for it to re-index
 39 2013-07-27 13:01:22 <gmaxwell> mac .. corrupted db. :-/
 40 2013-07-27 13:02:08 <sipa> the "corrupted at unclean shutdown" case worries me less than the "corrupted randomly during sync, and fixed by rebooting"
 41 2013-07-27 13:02:46 <sipa> though neither should happen, clearly
 42 2013-07-27 13:03:38 <gmaxwell> Both preclude deploying pruning. ... chainstate is 247mb ... perhaps we ought to be keeping a backup copy of it to recover from.
 43 2013-07-27 13:04:54 <sipa> that shouldn't be necessary
 44 2013-07-27 13:05:28 <sipa> leveldb's design should make crashes always safe... unless there is random data corruption, in which case keeping a copy won't help
 45 2013-07-27 13:07:07 <gmaxwell> Even after whatever problem is causing these issues is solved it'll be some time before we can tell if the failure rate is one per thousand operating hours or greater than one in one-hundred-thousand operating hours. The latter being what we need for pruning.
 46 2013-07-27 13:07:10 <sipa> every file in leveldb is append-only or even write-once
 47 2013-07-27 13:08:20 <gmaxwell> Doesn't it need to ripple up changes when it rewrites a level to compact it? I guess those are just appended records?
 48 2013-07-27 13:08:57 <sipa> it writes new files
 49 2013-07-27 13:09:00 <sipa> and then deletes the old ones
 50 2013-07-27 13:09:33 <gmaxwell> sipa: I know thats how it updates a level, but then how does it find the new files?
 51 2013-07-27 13:10:11 <gmaxwell> in any case, I think this corruption happens too often to be just an issue during compaction. :(
 52 2013-07-27 13:11:05 <sipa> yeah
 53 2013-07-27 13:11:18 <sipa> but especially the corruption during sync
 54 2013-07-27 13:11:34 <sipa> where even nog a single exit/start cycle is involved
 55 2013-07-27 13:11:40 <sipa> *not
 56 2013-07-27 13:11:45 <phantomcircuit> sipa, crashing ssds tend to cause corruption around sectors recently written to
 57 2013-07-27 13:11:58 <gmaxwell> phantomcircuit: yes, and the design should tolerate that.
 58 2013-07-27 13:12:42 <phantomcircuit> gmaxwell, how would you tolerate that?
 59 2013-07-27 13:12:59 <phantomcircuit> iirc leveldb doesn't checksum anything
 60 2013-07-27 13:13:05 <gmaxwell> It does.
 61 2013-07-27 13:13:09 <phantomcircuit> most databases dont...
 62 2013-07-27 13:13:16 <gmaxwell> So does postgresql.
 63 2013-07-27 13:13:31 <phantomcircuit> gmaxwell, uh postgresql only checksums the wal
 64 2013-07-27 13:13:31 <sipa> gmaxwell, Luke-Jr, bmcgee: just did a reindex until block 225430 (last checkpoint) in 19m12s
 65 2013-07-27 13:13:40 <phantomcircuit> nothing else is checksummed
 66 2013-07-27 13:13:46 <bmcgee> sipa: I hate you so much right now?????????..
 67 2013-07-27 13:13:49 <bmcgee> ;)
 68 2013-07-27 13:13:59 <gmaxwell> In any case, so long as the corruption is contigious, it should just tear off the last transaction. Leveldb is "all log".
 69 2013-07-27 13:14:48 <gmaxwell> phantomcircuit: yes, thats all that should need to get checksummed in the design.  If recent written areas are blown away the corruption will either be in the wal and detected, or be in the data and get detected on the WAL replay.
 70 2013-07-27 13:15:04 <sipa> gmaxwell: i don't know whether there's any "in case a checksum error is found, roll back" setting
 71 2013-07-27 13:15:12 <sipa> gmaxwell: i think it just checks, and fails if the checksum is wrong
 72 2013-07-27 13:15:13 <gmaxwell> Though if the physical sector and the FS sectors don't agree, I could imagine that scary things would happen.
 73 2013-07-27 13:15:31 <sipa> it's also not that simple that you can always roll back
 74 2013-07-27 13:15:34 <phantomcircuit> gmaxwell, im saying the problem is that you can get random corruption in sectors physically close to the last written sector
 75 2013-07-27 13:15:39 <phantomcircuit> but not necessarilly logically close
 76 2013-07-27 13:16:00 <gmaxwell> phantomcircuit: thats busted. :(
 77 2013-07-27 13:16:08 <gmaxwell> "all bets are off"
 78 2013-07-27 13:16:11 <phantomcircuit> gmaxwell, agreed
 79 2013-07-27 13:16:51 <phantomcircuit> gmaxwell, there's a reason why my exchange code has a custom journal which incorporates a number of significant and very strict checksumming mechanisms
 80 2013-07-27 13:17:02 <phantomcircuit> bad stuff happens with disks all the time
 81 2013-07-27 13:17:05 <gmaxwell> The model in postgresql at least can _always_ tolerate recently written sectors being trashed. (well, so long as it doesn't expand past a logging epoch) but not when random ones get trashed too.
 82 2013-07-27 13:17:13 <phantomcircuit> mostly people dont notice because it's not relevant
 83 2013-07-27 13:18:07 <gmaxwell> sipa: hm. thats ... crappy. If it can't handle a corrupted write at the end then it can't be robust even on a well behaved disk.
 84 2013-07-27 13:18:57 <phantomcircuit> gmaxwell, the main thing is to never run with corrupted data
 85 2013-07-27 13:19:24 <phantomcircuit> or at least corrupted data that could lead to incorrect data being displayed about wallet transactions
 86 2013-07-27 13:19:31 <gmaxwell> uh.
 87 2013-07-27 13:19:37 <gmaxwell> this has nothing to do with wallets.
 88 2013-07-27 13:19:38 <phantomcircuit> that's definitely an unsolved problem honstly
 89 2013-07-27 13:19:51 <phantomcircuit> gmaxwell, i know it's the chainstate info
 90 2013-07-27 13:20:11 <phantomcircuit> bdb is probably much worse than leveldb in this regard actually :/
 91 2013-07-27 13:20:34 <gmaxwell> bdb is supposted to handle all this stuff too, and in practice it seems to be pretty robust.
 92 2013-07-27 13:21:14 <gmaxwell> (bdb has an annoying practice of _crashing_ for certian kinds of corruption however, which does not inspire confidence)
 93 2013-07-27 13:21:34 <gmaxwell> (but I've only ever triggered that by fuzzing its files)
 94 2013-07-27 13:21:37 <phantomcircuit> gmaxwell, they specifically handle a lot of bizarre edge cases with various hard drives
 95 2013-07-27 13:21:55 <phantomcircuit> to do that they end up using flush ops prodigiously
 96 2013-07-27 13:22:07 <gmaxwell> yea, I've looked at the BDB code.
 97 2013-07-27 13:22:37 <phantomcircuit> i generally assume hard drives are liars
 98 2013-07-27 13:22:48 <phantomcircuit> which is why i've got a lot of things on zfs now...
 99 2013-07-27 13:24:18 <sipa> gmaxwell: i think the log is replayed up to the part where checksums match
100 2013-07-27 13:24:25 <sipa> gmaxwell: so the log can have a corrupted end
101 2013-07-27 13:25:04 <gmaxwell> sipa: what if that leaves it in the middle of a transaction?
102 2013-07-27 13:25:22 <sipa> gmaxwell: the log is appended to atomically
103 2013-07-27 13:25:45 <phantomcircuit> sipa, that's literally impossible unless each log section is 512 bytes
104 2013-07-27 13:26:01 <sipa> i mean with a checksum for the entire record
105 2013-07-27 13:26:26 <sipa> leveldb does not have real transactions, only atomic batch updates
106 2013-07-27 13:26:44 <gmaxwell> k. that should be fine then.
107 2013-07-27 13:27:12 <sipa> though i'm not convinced anymore now that i'm looking at the code
108 2013-07-27 13:28:20 <phantomcircuit> sipa, looking at the log format... im not sure how that's possible
109 2013-07-27 13:28:35 <gmaxwell> still: our problems are not torn writes, though it's important to know what durability properties we can reasonable expect. If the chainstate is going to get irreparabily corrupted on an unfortuate shutdown we do need to know that.
110 2013-07-27 13:36:19 <sipa> ok, when writing records
111 2013-07-27 13:36:33 <sipa> the batch is covnerted to an internal serialized batch
112 2013-07-27 13:36:45 <sipa> this batch is written to the log in one or more records
113 2013-07-27 13:37:11 <sipa> each record with its own checksum and markers whether it is the (a) first record of a batch (b) last record of a batch
114 2013-07-27 13:37:26 <sipa> when reconstructing, the entire batch is read before applying it
115 2013-07-27 13:37:31 <sipa> so that looks safe to me
116 2013-07-27 13:38:22 <phantomcircuit> sipa, is there a counter in the first record of how many more records there should be in the batch
117 2013-07-27 13:38:24 <phantomcircuit> im guessing no
118 2013-07-27 13:38:52 <sipa> hmmm
119 2013-07-27 13:39:11 <phantomcircuit> sipa, it's not common, but writes disappearing is a failure mode
120 2013-07-27 13:39:12 <sipa> we should start asking people for the chainstate/LOG files when they report corruption
121 2013-07-27 13:39:28 <sipa> that would mention what types of repairs have been performed
122 2013-07-27 13:39:34 <phantomcircuit> especially with ssds that have buggy firmware
123 2013-07-27 13:40:01 <sipa> huh
124 2013-07-27 13:40:14 <phantomcircuit> sipa, lol yeah writes just
125 2013-07-27 13:40:16 <phantomcircuit> disappear
126 2013-07-27 13:40:19 <phantomcircuit> *poof*
127 2013-07-27 13:40:22 <phantomcircuit> it's bizarre
128 2013-07-27 13:41:21 <sipa> when a log file record couldn't be applied, it is skipped
129 2013-07-27 13:41:26 <sipa> but the recovery is not aborted
130 2013-07-27 13:41:31 <gmaxwell> eek
131 2013-07-27 13:41:38 <phantomcircuit> sipa, ahah oh boy
132 2013-07-27 13:41:40 <phantomcircuit> that's not good
133 2013-07-27 13:41:50 <sipa> whether that is safe depends on how the checksum is calculated
134 2013-07-27 13:41:59 <sipa> if it is cumulative, further records should also fail
135 2013-07-27 13:42:01 <gmaxwell> yea, if the checksum for all the rest will fail then thats fine.
136 2013-07-27 13:42:14 <gmaxwell> easy enough to test I guess.
137 2013-07-27 13:42:42 <phantomcircuit> sipa, im pretty sure the checksum is per record
138 2013-07-27 13:43:03 <phantomcircuit> actually
139 2013-07-27 13:43:05 <phantomcircuit> scratch that
140 2013-07-27 13:43:09 <phantomcircuit> it's cumulative
141 2013-07-27 13:43:14 <phantomcircuit> uint32_t crc = crc32c::Extend(type_crc_[t], ptr, n);
142 2013-07-27 13:43:25 <sipa> it's only 32 bits though :)
143 2013-07-27 13:43:26 <phantomcircuit> sort of
144 2013-07-27 13:43:32 <phantomcircuit> it's cumulative by record type
145 2013-07-27 13:45:10 <phantomcircuit> sipa, so, corrupt middle record, followed by uncorrupted last record
146 2013-07-27 13:45:17 <phantomcircuit> you could end up applying a partial batch
147 2013-07-27 13:45:20 <phantomcircuit> (i think)
148 2013-07-27 13:47:31 <sipa> uh-oh
149 2013-07-27 13:47:40 <sipa> type_crc_ are just precomputed crc's
150 2013-07-27 13:47:44 <sipa> they are not updated afaics
151 2013-07-27 13:47:53 <sipa> so the checksums are not cummulative
152 2013-07-27 13:48:26 <sipa> this sounds very bad...
153 2013-07-27 13:48:30 <phantomcircuit> lol...
154 2013-07-27 13:48:44 <CodeShark> are you talking about leveldb?
155 2013-07-27 13:48:45 <phantomcircuit> ACTION hugs his 0.7.3
156 2013-07-27 13:48:49 <sipa> CodeShark: yes
157 2013-07-27 13:49:18 <sipa> still, this does not explain the mid-sync corruptions
158 2013-07-27 13:49:33 <sipa> as log recovery is only done at startup
159 2013-07-27 13:49:34 <gmaxwell> yea, so thats not good but doesn't explain our actual problems.
160 2013-07-27 13:49:53 <gmaxwell> it also sounds easily fixed. E.g. adding a counter to the last record.
161 2013-07-27 13:50:08 <phantomcircuit> well the format here is simple enough that if people provided their db it could be analyzed
162 2013-07-27 13:50:12 <sipa> or just stopping as soon as a checksum mismatch is detected
163 2013-07-27 13:50:30 <sipa> instead of trying to continue with further records
164 2013-07-27 13:50:32 <phantomcircuit> sipa, checksum & sequence id
165 2013-07-27 13:50:40 <phantomcircuit> they protect against different failure modes
166 2013-07-27 14:08:29 <bmcgee> sipa: that 19m you quoted, was that a re-index or did it include downloading from other peers?
167 2013-07-27 14:08:45 <sipa> bmcgee: reindex
168 2013-07-27 14:08:52 <sipa> bmcgee: and only until block 225430
169 2013-07-27 14:11:07 <sipa> it's around an hour for a full reindex here
170 2013-07-27 14:11:19 <sipa> but the latter part is cpu-bound, and this is on a fast cpu
171 2013-07-27 14:11:38 <bmcgee> i'm gonna try it on my macbook pro. i7 with 16gb ram???..
172 2013-07-27 14:12:00 <sipa> (hexacore, 12 threads, 3.2 GHz xeon)
173 2013-07-27 14:13:42 <bmcgee> sipa: again, I hate you so much right now...
174 2013-07-27 16:14:17 <bmcgee> is it possible to install bitcoind on a mac? not bitcoin-qt
175 2013-07-27 16:14:47 <bmcgee> bitcoin-testnet-box requires the command line daemon
176 2013-07-27 16:16:44 <bmcgee> ah wait a min, being stupid. I'll just spin up vagrant
177 2013-07-27 16:19:01 <Diablo-D3> bmcgee: it works fine on osx
178 2013-07-27 16:19:14 <Diablo-D3> its the same code, just doesnt include the qt ui
179 2013-07-27 16:19:21 <bmcgee> Diablo-D3: how did you install it?
180 2013-07-27 16:19:29 <Diablo-D3> I havent
181 2013-07-27 16:19:34 <Diablo-D3> I dont use bitcoin on osx
182 2013-07-27 16:19:50 <bmcgee> ah well I'm happy enough to spin up an ubuntu vm and just run it from there
183 2013-07-27 16:20:05 <bmcgee> better in some ways anyway, less clutter on my machine
184 2013-07-27 16:24:03 <sipa> bmcgee: if you build it yourself, you'll have it :)
185 2013-07-27 16:24:16 <bmcgee> sipa: meh ;)
186 2013-07-27 18:20:39 <CheckDavid> hey guys, just a quick question
187 2013-07-27 18:20:53 <CheckDavid> Is it possible to make a decentralized system for chat?
188 2013-07-27 18:21:04 <CheckDavid> A bit like IRC, but decentralized like bitcoin?
189 2013-07-27 18:21:14 <Luke-Jr> CheckDavid: probably; some people have been discussing doing that
190 2013-07-27 18:22:48 <CheckDavid> Anything peculiar about it
191 2013-07-27 18:22:55 <CheckDavid> As in limitations or features?
192 2013-07-27 18:23:11 <Luke-Jr> as I understand it, the difficult part is moderation
193 2013-07-27 18:23:17 <Luke-Jr> and antispam
194 2013-07-27 18:23:42 <CheckDavid> Couldn't the network assing moderator roles to certain users?
195 2013-07-27 18:24:02 <Luke-Jr> then it's no longer decentralized
196 2013-07-27 18:24:08 <bmcgee> I was just gonna say that
197 2013-07-27 18:24:23 <MoALTz> why not just commit to the blockchain the hash of each message sent? (non-refutation)
198 2013-07-27 18:24:26 <CheckDavid> I don't see how it would not be decentralized.
199 2013-07-27 18:24:45 <CheckDavid> The moderation would be limited to a certan room or channel, considering IRC as an analogy.
200 2013-07-27 18:24:49 <Luke-Jr> MoALTz: ??? srsly?
201 2013-07-27 18:25:34 <MoALTz> Luke-Jr: email takes on this role in organisations (irrefutable that your boss ordered you do X)
202 2013-07-27 18:25:53 <Luke-Jr> MoALTz: and email has major spam problems
203 2013-07-27 18:26:05 <Scrat> btc backed fidelity bonds required to join, voting system for kicks/bans
204 2013-07-27 18:26:46 <Scrat> plus a decentralized WoT
205 2013-07-27 18:26:53 <Scrat> ok it's madness :/
206 2013-07-27 18:27:00 <bmcgee> lol
207 2013-07-27 18:27:02 <Luke-Jr> socks will vote to ban you
208 2013-07-27 18:28:59 <Scrat> Luke-Jr: voting will be weighed by WoT score
209 2013-07-27 18:29:06 <Luke-Jr> Scrat: from whose perspective?
210 2013-07-27 18:29:48 <gmaxwell> Scrat: my 1000 socks have all WoT++ed each other.
211 2013-07-27 18:30:05 <Scrat> a genesis block will have to exist with a few starting trusted organizations
212 2013-07-27 18:30:15 <gmaxwell> then that isn't decenteralized.
213 2013-07-27 18:30:16 <Luke-Jr> so we're back to centralized
214 2013-07-27 18:30:22 <Luke-Jr> also, why the heck would we have a blockchain for this?
215 2013-07-27 18:30:36 <Scrat> forgot to add quotes
216 2013-07-27 18:30:42 <gmaxwell> if you want to have "trusted organizations" you can dispense with a lot of messy complexity, but ... not decenteralized.
217 2013-07-27 18:30:59 <MoALTz> what exactly do you want your chat system to do?
218 2013-07-27 18:31:13 <Luke-Jr> MoALTz: what IRC does
219 2013-07-27 18:32:46 <MoALTz> you know what? node proposes a block that has messages. other nodes may do PoW on it depending on their judgement via bayesian scoring
220 2013-07-27 18:33:25 <MoALTz> blocks that tend not to contain spam would be more likely to have PoW applied
221 2013-07-27 18:33:35 <gmaxwell> wrong channel for this stuff btw, I'm sorry I responded above, I thought this was #bitcoin.
222 2013-07-27 18:33:39 <MoALTz> however, incentives would need  to be there
223 2013-07-27 18:33:42 <MoALTz> ok
224 2013-07-27 18:34:16 <Scrat> gmaxwell: yeah my bad for starting it, obviously I havent thought this through
225 2013-07-27 18:37:05 <Luke-Jr> if someone makes a channel for this topic in further depth, please let me know
226 2013-07-27 19:08:45 <bmcgee> not strictly on topic, but I'm guessing a fair number of you guys have consider Make fu
227 2013-07-27 19:08:47 <bmcgee> am i right?
228 2013-07-27 19:09:52 <bmcgee> i want to tweak some of the targets for bitcoin-testnet-box
229 2013-07-27 19:10:03 <bmcgee> but i'm just a lowly Java/Scala developer???.
230 2013-07-27 19:29:30 <bmcgee> scratch that, don't need to modify it in the end