{"id":4582,"date":"2013-01-20T19:57:17","date_gmt":"2013-01-21T03:57:17","guid":{"rendered":"http:\/\/www.chesnok.com\/daily\/?p=4582"},"modified":"2013-01-20T20:08:37","modified_gmt":"2013-01-21T04:08:37","slug":"setting-up-hbase-for-socorro","status":"publish","type":"post","link":"https:\/\/www.chesnok.com\/daily\/2013\/01\/20\/setting-up-hbase-for-socorro\/","title":{"rendered":"Setting up HBase for Socorro"},"content":{"rendered":"<p>Setting up HBase for use with <a href=\"http:\/\/github.com\/mozilla\/socorro\">Socorro<\/a> is a bit of a <a href=\"https:\/\/i.chzbgr.com\/maxW500\/516933888\/h7E25C981\/\">bear<\/a>!  The default Vagrant config sets up a VM with filesystem-only. For those that want to try out the HBase support, or are on a path toward setting up a production instance, these instructions might help you along the way. <\/p>\n<p>You may also be interested in Lars&#8217; recent blog posts about Socorro: <\/p>\n<ul>\n<li><a href=\"http:\/\/www.twobraids.com\/2012\/12\/socorro-modular-design.html\">Socorro Modular Design<\/a><\/li>\n<li><a href=\"http:\/\/www.twobraids.com\/2012\/12\/the-socorro-crash-storage-system.html\">Socorro Crash Storage System<\/a><\/li>\n<li><a href=\"http:\/\/www.twobraids.com\/2012\/12\/socorro-file-system-storage.html\">Socorro File System Storage<\/a><\/li>\n<li><a href=\"http:\/\/www.twobraids.com\/2013\/01\/more-socorro-file-system-storage.html\">More Socorro File System Storage<\/a><\/li>\n<li><a href=\"http:\/\/www.twobraids.com\/2013\/01\/whats-next-for-socorro-file-system.html\">What&#8217;s next for Socorro File System<\/a><\/li>\n<li><a href=\"http:\/\/www.twobraids.com\/2013\/01\/hbase-as-socorro-crash-storage.html\">HBase as Socorro Crash Storage<\/a><\/li>\n<\/ul>\n<p>Here&#8217;s how I got it all working on an Ubuntu Precise (12.04) system, along with some scripts for launching important processes and putting test crashes into the system so you can tell that it is working. Ultimately, my goal is to incorporate all of this into some setup scripts to help new users out.<\/p>\n<h2>Set up HBase and Thrift<\/h2>\n<p>Socorro uses the Thrift API to insert new crashes and retrieve them through the middleware layer. These <a href=\"http:\/\/hbase.apache.org\/book.html#quickstart\">Quickstart instructions<\/a> are pretty helpful for getting HBase installed.<\/p>\n<p>Then, you need to edit <\/p>\n<pre>\/etc\/hosts<\/pre>\n<p> and remove the &#8216;127.0.1.1&#8217; entry, and add your hostname to the localhost &#8216;127.0.0.1&#8217; line. Also, it&#8217;s helpful for the defaults to add &#8216;<code>crash-stats<\/code>&#8216; and &#8216;<code>crash-reports<\/code>&#8216; as host aliases. Your final config line for localhost would look like: <\/p>\n<pre>\r\n127.0.0.1       localhost wuzetian crash-reports crash-stats\r\n<\/pre>\n<p>(where <code>wuzetian<\/code> is your hostname)<\/p>\n<p>You also need to add configuration for HBase. Here&#8217;s an example: <\/p>\n<pre><code>\r\n&lt;?xml version=&quot;1.0&quot;?&gt;\r\n&lt;?xml-stylesheet type=&quot;text\/xsl&quot; href=&quot;configuration.xsl&quot;?&gt;\r\n&lt;configuration&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;hbase.rootdir&lt;\/name&gt;\r\n    &lt;value&gt;file:\/\/\/var\/tmp\/hbase&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n  &lt;property&gt;\r\n    &lt;name&gt;hbase.zookeeper.property.dataDir&lt;\/name&gt;\r\n    &lt;value&gt;\/var\/tmp\/zookeeper&lt;\/value&gt;\r\n  &lt;\/property&gt;\r\n&lt;\/configuration&gt;\r\n<\/code><\/pre>\n<p>That sets the location for your HBase files for and zookeeper. This setup is for testing, so I put the directories in a location can easily clear out.<\/p>\n<p>Then, to start HBase and Thrift up: <\/p>\n<pre>\r\n\/etc\/init.d\/hadoop-hbase-master start\r\n\/etc\/init.d\/hadoop-hbase-thrift start\r\n<\/pre>\n<h2>Setting up processor tools<\/h2>\n<p>The processor that looks at raw crashes runs two tools by default: <code>minidump_stackwalk<\/code> and <code>exploitable<\/code>.<\/p>\n<p>You can build these from the socorro source tree with: <\/p>\n<pre>\r\n<code>make minidump_stackwalk<\/code>\r\n<\/pre>\n<p>Then <code>make install<\/code> should put these files into a useful location.<\/p>\n<p>You can also just copy the binaries from the stackwalk\/bin directory and the other is exploitable\/exploitable.<\/p>\n<p>The paths for these are configured in <code>config\/processor.ini<\/code>: <code>exploitability_tool_pathname<\/code> and <code>minidump_stackwalk_pathname<\/code><\/p>\n<p>There&#8217;s also a symbols resolver configured, but I am not setting this up in my test.<\/p>\n<h2>Disable LZO compression for HBase (unless you have it configured<\/h2>\n<p>Our hbase schema is configured to use LZO compression by default. Change that to &#8216;NONE&#8217; and load the schema into hbase: <\/p>\n<pre>\r\n\/bin\/cat \/home\/socorro\/dev\/socorro\/analysis\/hbase_schema | sed 's\/LZO\/NONE\/g' | \/usr\/bin\/hbase shell\r\n<\/pre>\n<h2>Set up crashmover<\/h2>\n<p>Update two lines in scripts\/config\/collectorconfig.py:<\/p>\n<pre>\r\nlocalFS.default = '\/home\/socorro\/primaryCrashStore'\r\nfallbackFS.default = '\/home\/socorro\/fallback'\r\n<\/pre>\n<p>Set those to directories that you can store crash dumps.<\/p>\n<h2>Configure processor and monitor to use HBase<\/h2>\n<p>You need to set the processor up to use HBase instead of local crash storage. <\/p>\n<p>The easiest way to do this is as follows: <\/p>\n<pre>\r\nPYTHONPATH=. python socorro\/processor\/processor_app.py --admin.conf=.\/config\/processor.ini --source.crashstorage_class=socorro.external.hbase.crashstorage.HBaseCrashStorage --admin.dump_conf=config\/processor2.ini\r\nPYTHONPATH=. python socorro\/processor\/monitor_app.py --admin.conf=.\/config\/monitor.ini --source.crashstorage_class=socorro.external.hbase.crashstorage.HBaseCrashStorage --admin.dump_conf=config\/monitor2.ini\r\n<\/pre>\n<p>Then edit both files to reflect your HBase configuration.<\/p>\n<h2>Starting up<\/h2>\n<p>The <a href=\"http:\/\/socorro.readthedocs.org\/en\/latest\/installation.html#run-socorro-in-dev-mode\">docs suggest starting up four daemons in screen sessions.<\/a> I mocked up <a href=\"https:\/\/gist.github.com\/4583487\">a shell script and a screenrc<\/a> to get you started.<\/p>\n<p>And that&#8217;s it! You should now have a working system, with crashes being submitted and stashed into HBase, and the monitor and processor picking up crashes as they arrive and running the stackwalk and exploitable tools against the crashes.<\/p>\n<p>Please let me know if these instructions work, or don&#8217;t work, for you.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Setting up HBase for use with Socorro is a bit of a bear! The default Vagrant config sets up a VM with filesystem-only. For those that want to try out the HBase support, or are on a path toward setting &hellip; <a href=\"https:\/\/www.chesnok.com\/daily\/2013\/01\/20\/setting-up-hbase-for-socorro\/\">Continue reading &rarr;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[590],"tags":[],"class_list":["post-4582","post","type-post","status-publish","format-standard","hentry","category-socorro"],"_links":{"self":[{"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/posts\/4582","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/comments?post=4582"}],"version-history":[{"count":10,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/posts\/4582\/revisions"}],"predecessor-version":[{"id":4633,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/posts\/4582\/revisions\/4633"}],"wp:attachment":[{"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/media?parent=4582"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/categories?post=4582"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.chesnok.com\/daily\/wp-json\/wp\/v2\/tags?post=4582"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}