Grafana, Telegraf, Smokeping, oh my…

So, I’ve been working on something.  I keep seeing all of these very nice home lab dashboards on /r/homelab and I thought it would be useful to create one for myself.  I present to you, my home dashboard, which is hanging in the kitchen on an old iPad we weren’t using:

Getting to this point was not without challenges.  In fact, it was painful at times.  I’m going to try to document my setup here.  Because of all of the twists and turns along the way, I would say this is not a complete guide.  There are parts of this that you’ll have to figure out for yourself.  It also assumes some knowledge of linux, Ubuntu in particular.  If I get comments asking about specific sections, I’ll try to update the post with current info.

So, what do we have here?  The picture you see above is made up of a number of components.  InfluxDB is a time based DB, much like RRDTool or the original MRTG.  It’s designed to take in datapoints, tag them with a timestamp, and then move on.  It might be capable of more, but we’re not using it for anything else.  Grafana is the visualization tool that creates what you see above.  Grafana is very configurable, which I’ll dive into more in a bit.  The final piece of the puzzle is data collection.  There are a number of ways to get data into InfluxDB.  I’m using Telegraf and some interesting scripting.

Let’s start by getting some links in here.  I’ll update this as I update the post.

This is where it all started for me:

https://lkhill.com/using-influxdb-grafana-to-display-network-statistics/

This was useful for the Grafana configuration:

Setup a wicked Grafana Dashboard to monitor practically anything

InfluxData, which includes InfluxDB and Telegraf

https://www.influxdata.com/

Grafana for the visualization:

http://grafana.org/

The “SmokePing” stand-in:

https://hveem.no/visualizing-latency-variance-with-grafana

The Unraid tools:

https://lime-technology.com/forum/index.php?topic=52220.msg512346#msg512346

Ok, here we go…

First, I would start with the top link to lkhill’s instructions.  Use that to get up and running with InfluxDB and Grafana installed.  DO NOT follow that guide for the InfluxSNMP install.  Telegraf takes care of SNMP now.  If I recall, InfluxData wants your…data, in order to download InfluxDB.  It’s cool though, because they’ll send you some swanky stickers.  I believe these are still valid instructions for installing Telegraf:  https://docs.influxdata.com/telegraf/v1.1/introduction/installation/

I would suggest getting to this point with InfluxDB, Grafana and Telegraf installed and not throwing errors before you proceed with any configuration.  I know I’m skipping a lot of things that might not work without some tweaking.  Like I said, I’ll update this if I get feedback that these installations need to be detailed.  Add the data source as shown in lkhill’s instructions.

At this point you should have some data being populated for the localhost and the data source should have been available.  I would suggest diverting from lkhill’s instructions at this point.  Instead of adding a graph for SNMP stats (we have none at this point), let’s set up a graph of the local CPU utilization.  Add a new dashboard and then click on the small green square in the upper left.  Click on the “A” select statement and it’ll expand to show you options for finding the data.  Clicking on each of the fields will either give you a drop down list of options, or it might give you an X above the item.  For instance, if you click on mean() you’ll get the x above that.  Click the x to delete mean().  Clicking the + at the end of each row will give you a list of options to add from.  Try to get your selection to look like this:

Click the big X out on the right of the tab bar, past Time range, to close the edit and return to the dashboard.  Congrats, you just made your first dashboard!  Let’s get some useful data in there.

First thing to take care of is to add SNMP.  Go to /etc/telegraf/ and edit telegraf.conf.  If there’s not a conf file, there might be a template called dpkg-dist in there.  If not, you can create a new template.  I found this extremely helpful for working through Telegraf issues:  https://github.com/influxdata/telegraf  You can also go right to the SNMP readme at https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp

You can see that Telegraf has quite a few plugins for gathering data.  SNMP is only one part of it.  Some configuration is necessary to start using Telegraf.  Near the top of the file are general settings that must be configured.  Make sure in the OutputPlugins section the urls, database and username/password are uncommented and correct.  The database can be called whatever you want, and you can have multiple databases in Grafana.  Find the “inputs.snmp” section and we’ll begin editing it.  Here’s mine:

# # Retrieves SNMP values from remote agents
[[inputs.snmp]]
agents = [ “192.x.x.x:161” ]
timeout = “5s”
version = 3

max_repetitions = 50

sec_name = “SNMPv3User”
auth_protocol = “SHA” # Values: “MD5”, “SHA”, “”
auth_password = “topsecret”
sec_level = “authPriv” # Values: “noAuthNoPriv”, “authNoPriv”, “authPriv”

priv_protocol = “AES” # Values: “DES”, “AES”, “”
priv_password = “alsotopsecret”

name = “nutanix”
[[inputs.snmp.field]]
name = “host1CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.1”
[[inputs.snmp.field]]
name = “host2CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.2”
[[inputs.snmp.field]]
name = “host3CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.3”
[[inputs.snmp.field]]
name = “ClusterIOPS”
oid = “1.3.6.1.4.1.41263.506.0”
[[inputs.snmp.field]]

name = “Host1MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.1”
[[inputs.snmp.field]]
name = “Host2MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.2”
[[inputs.snmp.field]]
name = “Host3MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.3”

[[inputs.snmp]]
agents = [ “192.x.x.x:161” ]
timeout = “5s”
retries = 3
version = 2
community = “topsecret”
max_repetitions = 10

name = “ERX”
[[inputs.snmp.field]]

name = “Bytes.Out”
oid = “1.3.6.1.2.1.2.2.1.10.2”
[[inputs.snmp.field]]
name = “Bytes.In”
oid = “1.3.6.1.2.1.2.2.1.16.2”

I’ve edited the IP addresses and security info, so make sure that matches whatever you have set up.  Oh yeah, you have to enable SNMP on your devices!  A couple of key points for this, you can have different SNMP versions or authentication methods defined by adding a new [[inputs.snmp]] for each one.  I’m also using the full OIDs, but you can see in the template that it’s possible to reference a MIB by name as well.  Save that and exit.  You can test the file with

telegraf –config telegraf.conf -test

This will give you lines for each device you’ve configured and show you what the response is.  If you don’t see data, something’s wrong with the snmp config.

Cobra action camera mount

Unfortunately, I haven’t posted about how my Cobra is legal now.  In the meantime, how about this.  I recently got a cheap action camera off of Amazon.  Works pretty well, and the video quality is decent.  However, I needed a way to mount it to the Cobra.  I stumbled across the bracket you see below at a Tractor Supply store.  This is meant to mount one of those obscenely bright LED light bars for trucks.  However, it fits perfectly on the FFR roll bar.  Rock solid and good mounting for the camera mount.  I think it was <$15 for the pair.

Aerohive issues

Just a quick reminder note about something I’ve run into with Aerohive a couple of times.  If you get too anxious and start changing the config and rebooting quickly, the APs will get confused and seem to go into a waiting period.  Things will behave oddly, and you’ll get error messages like “There’s an admin modifying the config”, or something to that effect.  Just be patient, and either wait for or perform a full reboot.  And then be patient.  It seems like these things just need some time to get caught up occasionally.

Also, I ran into a situation where non-Apple devices would connect fine, but all Apple devices would either say “Unable to join” or “Incorrect password”.  No rhyme or reason to it.  Eventually, after several reboots, the Apple devices magically started working.  Again, just be patient.  It’s not like applying changes to a standalone AP, or even a local controller.  There’s that Internet thing getting in the way!

Ubiquiti UAP-AC-LR in the house!

I’ve been having some trouble with my two Apple Airport Extreme’s in the house.  They are both a couple of generations old and I got them both used off of Ebay some years ago.  They’ve served me well and provided good throughput and signal coverage.  For some reason I can’t explain, in the last month they’ve become slow and buggy.  Maybe it was an update.  Regardless, I’ve had my eye on the new AC APs from Ubiquiti and this was a good excuse to pull the trigger.

So, I decided to get a couple of the LR models, partly because I want more coverage out in the yard, partly because they are less expensive and partly because they are readily available.  I set up the Unifi controller in a VM in Nutanix first, and installation could not have been easier.  So far, I’m very happy with the coverage and performance.  I’ve been getting good coverage in the house, and I’m able to still use them at almost 200′ away from the house.

 

Nutanix CE is operational

I’ve been running on a Nutanix CE install for about a month now.  With the November release they added some much needed GUI controls for the image service.  You can now import ISOs for install images, without having to fiddle with CLI stuff.

I’ve had virtually no problems, and the VMs are performing well.  If there’s one complaint I have with this solution it’s that the baseline memory utilization is high.  I couldn’t reduce the CVM’s to less than 8GB each without running into serious problems with the cluster.  Plus, there seems to be a missing 3GB per host.  I’m assuming this is what the actual CE and KVM host requires, but that seems high.  I know I can run VMWare ESXi in less than 1GB per host.  So, 11GB per host is used up right from the start.  Since I’m running this on a shoestring budget with 16GB per host, I really only have 5GB available for VMs.  That kinda sucks.

On the upside, the CVM’s at 8GB work fine and the IO performance is pretty amazing.  I’ve seen upwards of 1600 IOPS at times.  This is basically a single consumer grade 240GB SSD in each host for the primary tier and 640GB HDD for the secondary tier.  I don’t think I’m even using the secondary yet.  3 hosts at varying levels of i5 CPU’s, but none of them current gen.

I’m pretty happy with this and I’m looking forward to seeing what Nutanix does next.

Nutanix CE challenges

The Nutanix install has been moving along.  I would not say it’s ready for more than lab use, but it’s getting there.  I’m setting up a 3 node cluster, and one of the nodes, which has an Intel motherboard, kept throwing a generic error about not being able to find the sysinfo.  Thanks to the help from the forum, I was able to hard code a product name in order to get past the install.  I don’t think it will have an impact on operation, only install, but it’s one of those little things that crops up with new software.

The link is here, if you’re able to access it:  http://next.nutanix.com/t5/Discussion-Forum/Install-failing-with-quot-unable-to-gather-basic-hardware/td-p/5034/page/2

 

Nutanix in the house

About 2 months ago Nutanix.com released a free software only version of their magic, called Community Edition.  I got on the list for this as quickly as I could, but I haven’t been able to install it until now.  See, I wanted to have an actually cluster, what the call RF2 (Redundancy Factor), which would require me to blow away my existing XenServer install to get to enough compatible hardware.  I also needed to purchase SSD’s for each of the nodes in the cluster.

Well, I’ve done that now.  At the moment, I’m exporting my VM’s out of XenServer to OVA’s, in the hope I can restore them from that.  If I can’t, well….I’m not sure then.  I may just rebuild everything from scratch.  I’d really like to figure out how to import them, though.

What I’ll have when I’m done is a 3 node RF2 cluster, with the minimum a 240GB SSD, and at least a 500GB HDD in each node.   All 3 nodes are i5’s, of different vintages. Not a lot of space, once you run the Nutanix overhead, but it’ll be enough for my needs.  I’ll post some screenshots and pics once I’m up and running.

Grandstream GXV3610 disassembled

In the interest of completeness I’m posting my pictures here to show what the inside of the GXV3610 looks like. I couldn’t find this anywhere else, so here it is:

Untitled

Untitled

My goal was to remove the cable from the backside of the ball and run it through the small hole I had already drilled through the exterior wall for a Cat5 cable. The two sides of the ball unscrew easily enough, but you can see from the pictures that getting the cable out would have been a challenge. The wires coming out of the jacketed cable go to three different plugs. I could have made that work, although the weatherproof grommet at the back would have been a problem. The bigger issue is what the red arrow is pointing at in the second pic. That’s the mic at the front bottom of the camera. Or rather, that pair of wires goes to the back side of the mic, which is thoroughly coated with a white paste of some sort, no doubt for weather proofing. My only real option would have been to snip those wires and resolder them. Nah, that’s ok.

Instead, I went to Home Depot with the mounting ring in hand and found something like this:

Round PVC junction box

The ring does not line up with standard 4″ round electric boxes. However, it’s a perfect match with this junction box. I had to add some caulk and foam backing material to seal the gap, but it’s closed tight now. I’ll snap a picture of the mounted box and post it.

The camera is working great. Good FOV and sharp picture. I have the 720p model, not the full HD.

Logitech 700e security camera disassembly

My 700e security camera failed after a couple years of use.  I got a Grandstream replacement, which I’ll also post about, so I figured it would be good to take the 700e apart and see if I could figure out what went wrong.  Figure it out I did.  And since there’s a lack of documentation and photos for taking the thing apart I thought I’d put it up here:

First, pop the top silver cover off by slotting a screw driver down the side to pop out the tabs:

Untitled

Untitled

Untitled

Once the silver cover is off you need to remove the 6 screws across the top and gently insert a small flat head screwdriver here:

Untitled

All you’re trying to do with that one is break the seal. After the seal is broken the darker color cover comes right off and you have this:

Untitled

Bonus points if you can guess what my problem was:

Untitled

Remove the screw from the metal plate at the ethernet jack end. Carefully pry up the circuit board from that end. There’s also a slot on the plastic end plate that you can use to help it up. Once it’s out you can carefully raise the board up on the angle, although the cables going to the camera board are still attached. Here’s what I found under there:

Untitled

Yep, that’s white corrosion all of the top of the Ethernet jack, the lower right corner of the board, and all around the edge. Just below the (I think) transformer on the right there’s a bright white spot. That’s actually a pile of corrosion on top of several resistors.

So, there you have it. A disassembled Logitech 700e security camera. I’m going to try an eraser on the corroded bits and see if it wants to come back to life. I’m not optimistic, but I’ll post back if it works.

Apple Watch and replaceable parts

Today I sent a note to Daring Fireball saying ythe following.  I think the backing for the Apple Watch will be easily removable and will support swaps at the Genius bar.  What I mean is that the casing, which will be the expensive part,will be static and the engine, currently the S1,will be replaceable at the Apple store.  Apple currently has the mechanism to replace screens for iPhones at the stores.  It seems to be a small stretch to support removing the backing on the Watch and replacing the smarts inside quickly. I’m even thinking that the recent changes to triaging Genius requests has something to do with this.  Of course, we’ll see what happens over the next year.

Sweeping the Mental Dust Bunnies Under the Rug