So, I’ve been working on something.  I keep seeing all of these very nice home lab dashboards on /r/homelab and I thought it would be useful to create one for myself.  I present to you, my home dashboard, which is hanging in the kitchen on an old iPad we weren’t using:

Getting to this point was not without challenges.  In fact, it was painful at times.  I’m going to try to document my setup here.  Because of all of the twists and turns along the way, I would say this is not a complete guide.  There are parts of this that you’ll have to figure out for yourself.  It also assumes some knowledge of linux, Ubuntu in particular.  If I get comments asking about specific sections, I’ll try to update the post with current info.

So, what do we have here?  The picture you see above is made up of a number of components.  InfluxDB is a time based DB, much like RRDTool or the original MRTG.  It’s designed to take in datapoints, tag them with a timestamp, and then move on.  It might be capable of more, but we’re not using it for anything else.  Grafana is the visualization tool that creates what you see above.  Grafana is very configurable, which I’ll dive into more in a bit.  The final piece of the puzzle is data collection.  There are a number of ways to get data into InfluxDB.  I’m using Telegraf and some interesting scripting.

Let’s start by getting some links in here.  I’ll update this as I update the post.

This is where it all started for me:

https://lkhill.com/using-influxdb-grafana-to-display-network-statistics/

This was useful for the Grafana configuration:

Setup a wicked Grafana Dashboard to monitor practically anything

InfluxData, which includes InfluxDB and Telegraf

https://www.influxdata.com/

Grafana for the visualization:

http://grafana.org/

The “SmokePing” stand-in:

https://hveem.no/visualizing-latency-variance-with-grafana

The Unraid tools:

https://lime-technology.com/forum/index.php?topic=52220.msg512346#msg512346

Ok, here we go…

First, I would start with the top link to lkhill’s instructions.  Use that to get up and running with InfluxDB and Grafana installed.  DO NOT follow that guide for the InfluxSNMP install.  Telegraf takes care of SNMP now.  If I recall, InfluxData wants your…data, in order to download InfluxDB.  It’s cool though, because they’ll send you some swanky stickers.  I believe these are still valid instructions for installing Telegraf:  https://docs.influxdata.com/telegraf/v1.1/introduction/installation/

I would suggest getting to this point with InfluxDB, Grafana and Telegraf installed and not throwing errors before you proceed with any configuration.  I know I’m skipping a lot of things that might not work without some tweaking.  Like I said, I’ll update this if I get feedback that these installations need to be detailed.  Add the data source as shown in lkhill’s instructions.

At this point you should have some data being populated for the localhost and the data source should have been available.  I would suggest diverting from lkhill’s instructions at this point.  Instead of adding a graph for SNMP stats (we have none at this point), let’s set up a graph of the local CPU utilization.  Add a new dashboard and then click on the small green square in the upper left.  Click on the “A” select statement and it’ll expand to show you options for finding the data.  Clicking on each of the fields will either give you a drop down list of options, or it might give you an X above the item.  For instance, if you click on mean() you’ll get the x above that.  Click the x to delete mean().  Clicking the + at the end of each row will give you a list of options to add from.  Try to get your selection to look like this:

Click the big X out on the right of the tab bar, past Time range, to close the edit and return to the dashboard.  Congrats, you just made your first dashboard!  Let’s get some useful data in there.

First thing to take care of is to add SNMP.  Go to /etc/telegraf/ and edit telegraf.conf.  If there’s not a conf file, there might be a template called dpkg-dist in there.  If not, you can create a new template.  I found this extremely helpful for working through Telegraf issues:  https://github.com/influxdata/telegraf  You can also go right to the SNMP readme at https://github.com/influxdata/telegraf/tree/master/plugins/inputs/snmp

You can see that Telegraf has quite a few plugins for gathering data.  SNMP is only one part of it.  Some configuration is necessary to start using Telegraf.  Near the top of the file are general settings that must be configured.  Make sure in the OutputPlugins section the urls, database and username/password are uncommented and correct.  The database can be called whatever you want, and you can have multiple databases in Grafana.  Find the “inputs.snmp” section and we’ll begin editing it.  Here’s mine:

# # Retrieves SNMP values from remote agents
[[inputs.snmp]]
agents = [ “192.x.x.x:161” ]
timeout = “5s”
version = 3

max_repetitions = 50

sec_name = “SNMPv3User”
auth_protocol = “SHA” # Values: “MD5”, “SHA”, “”
auth_password = “topsecret”
sec_level = “authPriv” # Values: “noAuthNoPriv”, “authNoPriv”, “authPriv”

priv_protocol = “AES” # Values: “DES”, “AES”, “”
priv_password = “alsotopsecret”

name = “nutanix”
[[inputs.snmp.field]]
name = “host1CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.1”
[[inputs.snmp.field]]
name = “host2CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.2”
[[inputs.snmp.field]]
name = “host3CPU”
oid = “1.3.6.1.4.1.41263.9.1.6.3”
[[inputs.snmp.field]]
name = “ClusterIOPS”
oid = “1.3.6.1.4.1.41263.506.0”
[[inputs.snmp.field]]

name = “Host1MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.1”
[[inputs.snmp.field]]
name = “Host2MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.2”
[[inputs.snmp.field]]
name = “Host3MEM”
oid = “1.3.6.1.4.1.41263.9.1.8.3”

[[inputs.snmp]]
agents = [ “192.x.x.x:161” ]
timeout = “5s”
retries = 3
version = 2
community = “topsecret”
max_repetitions = 10

name = “ERX”
[[inputs.snmp.field]]

name = “Bytes.Out”
oid = “1.3.6.1.2.1.2.2.1.10.2”
[[inputs.snmp.field]]
name = “Bytes.In”
oid = “1.3.6.1.2.1.2.2.1.16.2”

I’ve edited the IP addresses and security info, so make sure that matches whatever you have set up.  Oh yeah, you have to enable SNMP on your devices!  A couple of key points for this, you can have different SNMP versions or authentication methods defined by adding a new [[inputs.snmp]] for each one.  I’m also using the full OIDs, but you can see in the template that it’s possible to reference a MIB by name as well.  Save that and exit.  You can test the file with

telegraf –config telegraf.conf -test

This will give you lines for each device you’ve configured and show you what the response is.  If you don’t see data, something’s wrong with the snmp config.

Leave a comment

Your email address will not be published. Required fields are marked *