An Introduction to OpenTSDB

In our last three blogs, we have talked about the HDFS, Zookeeper & HBase Cluster, which is needed for deploying OpenTSDB in clustered mode. Continuing to the series, In this blog, we will finally deploy the OpenTSDB.

OpenTSDB is a distributed, scalable, time series database built on top of Hadoop and HBase.

OpenTSDB can collect, store & serve billions of the data points without any loss of precision. Which makes it a perfect solution for the monitoring system.

OpenTSDB uses HBase for time series data storage and ZooKeeper to get HBase cluster information.

OpenTSDB consists of a Time Series Daemon (TSD) as well as a set of command line utilities. Interaction with OpenTSDB is primarily achieved by running one or more of the TSDs. Each TSD is independent. There is no master and no shared state so we can run as many TSDs as required to handle as much load you throw at it. Each TSD uses an open source database i.e. HBase to store and retrieve time-series data.

Properties of OpenTSDB:

  • Data is stored exactly as you give it
  • Write with millisecond precision
  • Keep raw data forever
  • Runs on Hadoop and HBase
  • Scales to millions of writes per second
  • Add capacity by adding nodes.
  • Generate graphs from the GUI
  • Support for HTTP API is provided
  • Tools like Grafana can be used to visualize data more efficiently.

Why HBase:

Properties of HBase makes it a perfect fit for OpenTSDB:

Scalable: HBase uses HDFS to store data. So if we want to store more data, we just have to add more DataNode in our cluster.

Automatic replication: Your data is stored in HDFS, which by default means 3 replicas upon 3 different machines. You can also enable region replication while creating tsdb tables in HBase. (in our create_tbl script we set REGION_REPLICATION => 2)

High write throughput: The Bigtable design, which HBase follows, uses LSM trees instead of, say, B-trees, to make writes cheaper.

Create Tables in HBase for OpenTSDB:

OpenTSDB uses the following tables in HBase:
tsdb, tsdb-uid, tsdb-tree & tsdb-meta.

The time series data is stored in the tsdb table.

Copy create_tbl.sh file inside any HBase container, which we have deployed in our last blog. We will do this in the hbase1 container.

Deploy OpenTSDB:

To Deploy OpenTSDB we will use open source docker image provided by Peter Grace.

opentsdb.conf

Replace <zookeeper1 vm IP>,<zookeeper2 vm IP>,<zookeeper3 vm IP> with respective vm Ip. OpenTSDB uses zookeeper to get HBase cluster information. Both HBase & OpenTSDB will use same ZooKeeper cluster.

Run OpenTSDB docker Container

You can open http://<vm1 IP>:4242/ and see running OpenTSDB.

opentsdb.PNG

In our future blog, we will discuss how can we use opentsdb http api to push stats to OpenTSDB & how to use the TCollector to push host level and other services stats to OpenTSDB.

Thanks for reading😊
Please like and share😊

Reference:

http://opentsdb.net/

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *