# Upgrading InfluxDB

## Summary

There are two parts to upgrading the Influx version.  First cookbook changes and then applying to the server.  In this guide, we are updating from `1.3.5` to `1.3.7`.

## Cookbook Changes

### Update `sds_influxdb`

1. Download the files for the new version, the links for these can be found at the [InfluxDB Portal][1]

    * Navigate to a valid license.
    * Check the box for `Click to Accept Software License Subscription Agreement` to show the download links.
    * Download both the `DataNode` file and the `MetaNode` file for the `amd64.deb` variant.

        ```shell
        curl -O -J https://dl.influxdata.com/enterprise/releases/influxdb-data-1.3.7-c1.3.7_amd64.deb
        curl -O -J https://dl.influxdata.com/enterprise/releases/influxdb-meta_1.3.7-c1.3.7_amd64.deb
        ```

1. Calculate hashes for the new files.  On Mac use the following:
    
    ```shell
    shasum -a 256 ./influxdb-data-1.3.7-c1.3.7_amd64.deb
    shasum -a 256 ./influxdb-meta_1.3.7-c1.3.7_amd64.deb
    ```

1. In `attributes.rb` add new checksum lines for the new version.  Use the hashes output by the above `shasum` commands.
 
    ```ruby
    default['influxdb']['checksums']['data']['debian']['1.3.7'] = '779487a83c5f2b113c4788e740c7b6d46ea1fa58fa35b7985be84d431f357337'
    default['influxdb']['checksums']['meta']['debian']['1.3.7'] = '37bf45df0994a233919a54e0d412bc903c6a0dd736c558f2a0488156bf69a0c6'
    ``` 

1. In `attributes.rb` change the default version to the new version.

    ```ruby
    default['influxdb']['version'] = '1.3.7'
    ```

1. Apply any needed default configuration changes as required by the new version, this will not be required in many version bumps.

1. Push new cookbook version.
    * Commit changes
    * Run `bump_metadata_version`
    * Run `push_metadata_version`

### Update `app_influxdb`

1. Apply any configuration override changes as needed.

1. Push new cookbook version.
    * Run `berks update`
    * Commit changes
    * Run `bump_metadata_version`
    * Run `push_metadata_version`
    * Run `upload_and_apply`

## Apply Service Upgrade

1. Log in to all servers in the cluster via `ssh`.

1. Ensure chef has run successfully on the nodes.  You can confirm InfluxDB version with the following commands and compare the `Status:` and expected `Version:` lines.

    On Meta Nodes

    ```shell    
    apt-cache show influxdb-meta
    Package: influxdb-meta
    Status: install ok installed
    Priority: extra
    Section: default
    Installed-Size: 28479
    Maintainer: support@influxdb.com
    Architecture: amd64
    Version: 1.3.7-c1.3.7-1
    Conffiles:
    /etc/influxdb/influxdb-meta.conf a9e07efd93f274b53f876e64c151db18
    /etc/logrotate.d/influxdb-meta 5098f781a8a9fdab4cd96d7f2a88d961
    Description: Meta service for InfluxDB, a distributed time-series database.
    Description-md5: afa320dd28af640c4c931696717df9a8
    License: Proprietary
    Vendor: InfluxData
    Homepage: https://influxdata.com
    ```
    
    On Data Nodes

    ```shell
    apt-cache show influxdb-data
    Package: influxdb-data
    Status: install ok installed
    Priority: extra
    Section: default
    Installed-Size: 60780
    Maintainer: support@influxdb.com
    Architecture: amd64
    Version: 1.3.7-c1.3.7-1
    Conffiles:
    /etc/influxdb/influxdb.conf ba442fd9d983f299d535e51492a711bc
    /etc/logrotate.d/influxdb 546afc8a077d25862a8c0a4c1220223f
    Description: Distributed time-series database.
    Description-md5: 0b6beb0ca2e0701d5fa9dd4f80fb60f3
    License: Proprietary
    Vendor: InfluxData
    Homepage: https://influxdata.com
    ```    

1. The following command can be executed on the `meta` nodes, and only on the `meta` nodes, to examine cluster version state.  You can run this multiple times throughout the process to verify that nodes are back online. 

    ```shell
    influxd-ctl show
    Data Nodes
    ==========
    ID    TCP Address                    Version
    6    influxdb-data-pdx-a-06edc145675dab320:8088    1.3.5-c1.3.5
    7    influxdb-data-pdx-a-0cb4db4937ade1f13:8088    1.3.5-c1.3.5
    4    influxdb-data-pdx-b-0d433580dd3cae66d:8088    1.3.5-c1.3.5
    5    influxdb-data-pdx-c-082e62f0305ccd246:8088    1.3.5-c1.3.5

    Meta Nodes
    ==========
    TCP Address                    Version
    influxdb-meta-pdx-a-05ac3af0a5407fb69:8091    1.3.5-c1.3.5
    influxdb-meta-pdx-b-0e56f8538c4251c28:8091    1.3.5-c1.3.5
    influxdb-meta-pdx-c-07fb56dd1672ddcfc:8091    1.3.5-c1.3.5
    ```

1. Starting with the `meta` nodes recycle the service one node at a time.  Do not proceed to the next `meta` nodes until `influxd-ctl show` shows the `meta` node as upgraded.  These should be reasonably fast. 

    ```shell
    systemctl restart influxdb-meta
    ```

1. Starting with a single `data` node recycle the service one at a time.  

    > **WARNING:** Do not progress from `data` node to `data` node without ensuring that the service has fully initialized.  This can be determined multiple ways, but `influxd-ctl show` is **NOT** sufficient. 

    > **WARNING:** Each server will take a potentially significant amount of time to finish this step based on its data volume on the disk.  It's highly recommended to be sure you have the new version **before** restarting the service.


    * Restart the service and follow the logs to watch service startup.:
        ```shell
        systemctl restart influxdb && journalctl -fu influxdb
        ```

    * You will see lines about opening and reading of files like these:
        ```
        Oct 30 20:51:21 influxdb-data-pdx-a-06edc145675dab320 influxd[112036]: [I] 2017-10-30T20:51:21Z reading file /mnt/influxdb_wal/wal/demeter_api_query/autogen/4416/_00047.wal, size 10489259 engine=tsm1 service=cacheloader
        Oct 30 20:51:52 influxdb-data-pdx-a-06edc145675dab320 influxd[112036]: [I] 2017-10-30T20:51:52Z /mnt/influxdb_data/data/demeter_network_request/autogen/4114 opened in 24.25091558s service=store
        ```

    * Even though `influxd-ctl show` would now see the new version the service is **not** actually online yet.  You must wait till you see HTTP traffic in the log. 
        ```
        Oct 30 20:59:11 influxdb-data-pdx-a-0cb4db4937ade1f13 influxd[2781]: [httpd] 10.201.168.223 - - [30/Oct/2017:20:59:11 +0000] "GET /ping HTTP/1.1" 204 0 "-" "ELB-HealthChecker/2.0" 2dd5e37b-bdb5-11e7-8004-000000000000 24
        Oct 30 20:59:11 influxdb-data-pdx-a-0cb4db4937ade1f13 influxd[2781]: [httpd] 10.201.169.18 - - [30/Oct/2017:20:59:11 +0000] "GET /ping HTTP/1.1" 204 0 "-" "ELB-HealthChecker/2.0" 2e1a4553-bdb5-11e7-8005-000000000000 72
        Oct 30 20:59:12 influxdb-data-pdx-a-0cb4db4937ade1f13 influxd[2781]: [httpd] 10.201.170.176 - - [30/Oct/2017:20:59:12 +0000] "GET /ping HTTP/1.1" 204 0 "-" "ELB-HealthChecker/2.0" 2e9233ee-bdb5-11e7-8006-000000000000 15
        ```

    * Once the HTTP port is bound, and traffic is seen in the logs it's safe to move to the next node.

1. Confirm that the cluster is on a consistent version on all nodes using the same `influxd-ctl show` command as earlier.

    ```shell
    influxd-ctl show
    Data Nodes
    ==========
    ID    TCP Address                    Version
    6    influxdb-data-pdx-a-06edc145675dab320:8088    1.3.7-c1.3.7
    7    influxdb-data-pdx-a-0cb4db4937ade1f13:8088    1.3.7-c1.3.7
    4    influxdb-data-pdx-b-0d433580dd3cae66d:8088    1.3.7-c1.3.7
    5    influxdb-data-pdx-c-082e62f0305ccd246:8088    1.3.7-c1.3.7

    Meta Nodes
    ==========
    TCP Address                    Version
    influxdb-meta-pdx-a-05ac3af0a5407fb69:8091    1.3.7-c1.3.7
    influxdb-meta-pdx-b-0e56f8538c4251c28:8091    1.3.7-c1.3.7
    influxdb-meta-pdx-c-07fb56dd1672ddcfc:8091    1.3.7-c1.3.7
    ```

[1]: https://portal.influxdata.com/
