Analysis pipeline


To set up DIO's analysis pipeline, it is necessary to install the Elasticsearch and Kibana software and upload DIO's pre-defined dashboards. Folder dio-pipeline contains a docker-compose file and Ansible playbooks to create and set up DIO's pipeline automatically.

Start by clonning DIO's repository and accessing the dio-pipeline folder:

$ git clone https://github.com/dsrhaslab/dio.git
$ cd dio-pipeline

> Via Docker-compose (single-node)

The folder docker-compose contains a docker-compose.yml file that allows configuring one container for Elasticsearch and another for Kibana. It also has a .env file with important variables for setting up DIO's analysis pipeline.

  1. Update the necessary variables in the .env file according to your setup.
  2. Run docker-compose up to start the containers.
  3. Ensure that you can access and log into Kibana:
    • Access http://<HOST_IP>:<KIBANA_PORT> in your browser;
    • Log in with:
      • Username: "elastic";
      • Password: "<ELASTIC_PASSWORD>"; (the elastic password defined in .env).
  4. Import DIO's dashboards into Kibana:
  5. $ curl -u "elastic:" -X POST -k "http://:/api/saved_objects/_import" -H "kbn-xsrf: true" --form file=@dio_dashboards.ndjson

> Via Ansible & Kubernetes (multi-node)


The folder dio-pipeline contains Ansible playbooks to automatically create and set up a Kubernetes cluster with all the required components.

  1. Install Ansible and the required modules.
  2. $ apt install ansible
    $ ansible-galaxy collection install ansible.posix
    $ ansible-galaxy collection install kubernetes.core
    $ ansible-galaxy collection install cloud.common
    $ ansible-galaxy collection install community.general
    $ ansible-galaxy collection install community.kubernetes
  3. Update the inventory file (hosts.ini) with the servers' information on where the pipeline should be installed. If more than one server is used, Kibana will be installed on the master, while Elasticsearch will run on workers'.
    • Add the master information in the group "[master]"
    • Add the workers information in the group "[node]"

    Syntax
      <hostname> ansible_host=<host_ip> ansible_python_interpreter='python3'
    Example
      <master>
      master ansible_host=192.168.56.100 ansible_python_interpreter='python3'
      <node>
      worker1 ansible_host=192.168.56.101 ansible_python_interpreter='python3'
      worker2 ansible_host=192.168.56.102 ansible_python_interpreter='python3'
      <kibana:children>
      master
      <kube_cluster:children>
      master
      node

  4. Run the run_dio_pipeline.sh script to install and configure DIO's pipeline.
  5. $ bash run_dio_pipeline.sh install_dio_pipeline
  6. Ensure that you can access and log into Kibana:
    • Access http://<MASTER_IP OR WORKER_IP>:32222 in your browser;
    • Default credentials are:
      • Username: "dio";
      • Password: "diopw";
    • Credentials can be changed on the group_vars/kube_cluster.yml file (variables dio_es_user and dio_es_pass).

More information available at dio-pipeline.