Configuration

Bunny can be configured using environment variables. If you’re running Bunny as part of a composed stack of containers, these are set in the compose.yml. They can be found under

services:
  # other containers, if needed
  bunny:
    depends_on:
      - db
      - relay
      - rabbitmq
    build:
      context: app/bunny
      dockerfile: Dockerfile.daemon
    environment:
      - DATASOURCE_DB_USERNAME=postgres
      - DATASOURCE_DB_PASSWORD=postgres
      - DATASOURCE_DB_DATABASE=postgres
      - DATASOURCE_DB_DRIVERNAME=postgresql
      - DATASOURCE_DB_SCHEMA=public
      - DATASOURCE_DB_PORT=5432
      - DATASOURCE_DB_HOST=db
      - TASK_API_BASE_URL=http://relay:8080
      - TASK_API_USERNAME=username
      - TASK_API_PASSWORD=password
      - LOW_NUMBER_SUPPRESSION_THRESHOLD=
      - ROUNDING_TARGET=
      - POLLING_INTERVAL=5
      - COLLECTION_ID=collection_id

In development, if you’re running Bunny outside a container, you can instead set these with a .env file, as described in quickstart.

Task API

TASK_API_BASE_URL=<host>:<port>
TASK_API_USERNAME=<Task API username>
TASK_API_PASSWORD=<Task API password>

Relay as Task API host

To configure the bunny-daemon communicating with Relay, the TASK_API variables must be set. When configuring Relay , you will set a port to poll for jobs, a username and a password.

If Bunny and Relay are running in the same stack, the host should be http://relay.
If you have Relay running locally, and Bunny running locally, the host should be http://localhost
If you are not using Relay as Task API host, see below

Other Task API hosts

You need to change the TASK_API_BASE_URL value according to your actual Task API host where Bunny connected to GET the job and POST the results back.
The I/O requests from Bunny are authenticated by the TASK_API_USERNAME and TASK_API_PASSWORD, please put your Task API credentials here

Task Type

If you would like Bunny to connect directly with Rquest, you have to add one more configuration which is TASK_API_TYPE. This can only be either a (Availability), b (Distribution and PHEWAS) or c (AnalyticsGwas, AnalyticsGwasQuantitiveTrait, AnalyticsBurdenTest).

Database

The database connection is configured by the variables prefixed with DATASOURCE_DB

DATASOURCE_DB_USERNAME=<username>
DATASOURCE_DB_PASSWORD=<password>
DATASOURCE_DB_DATABASE=<database name - ignored if using Trino>
DATASOURCE_DB_DRIVERNAME=<driver name - ignored if using Trino>
DATASOURCE_DB_SCHEMA=<database schema name>
DATASOURCE_DB_PORT=<port served by database>
DATASOURCE_DB_HOST=<database host>
USE_TRINO=

If using the supplied compose.yml and running the database in the same stack, unless you modify the database configuration, the defaults will connect Bunny to your database. If using a remote database, change accordingly.

DATASOURCE_DB_DRIVERNAME

Defines the database driver to use. Currently supports PostgreSQL and SQL Server.

Valid values:

postgresql -> PostgreSQL
mssql -> SQL Server

DATASOURCE_DB_DRIVERNAME=postgres

Obfuscation

Two methods of result obfuscation are provided as below. Before building your Hutch tools stack, these can be configured in compose.yml.

Notice: The method will be applied when its related value is provided. When both values are provided, the Low Number Suppression will be applied first, then Rounding.

Low Number Suppression

There’s a data disclosure risk if low numbers are reported for a given query, so if the LOW_NUMBER_SUPPRESSION_THRESHOLD is provided, low number suppression will be applied to results, such that any result with a count below a given threshold will not be revealed.

LOW_NUMBER_SUPPRESSION_THRESHOLD=#Add your threshold here

Rounding

If the ROUNDING_TARGET is provided, results can be rounded to the nearest n

ROUNDING_TARGET=#Add your rounding target

Polling

This is set to the number of seconds provided in the environment indicating the polling interval duration. By default, Bunny will make GET request to Task API host every 5 seconds to check for jobs.

POLLING_INTERVAL=5

Collection ID

This is the collection ID you would like to query.

COLLECTION_ID=collection_id

Quickstart Deployment