Configuration
Bunny can be configured using environment variables.
If you’re running Bunny as part of a composed stack of containers, these are set in the compose.yml
.
They can be found under
services:
# other containers, if needed
bunny:
depends_on:
- db
- relay
- rabbitmq
build:
context: app/bunny
dockerfile: Dockerfile.daemon
environment:
- DATASOURCE_DB_USERNAME=postgres
- DATASOURCE_DB_PASSWORD=postgres
- DATASOURCE_DB_DATABASE=postgres
- DATASOURCE_DB_DRIVERNAME=postgresql
- DATASOURCE_DB_SCHEMA=public
- DATASOURCE_DB_PORT=5432
- DATASOURCE_DB_HOST=db
- TASK_API_BASE_URL=http://relay:8080
- TASK_API_USERNAME=username
- TASK_API_PASSWORD=password
- LOW_NUMBER_SUPPRESSION_THRESHOLD=
- ROUNDING_TARGET=
- POLLING_INTERVAL=5
- COLLECTION_ID=collection_id
In development, if you’re running Bunny outside a container, you can instead set these with a .env
file, as described in quickstart.
Task API
TASK_API_BASE_URL=<host>:<port>
TASK_API_USERNAME=<Task API username>
TASK_API_PASSWORD=<Task API password>
Relay as Task API host
To configure the bunny-daemon communicating with Relay, the TASK_API
variables must be set.
When configuring Relay , you will set a port to poll for jobs, a username and a password.
- If Bunny and Relay are running in the same stack, the host should be
http://relay
. - If you have Relay running locally, and Bunny running locally, the host should be
http://localhost
- If you are not using Relay as Task API host, see below
Other Task API hosts
- You need to change the
TASK_API_BASE_URL
value according to your actual Task API host where Bunny connected toGET
the job andPOST
the results back. - The I/O requests from Bunny are authenticated by the
TASK_API_USERNAME
andTASK_API_PASSWORD
, please put your Task API credentials here
Job Type
- If you would like Bunny to connect directly with Rquest, you have to add one more configuration which is
TASK_API_TYPE
. This can only be eithera
(Availability),b
(Distribution and PHEWAS) orc
(AnalyticsGwas, AnalyticsGwasQuantitiveTrait, AnalyticsBurdenTest).
Database
The database connection is configured by the variables prefixed with DATASOURCE_DB
DATASOURCE_DB_USERNAME=<username>
DATASOURCE_DB_PASSWORD=<password>
DATASOURCE_DB_DATABASE=<database name - ignored if using Trino>
DATASOURCE_DB_DRIVERNAME=<driver name - ignored if using Trino>
DATASOURCE_DB_SCHEMA=<database schema name>
DATASOURCE_DB_PORT=<port served by database>
DATASOURCE_DB_HOST=<database host>
USE_TRINO=
If using the supplied compose.yml
and running the database in the same stack, unless you modify the database configuration, the defaults will connect Bunny to your database.
If using a remote database, change accordingly.
Obfuscation
Two methods of result obfuscation are provided as below. Before building your Hutch tools stack, these can be configured in compose.yml
.
Notice: The method will be applied when its related value is provided. When both values are provided, the Low Number Suppression will be applied first, then Rounding.
Low Number Suppression
There’s a data disclosure risk if low numbers are reported for a given query, so if the LOW_NUMBER_SUPPRESSION_THRESHOLD
is provided, low number suppression will be applied to results, such that any result with a count below a given threshold will not be revealed.
LOW_NUMBER_SUPPRESSION_THRESHOLD=#Add your threshold here
Rounding
If the ROUNDING_TARGET
is provided, results can be rounded to the nearest n
ROUNDING_TARGET=#Add your rounding target
Polling
This is set to the number of seconds provided in the environment indicating the polling interval duration.
By default, Bunny will make GET
request to Task API host every 5 seconds to check for jobs.
POLLING_INTERVAL=5
Collection ID
This is the collection ID you would like to query.
COLLECTION_ID=collection_id