Keycloak needs an admin group setup for it to work with Holocron.
## What are Collectors?
- What kind of collectors do I need?
- How many collectors do I need?
Collectors can be thought of as an infinitely running background jobs. They constantly run within a set interval and collect data from sources (GitLab, Jira etc.)
### What kind of collectors do I need?
It depends on what the team uses for source data (GitLab, Jira etc.). For example, a team could be using a Jira board at a certain hosted Jira domain, and source code management using GitLab at a certain GitLab hosted domain. Therefore, the team would need one collector running for each domain.
### How many collectors do I need?
It depends on how many domains exist. For example, we could have two running instances of a gitlab-scm-collector if there are two running hosted servers for GitLab in an organization. For more information on how to setup collector, please consult the [Collector Environmental Variables](Collector Environmetal Variables)
Every collector needs the following environmental variables for setup:
...
...
@@ -22,9 +25,10 @@ Every collector needs the following environmental variables for setup:
-`DB_HOST` which is the postgres database host.
-`DB_PORT` which is the postgres database port.
-`DB_NAME` which is the postgres database name.
-`COLLECTOR_NAME` which is the name of the collector (the name for a particular running instance of a collector).
-`COLLECTOR_NAME` which is the name of the collector (the name for a particular running instance of a collector for a particular `TARGET_URL` which needs to be unique in the database for all the collectors
that are running).
-`COLLECTOR_INTERVAL_SECONDS` which is the amount of time in seconds the collector needs to sleep before beginning the next round of collection.
-`TARGET_URL` which is the base url for where the requests are going to be send to when collecting.
-`TARGET_URL` which is the base domain url for where the requests are going to be send to when collecting.
-`LOOK_BACK_DAYS` which represents how old of a dataset (measured in days) do we want the collectors to collect data from (for example issues that were update 365 days ago)
-`GO_RUN_ENV` which represents the environment type (can be "production" or "development" only).
-`COLLECTOR_TARGETS_INTRVL_SECS` which represents the amount of time that the top level collection (collector target collection) must wait before collecting again. Ideally 24 hours.