Configuration¶
Configuration Overrides¶
The default configuration values are defined in the module’s config.py
file. These defaults can be overridden in the top-level invenio.cfg
file of
an InvenioRDM instance or as environment variables.
Module Enable/Disable¶
The entire community stats dashboard module can be enabled or disabled using the COMMUNITY_STATS_ENABLED
configuration variable:
# Disable the module completely
COMMUNITY_STATS_ENABLED = False
When disabled:
Scheduled tasks will not run: No automatic aggregation or migration tasks
CLI commands will fail: All commands will show an error message
Services will not be initialized: No event tracking or statistics services
Menus will not be registered: No dashboard menu items
Components will not be added: No event tracking components
Note: This is a global on/off switch. When disabled, the module will not modify the instance in any way.
Scheduled Tasks Enable/Disable¶
Scheduled aggregation tasks can be controlled separately using the COMMUNITY_STATS_SCHEDULED_TASKS_ENABLED
configuration variable:
# Enable the module but disable scheduled tasks
COMMUNITY_STATS_ENABLED = True
COMMUNITY_STATS_SCHEDULED_TASKS_ENABLED = False
When scheduled tasks are disabled:
Scheduled aggregation tasks will not run: No automatic daily/weekly aggregation
CLI aggregation commands will fail:
aggregate
command will show an error unless--force
is usedManual aggregation with –force: You can still run aggregation manually using
invenio community-stats aggregate --force
All other functionality remains: Event tracking, migration, and other features work normally
This allows you to enable the module for manual operations while preventing automatic background tasks. The --force
flag bypasses the scheduled tasks check and allows manual aggregation even when scheduled tasks are disabled.
View/Download event migration¶
The following configuration variables control the default behavior of migration commands:
STATS_DASHBOARD_REINDEXING_MAX_BATCHES = 1000 # Maximum number of batches to process per month
STATS_DASHBOARD_REINDEXING_BATCH_SIZE = 1000 # Number of events to process per batch
STATS_DASHBOARD_REINDEXING_MAX_MEMORY_PERCENT = 75 # Maximum memory usage percentage before stopping
These defaults can be overridden using the corresponding CLI options when running the migrate-events
command.
Task scheduling and aggregation¶
The following configuration variables control the scheduling and behavior of aggregation tasks:
from invenio_stats_dashboard.tasks import CommunityStatsAggregationTask
COMMUNITY_STATS_CELERYBEAT_SCHEDULE = {
"stats-aggregate-community-record-stats": {
**CommunityStatsAggregationTask,
},
}
"""Celery beat schedule for aggregation tasks."""
COMMUNITY_STATS_CATCHUP_INTERVAL = 365
"""Maximum number of days to catch up when aggregating historical data."""
Aggregation task locking¶
The following configuration variables control the locking mechanism for the aggregation task:
STATS_DASHBOARD_LOCK_CONFIG = {
"enabled": True, # Enable/disable distributed locking
"lock_timeout": 86400, # Lock timeout in seconds (24 hours)
"lock_name": "community_stats_aggregation", # Lock name
}
Default range options¶
The following configuration variable controls the default date range options for the dashboard. The keys represent the available granularity levels for the date range selector and cannot be changed. The values represent the default date range for each granularity level.
STATS_DASHBOARD_DEFAULT_RANGE_OPTIONS = {
"day": "30days",
"week": "12weeks",
"month": "12months",
"quarter": "4quarters",
"year": "5years",
}
Dashboard layout and components¶
The layout and components for the dashboard are configured via the STATS_DASHBOARD_LAYOUT
configuration variable. This is a dictionary that maps dashboard types (currently global
and community
) to layout configurations. Each layout configuration is a dictionary that maps dashboard sections to a list of components to display in that section. Rows can be specified to group components together, and component widths can be specified with a “width” key.
For example, the default global layout configuration is:
STATS_DASHBOARD_LAYOUT = {
"global": {
"tabs": [
{
"name": "content",
"label": "Content",
"rows": [
{
"name": "date-range-selector",
"components": [{"component": "DateRangeSelector", "width": 16}],
},
{
"name": "single-stats",
"components": [
{"component": "SingleStatRecordCount", "width": 3},
{"component": "SingleStatUploaders", "width": 3},
{"component": "SingleStatDataVolume", "width": 3},
],
},
{
"name": "charts",
"components": [
{"component": "StatsChart", "width": 8},
],
},
{
"name": "tables",
"components": [
{"component": "ResourceTypesTable", "width": 8},
{"component": "AccessStatusTable", "width": 8},
{"component": "RightsTable", "width": 8},
{"component": "AffiliationsTable", "width": 8},
],
},
],
},
],
},
}
If no layout configuration is provided for a dashboard type, the default “global” layout configuration will be used.
Any additional key/value pairs in the dictionary for a component will be passed to the component class as additional props. This allows for some customization of the component without having to subclass and override the component class.
The component labels used for the layout configuration are defined in the components_map.js
file, where they are mapped to the component classes.
Routes¶
The routes for the dashboard are defined by the STATS_DASHBOARD_ROUTES
configuration variable. This is a dictionary that maps dashboard types (currently global
and community
) to route strings.
For example, the default routes are:
STATS_DASHBOARD_ROUTES = {
"global": "/stats",
"community": "/communities/<community_id>/stats",
}
Templates¶
The templates for the dashboard are defined by the STATS_DASHBOARD_TEMPLATES
configuration variable. This is a dictionary that maps dashboard types (currently global
and community
) to template strings.
For example, the default templates are:
STATS_DASHBOARD_TEMPLATES = {
"macro": "invenio_stats_dashboard/macros/stats_dashboard_macro.html",
"global": "invenio_stats_dashboard/stats_dashboard.html",
"community": "invenio_stats_dashboard/community_stats_dashboard.html",
}
UI Configuration¶
The UI configuration for the dashboard is defined by the STATS_DASHBOARD_UI_CONFIG
configuration variable. This is a dictionary that maps dashboard types (currently global
and community
) to a dictionary of configuration options.
For example, the default UI configuration is:
STATS_DASHBOARD_UI_CONFIG = {
"global": {
"title": _("Statistics"),
"description": _("This is the global stats dashboard."),
"maxHistoryYears": 15,
"default_granularity": "month",
"show_title": True,
"show_description": False,
},
"community": {
"title": _("Statistics"),
"description": _("This is the community stats dashboard."),
"maxHistoryYears": 15,
"default_granularity": "month",
"show_title": True,
"show_description": False,
},
}
Title and description display¶
The title and description display in different places for the global and community dashboards. For the global dashboard, the title and description are displayed in the page subheader, while for the community dashboard they display at the top of the dashboard sidebar.
The show_title
and show_description
options can be used to control whether the title and description are displayed for the global and community dashboards.
Subcount Configuration¶
The following configuration variables control how subcount breakdowns are generated and displayed:
COMMUNITY_STATS_TOP_SUBCOUNT_LIMIT
¶
This variable controls the maximum number of items returned in subcount breakdowns (e.g., “Top 20 Resource Types”). This helps prevent overwhelming the UI with too many items and improves performance.
COMMUNITY_STATS_TOP_SUBCOUNT_LIMIT = 20
COMMUNITY_STATS_SUBCOUNT_CONFIGS
¶
This variable defines the configuration for different subcount breakdown types, including field mappings and display options.
COMMUNITY_STATS_SUBCOUNT_CONFIGS = {
"by_resource_types": {
"field": "metadata.resource_type.id",
"label": "Resource Type",
"display_field": "metadata.resource_type.title",
},
"by_subjects": {
"field": "metadata.subjects.subject",
"label": "Subject",
"display_field": "metadata.subjects.subject",
},
# ... other subcount configurations
}
STATS_DASHBOARD_UI_SUBCOUNTS
¶
This variable controls which subcount breakdowns are available in the UI and how they are displayed.
STATS_DASHBOARD_UI_SUBCOUNTS = {
"by_resource_types": {},
"by_subjects": {},
"by_languages": {},
"by_rights": {},
"by_funders": {},
"by_periodicals": {},
"by_publishers": {},
"by_affiliations": {
"combine": ["by_affiliations_creator", "by_affiliations_contributor"]
},
"by_countries": {},
"by_referrers": {},
"by_file_types": {},
"by_access_statuses": {},
}
Test Data Mode¶
STATS_DASHBOARD_USE_TEST_DATA
¶
This variable enables test data mode for development and testing purposes. When enabled, the dashboard will use synthetic data instead of making API calls to the statistics service.
STATS_DASHBOARD_USE_TEST_DATA = True
Note: This should be set to False
in production environments to ensure real statistics data is displayed.
Event Processing Configuration¶
STATS_EVENTS
¶
This variable defines the event types and their configurations for statistics processing. It controls which events are tracked and how they are processed.
STATS_EVENTS = {
"file-download": {
"processor": "invenio_stats.processors.flag_robots",
"processor": "invenio_stats.processors.flag_machines",
"processor": "invenio_stats.processors.anonymize_user",
},
"record-view": {
"processor": "invenio_stats.processors.flag_robots",
"processor": "invenio_stats.processors.flag_machines",
"processor": "invenio_stats.processors.anonymize_user",
},
}
Auto-Generated Configuration¶
The following configuration variables are automatically generated by the module and typically do not need manual configuration:
COMMUNITY_STATS_AGGREGATIONS
¶
This variable contains the aggregation configurations for all statistics aggregators. It is automatically populated by the register_aggregations()
function and includes configurations for record counts, usage statistics, and other metrics.
COMMUNITY_STATS_QUERIES
¶
This variable contains the query configurations for accessing statistics data. It is automatically populated and includes configurations for different types of statistics queries.
Configuration Reference¶
The following table provides a complete reference of all available configuration variables:
Variable |
Default |
Description |
---|---|---|
|
|
Enable/disable the entire module |
|
|
Enable/disable scheduled aggregation tasks |
|
|
Celery beat schedule for aggregation tasks |
|
|
Maximum days to catch up when aggregating historical data |
|
|
Aggregation configurations (auto-generated) |
|
|
Query configurations (auto-generated) |
|
|
Maximum number of items to return in subcount breakdowns |
|
|
Configuration for subcount breakdowns and field mappings |
|
|
UI subcount configuration for different breakdown types |
|
|
Distributed locking configuration for aggregation tasks |
|
|
Template paths for dashboard views |
|
|
URL routes for dashboard pages |
|
|
UI configuration for dashboard appearance and behavior |
|
|
Default date range options for different granularities |
|
|
Dashboard layout and component configuration |
|
|
Enable/disable menu integration |
|
|
Menu item text |
|
|
Menu item order |
|
|
Menu item endpoint |
|
|
Custom menu registration function |
|
|
Enable/disable test data mode for development |
|
|
Maximum batches per month for migration |
|
|
Events per batch for migration. Note: OpenSearch has a hard limit of 10,000 documents for search results, so this value cannot exceed 10,000. |
|
|
Maximum memory usage percentage before stopping migration |
|
|
Event type configurations for statistics processing |
Note: Variables marked with {...}
contain complex configuration objects that are documented in detail in the sections above.