A cookiecutter template for data science projects within His Majesty's Government and wider public sector.
Office for National Statistics (ONS)
Other
https://www.gov.uk/government/organisations/office-for-national-statistics
Stars of active repositories
1,059
Active repositories
390
Live repositories
1,923
Unavailable repositories
392
Languages of active repositories
- 1.Python (32%)
- 2.Go (22%)
- 3.Java (12%)
Active: currently on GitHub, not archived, and pushed to within 180 days. Live: currently on GitHub. Unavailable: previously on GitHub but not currently found.
GitHub accounts
ONSdigital, ONS-Innovation, best-practice-and-impact, datasciencecampus, ONS-OpenData, ONSBigData (inactive), GSS-Cogs (inactive)
Repositories
Showing all 390 active repositories, sorted by stars
Guidance for quality assurance of code for civil service researchers and analysts.
R package: generate best-practice stats spreadsheets for publication
Good Practice Tables - an XlsxWriter wrapper to write consistently formatted statistical tables to Excel.
Development website for collecting and disseminating UK data for the Sustainable Development Goal global indicators.
afcharts is an R package for creating accessible plots by the Government Analysis Function.
A suite of PySpark, Pandas, and general pipeline utils for ONS projects.
General information and standards for the ONS software engineering community of practice
A Python package for text-classification tasks associated with the production of official statistics, utilising semantic search and Retrieval Augmented Generation
The official repository of all publication ready ONS charts.
A tool for exploring neighbourhood level Census data tables on a map.
Statistical Methods Library
A Python package for creating accessible charts following Government Analysis Function guidance
Django Wagtail CMS for managing and publishing content for the Office for National Statistics (ONS)
This repository contains CROW, the Clerical Resolution Online Widget, an open-source project designed to help data linkers with their clerical matching needs!
Statistical Methods Library for Python Pandas methods used in SPP.
ONS Dissemination development team documentation.
Repository to store code for outputs that are produced from the Subnational Statistics and Analysis division.
Python and PySpark implementation of Goldstein et al.'s Scalelink method of data linkage.
Starter page templates using ONS svelte-components.
An application for authoring eQ questionnaires.
Quarto book site for the second edition of the AQuA Book. The book is now published. This site will be used for maintenance and revision of the book in future.
For any specific queries or detailed feedback, please contact us at [email protected]
User interface for Respondent Account Services
Website cataloguing Statistical Methods Library
This microservice manages surveys.
This data set family supports the Explore Subnational Statistics initiative of the Office for National Statistics and provides low level geography data across many indicators for the United Kingdom's four nations. For any specific queries or detailed feedback, please contact us at [email protected]
Address Index Matching Service Spark Job
This microservice manages collection exercises.
Resources and templates for documenting analytical workflows, assumptions and limitations.
Secure Messaging Service for Respondent Account Services.
Dataset API for customise my data
Automated Data Checker for Python
The ONS respondent account service 'collection instrument' micro service.
A package for cleaning, validating, extracting and contextualising addresses with PySpark.
ONS SDC Method for Cell Key Perturbation in R
Digital publishing resumable file upload service that handles on the fly encryption and writing to S3. It updates images through the CMS
Go module implementing a kafka client wrapper, abstracting messages to go channels
ONS Address Index Matching Service containerised deployment for external users
Creates a frequency table and performs cell-key perturbation
Calculating statistics for national and regional research and development expenditure
A basic Python template to jump start a Python project
A template for scrolly projects with a .docx and ArchieML parser
A Flask application, designed to gather and report information on technologies used within ONS.
RAP minimum standard guidance for the Office for National Statistics
Search API on to the ONSWebsite Search engine
Template that can be used when creating a new prototype-kit project.
An internally developed Digital Landscape platform that presents the organisationโs repositories, projects, technology stack and GitHub Copilot usage.
Controller for handling datasets on the ONS website
Routes website requests to the appropriate apps
A collection of logging libraries to implement standardised logging
A Python package containing a set of functions used to expedite and streamline the data linkage process.
A mongo db library for Go.
This microservice manages cases.
Reads CMD dataset dimensions and dimension options from the Dataset API and writes them to the graph DB.
A helpful command line tool to make Digital Publishing developers' lives better
Common code for SDC services that use JWE
An API for the Tech Audit Tool, designed to gather and report information on technologies used within ONS.
An AWS Lambda Function to collect historic GitHub Copilot usage metrics from the GitHub API.
Python script to collect data on repositories within the organisation to be used on the Digital Landscape.
Get data from the landregistry sparql endpoint and process it
API for managing access control permissions for Digital Publishing API resources
The Survey Assist API - Used to access backend data processing services such as classification
SIC Classification Utilities
Python client for interacting with our Blaise REST API.
React component implementations of the ONS Design System
A prototyping kit that uses the ONS Design System
Docker containers for Concourse CI
A dashboard that displays useful information from multiple GCP projects.
Test components in isolation using Godog / Gherkin
A web application that provides insights into GitHub organisations to make managing them easier.
User documentation for the ONS Statistical Methods Library
An API for validating survey schemas
Digital Publishing controller for managing cookies.
Event driven nomad deployer service.
An ONS branded theme for MkDocs based on Material for MkDocs
SOC Classification Utilities
A dashboard to monitor SDC survey response rates
An HTTP service for the controlling of search query
Repository for work-in-progress and infrequently used chart templates.
Common resources for ONS charts
A benchmarking tool for EQ Survey Runner
A collection of api clients written in go and used for the ONS website
A service for managing the retrieval of files from public and private s3 buckets
An API used to manage the authorisation of users accessing data publishing services.
A project to assist in composing multiple DP services
Posts Slack alerts from status feeds.
Library of classification functionality associated with UK SOC (Standard Occupational Classification)
A tool to automatically populate the GDD CPD log from Cambridge Sparks OTJ log.
Static File Publisher
This microservice manages the survey sample.
An API used by florence and internal micro services to create import datasets which is then updated by other micro services
Schema definitions for the ONS system ecosystem
A Python package used to interact with the GitHub RESTful API
This is a Lambda function for the weekly retrieval of GitHub usernames and Organisation verified email addresses.
An application to track an organisation's compliance with ONS Usage Policy
AI Assist PoC (initially for TLFS)
Response rate dashboard for the Coding in Analysis and Research Survey (CARS) wave 3.
Fork of WorldPop's repo for Malawi census support
Bootstrap module to support the use of terraform template
Data processing for the ONS Visual Journalism catalogue of releases.
Repository holding tools and code snippets relating to all units on the Analysis for Action platform
Github actions to format CPI items data
Frontend service for deploying questionnaire packages to Blaise
Library of classification functionality associated with UK SIC (Standard Industrial Classification)
The new version of the Secure Messaging Service for Respondent Account Services.
This repository is for the Python driven Management UI application for the CIR Converter service
UI for the TO to launch into CATI questionnaire cases.
PoC using the themeFinder package from i.Ai to analyse responses
Website to showcase the work of ONS
Repo for the eQ team to create prototypes using the ONS Design System
Enables greater flexibility in creating journeys through the website
Renders a table in multiple formats, given data in json format
The dis-bundle-api is a backend service for managing and publishing datasets and content as bundles, similar to Florenceโs collections.
User Interface that compliments survey-assist-themes
Copy of Census 2021 UI Prototype inside ONS Digital Org
A repository for the code supporting the paper "Are Multilateral Methods the Holy Grail of Price Measurement? A Critical Examination of Their Promises and Practical Performance in Dynamic Datasets", covering the perturbation test used in the paper.
An API for handling file meta data
PII validation library for Survey Assist
Utilities used as part of Survey Assist API or UI
This project identifies outliers in small-area Gross Disposable Household Income (GDHI) data at the Lower Layer Super Output Area (LSOA) level and applies adjustments to ensure the statistics are reliable and suitable for publication.
Welcome to the Dissemination Engineering Standards repository
Learn RAP principles through practical Python exercises. This repo showcases best practices for Reproducible Analytical Pipelines, with descriptions of components and exercises to help users build confidence in applying RAP techniques.
Code to identify potential linkage bias by comparing features of linked vs unlinked data
User interface for accessing Survey Assist backend
The Reproducible Analytical Pipeline (RAP) used to produce the weekly deaths dataset.
Component that breaks down the 'Databakered' CSV into individual rows and writes them to a Kafka queue.
Takes a CMD filter job and produces a filtered dataset CSV.
Resource api for customise my data
Supply Cantabular metadata for Florence metadata journey.
Service that consumes CMD observations to import from Kafka and writes them to the Graph DB.
Tracks CMD dataset imports, recording the completion of the import steps to the Dataset API and then updating the import API job to state to completed.
Dynamic population modelling using Template Model Builder
Generates the static HTML for https://developer.ons.gov.uk/
A Python utility used to archive old, unused GitHub repositories from an organisation.
Analysis for the coding in analysis and research survey.
WIP for a python package to wrap up Govcookiecutter Templates into Zip files to make it easier for updating version in locked down environments
A workflow to automate module updates in terraform projects
Python pipeline classifying free-text survey responses into ISCO/ISIC schemes
ONS Data Scientist Learning Pathway
Proof of concept for auto-documentation of code using LLMs.
A benchmarking tool for Census EQ Questionnaire Runner
Get mortgage interest rates from the Bank of England api
An API for validating survey schemas
Action to scrape CPI weights for use in the ONS inflation calculator
A generic interface to a vector store used for SIC classification
A Go API for data migration.
A generic vector store and api used by survey assist. This can deploy either an industry, occupation or search as you type vector store
Azure Pipeline to backup Blaise users from local SQLite DB
A python client for the blaise-data-delivery-status API
Cloud Function to perform point-in-time restore for specific questionnaire data.
Jupyter Notebook for GCP log investigation
Services for interacting with Totalmobile's APIs.
Lightweight Node.js provider for Google IAP ID tokens
This repository contains functionality for checking questionnaire survey days and automating daybatch creation.
Node.js client for our Blaise UAC Service
Node.js client for our Blaise REST API
Slack alerts for data delivery issues
Cloud Function for processing receipts from NiFi
Publish file metadata to Sub/Sub topic when uploaded to bucket.
Functions to copy questionnaire data from NISRA SFTP to GCP storage bucket.
A registry of questionnaire schemas for eq-questionnaire-runner.
Scripts for translating eq-questionnaire-runner surveys
A registry of census questionnaire schemas for census31-eq-questionnaire-runner.
Definitions and specifications for the 2031 Census EQ interfaces
This project is demo of design system implemenented in Python and Flask.
A Go Service to redirect legacy URLs requested by users
A Go API for management of redirects.
HTML dashboard produced using Quarto displaying data from the Office for National Statisticsโ Health Insight Survey, commissioned by NHS England.
REST API for managing cache control information for pages within the legacy CMS
Periodic nomad job for checking integrity of zebedee workspace
Proxy for handling the cache for pages within the legacy CMS
Digital Publishing image management API
Provides a GET endpoint to generate and download files
Controller for handling homepage on ONS website
Release Calendar frontend controller
API for managing the release calendar
A generic interface to a vector store used for SOC classification
Service to retrieve data to update search index
Service to store searchable content into elasticsearch
Search Reindexing Batch Job
A Golang microservice that expands industry codes, output area codes, identifies survey codes and years in the search query and returns them separately
This contains the code for the dp feedback api
This is a demo R pipeline for to use the `crvs.pkg` functionality to process and analyse data into publication ready tables.
To allow user feedback on the ONS website
R package for the production of vital statistics from civil registration data. The package handles data validation, processing/derivation, aggregation and analysis.
Public repository for the ESPRESSO Python pipeline
A questionnaire launcher for census31-eq-survey-runner
This project is Collection Instrument Registry. It will manage the storage and versioning of Electronic Questionnaires used but the EQ services.
Census 2031 FWMT: shared Maven parent BOM
Microservice to automatically process sample jobs
User Interface for respondents to access ONS Survey Data Collection questionnaires and services
Automated quality assurance methods for demographic outputs
RM Census acceptance tests
A Java client library for the dp-image-api
A java client library for sending slack messages
Data pipeline for ONS Beta website
Census 2031 FWMT: census31-fwmt-job-service (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-outcome-service (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-tm-mock (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-storage-utils (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-performance-tests (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-perf-msg-builder (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-gateway-version-tracker (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-fulfillment-event-service (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-events (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-csv-service (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-canonical (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-acceptance-tests (seeded from 2021, FMT-4)
Census 2031 FWMT: census31-fwmt-common (seeded from 2021, FMT-4)
A Java JSON Web Token (JWT) verification library
Encrypted HTTP Multipart Upload
Access Java resources as .properties, File, XML or String
Brutally opinionated REST micro-framework.
Address Index Acceptance tests
A service for loading data into the Supplementary Data Service (SDS).
A scheduler application for automating the purging of cache for paths to be published.
Spring Boot Microservice for AIMS bulk match solution
A simple FastAPI to mock the SDS service required by eq-runner.
A simple FastAPI to mock the CIR service required by eq-runner
Terraform infrastructure to generate load test against eq-survey-runner
A wrapper for govulncheck to allow exclusions
tidysheet takes messy Excel data and converts it to tidy data with a single value per row.
Generate accessible ODS spreadsheets with in a web browser or using Node.js
A questionnaire launcher for eq-survey-runner
Proof of concept for using svelteplot library to replace svelte-charts for advanced data visualisation products
Shared/common classes which are used to validate sample data for Survey Collection & Ingestion
Fork of davidcarboni/cryptolite
This repository holds the code for the data-api bundle scheduler, which is used to automate the publication of datasets included within a bundle.
This repository contains the locust performance testing components for the SDS application.
Fake upstream search stub to mimic upstream services using a generic search contract to integrate with our search stack
A working-level exercise in data linkage, demonstrating how to clean and link data in Python
An interactive linkage notebook. This goes through an entire linkage pipeline, encouraging participants to alter match thresholds and variable weighting to maximise linkage quality.
Frontend svelte app for ONS visual journalism's searchable release catalogue
This project is demo of design system implemenented in Python and Flask.
sds-common is a Python library for common functionality used to interact with SDS
The repo is the terraform to setup author in GCP.
A new auth service stub to ensure we can test all actions associated with login/authentication
Dissemination client library for Redis or similar protocols.
A Java Client Library for the dp-files-api
A rendering library for Dissemination frontend go microservices. dis-design-system-go contains template, localisations, model structs, css and javascript that are core to all dissemination frontend services.
A Java client for the dp-dataset-api service
Feasibility research looking into the use of administrative data to predict disability status
Learn RAP principles through practical R exercises. This repo showcases best practices for Reproducible Analytical Pipelines, with descriptions of components and exercises to help users build confidence in applying RAP techniques.
Frontend service to host filter and flexing of datasets
Search Test bed
upload imf files
Helps you cache data for dp-renderer in your frontends.
elasticsearch client
A central repository for the digital publishing import process.
Extracts dimensions and dimension options from the uploaded CMD CSV dataset files and writes them to the Dataset API
dp-cantabular-metadata-exporter
Consumes a Kafka message (cantabular-export-start) to begin the process of retrieving the counts from Cantabular.
REST API providing querying of CMD dataset observations with limited filtering capability.
Begins the process of retrieving categories from Catabular API to begin the population of dimensions into our datastore
Consumes error messages from the CMD import process and report them to the Dataset API
Extracts dimension options (aka categories) from the Cantabular server and writes them to the dataset API.
A new version of `eq-publisher` for translating questionnaires built in Author into Runner V3 format
The Pandemic Preparedness Toolkit (PPT) is a five-year project (2023 โ 2028), funded by Wellcome, which aims to co-create a sustainable, online Toolkit that will build capacity for infectious disease surveillance in National Statistical Offices (NSOs).
Repository for Google Cloud Functions used with SDS
POC middleware to forward GitHub Dependabot alerts
A REST API wrapper around the Cantabular Extended GraphQL API allowing querying of information about observations, variables, and categories.
A registry for supplementary dataset schemas for the SDS service.
A series of functions used to check adherence to the GitHub Usage Policy.
A template repository for all new KEH repositories.
Flask survey from config PoC
test to raise Dependabot PRs on template repo
check Dependabot updates
A test repository to collect the team effort to solve some of the puzzels from the Advent of Code from last year (2024) using Backstage.