This is used to build tools which process and standardize data.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Go to file
youainti 123fe3b5e4 Merge completed: Merged working versions from home and office PCs 3 years ago
.dbeaver added info for setting up the db from DrugCentral.org 3 years ago
containers Merge completed: Merged working versions from home and office PCs 3 years ago
development_sql Merge completed: Merged working versions from home and office PCs 3 years ago
non-db_data_sources/GBD and ICD-10_(2019 version) Merge completed: Merged working versions from home and office PCs 3 years ago
scripts adding various changed for which I don't remember the details. I'm adding it to avoid loosing it. 3 years ago
.gitignore added info for setting up the db from DrugCentral.org 3 years ago
.project Got working connection between trials and NDA/ANDA start times 3 years ago
README.md Update 'README.md' 3 years ago
justfile lots of updates while integrating rxnav and aact into one dockerfile 3 years ago

README.md

ClinicalTrialsDataProcessing

This represents my

Prerequisites

Python >= 3.10 (requires match statement) Docker >= 20.10 Curl >= 7 Just >= 1.9

Usage

Basic usage

Check prerequisites

just check-status

Setup the underlying AACT database including downloading both the AACT dump and historical data.

just create
just select-trials
just count=1000 get-histories

replacing the 1000 in count=1000 with the number of trials you want to download.

Advanced Usage

If you need to reset the db without downloading the AACT dump

just rebuild
just select-trials
just count=1000 get-histories

Description of all the just recipes

Background information

This is designed to run on a linux machine with bash. If you are using a shell other than bash you should be aware of what is needed to run all of this using bash

If any of the discussions below don't make sense, talk to your sysadmin, a local linux user, or reach out to the author.

Just installation

I use the command runner just to automate/simplfy setting up the docker containers and running many of the python scripts. It is similar to make in many ways but is designed to do less.

Just can be installed from https://github.com/casey/just/

Python installation

This requires python 3.10 or above due to the use of match-case statements in the html parser.

Check which version of python you have by typing python --version. If you do not have the required version, I would recommend installing the conda python manager and setting up a conda environment with python 3.10. Instructions for doing so are on the internet.

Docker and Postgres

Docker is a tool to manage and run OCI containers. What this means in regards to this project is that docker makes it easy to setup containers.

Install docker based on instructions for your linux distribution. I use podman (an alternative from RedHat) because it allows for running without root permissions.

Docker networking

It is helpful to construct an external docker network by running

docker network create network_name

and then including that network in the docker-compose.yaml

Environment Variables (.env file)

I use an single .env file to setup the docker containers and pass configuration variables to the python scripts. I would suggest changing the default values in sample.env to match your needs. If you do need to think about the security of your database I would recommend you start by changing these.