This is used to build tools which process and standardize data.
You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Go to file
youainti 9850f4c677 setting up the extraction tools and including test cases. 4 years ago
AACT_downloader updated table definitions for views so that they deliver the correct information 4 years ago
Orangebook Adding attempts at orangbook processing. The CID parser stuff is in case I need to parse the pdfs 4 years ago
Parser setting up the extraction tools and including test cases. 4 years ago
assorted Reorganized folder system. added an assorted section to help keep track of useful sql etc. 4 years ago
classifications Reorganized folder system. added an assorted section to help keep track of useful sql etc. 4 years ago
history_downloader Added rate limiting functionality. Not tested as things ran just fine for the 1700 trials I downloaded. 4 years ago
.gitignore Housekeeping: Renamed folder to distinguish between aact and history downloader. Updated .gitignore to handle orangebook data 4 years ago
README.md began rewriting README 4 years ago

README.md

ClinicalTrialsDataProcessing

This is used to build tools which process and standardize the data.

More data later.

Outline

Directory Tree

AACT_downloader

Key files index

Background on Docker

Docker uses the following flow

  1. configuration using docker-compose.yaml or a Dockerfile
  2. docker build . to generate an image
  3. docker run xxxxxx to take the image and create a container.
    • when the container is created, it starts, running commands as configured in the dockerfile.
    • Consequently, the AACT database image when run must initialize the postgres db, then run the initalization details.
    • Here is where bind mounts come into play.

Multistage builds

https://stackoverflow.com/questions/53659993/docker-multi-stage-how-to-split-up-into-multiple-dockerfiles

https://docs.docker.com/develop/develop-images/multistage-build/

Basically

Dockerfile vs docker-compose.yaml

A Dockerfile is used to create images.

A docker-compose.yaml is used to automate the deployment of containers.

Types of storage

COPY/ADD (Dockerfile)

In a dockerfile, this adds a file permanently to the image.

This adds files one way to or from the container when initialized.

Volumes (docker-compose.yaml && Dockerfile)

Useable in both docker-compose and Dockerfile's, this creates a permanent storage. It can be maintained by docker or stored in a particular location.

Good for longer term storage such as databases.

Bind mounts (docker-compose.yaml)

Bind mounts are used to make a host filesystem resource available