You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
ClinicalTrialsDataProcessing/Parser/parser.txt

15 lines
724 B
Plaintext

Design decisions
There are roughly 2 ways to [parse xml](https://simplabs.com/blog/2020/12/31/xml-and-rust/).
The first is to use a streaming parser, which reads the string in and extracts the required data as it moves along.
The second is to unpack the xml into an internal representation in ram and access it there.
For what I need, I believe a streaming parser will work best as I want to read through the file once and extract the details needed.
I may be able to add that to plpgsql or do it in plpgsql directly.
Alternatively, after attempts with python xml and rust's quick_xml, I think I might be better off with a "find line in string matching this, manually parse the data"
I still need to give a dom a try.