Amtrak Data

This repository contains the data extracted from Amtrak's monthly reports in various formats. The repository is structured like so:

root
├───data
│   ├───csv
│   │   └───{year}
│   │       └───{month}
│   │           ├───operatingResults.csv
│   │           ├───capitalResults.csv
│   │           ├───keyPerformanceIndicators.csv
│   │           ├───sourcesAndUsesAccount.csv
│   │           └───routeLevelResults.csv
│   ├───json
│   │   └───{year}
│   │       └───{month}
│   │           ├───operatingResults.json
│   │           ├───capitalResults.json
│   │           ├───keyPerformanceIndicators.json
│   │           ├───sourcesAndUsesAccount.json
│   │           └───routeLevelResults.json
│   └───xml
│       └───{year}
│           └───{month}
│               ├───operatingResults.xml
│               ├───capitalResults.xml
│               ├───keyPerformanceIndicators.xml
│               ├───sourcesAndUsesAccount.xml
│               └───routeLevelResults.xml
├───reports
│   └───{year}
│       └───{month}
│           ├───report.pdf
│           └───unprocessed
│               ├───tabula-report-0.csv
│               ├───tabula-report-1.csv
│               ├───tabula-report-2.csv
│               ├───tabula-report-3.csv
│               └───tabula-report-4.csv
├───scripts
│   ├───parseRawToJSON.js
│   └───parseJSONtoOthers.js
└───[various config files, including package.json]

I used Tabula to extract the table data from PDFs. This will take a bit of manual work and also may not be in the right format for my scripts to process. If I haven't updated this data yet and a report has been published, you email me (on my website) and I will knock it out on one of my lunch breaks.

To run the scripts yourself, clone the repo, make sure you have some modern level of node installed (i think node 14+, not sure tho) and run ye old:

npm install

and to run

cd scripts
node parseRawToJSON.js
node parseJSONtoOthers.js

All I am doing currently is extracting the data into other formats to allow for further analasys. I do not plan on doing any further analysis within these scripts.

Just a little key for personal needs when extracting the tables from the pdfs:

0: operatingResults
1: capitalResults
2: keyPerformanceIndicators
3: sourcesAndUsesAccount
4: routeLevelResults

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
reports		reports
scripts		scripts
.gitignore		.gitignore
package-lock.json		package-lock.json
package.json		package.json
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amtrak Data

About

Releases

Packages

Languages

piemadd/amtrak-data

Folders and files

Latest commit

History

Repository files navigation

Amtrak Data

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages