Cell coverage checker

Small tool to check for cell towers in reach for a given location. The coverage data for this is provided and licensed by the OpenCellID-Project.

This tool was created for the lecture “Big Data” held by @marcelmittelstaedt at the DHBW Stuttgart.

Starting the service locally

The service can be started through Docker Compose or a compatible tool.

# Make sure you are inside of the project root
docker-compose up

To add your authentication credentials, create and copy the credential file to /credentials/airflow inside the nginx container.

htpasswd -c credentials admin
docker exec dhbw-big-data-reverse-proxy-1 mkdir -p /credentials
docker cp credentials dhbw-big-data-reverse-proxy-1:/credentials/airflow

After this, the user application, Airflow, Hadoop task overview, and the Hadoop file viewer should be reachable.

If you need to access other services like the frontend application database, make sure to apply the development service configuration.

docker-compose -f docker-compose.yml -f docker-compose.dev.yml up

Data collection workflow

The ETL process is implemented in Airflow. The general idea is to check the current state of the data within the Hadoop file system and perform the actions accordingly. For example, if the initial full OpenCellID export is already downloaded, download the remaining difference files instead. The state of the data is tracked with a separate timestamp file, which marks the last time the data was gathered.

OpenCellID provides additional attributes for each entry, which are not relevant for the frontend application but are required to identify the entries when merging the data states. Therefore, those extra attributes are reduced to their hash, providing a unique identifier. The final dataset includes only the location and the range of the station, and this identifier.

Task	Description
`create-tmp-dir`	Create a directory to stage the downloaded files in.
`clear-tmp-dir`	Delete existing files within the download staging directory if any exists.
`check-for-initial-data`	Check for a download of a full OpenCellID export.

`download-initial-data`	Label the initial download branch, which gets executed if the previous check fails.
`create-timestamp-file`	Create the data timestamp file within the download staging directory.
`download-complete-data`	Download the full OpenCellID export archive.
`unzip-complete-data`	Extract the full OpenCellID export from its archive.
`create-raw-target-directories`	Create the final destination for the downloaded files within the Hadoop file system.
`put-initial-data`	Upload the full OpenCellID export to the Hadoop file system.
`put-initial-timestamp`	Upload the timestamp file to the Hadoop file system.
`parse-initial-data`	Transform the initial data and save it to the Hadoop file system and the frontend application database.

`download-remaining-diffs`	Label the difference download branch, which gets executed if the previous check succeeds.
`get-timestamp`	Download the timestamp of the current data state from the Hadoop file system.
`download-diffs`	Download all difference files from OpenCellID since the data timestamp.
`put-diffs`	Upload the downloaded difference files to the Hadoop file system.
`update-timestamp`	Override the timestamp file with the new timestamp for the data.
`delete-tmp-previous`	Delete temporary data storage if it exists.
`move-previous`	Move the data storage to the temporary directory for processing.
`parse-diffs`	Merge the existing data with the new data provided by the difference files and override the existing data.
`delete-left-overs`	Clear the final data location. Gets executed when the previous step failed.
`restore-previous`	Move the previous data to its final destination. Gets executed if the previous step succeeded.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
airflow		airflow
frontend		frontend
nginx		nginx
spark		spark
.gitignore		.gitignore
README.md		README.md
airflow.Dockerfile		airflow.Dockerfile
dag-tasks.png		dag-tasks.png
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
frontend.png		frontend.png
hadoop.Dockerfile		hadoop.Dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cell coverage checker

Starting the service locally

Data collection workflow

About

Packages

Languages

hfxbse/dhbw-big-data

Folders and files

Latest commit

History

Repository files navigation

Cell coverage checker

Starting the service locally

Data collection workflow

About

Topics

Resources

Stars

Watchers

Forks

Packages 0

Languages

Packages