Skip to content

This project focuses on cleaning and processing datasets using Shell scripts. It is part of the Fundamentals of Informatics course (2022-23) and involves handling movie and show data to create cleaned and filtered datasets for further analysis.

Notifications You must be signed in to change notification settings

luciarevaliente/Shell_script_data_cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fundamentals of Informatics: Cleaning a Dataset

This repository contains the first practice of the Fundamentals of Informatics course (2022-23), which involves cleaning a dataset.

Project Description

The goal of this practice is to learn how to handle and clean datasets using Shell scripts. Several CSV files with movie and show data have been provided, and scripts have been created to filter and clean this data, generating final files that are more manageable and useful for further analysis.

Repository Contents

  • Movies.csv: Original file with movie data.
  • Movies_columna12.csv to Movies_columna16.csv: Files with specific columns extracted from the original dataset.
  • Movies_f.csv and Movies_net.csv: Files with filtered and cleaned movie data.
  • Shows.csv: Original file with show data.
  • Shows_columna12.csv to Shows_columna15.csv: Files with specific columns extracted from the original dataset.
  • Shows_f.csv and Shows_net.csv: Files with filtered and cleaned show data.
  • practica1.sh: Script used for data cleaning and processing.
  • prova.txt and prova_script_pas4: Test files used during the development of the practice.
  • titles.cvs: File with titles of movies and shows.

Instructions

  1. Clone the repository:

    git clone https://github.com/luciarevaliente/fon_info_practica1.git
    cd fon_info_practica1
  2. Run the cleaning script:

    ./practica1.sh

Contributions

This project is part of an academic course and does not accept external contributions.

License

This project does not have a specific license and is for educational purposes only.

About

This project focuses on cleaning and processing datasets using Shell scripts. It is part of the Fundamentals of Informatics course (2022-23) and involves handling movie and show data to create cleaned and filtered datasets for further analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages