This repository processes and analyzes U.S. Immigration and Customs Enforcement (ICE) data released pursuant to FOIA requests by the University of Washington Center for Human Rights.
The datasets analyzed here were released by ICE's Enforcement and Removal Operations (ERO) Law Enforcement Systems and Analysis Division (LESA); the datasets represent person-by-person, facility-by-facility detention history records from 2011-10-01 through 2024-01-04.
2022-ICFO-09022: We are seeking person-by-person, facility-by-facility detention history records from the ERO LESA Statistical Tracking Unit of all people in immigration detention nationwide from 10/1/2011 to date, in XLS, XLSX, or CSV spreadsheet format; including but not limited to the following fields and including any related definitions, legends, or codebooks: - Unique subject identifier: Non-personally identifiable sequence number or other designation to identify records relating to the same subject. (Such information was previously released pursuant to FOIA 2015-ICFO-95379.) - Detention Stay Book In Date - Book In Date And Time - Book Out Date And Time - Detention Stay Book Out Date - Birth Country - Citizenship Country - Race - Ethnicity - Gender - Age at Book In - Entry Date - Entry Status - LPR Yes No - Most Serious Criminal Conviction (MSCC) - MSCC Code - MSCC Conviction Date - MSCC Sentence Days - MSCC Sentence Months - MSCC Sentence Years - Aggravated Felon - Aggravated Felon Type - Rc Threat Level - Apprehension COL - 287(g) Arrest - Border Patrol Arrest or Arresting Agency - Book In After Detainer - Apprehension Program - Initial Detention Facility Code - Initial Detention Facility - History Detention Facility Code - History Detention Facility - Order of Detention - History Book In DCO - History Book Out Date And Time - History Release Reason - Detainer Prepare Date - Detainer Prior to Bookin Date (Yes/No) - Detainer Threat Level - Detainer Detention Facility - Detainer Detention Facility Code. We are not providing third party consent forms for all those whose data would be included and therefore understand that as a result, personally-identifiable information will be redacted to protect their privacy. However, the FOIA requires that all segregable information be provided to requesters, and personally-identifiable information is segregable from the remainder of this information. Such information was previously released pursuant to FOIA 2015-ICFO-95379 and FOIA 2019-ICFO-10844.
Large data files are excluded from this repository; data associated with this repository can be obtained here: https://drive.google.com/drive/folders/1Guhtpv80sh2FJ90-t1GyCtNzSa0Jsvzr?usp=drive_link
To execute tasks in this repository, first download the data files linked above and ensure they are stored in the indicated directory within the Git repository: original, untransformed datasets are stored in import/input/
; compressed, CSV-formatted files are stored in import/frozen/
.
Final datasets with minimal cleaning and standardization are stored/generated in export/output/
. Users interested in reviewing the final datasets without executing the code contained in this repository can find export datasets as of Oct. 25, 2024 at the following link: https://drive.google.com/drive/folders/1OQLU7IzhbodsrD2wZm-5fV57UIsnOW4x?usp=drive_link
This project uses "Principled Data Processing" techniques and tools developed by @HRDAG; see for example "The Task Is A Quantum of Workflow."
import/
: Convenience task for file import; original Excel files ininput/
are saved as compressed csv files infrozen/
.concat/
: Concatenates individual input files, standardizes column names, drops records missinganonymized_identifier
, and trivial number of duplicated records, logging stats tooutput/concat.log
; adds hash record and stay identifiers, and record sequence.unique-stays/
: Performs various calculations per placement, individual, and stay and adds relevant fields to facilitate calculations which require unique stay records (e.g. Average Length of Stay).headcount/
: Calculates daily detention headcount by given characteristic, e.g. per facility, by gender/nationality. Slow when applied to full dataset, could likely be optimized/improved.export/
: Convenience task, final datasets inoutput/
.share/
: Resources potentially used by multiple tasks but not created or transformed in this repo.write/
: Generates descriptive notebooks for publication.docs/
: Descriptive notebooks published at: https://uwchr.github.io/ice-detain/analyze/
: Exploratory analysis notebooks; contents are not final and should be considered to be speculative.
Each row represents an individual detention placement record per person per facility. Consecutive records represent successive detention placements in an overall detention stay of one or more placements. Individual people can experience one or more detention stay. In some cases, an individual's stay_book_in_date_time
does not coincide with the detention_book_in_date_and_time
of the individual's first detention placement; this is most common in records from the earlier period of the data (FY2011 and prior). Records with missing stay_book_out_date_time
and detention_release_reason
/stay_release_reason
values represent individuals whose detention stays were ongoing at the time the dataset was generated.
This dataset lacks information regarding detention facility characteristics such as precise location (other than ICE area_of_responsibility
) or facility type which may be relevant for detailed analysis.
Data was released without any data dictionary or field definitions; therefore we have had to infer significance of some values.
stay_book_in_date_time
: Detention stay start datedetention_book_in_date_and_time
: Detention placement start date (per facility)detention_book_out_date_time
: Detention placement end date; missing values represent current placement at time of release of datastay_book_out_date_time
: Detention stay end date, missing values represent current stay at time of release of databirth_country_per
: Individual's country of birth, unclear how different frombirth_country_ero
birth_country_ero
: Individual's country of birth, unclear how different frombirth_country_per
citizenship_country
: Individual's country of citizenshiprace
: Individual's race (Largely missing)ethnic
: Individual's ethnicity (Largely missing)gender
: Individual's genderbirth_date
: Redactedbirth_year
: Individual year of birthentry_date
: Individual's entry dateentry_status
: Individual's entry statusmost_serious_conviction_(msc)_criminal_charge_category
: Most serious conviction categorymsc_charge
: Most serious conviction chargemsc_charge_code
: Most serious conviction charge codemsc_conviction_date
: Most serious conviction datemsc_sentence_days
: Most serious conviction sentence length (days)msc_sentence_months
: Most serious conviction sentence length (months)msc_sentence_years
: Most serious conviction sentence length (years)msc_crime_class
: Most serious conviction crime classcase_threat_level
: Redactedapprehension_threat_level
: Redactedfinal_program
: Appears to represent DHS division responsible for decision to detaindetention_facility_code
: Detention facility codedetention_facility
: Detention facility full titlearea_of_responsibility
: ICE field office responsible for detention facilitydocket_control_office
: ICE docket control officedetention_release_reason
: Missing values indiciate ongoing detentionstay_release_reason
: Missing values indiciate ongoing detentionalien_file_number
: Redactedanonymized_identifier
: Anonymized unique individual identifier
filename
: Original data filenamerecid
: Unique record identifier based on original data fieldsstayid
: Unique stay identifier based onanonymized_identifier
andstay_book_in_date_time
rowseq
: Record sequence across input filesfile_rowseq
: Record sequece within input filestay_length
: Length of stay (missing for ongoing stays)placement_length
: Length of placement (missing for ongoing placement)stay_length_min
: Minimum length of stay (as of 2024-01-4, date of generation of dataset)placement_length_min
: Minimum length of placement (as of 2024-01-4, date of generation of dataset)total_stays
: Total detention stays per individualtotal_placements
: Total detention placements per individualcurrent_stay
: Does row relate to a current detention stay?current_placement
: Does row relate to a current detention placement?stay_count
: Consecutive identifier per stay per personplacement_count
: Consecutive identifier per placement per personstay_placements
: Total detention placements per stayfirst_facil
: Stay book-in facilitylast_facil
: Stay book-out facilitylongest_placement_facil
: Longest placement facility per staylast_placement
: Is this final placement of current stay?longest_placement
: Is this longest placement of current stay?
UWCHR is grateful to Prof. David Hausman and the ACLU of California for obtaining and sharing a previous verison of this dataset; and to Prof. Abraham Flaxman for assistance in analyzing a previous version of this dataset.
- Bring in ICE detention facility characteristics and related notes, analyze how many facilities here are represented
- Resolve problems noted in issue #3.
- Instead of generating separate dataset in
unique_stays
, flag final placement per stay in full dataset for simple filtering. - Create
docs/
and associated tasks - Create
stayid
key value for record blocs representing unique stays (combination ofanonymized_identifier
,stay_book_in_date_time
).