
PiT Data Cleaning and Analysis for Everyone Counts 2024
This presentation provides guidance on cleaning and analyzing Point-in-Time (PiT) Count data, covering topics such as cleaning enumeration and survey data, creating sociodemographic variables, conducting crosstabulations, visualizing survey data, and investigating trends over time. It includes information on submitting enumeration data, adding overnight locations, disaggregating data, and analyzing methodology for reporting trends in homelessness data.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
PiT Count Data Cleaning and Analysis Everyone Counts 2024 Office Hours December 11, 2024
Introduction Introduction This presentation will provide guidance for cleaning and analyzing your PiT Count data. We will discuss: Cleaning and reviewing enumeration data; Cleaning and reviewing survey data; Where to look for resources on data cleaning based on HIFIS version and for non-HIFIS users; Creating variables for key sociodemographic results; Crosstabulations and visualizations of PiT survey data; Investigating trends over time. 2025-03-18 2
Data Data Cleaning Cleaning Enumeration Enumeration Data Data To submit your enumeration data for Everyone Counts 2024, complete the Post-Count report in MS Forms Unsheltered: Unsheltered excluding encampments (Surveyed) Unsheltered excluding encampments (Observed) Encampments (Surveyed) Encampments (Observed) Shelters: Emergency shelters (including extreme weather response shelters and hotel/motel programs that operate as emergency shelters); DV shelters. Transitional housing Health and correctional systems [optional] 2025-03-18 3
Data Data Cleaning Cleaning Enumeration Enumeration Data Data Adding overnight locations: If you enumerated individuals experiencing hidden homelessness, this may be reflected in your local report. However, due to lack of consistency in data collection across Canada, these locations are not included in the national enumeration. Disaggregating overnight locations: You may find that reporting subsets of core location types separately (e.g., unsheltered homelessness and encampments, or emergency shelters and DV shelters) adds valuable insight into the experiences of homelessness in your community. Similarly, you may wish to disaggregate health and correctional systems in your local report or disaggregate severe weather shelters from emergency shelters. Unknown overnight locations: Some individuals may have declined to answer the screening question regarding overnight location or may have indicated they do not know where they will be staying. If they responded to the follow-up screening question to indicate that they do not have access to a permanent residence where they can safely stay as long as they want, they should be screened into the survey but should not be included in the enumeration of people experiencing homelessness. 2025-03-18 4
Data Data Analysis Analysis Enumeration Enumeration Data Data Review and compare your methodology to that used in previous Counts. Changes in methodology should be taken into account when reporting trends over time (% changes). The current enumeration in absolute terms should use all the data available. When calculating percentage changes in homelessness, ensure to include only overnight locations for which data was collected in both years. For example, if enumeration data was collected in DV shelters or health and correctional systems for the first time in 2024, we recommend reporting percentage changes in enumeration based on the locations for which data are available in both years. Consider what locations you want to include in your Total number. You may want to report the sum of the core locations, with other locations listed separately. 2025-03-18 5
Enumeration Enumeration Data Data Analysis Analysis Example Example Enumeration for Community A Unsheltered (Surveyed) Unsheltered (Observed) Encampments (Surveyed) Encampments (Observed) Emergency Shelters DV Shelters Count Transitional Shelters/Housing Systems [optional] Total enumeration 2024 61 20 22 8 76 31 44 16 278 2021 48 - 9 - For reporting: 2024 total enumeration = 278 For calculating % change: 2024 enumeration for comparison with 2021 = 234 92 39 - 188 Increase in enumeration from 2021 to 2024 = (234-188)/188 = 24.5%* - Included observed homelessness. - Improved knowledge of encampment locations and outreach between 2021-2024. - Engaged with more DV shelters and disaggregated sheltered count. - Included systems data from a hospital. - Did not include observed homelessness or systems in enumeration. - Some DV shelters participated, but these were included in the general sheltered count. *Due to improved coverage of encampments and DV shelters in 2024, the increase in homelessness may be slightly overestimated. Notes for Interpretation 2025-03-18 6
Investigating Investigating Trends over Time Trends over Time If your community participated in the nationally-coordinated PiT Counts in 2016 and/or 2018, or if your community conducted provincial or local PiT Counts in other years, these data points can be used to investigate changes in the rate and experiences of homelessness within your community over time. Enumeration for Community A Unsheltered (Surveyed) Unsheltered (Observed) Encampments (Surveyed) Encampments (Observed) Emergency Shelters DV Shelters Count Transitional Shelters/Housing Systems [optional] Total enumeration Total enumeration for comparison 234 250 216 2024 61 20 22 8 76 31 44 16 278 2023 57 16 18 10 80 26 35 12 254 2021 48 - 9 - 2018 32 - - - Transitional Shelters/Housing 44 188 200 35 169 39 Shelters (including DV shelters) 36 150 107 106 92 100 92 101 101 Unsheltered Locations (including encampments) 39 - 188 36 - 169 50 83 75 57 32 Total enumeration 0 2018 2019 2020 2021 2022 2023 2024 234 216 188 169 2025-03-18 7
Key Key Resources Resources for Survey Data for Survey Data Cleaning Cleaning For all communities: A data cleaning guide is available on GCcollab, under Files > Phase 4: Post-Count. It will be published on the Homelessness Learning Hub in the coming weeks. For non-HIFIS users: Data submission template and data dictionary in Excel is available on GCcollab, under Files > Phase 4: Post-Count. For HIFIS users: A data dictionary for all HIFIS export files is available on the Homelessness Learning Hub. Annex A in the data cleaning guide includes specific data cleaning considerations, for communities who are not using HIFIS 4.0.60.4.3 or HIFIS Lite. 2025-03-18 8
Reviewing Reviewing Completed Completed and and Incomplete Incomplete Surveys Surveys Unclear/Blank Response is used when a surveyor skipped a question or if a survey ended early. Reviewing the frequency of Unclear/Blank Response for each question in order can be used to assess the drop-off rate and inform revisions to the length of the survey questionnaire. Decline to Answer is used to assess the response rate of a survey question. (Willingness to respond) Reviewing the frequency of Decline to Answer can be used to identify opportunities for improving the survey design (e.g., assessing inclusivity and cultural sensitivity of language/content, privacy, dignity, and personal security of respondents). Don t Know is used to assess the response rate of a survey question. (Ability to provide a response) Reviewing the frequency of Don t Know can be used to identify opportunities for improving the survey design (e.g., assessing accessibility of language, relevance of content). When calculating the prevalence of characteristics or experiences, these entries should be excluded from the denominator. 2025-03-18 9
Reviewing Reviewing Screening Questions and Survey Screening Questions and Survey Eligibility Eligibility IF the response to the screening question C (overnight location) is one of the following: Someone else s house; Hotel/motel self-funded; Hospital; Treatment centre; Jail, prison, remand centre. AND the response to optional screening question C1 (access to a safe and permanent residence) is one of the following: Yes; Decline to answer. This respondent should have been screened out of the survey. Filtering for the above responses to the c1_permanent residence variable and dropping these rows from the survey dataset will ensure only individuals eligible to complete the survey are included in the analysis. Moving the dropped rows to a new file permits review of the responses that were screened out. 2025-03-18 10
Addressing Addressing Duplicate Observations Duplicate Observations Screening question ( Have you answered a survey with a person with this (identifier)? ) can be used to filter out people who have already responded. A unique identifier (e.g., randomly generated, or non-randomly generated using first two letters of first name, first two letters of last name, day of birth) helps to flag potential duplicate observations using basic personal information while maintaining anonymity. If your community is collecting names through its PiT Count (e.g., to create or update a Unique Identifier List or By- Name List), please ensure that the data submitted to the Government of Canada is anonymized, such that no individual person is identifiable. Check your survey dataset for overly similar observations to flag suspected duplicates using the following fields: Age; Gender; Racial identity; and Age of first homeless experience. When deciding which row to keep, consider the survey dates/times and completeness of survey responses. Do not combine answers from separate observations. 2025-03-18 11
Overnight Overnight Locations Locations Review write-in responses and re-categorize under pre-existing overnight locations, if possible. Review text input responses for exact or approximate/logical matches to response options from screening question C. Do not remove write-in response column. Create a standardized location variable based on responses to the screening question. Response in c_overnight location Standardized response in Location Sheltered Unsheltered HOMELESS SHELTER (e.g. emergency, family or domestic violence shelter, warming centre, drop-in) UNSHELTERED IN A PUBLIC SPACE (e.g. street, park, bus shelter, forest, or abandoned building) VEHICLE (e.g. car, van, recreational vehicle (RV), truck, boat) ENCAMPMENT (e.g. group of tents, makeshift shelters, or other long-term outdoor settlement) TRANSITIONAL SHELTER/HOUSING HOTEL/MOTEL FUNDED BY CITY OR HOMELESS PROGRAM HOSPITAL TREATMENT CENTRE JAIL, PRISON, REMAND CENTRE SOMEONE ELSE S PLACE Encampments Transitional Housing Hotel/Motel Systems Hidden Homelessness 2025-03-18 12
Locations in the Past Locations in the Past Year Year To analyze active and recent shelter use, filter the screening question for the response Homeless Shelter (Emergency, Family or Domestic Violence Shelter) and ensure that the variable for location in the past year reflects that the question was answered and that q01_homeless shelter variable = 1. It is recommended to repeat the above steps for all locations (unsheltered, encampments, transitional, etc.), so that the respondent s location on the night of the count is reflected accordingly. 2025-03-18 13
Sociodemographic Sociodemographic variables variables Connecting ConnectingFamilies Families Family composition can be categorized into the following types: Single individual; Couple/multiple adults without child(ren); Single parent/guardian; and Couple/multiple adults with child(ren). Create a binary variable to indicate rows where pet(s) are associated with the survey number of the respondent or the family head of the respondent. During data entry, the survey number of the Family Head should be indicated for all family members, including the Family Head themselves. If multiple parents/guardians within a family unit reported their children and their ages, it is important to ensure only one row is added for each child. 2025-03-18 14
Crosstabulations Crosstabulations of Survey Data of Survey Data Crosstabulations (or pivot tables in Excel) can be used to examine relationships between two or more variables and to find patterns within the data. These are tables of two or more dimensions that record the frequency of respondents broken down by the characteristics described by the variables. Representing these frequencies as a percentage of the column or row total will highlight differences of the composition of each group in terms of the other variable. A crosstabulation is a descriptive statistic, meaning that it can only be used to make statements about the sample of people who responded to the survey. I.e., the relationships between variables cannot be generalized to the full enumeration. Example visualization Age group distribution, by overnight location Unsheltered Youth (13-24) 7% Adult (25-49) 68% Older Adult (50-64) 22% Senior (65+) 3% Shelters 11% 53% 29% 6% Transitional 22% 49% 23% 6% Hotels/Motels 7% 56% 30% 7% Systems 10% 73% 15% 2% Hidden 17% 57% 22% 4% 2025-03-18 15
Visualizations Visualizations of Survey Data of Survey Data Example visualization Age group distribution, by overnight location 2025-03-18 16
Federal Federal Reports Reports Using Using PiT Data PiT Data To view how HICC represents nationally-aggregated PiT data from 2020-2022, please refer to the following publications: Everyone Counts 2020-2022 - Results from the Third Nationally-Coordinated Point-in-Time Counts of Homelessness Homelessness data snapshot: Mental health, substance use, and homelessness in Canada Homelessness data snapshot: Homelessness among racialized populations Homelessness data snapshot: Youth homelessness in Canada 2025-03-18 18
Adding Adding Context Context to to Enumeration Enumeration Data Data Data sources: PiT surveys, administrative data Coverage: Leveraging relationships and knowledge of outreach teams; Considering the potential influence of police or bylaw authorities in displacement of people leading up to the Counts; Leveraging GIS data, heat mapping; and Sampling methods, extrapolation and data quality checks (e.g., use of volunteers as control persons to estimate % of unsheltered population missed by surveyors). Family composition: Using enumeration data to investigate the prevalence of family homelessness can inform a better understanding of program needs (family vs. individual supports). Aggregate occupancy information: By service provider type (emergency shelter [general/family/youth], extreme weather shelter, DV shelter, transitional shelter/housing); Number of facilities; Number of beds; and Number of beds occupied on enumeration night. 2025-03-18 19
Sociodemographic Sociodemographic variables variables Age Age For the purposes of analysis, respondents are grouped according to the following mutually-exclusive age ranges: children (aged 0-12); youth (aged 13-24); adults (aged 25-49); older adults (aged 50-64); and seniors (aged 65+). To generate an AgeRange variable in Excel, create a column with a formula: =IF(AO3="","", IF(AO3<13,"Children (0-12)", IF(AO3<25, "Youth (13-24)", IF(AO3<50, "Adults (25-49)", IF(AO3<65, "Older Adults (50-64)", IF(AO3>=65, "Seniors (65+)")))))) Create a crosstabulation (or pivot table in Excel) to compare the q03aageyears variable and AgeRange. Confirm that the age ranges are inclusive of all responses, mutually exclusive, and aligned with the age ranges specified for reporting. 2025-03-18 20
Sociodemographic Sociodemographic variables variables Gender Gender and and Sexual Sexual Identity Identity A 2SLGBTQI+ variable can be created using a combination of responses from the questions regarding gender and sexual identity. If the Gender variable = Gender Diverse , compute 2SLGBTQI+ variable = 1. Respondents who identify as having a sexual orientation as Gay, Lesbian, Bisexual, Two-Spirit, Pansexual, Asexual, Queer, Questioning, Not Listed would be categorized as being 2SLGBTQI+. (In this case, compute 2SLGBTQI+ variable = 1). Respondents who indicate a gender identity of Man or Woman and a sexual orientation of Heterosexual are categorized as non-2SLGBTQI+. (In this case, compute 2SLGBTQI+ variable = 0). Respondents do not need to respond to both questions in order to meet the criteria for 2SLGBTQI+ = 1. However, it is necessary that both questions are completed in order to meet the criteria for 2SLGBTQI+ = 0. Responses from q12_gender identity Gender Variable Man Man Woman Woman Two-Spirit, Trans Woman, Trans Man, Non-Binary (Genderqueer), Not Listed Gender Diverse 2025-03-18 21
Sociodemographic Sociodemographic variables variables Newcomer NewcomerExperience Experience The term "newcomer" applies to respondents who have identified as having the experience of arriving in Canada under one of the following statuses: Immigrant; Refugee; Asylum claimant; Temporary foreign worker; Other work permit; Student/Study permit; Temporary resident; or Other. If one of these responses is indicated, compute NewcomerExperience variable = 1. If the response to this question is No , compute NewcomerExperience = 0. If a respondent indicates having an experience as a newcomer and indicates the duration of time since their arrival AND selected Always been here as a response to question 7 regarding time in the community, Always been here can be replaced by the duration indicated in the response to question 6. 2025-03-18 22
Sociodemographic Sociodemographic variables variables Migration and Time in Community Migration and Time in Community This question asks respondents whether they had moved from another community and also how long they have been in the current community where the PiT Count is taking place. If respondent indicates Always been here , but then provides the name of the previous city, the approach for cleaning the data would be the following: 1. 2. Ensure Answered is marked for variable q07_answered name of previous community . Replace the Always been here response and mark as Unclear/Blank Response for variable q07_answered duration in community . If possible, it is also recommended to go through all the responses provided for the name of previous communities, and ensure that the Country and Province fields are entered as well. This will help when conducting analyses looking at interprovincial migrations, migrations within a province, as well as migrations into Canada from another country. 2025-03-18 23
Sociodemographic Sociodemographic variables variables Indigenous Indigenous Status Statusand Racial Identity and Racial Identity Each racial identity is represented as a binary column in the HIFIS output and Excel data template for non-HIFIS users. If a respondent identifies only as White, they are categorized in the Non-racialized group . If a respondent identifies as Indigenous (First Nations, M tis, or Inuit) and does not identify as a member of a racialized group in 8b, they are categorized as Indigenous only . If a respondent identifies as a member of one or more racialized groups in 8b, they are categorized as Racialized group . Creating a binary Indigenous identity variable permits analysis of the intersection of Indigenous and racial identity. To align with the Census of Population conducted by Statistics Canada, Indigenous identity and racial identity are assessed through two separate questions. In accordance with the Employment Equity Act, respondents who identify as Indigenous are categorized as Not a visible minority . However, when representing all population groups, they are included in a distinct category labeled Indigenous peoples . 2025-03-18 24
Sociodemographic Sociodemographic variables variables Indigenous Indigenous Status Statusand Racial Identity and Racial Identity Some additional data cleaning considerations: 1. In cases when an Indigenous Identity is chosen, but the Racial Identity question is unanswered. The blank response for Racial Identity should be changed to Indigenous Only . 2. If a respondent indicates having Indigenous Ancestry, and left the Racial Identity question unanswered, the response to question 8b should remain as an Unclear/Blank Response . 3. If possible, it is recommended to review the responses that are set as Not listed , which includes the write-in responses, to check if it can be categorized into existing response options. This can be done by searching the responses in the q08b_Not listed variable for exact or approximate/logical matches to each of the pre-existing racial identities. The re-categorized responses should not be deleted. 2025-03-18 25
Sociodemographic Sociodemographic variables variables Veteran VeteranStatus Status The response options are: Yes, Military , Yes, RCMP , Both Military and RCMP , No , Don t Know , and Decline to Answer . For the purposes of analysis, the first three responses are aggregated into a binary variable where 1 = Veteran and 0 = Non-veteran. In Excel, a formula to calculate this might look something like the following: =IF(BX3="Unclear / Blank Response","",IF(BX3="","", IF(BX3="Yes, Military",1,IF(BX3="Yes, RCMP",1,IF(BX3="Both Military and RCMP",1,IF(BX3="No",0,IF(BX3="Don't Know",3,IF(BX3="Decline to Answer",3)))))))) 2025-03-18 26