Studying Workers Travel Patterns to Employment Centers Using Census Data

>> Coordinator: Welcome and thank you for standing by. Currently all participants are in listen only mode. Today's webinar is being recorded and the recording will be posted publicly. If you have any objections, you may disconnect at this time. Now I'd like to turn the call over to your host for today, Earlene Dowell. >> Earlene Dowell: Thank you, Lisa. Before we begin, a few housekeeping items. We will save all our questions to the end of the presentation. Please add your questions into the

Q and A box in the lower right-hand corner and we will read them out loud after the presentation. The chat has been disabled to our attendees at this time. Good afternoon, everyone, and thank you for joining us for the first local employment dynamics or LED webinar of 2024. On behalf of the U.S Census Bureau and the LED partnership in collaboration with the Council for Community and Economic Research and the Labor Market Information Institute, it is my pleasure to welcome Sadra Sharifi, E

mmanuel Lucban, and Liang Tian as they present studying workers' travel patterns to employment centers using census data. This study explores workers' historical travels patterns utilizing the longitudinal employer, household dynamics, origin, destination, employment statistics or LODES data to gain insight into the relationship between the demographic characteristics and primary employment centers in the San Diego region. Results of this study will help local policy makers and key stakehol

ders to institute data driven programs and policies to improve regional transportation systems leading to reduce congestion and emissions. Sadra Sharifi is an associate data scientist at San Diego Association of Governments or SANDAG data science team. He has over five years of experience working on various data science and modeling projects. Sadra holds a PhD degree in transportation engineering from Utah State University with concentrations on transportation data analysis, transportation

model developments, and statistical modeling. Emmanuel Lucban is an associate data scientist at SANDAG. Emmanuel has years of experience in developing data science related production applications, data engineering process, and IT management. Emmanuel holds a master's of science degree in applied data science from the University of San Diego and a bachelor's degree in data science and computer science from the University of California, Berkeley with concentrations in data analysis, data eng

ineering, and machine learning. Liang Tian is a principal overseeing data science management and analytical work program areas at SANDAG. Liang has led cross functional teams to deliver high performing predictive science and analytic solutions in various application domains across global markets. Liang has published 20 plus journal articles and book chapters and is the first named inventor on the U.S patent application in financial risk management domain. Liang holds a PhD in computer scie

nce with a specialization in computational intelligence methods and its applications. Now I'd like to turn the call over to Liang. >> Liang Tian: Thank you. Thank you so much, Earlene, for the warm introductions. Really appreciate that. So first of all we are so glad to be here and truly appreciate the organizing committee for giving us this opportunity to share our original research from a local planning agency perspective. I'm also excited to have my two colleagues, Sadra Sharifi and Emma

nuel Lucban, to join me today to cover the presentation. So our topic today is the research study on workers' travel and commute patterns to major employment centers in the San Diego region. Next slide please. So we will start with a brief introduction to SANDAG and to San Diego region and also to introduce what we do in the data science team. And we will cover two parts in this presentation. So part one will be focusing more on the historical trend analysis and part two will cover the more

extended analysis and also the behavior analysis using most recent data. All right. So the San Diego Association of Governments, short for SANDAG, is a federal designated metropolitan planning organization. And it's also a state designated transportation planning agency. So as a metropolitan planning organization and a council of governments, SANDAG is bringing together local decision makers, jurisdictions, and also the public to develop solutions to regional issues including improving tr

ansportation, air quality, sustainable energy, economic development, and also to enhance transportation across our binational region, across U.S and Mexico. Public health, public safety, and housing and so much more. All right. So the San Diego region is a very diverse and increasingly growing place to live and work. According to the U.S. Census Bureau and California Employment Development Department, the population and employment in the region has increased significantly in the last 10 yea

rs. So at SANDAG we use data as the foundation of pretty much everything we do. So for planning, for building, and preserving resources, Sadra, Emmanuel, and I are part of data science department. So our entire data science team is truly passionate about data, passionate about analytics, and we are developing products and performing research by leveraging our skill set to really help make better informed decisions around regional issues. Understanding where people live and how people commut

e to work are to major employment centers has been and always will be a foundation of our regional planning effort. Further we should also focus on understanding how people are making different travel decisions. Right? Is it by walking or biking or driving by yourself or carpool with someone else? Right? So or taking a bus, freeways, or side street. So those are the questions we all really need to understand better. Okay. So with that I'm going to turn you over to Emmanuel to cover the par

t one of the presentation. Emmanuel. >> Emmanuel Lucban: Can you hear me? All right. Thank you, Liang. So hello everyone. My name's Emmanuel Lucban. I am an associate data scientist for SANDAG. And I'm going to go over the part one of the analysis which is going to be historical trends. Okay. So for this study one can kind of gain insight into the following. So, you know, first of all what are employment centers in the region? So employment centers are essentially high areas of high density

employers. And then so what are the trends or changes over time for job growth? Gender. Worker age distributions. And income. And finally have there been any changes to workplace proximities over time? Okay. So let me kind of give you like a quick introduction to what SANDAG refers to as employment centers. So on the map on the right you see these colored portions. So the colored portions on the map denote areas of significant density of employers. So how did we identify these areas? So u

sing SANDAG's employment estimates and other data and along with a density based spatial clustering method SANDAG was able to identify 79 clusters that we referred to as employment centers. So these employment centers were then organized into four tiers based on the based on the number of employees within those employment centers. So if you take a look at the table, you see tier one centers consist mostly are centers that have over 75,000 employees. So these tend to be kind of large compani

es and corporate headquarters. And there are three employment centers that fall into this tier. So for tier 2, tier 2 employment centers are employment centers with 25 to 75,000 employees, and there are 10 employment centers that fall into this tier. For tier 3 employment centers we have 15,000 to 25,000 employees and there are 15 employment centers that fall into this tier. And finally for tier 4 which are employment centers with 2 and a half to 15,000 employees and there are 51 employment

centers in this tier. So let's kind of examine the underlying sectors that make up the four tiers of employment centers. So for tier one employment centers we can see that the predominant sector by far at 26.9% fall under the professional scientific professional scientific and technical sector followed by a combination of food and then the healthcare sector. So for tier two it's predominantly healthcare closely followed by retail and accommodation and food. And then here we see for tier t

hree it's predominantly manufacturing, retail, and then accommodation and food. And then for tier four we see it's predominantly healthcare followed by accommodation and food and then retail. Okay then, so using the longitudinal employer household dynamics or LEHD data let's kind of examine some of the characteristics of the workers that work in these employment centers and their travel patterns. So for this study we're using the LEHD origin-destination employment statistics or LODES versio

n 7.5 data, and we're going from the time period of 2012 to 2019. So we're going to look at employment and residential areas. So the origin-destination we're using the origin destination tables and also the work area characteristics tables, and we're going to be looking at the census block level. So we use decided to use job type both 03 which is all primary private jobs to kind of better align with the objective of the study. Okay. So exploring historical patterns. We want to obtain insig

hts into the relationships between socio-demographic characteristics and the employment center tiers. Some of the characteristics we'll be looking at over time using the LEHD LODES data is going to be job growth, gender, age, income distribution, educational attainment, and proximity. And I just want to kind of make a note. We mentioned proximity and in this case we kind of want to make this distinction between travel business and proximity. So for part one we define proximity here as a dir

ect distance from origin to destination. So kind of think of a straight line rather than actual travel distance going through a transportation network. So the first characteristic that we'll be examining is job growth. So on the top bar chart we have the year over year growth of the number of jobs from 2012 to 2019 for all employment center tiers. And on the bottom we have a line chart and it's a breakdown of job growth by each individual tier. So, as you can see from the top chart, job gro

wth for all employment centers experienced kind of a steady increase with the largest year over year job growth occurring in 2016 at 3.1%. The bottom chart we see that tier one employment centers see significant steady job growth across all years while tier three and tier four job growth kind of flattens out starting around 2016. And examining gender distributions for each tier, so the left graph shows the proportion of female workers by tier over time and the right graph shows the proport

ion of male workers by tier over time. We can see that the proportion of male to female workers in each tier remained stable, relatively unchanged, over time. We do observe in tier that tier four employment centers have the smallest difference with an almost balanced proportion of male to female workers. Tier three employment centers are observed to have the largest difference in proportion of male to female workers. And tier one and tier two differences in proportions are between tier four

and tier three, but still have a higher male to female worker ratio. So one of the potential contributors to the observed proportional differences in gender could be the predominant sectors that comprise the employment center tiers. So if you recall, a tier four's top three sectors include healthcare, accommodation, food, and then followed by retail while top tier three's top three sectors are largely manufacturing, retail and then followed by accommodation and food. Okay. So looking at th

e worker age distributions we have four condensed stacked bar charts for each tier. The charts were condensed for readability. So over time we observed kind of this increase of proportion of workers aged 55 and older in all tiers while observing a decrease in proportion of workers under the age of 30 excluding tier 1 employment centers which shows very marginal increase in that age group. We could also see that tier 4 employment centers employ the largest proportion of workers under the ag

e of 30 compared to other tiers while tier 1 employment centers have the largest proportion of workers aged 30 to 54. Okay. So comparing income distributions from 2012 versus 2019. So the top donut charts are income distributions for tier one and the bottom one's for tier two. The left side is going to be 2012 and the right side 2019. So the green part of the chart shows the proportion of workers making over $3,333 per month. And we'll just call this the highest income group. So we can see

that the green part shows that tier 1 has the largest proportion of this group compared to other tiers and has increased 8% from 2012 to 2019. So for tier 2 and tier 3 employment centers, next slide please, so they both showed a 9% increase in the highest paid group from 2012 versus 2019. For tier 4 employment centers which has the smallest proportion of this highest paid group, we still observed a 7% increase from 36% to 43%. Okay. So let's look at the educational attainment. So for educat

ional attainment again we have condensed bar charts for each employment center tier. So over time we kind of observed this interesting trend. For all employment center tiers we have observed a steady proportional decrease of workers with a college or advanced degree. So from SANDAG's economic analysis we have been observing an increase in jobs that don't require college degrees which may be a contributing factor to this trend. So even with an overall decrease, all employment center tiers st

ill have a larger proportion of workers with a college or advanced degree than those without. And with tier one employment centers having the largest proportion of college educated workers among all tiers. This is most likely attributed to the sectors that comprise tier one employment centers which is predominantly professional scientific and technical. Okay. So next we'll examine worker proximities from employment centers. As we described before, we define proximity as an origin block to

destination block Euclidean distance rather than travel distance or transportation network. So you can see in this animated map we kind of want to give you an idea of the changes in weighted average proximity of workers to tier one employment centers over time. So if you see the black outline, that denotes the location of the tier one employment centers. As we can observe from 2012 to 2019, there are these pretty visible origin block changes particularly for those with average proximities o

f 30 plus miles which are the blocks in red. We can see that these origin blocks are concentrated in the northern part of San Diego over time. So a deeper dive into proximity, the bar charts here display the proximity trends of each employment center for the last five years available in the LEHD data. We can see that the proportion of workers with a proximity of 10 plus miles are decreasing overall. Especially in the tier 1 employment centers, we see a steady increase over time and proport

ionally have the largest percentage of workers with a proximity of 10 plus miles and the smallest percentage of workers with a proximity of 0 to 5 miles compared to other tiers. So workers in tier 1 and certainly in tier 3 employment centers tend to reside farthest from their workplaces. And workers in tier 2 and tier 4 employment centers have very similar proximity characteristics. Okay. So that concludes, you know, part one. This is the historical analysis of employment center tiers using

LEHD LODES 7.5. So I'm going to pass it off to Sadra who's going to give a more in depth analysis of the travel patterns of select employment centers. >> Sadra Sharifi: Thank you, Emmanuel. Hello, everyone. In this part I'm going to talk about our extended analysis on the employment centers. In this analysis we explored the primary origins of employees traveling to their employment centers. This will help us to understand where employees are distributed over the region. We examined to see

whether there are significant differences in commute distances among different age groups. We analyzed travel pattern variations among employees with different levels of incomes. This will help us to identify the potential like disparities in our transportation systems. We identified the preferred mode of transportation selected by employees to commute to their workplaces. And finally we investigated how travel distances vary across different employment centers. For this part we used 2019

LEHD version 8 which represents the normal travel behavior before the COVID time. And we also used two additional data sources, SANDAG activity based model and Replica. These two sources provide some travel related data like travel distances and the transportation network and mode choice selected by employees. And this will help us to couple these sources with LEHD data and do more deeper analysis. Similar to the first part, we focused on private primary jobs. For our analysis we selected f

our diverse employment centers representing each tier. Sorrento Valley located in the northern part of San Diego is a technology and innovation hub spot. It's home to many tech companies, startups, and research institutions such as UC San Diego. Mission Valley located in the center part of the San Diego region is an urban hub with a mix of commercial, retail, and residential spaces. And also there's a transit center within this employment center. Chula Vista in the southern part of the reg

ion is a diverse community with a growing manufacturing sector. And Santee in the eastern side is a suburb and community with a mix of businesses and industries. These two maps show the residence location of employees working in Sorrento Valley and Mission Valley obtained from LEHD data source. Each point on this map represents one employee, and you can clearly observe that employees working in Sorrento Valley on the left map they're living all over the region and they're distributed like a

ll in northern center part and also southern part. But for Mission Valley majority of employees are living in the center and southern part of the region. These are the two similar maps for Chula Vista and Santee. In Chula Vista a major part of the employees are living in the southern part of the region and the concentration of employees living within the employment center is very high. And for Santee most of employees are working very close to the employment center in the eastern part of t

he region. Using LEHD data coupled with Replica we can examine the travel distances to different employment centers. This plot shows the cumulative distribution of travel distance to the employment centers. The horizontal axis shows the travel distance and the vertical axis shows the percentage of employees. From this graph we can extract the 50 percentile or median of travel distances to these employment centers, and we can clearly see that employees are experiencing longer travel distance

s to Sorrento Valley and Mission Valley compared to Chula Vista and Santee. Specifically the median travel distance to Sorrento Valley is more than two times to the Chula Vista. And also looking at these curves we can find that likely behavior for Chula Vista and Santee are very similar up to the 50 percentile, but after that level like the curve for Santee is moving towards the Mission Valley and Sorrento Valley. So knowing this we can understand the commuting behaviors much more better an

d potentially we can optimize our transportation network to be efficient for all the employees. This plot here shows the travel distances by age groups. The horizontal axis shows the travel distances on the transportation network in five markings and the colored bars show the portion of employees in each age group. Green bars are for employees with less than 29 ages. Purple one is for 30 to 54 ages. And the orange one is for plus 55 ages. Using these two graphs first we can compare the over

all distributions and we can observe that a high portion of employees are experiencing higher than 20 miles going to Sorrento Valley compared to Mission Valley. And also comparing the age groups together you can see that like a higher portion of employees with plus 55 ages are experiencing longer distances compared to other age groups. These are similar plots for Chula Vista and Santee. The plot for Chula Vista is very highly left skewed meaning that the average travel distance for going t

o this employment center is is the lowest among all employment centers. And comparing the age groups we can see that plus 55 age group is experiencing very short travel distances up to 5 miles compared to other age groups. But this observation is kind of reversed in Santee. About 31% of employees with less than 29 age is like they're experiencing travel distances up to 5 miles, but this value is at 27% for other age groups. Knowing this will help us to to define more efficient policies cons

idering the age group needs. This is the travel distance distribution by income groups. In the LEHD data we have three income groups and for this analysis we decided to aggregate to upper level and compare with the lower level of income groups. And here we can observe that the like the distributions for these two groups are very similar for all employment centers. However there are cases that the magnitudes are significantly different. For example, in Chula Vista about 39% of low income pe

ople are experiencing up to 5 miles, but this value is 27% for higher level income people. And this also applies to Santee. About 30% of low income people are traveling within 5 miles, but 26% of higher income level are traveling within this distance. And for Sorrento Valley and Mission Valley this observation is kind of reversed. Lower income people are experiencing longer distances and we believe that this is related to the affordable housing close to Chula Vista and Santee compared to So

rrento Valley and Mission Valley. Another analysis that will help us to understand the commuting behavior much more better is transportation mode choice analysis. This table displays the mode shares of different transportation modes selected by employees to the employment center. And we got this from our SANDAG activity based model. As expected, the highest share is for private cars, but there are some variations. About 95% of employees going to Santee they are using their private cars incl

uding drive alone and carpool, but about 87% of employees going to Chula Vista are using their own cars. Mission Valley stands for a higher percentage of using transit and as I mentioned earlier there's a transit center in Mission Valley providing services to different bus lines and trolley lines. So we believe that like that high percentage explained explain that phenomena. And also Chula Vista it stands for a high percentage of active modes including bike and walking. Specifically 4% of e

mployees are walking to work in Chula Vista employment center. And this is because the majority part of the employees are living within the employment center as we showed on the residence map. And like in those maps if you recall there were a very high percentage of points concentrated in the employment center. Summarizing the main points covered in this presentation, in the first part we tracked the historical trends of social demographic variables and we tracked the changes year over yea

r. And using that potentially we can identify employment center tiers that they have drastic changes over time. And in the second part we coupled LEHD data with other sources and we focused more on the commuting behavior of employees going to for representative employment centers. And as for the application of this study, we can potentially use the findings of this study to improve the accessibility of transportation systems for all employees regardless of their age, education, and ethnicit

y. We can address the specific needs of low income people which rely more on transportation. On public transit systems. And we can set more efficient policies considering age group specific needs. Recently SANDAG initiated a policy to offer free transit services to students and our observations showed that that was a really efficient policy to promote sustainable transportation systems such as transit systems. And yeah. So we think that potentially we could use all of these findings to to p

rovide more like more efficient plans for our region. We are excited to share that SANDAG has launched our open data portal. In this portal users can download the data sets, interact, and also visualize on the platform. And recently we published our new version of employment centers and we encourage you to check it out at opendata.sandag.org and let us know if you have any follow up questions. This concludes our presentation. We would like to thank again the committee to give us the opport

unity to present our research. And we look forward to the QA session. Thank you all for your time. >> Thank you, speakers. Before we go to the questions, the presentation will be accessible on the census academy website at census.gov/academy under the webinar tab in a couple of weeks. The link has been added to the chat. Again please type all of your questions in the Q&A and keep your questions pertaining to the presentation. The chat has been disabled to all attendees. And now for the fir

st question. From Tamara Shales. Can you further explain how you identified the employment centers and what are the base geographies, for example census block groups? >> Liang Tian: Emmanuel, do you want to answer that? >> Emmanuel Lucban: So we do actually provide a methodology on this on the sandag.org website, but essentially we were using I believe it was the underlying data was EDD. So it was point level data that was aggregated using was it quarter mile hexbins and then a spatial clust

ering method was applied to that? And then from those clusters we then overlaid them to a geography that we use internally at SANDAG called MGRA which is a master geographic reference area. So it's kind of akin to basically like a census block group. And then those kind of we used those to define the boundaries for those employment centers. >> The next question from Rosanna Santana. Does the data show where people who cross over from Mexico go to work? >> Liang Tian: Can you repeat again? So

rry. >> Sure. Does the data show where people who cross over from Mexico go to work? >> Sadra Sharifi: Not really. Like because LEHD data doesn't cover the like commutes from Mexico to the U.S. >> Thank you. Next question from Mark G. What data is available to apply to analysis methods to other geographies? Can you repeat it again? >> Liang Tian: Yes please. >> What data is available to apply excuse me. What data is available to apply your analysis methods to other geographies? >> Liang Tian:

I think I can start. I think we use a lot of different data sources for this right? We have a lot of public available data for example from census LEHD. We also have the California EDD data as well. And then there's also our SANDAG modeling data. And we basically used a combination fashioned of different data sources and then applied it to this study. I'm not sure if that answered the question here. I'm not sure also, Sadra, if you want to add something to this. >> Sadra Sharifi: Yeah. Basi

cally like I believe that we can use LEHD data which is publicly available. But for Replica because Replica is not publicly available so technically we could use Replica, but you will need to have license to use that. >> Thank you. Next question’s from Stephanie Benson. It's how do you access the LEHD data excuse me. If possible, I will provide that link in the chat. To the next person is from is Phillip D. His question is would you elaborate on the additional data sources and how they excu

se me. How they added to the study or how they are added to the study. Again, would you elaborate on the additional data sources and how they are added to the study? >> Sadra Sharifi: Yeah. If I understand correctly like the question is about Replica and SANDAG activity based model. So basically we used so in the first part we used the like direct instances from each origin to the workplaces, but in the second part we wanted to examine the actual travel distances on the transportation netw

ork. And this is something that we extracted from Replica data source. So because it doesn't exist on the LEHD data source. So like we coupled Replica with LEHD to extract to like relate actual travel distances with other social demographic variables. And also we used SANDAG activity based model just for the mode choice analysis. Since we don't have any like similar data in LEHD. >> The next question is from Stephanie Benson. How recently was the data updated? >> Liang Tian: All right. Emmanu

el, do you want to talk about the LEHD data? >> Emmanuel Lucban: So the first part of the analysis, so this is actually an analysis we did a little over a year ago. So at the time we were using the LEHD 7 I think it was, yeah, 7.5. But for the second part of the analysis we were using LODES version 8 or I think >> Sadra Sharifi: Version eight. Yeah. >> Emmanuel Lucban: I think up to 2021. But we decided to stick with 2019 because we wanted we wanted to kind of give an analysis that was befor

e COVID or, you know, without the effects of COVID. >> Sadra Sharifi: Yeah, but the latest LEHD I think is 2020 if I'm correct. >> Liang Tian: Yeah. That's right Sadra. >> Earlene Dowell: 2021. >> Sadra Sharifi: 2021. Yeah. >> Okay. Our next question is from Edward Sullivan. Has SANDAG used LODES data to analyze patterns of travel to the employment centers by industrial group? >> Liang Tian: Is it by different industrial sectors? Or can you repeat the question? >> Has SANDAG used LODES data t

o analyze patterns of travel to the employment centers by industrial groups? >> Sadra Sharifi: Yeah. That's a regular suggestion. No. We didn't differentiate between industries yet. >> Liang Tian: But I think, Sadra, on some of the slides we did show the difference by different tiers. Right? And because people go to I mean different tiers has a different level of concentration by industry groups. Right? So I think we can possibly infer some information from there which again is also on the

SANDAG open data portal. And then we do have analysis by different tiers and by each different center. So I think I would suggest maybe go there, take a look, and then if you still have further questions, email us and we can definitely help you understand better on that. >> That's the last of our questions. >> Earlene Dowell: Okay. Great. So thank you everyone for joining us this afternoon, and thank you to our speakers from SANDAG for their excellent presentation. Emmanuel, Sadra, and Lian

g, is there anything you would like to say before we close? >> Liang Tian: Sadra? Emmanuel? >> Sadra Sharifi: Not really. Thank you again. >> Emmanuel Lucban: Just wanted to say thank you for the opportunity to present for the >> Earlene Dowell: Great. Thank you, guys. >> Liang Tian: Same here. Yeah. Yeah. And just always maybe check out our SANDAG resources online and reach out to us for any questions. >> Earlene Dowell: Great. So this concludes the February LED webinar. Please join us next

month on the third Wednesday of the month, March 20, at 1:30 PM Eastern time when Nidaal Jubran presents, New Enhancements of the Census Business Builder. Be sure to register early for this webinar. Upon exiting this webinar, you will receive a pop up to an evaluation before you log off. We would appreciate that you take the time to answer the questions so we may be better serve you in the future webinars. On behalf of the U.S Census Bureau and the LED partnership, thank you for joining us.

Until next time, enjoy the rest of your afternoon and thank you for spending your time with us. >> Coordinator: This concludes today's webinar. Thank you for your participation. You may disconnect at this time.

Studying Workers Travel Patterns to Employment Centers Using Census Data

Related articles

Comments