ALTERNATE UNIVERSE DEV

Mentoring Developers

Episode 91

Greg started his career in data science after not getting a proper job with his Ph.D. degree in physics. He joined a Data Science bootcamp and then got a job as a Data Scientist. Watch this interesting interview as he describes his experience.

YouTube: https://youtu.be/vbSbKGwCUcA

Greg’s Bio:

Greg Damico is a Lecturer in Data Science at the Flatiron School and has been working there since March 2019. He hails originally from Columbus, OH, but has been in Seattle since 2013. He has an extensive background in academia and has taught and studied various subjects in addition to data science at a number of schools both in the Midwest and on the West Coast. He turned to data science for good in 2016. He is passionate about data science because of its ever-growing role in our daily lives, and he is passionate about education because of the way it empowers people to change their lives.

Episode Highlights

Audio Transcription

Episode Announcement:

You’re listening to Mentoring Developers, Episode 91, Let’s go. 

Intro:

Welcome to Mentoring Developers, the podcast for new and aspiring software developers, where we discuss your struggles, anxieties, and career choices. And now, here’s your host, Arsalan Ahmed.

Guest Biot:

In this episode of Mentoring Developers, I’ll be talking to Greg D’Amico. Greg is a lecturer in data science at the Flatiron School, and he’s been there for a long time. This interview was recorded in 2021. He’s originally from the Midwest in Ohio. He’s been settled out West in Seattle. So he has experience in both the Midwest and the West Coast.He’s taught everywhere.His passion is data science. And today we’ll be talking to him about data science, about.Flat Iron.School and other boot camps, and what it’s like to be in a boot camp as a mentor, as.A teacher. Also as a student. I think that this is a great little interview for people who are interested in boot camps, who are interested in data science, and who’s interested to see. what it’s like to be in a boot camp and learning data science.Especially from the point of view of a person that’s on the other side of the aisle.I Think you’re going to enjoy this. Let’s get ready and get started.

Arsalan Ahmed:

How are you doing, Greg?

Greg D’Amico:

I’m doing well, Arslan. Thanks for having me on the show.

Arsalan Ahmed:

How did you decide to become a data scientist?

Greg D’Amico:

It turned out to be a very significant letter for me. I was in philosophy graduate school. I did finish my PhD, but I wasn’t finding the work that I wanted after that was done. It’s very difficult in academia to get the high power job that a lot of people are looking for. I moved to Seattle and after I got to Seattle, I decided that I wanted to go back to school. I did have an old physics degree rusting on my shelf somewhere. I thought, Well, I’ve got some math skills. I think I could go do this applied mathematics program at the University of Washington. Shortly after I started that, an old friend of mine from philosophy grad school, Mike, said, Hey, have you studied any R? I started looking into it and I thought, Wow, this R language looks really cool. I don’t have a lot of programming background. I’ve done a little bit here and there. But I started looking into R. I took a course on R. I realized immediately its potential for data analysis and data science. That was my entry. I started reading a bunch of things about data science. I jumped into a boot camp for data science.Before long, I had finished a boot camp and I was able to land a job teaching at flight iron school in data science. That was my entry into the tech world. It’s never too late to start. It’s never too late to jump into tech. I think sometimes people are scared to make that transition. Obviously, I meet a lot of people teaching a boot camp in data science who are scared. They’re making these large career changes. They’ve been doing one thing and they’re thinking, Well, data science seems like something I could do. Maybe my background is quite different and I’m scared to do it, but I feel like I’ve got a chance. They really do. There’s a lot you can learn in a short time. There are lots of resources about R, lots of resources about Python. Python is actually the language of choice at Flat Iron, so I work mostly in Python these days. But it’s one step at a time. There’s always a lot to learn, but you don’t have to learn a lot to be able to make some cool things, to be able to start on some cool projects. Once you can do that, well, then you can get in touch with other people who are working on things.You can share ideas, you can share your own work, and then you’re right there in the community.

Arsalan Ahmed:

A question that a lot of our listeners would have right now is, is it really that useful, data science? Why should I learn it?How did you decide to become a data scientist?

Greg D’Amico:

I think data science has arisen largely because of lots of technological improvements to do with data. We are now incredibly good at storing data, at producing data, at having accurate data recordings. You can look online and find stats about exactly how much data is produced every day. I think it’s on the order of quintillions of bytes or something like that, just huge amounts of data. Lots of companies have their own data these days to worry about. Amazon, for example, part of the reason that Amazon is successful is they have lots and lots and lots of data about their customers, about the things that they’ve bought, about other things that they’ve bought, about Here’s a bunch of things that people just like you have bought. Maybe you’re interested in that, too. If you have access to a bunch of data about your customers and you can access it quickly, then that proves its business value pretty quickly. I think of data science as having an immediate business angel, but there’s also this technological aspect to it. It’s just as all of these technologies have gotten better, it’s natural that we have a lot of data. We have to have some understanding of how we can store that data, how we can access it, how we can share it, all that stuff.Data science itself as a discipline, certainly, it helps to have some understanding of things like databases and where data lives and how to access it and so on. But it’s also a bit of coding, it’s a bit of mathematics, it’s a bit of statistics. When we do our boot camps, we try to cover all those bases, at least a little bit.

Arsalan Ahmed:

What does it take to be a data scientist? Can anyone be a data scientist? Do you need some certain skills, certain aptitudes? What do you think?

Greg D’Amico:

I think data science has arisen largely because of lots of technological improvements to do with data. We are now incredibly good at storing data, at producing data, at having accurate data recordings. You can look online and find stats about exactly how much data is produced every day. I think it’s on the order of quintillions of bytes or something like that, just huge amounts of data. Lots of companies have their own data these days to worry about. Amazon, for example, part of the reason that Amazon is successful is they have lots and lots and lots of data about their customers, about the things that they’ve bought, about other things that they’ve bought, about Here’s a bunch of things that people just like you have bought. Maybe you’re interested in that, too. If you have access to a bunch of data about your customers and you can access it quickly, then that proves its business value pretty quickly. I think of data science as having an immediate business angel, but there’s also this technological aspect to it. It’s just as all of these technologies have gotten better, it’s natural that we have a lot of data. We have to have some understanding of how we can store that data, how we can access it, how we can share it, all that stuff.Data science itself as a discipline, certainly, it helps to have some understanding of things like databases and where data lives and how to access it and so on. But it’s also a bit of coding, it’s a bit of mathematics, it’s a bit of statistics. When we do our boot camps, we try to cover all those bases, at least a little bit.

Arsalan Ahmed:

What I want to know about Data science is that data science is a separate discipline in its own right. And you’re talking about using data to get some results, get some meaningful results out of the data. So if, for instance, you’re a university and you have data about your students, the students themselves, the courses that they register for, how frequently they do it, and which semester, which course is more popular. If you wanted to say, I want to predict next fall, what are the courses that might actually exceed capacity, where I may need to have an extra teacher to teach this course because there’s so much demand, but I need to know ahead of time so I could schedule it. That’s a thing. So your data goes into some data warehouse where it’s stored in some structure, maybe tables that are probably not normalized would mean I would expect them to be lots and lots of columns and everything in there so you don’t have to do lots of joins because that’s faster to read. And then you would be able to actually ask your system some intelligent questions and it answers it. That’s one aspect that I can think of. The other one is, Hey, Mr.Data Scientist, I have these 100 terabytes of data. Go give me some insights. I don’t know what I’m looking for.Which way is it falling here, this data science?

Greg D’Amico:

I think it’s both things I think very often you know I think about data science very often as trying to solve problems with data you know and those problems can take lots of forms often they’re you know sometimes they’re just straightforwardly Financial things like how can our business make more money you know um and sometimes they’re really more investigative things like you know I’ve got a bunch of customers and I want to do some sort of customer segmentation because I want to you know have a sort of targeted marketing campaign I want you know some ads to go to some people who are likely to pay attention to those ads from other ads to go to other people who are likelier to pay attention to that sort of advertising you know so there are lots of different problems but very often what’s happening is the data scientists will build some sort of model some sort of predictor and you’re right very often the data that you start with has some sort of tabular form right maybe I’ve got a bunch of columns of data and a bunch of rows rows will represent one record one observation you know One customer maybe or one house up for sale Maybe and you know each of the columns will be some feature about each of those records you know so if I’ve got a bunch of houses for sale maybe I’ve got a column that represents number of bedrooms or a feature of my rows is you know students at a college then maybe one of my columns is you know grades for a particular quarter or something like that so the general idea is I’ve got all phase columns and one of my columns is sort of privileged one of my columns is the thing that I’m trying to predict the thing that I’m trying to model right and I’m going to use all my other columns all the information that I have there to try to make accurate predictions about what’s in the the column of Interest if I’ve got thousands of rows or millions of rows then it’s very difficult for a human being to sort of you know pick up on the patterns that might be there but a computer is really fast right if I just show a computer well here are the values that I get for these rows in these particular columns and here are the values that I get for the column of Interest and then I say okay now here are some rows that you haven’t seen before and they have these values in these columns what do you think it’s likely to have in the column of interest you know and the computer sort of builds a model and then is able to use that model to make predictions on the Unseen data that that is new and then you can sort of evaluate that model is it good is it accurate and so on that sort of thing is is very often what’s at the heart of a lot of data science problems okay 

Arsalan Ahmed:

So if I have to build that model so some piece of software has to do that so are you as a data scientist are you writing custom code just start from scratch and just start building a model or do you use a tool and maybe script.

Greg D’Amico:

It a little bit there are lots of tools I think you know I think really python is as is probably the number one language right now for data science and much of the reason for that has to do with the fact that there are lots of well first of all it’s open source so it’s open source project lots of people contributing  and there are lots of tools lots of libraries that you can just import into your own workspace that already do lots of really cool data science things right so if I want to build a you know a random forest model well there are random Forest tools that I can just import right into my own workspace introduce them to my own particular data and they’ll build predictions for me you know straight out of the box and I don’t have to sort of create the model from scratch and so because of all these different libraries that are available on python really powerful flexible tool and you can get models up and running with really just a few lines of code because of all the work that’s already been done. 

Arsalan Ahmed:

That’s great yeah python is is a great language for beginners but also for for people who are doing data science but I wonder why python because I don’t know what python is doing that is so different than say Ruby or Java there must be something maybe there’s there’s some built-in libraries that do certain things I’m assuming some some math functions that others don’t. 

Greg D’Amico:

Yeah I think that’s right um so for example there are certain libraries of python that we introduced to our students in the first week because they will use them all the time um so for example there’s a package called numpy numerical python right and it’s basically a tool for scientific computer it’s a tool for doing you know sophisticated mathematics but it’s really fast right you can do sort of these vectorized operations you can add arrays together lightning quick you can multiply them you can do if you want to you can do scientific notation if you want to you can do complex numbers you can do trigonometry you can do all sorts of stuff another tool is pandas and pandas is the name I think comes from something like panel data and it used to be this sort of object type kind of called a panel it’s not really used much anymore but anyway really powerful tool for manipulating tables of data the sort of technical term in pandas is a data Frame data frame is just a basically just a big table of data and lots of really powerful tools for manipulating them quickly adding columns is you know just a line multiplying a column by a number is just a line of code adding columns together whatever you want to do filtering your data you know I’m only interested in you know rows that have this value and so on just really fast through adorable.

Arsalan Ahmed:

Really good yeah no I can imagine there is a reason why everybody is gravitating towards python either you can do really good websites MVC websites in Python you could start off as your first programming language and it’s easy enough but it has these Advanced features like this numerical Library you’re referring to yeah that’s good and the good news is it’s all free if you want to get started it’s free to get started and the libraries are probably also free this is this is the beauty of this open source ecosystem okay that’s all good but what kind of jobs can I do because my because if I’m thinking if someone is listening right now and they’re thinking okay that sounds interesting but complicated and I don’t want to commit to something where I may not see the return so what’s the return here what kind of jobs are available in what kind of experience they need to have.

Greg D’Amico:

Yeah it’s a good question I so um you know because data science is still relatively young I think there are lots of job titles that might be relevant um so most obviously things like data analysts data scientists but maybe also business analysts maybe also data engineer maybe also you know quantitative researcher or something like that statistician applied statistician machine learning engineer right so there are lots of different titles that are available I think any boot camp that’s where the salt should at least prepare students for a data analyst role and if you’re tackling a data analyst role probably you’re looking at a healthy amount of data visualization a healthy amount of interacting with databases using SQL maybe some maybe some non-sql databases as well too if you have some unstructured data uh but you know all that stuff I think is really Within Reach it’s a matter of learning the fundamentals of some of these tools which you know I think you can do in  15 weeks you know If you’re sort of dedicated to the study of these things you know learning the fundamentals can go a long way and you can do it pretty fast 

Arsalan Ahmed:

Really good yeah no I can imagine there is a reason why everybody is gravitating towards python either you can do really good websites MVC websites in Python you could start off as your first programming language and it’s easy enough but it has these Advanced features like this numerical Library you’re referring to yeah that’s good and the good news is it’s all free if you want to get started it’s free to get started and the libraries are probably also free this is this is the beauty of this open source ecosystem okay that’s all good but what kind of jobs can I do because my because if I’m thinking if someone is listening right now and they’re thinking okay that sounds interesting but complicated and I don’t want to commit to something where I may not see the return so what’s the return here what kind of jobs are available in what kind of experience they need to have.

Greg D’Amico:

Yeah it’s a good question I so um you know because data science is still relatively young I think there are lots of job titles that might be relevant um so most obviously things like data analysts data scientists but maybe also business analysts maybe also data engineer maybe also you know quantitative researcher or something like that statistician applied statistician machine learning engineer right so there are lots of different titles that are available I think any boot camp that’s where the salt should at least prepare students for a data analyst role and if you’re tackling a data analyst role probably you’re looking at a healthy amount of data visualization a healthy amount of interacting with databases using SQL maybe some maybe some non-sql databases as well too if you have some unstructured data uh but you know all that stuff I think is really Within Reach it’s a matter of learning the fundamentals of some of these tools which you know I think you can do in  15 weeks you know If you’re sort of dedicated to the study of these things you know learning the fundamentals can go a long way and you can do it pretty fast.

Arsalan Ahmed:

That’s good to know so in other words if I know databases if I know my way around basically I can write queries I can I’m comfortable with with SQL databases let’s say that an old SQL server or Oracle or postgres or my SQL one of those and I know a little bit about mongodb or nosql unstructured sort of databases which scale better but I kind of know both I’m not an expert but I know enough and then I learned python instead of just the ins and outs of just a few basic concepts the syntax the loops and and so on the how to make a function how to make a some make something reusable things like that and then I look at the numerical libraries and some other very data science specific libraries and how to call them and what types of Errors I get and and so on and so forth now I I know that you would probably do a lot more in an actual flat iron data science boot camp but I think overall that the you can get a few things in so now I’ve graduated I’ve done my 15 weeks can I actually get a job right now or do I need to do further work or do I need to work for free for a while internships how do I actually get a paid job

Greg D’Amico:

Yeah so you know I I certainly can’t report that everybody who graduates from Flatiron has a job within a few weeks you know that’s that’s not reality but I can say that we do see graduates getting jobs and you know usually it’s probably on the order of three months something like that maybe three to six months sometimes faster um sometimes students will get sort of a paid internship or they’ll get sort of um you know temporary work they’ll have sort of contract jobs that’s certainly always possible so I think it’s realistic we you know we have a slack Channel devoted to students getting jobs you know and we’re sort of constantly updating it and constantly remarking on it like hey it sounds like a job that’s really great it’s really it’s really sort of our North Star so you know we’re in the business of making sure students get jobs you know we’re not we’re not interested in you know giving students a hard time or giving them lots of work to do we’re interested in getting a job

Arsalan Ahmed:

So obviously there is demand for data science the reason we are even having this conversation and the reason you got the job as an instructor is because there are people that want to learn and it is very very very relevant in these days where we have data everywhere as you said and we want to get some insights out of that and some large large companies definitely use it but you know other small players probably could use it as well that’s why it’s growing because people are waking up to the reality that this could be useful or a competitive Advantage so they don’t want their competitors to have all the advantage  so it’s going to grow it it has to grow but as a young person I know that there are jobs as software developers just rate Up full stack developers I know there are jobs in in infrastructure networking there are jobs in database management even testing lots of jobs are out there so what kind of a person or what type of a person or a young person should pick data science over let’s say full stack development or do you think it do you think you need to be a certain type of person who has a a certain way of working or learning that would succeed in data science. 

Greg D’Amico:

Yeah I so I think I would say you know that there obviously is a quite a bit of overlap in the sort of skills required between data science and other types of software engineering roles or web development roles I think data science especially caters to people who are just have a sort of Natural Curiosity about things you know they’re interested in in solving problems um you know Netflix Works in part because it solved the problem you know how many people watch a TV show and then sort of Wonder well I finished that what should I watch next you know Netflix has a really good way of answering that right because they can say well here’s all these other people that we know about they watch that show that you just watched they also watched all these other things right probably you’d be interested in that too okay and then so it’s sort of easy for you to say oh well yeah I guess I’ll give that a shot and it’s really effective right because tastes are often you know not as fine-grained as the millions or thousands of movies and TV shows that exist it’s like I have a taste for westerns so if I like that Western all these other people watch that Western they also watched this other Western I should give that a shot so you know if you have a sort of interest in sympathy for solving that kind of problem working with that sort of issue then I think data science is for you.

Arsalan Ahmed:

Okay so good and I’m imagining a situation where someone starts off as a developer it’s getting boring not exciting anymore they can say let me try data science I know python already I know databases I don’t know anything about data science but I’ll just go study on the side the nights and weekends I’ll do some I’ll find some resources blogs books whatever and I’ll do something for six months and then I’ll try my luck maybe I’ll get a job as a data scientist and then or somebody started as a data scientist they just didn’t like the work and they would say well I know python now I know a little bit about databases let me see if I can be a full stack developer so you could switch between those right.

Greg D’Amico:

Definitely definitely I think you know again I do think there’s a fair bit of verbal out there certainly you know some of the lectures the data science students here would be useful for full stack students and vice versa so yeah I think once you’ve started opening your mind up to all the different tools that Tech has to offer then yeah it shouldn’t be a big deal to try to switch between the different levels well boot camps.

Arsalan Ahmed:

What they’re not the money right now but they are interested and they want to pursue this what are some free resources that they can use

Greg D’Amico:

yeah one of my favorites is called codewars.com okay I don’t know if people know this but it’s basically just coding practice.

Arsalan Ahmed:

I thought it was a first person shooter game, but not okay so it’s like lead code.

Greg D’Amico:

Yeah it’s it’s sort of got this kind of Asian martial arts theme to it ah but basically it’s a bunch of problems like write a function that does blah blah and you can do this in just about any language that you can think of I forget how many languages are supported but it’s you know I think it’s about 50 or something at this point so you can click on python you can click on R you can click on you know other data science languages or something else you know you could click on on C if you want to if you want to get deeper to the machine as it were lots of cool problems you know you can just sort of work on building up your understanding of syntax and you know it’ll show you if you don’t get it right it’ll show you what sort of problems you’re getting it’ll print out error messages for you so really useful really nice way to sort of level up your your coding ability I think there are also lots of um e-books and things online about about some uh resources like python um stack Overflow I probably everybody says this but stack Overflow is a really nice resource yeah lots of people post questions about hey I’m getting this sort of error or you know why is it that this seems to work and this thing doesn’t and lots of smart people answer you know lots of really good resources there too so you know I think if if somebody were really interested in data science and really wanted what was really starting from scratch you know you could really Google data science 101 or something like that and you would find some good resources you know there’s just there’s just so much out there on the web right now I think I think it’s pretty easy to get started.

Arsalan Ahmed:

Yeah I can imagine and one of the best ways of people switching careers or moving up or or having a kind of a sideways transition is to find perhaps something else that happening in your own company right now if you’re working you’re working for a large corporation and you’re doing QA which is fun but you want to do something more exciting maybe you want to go to data science go some do some research maybe there is a data science department and you will have I mean those people who already are working there they have an easier time getting in and a lot of times you may not be able to get the job that you really want but you could Shadow People you could make friends there you could start having lunches or coffee breaks over there and you can pick up things and they may be able to refer you to another company where they are hiring or you know maybe just let you work on something and put something on the resume because I think that would be the most important thing in general software development this is key you need lots of projects in your resume so that not only that it shows that you have done some work it shows that you have finished projects but it also shows that you you can handle it and so the problem with data science is it seems to be that you need big data and if I if you’re if you’re in high school you you don’t have big data right you’re not Procter Gamble that it needs to sell to uh consumers so that would be the problem I think it would not necessarily be super easy to get working on it and actually produce and show amazing results if unless you’re working in that kind of environment and I’m assuming I don’t really know but I’m assuming there might be some open source data sets or something that people can just use to hone their skills a little bit 

Greg D’Amico:

Yeah there are there are some good projects that I’ve seen that were done on small data sets you know it’s still possible if the data sets are sort of interesting enough definitely so you know that’s that’s definitely a possibility there are also lots of uh cloud services that people can utilize to handle big data you know so maybe I can’t hold this whole data set on my machine and tap into virtual machines courtesy of Google or Amazon or something.

Arsalan Ahmed:

Do you think this is really tied to machine learning because I hear a lot about this where you can show a picture to a program and it can actually show you similar pictures because you have trained it using artificial intelligence neural networks and all of that good stuff but actually it can see patterns once it knows what you’re looking for once you train it then you can see patterns is that would you call that data science open source datasets or something that people can just use to hone their skills a little bit 

Greg D’Amico:

Oh yeah yeah I’d say that’s a branch of it too because ultimately it’s really just another type of model you know a neural network is it’s a model really it’s just a really fancy sort of model right I’ve introduced the machine to thousands of images maybe millions of images there are networks like this around and if I feed it labels for different images you know I say I just gave you an image of a cat right that’s a cat image of a dog here’s 10 000 more images of cats here’s ten thousand limited Vlogs and airplanes and houses and whatever else well you know there are lots of ways that you could depict a cat right but if you have a nice diversity of cats some of them are standing up some of them are lying down some of them are awake some of them are asleep some of them are black some of them are white etc etc right well then what the computer is going to be learning to figure out what constitutes a cat is not you know it’s not just some particular arrangement of pixels uh it’s some very complicated pattern right it’s yeah well it might look sort of like that or maybe they’re more like this you know and if pixels of the image are basically themselves The Columns of data that I’m building my model on right and  then it’s the same principles same principles apply pretty cool

Arsalan Ahmed:

Alright people want to get in touch with you so who can they do that.

Greg D’Amico:

Yeah you can certainly find me on LinkedIn um if you look for Greg damico at flatironmental LinkedIn I’m also G A b-a-m-i-c-o on GitHub so happy to happy to have people reach out to me yeah 

Arsalan Ahmed:

Thank you so much for being part of the show and helping us 

Greg D’Amico:

Thank you so much, appreciated.

Arsalan Ahmed:

So, I have been working on an online space Online Academy where you get mentorship, where you get a really good Technical and non-technical education. But also a caring environment where people care about you and feel safe. Where you feel like you will make it because there are people who actually have empathy toward you. They’re going to help you succeed because of all of being there. So stay tuned for that. This is going to be pretty big in 2023.Hopefully, we will start with a nice good cohort of people building up steam as we go. All right. People, this has been a pleasure, see you later.

Outro:

For show notes and transcripts visit us at mentoring Developers.com

Show Notes:

List of topics
  • The reason behind the success of Amazon.
  • How Data Science takes businesses to the next level.
  • How Data Science helps to improve education for students.
  • Which tools do Data Scientists use for their work?
  • Why are people gravitating toward Python?
  • Can a programmer become a Data Scientist

Important Links

Flatiron School: https://flatironschool.com/
Codewars: https://codewars.com
My GitHub: https://github.com/gadamico
My LinkedIn: https://linkedin.com/in/greg-damico
Stack Overflow: https://stackoverflow.com

YouTube Channel: https://www.youtube.com/@itsarsalan

Thanks for Listening!

Do you have some feedback or some advice for us or our audience? Please give us a review on iTunes, Spotify, Google Podcasts, or Stitcher and share your thoughts.

If you found this episode useful, please go ahead and share it with your friends and family. You can also listen directly and give your feedback on the website.

You can subscribe to Mentoring Developers via iTunes, Stitcher Radio, Spotify, or Google Podcasts

Episode source