Sign in

Data Scientist | Analytics Nerd | Pythonista | Professional Question Asker |

A beginner’s guide including mini-tutorial in Python.

It’s project week here at the Flatiron School data science bootcamp and we’re pulling US Census data!

There are about a gajillion (give or take) websites and links and resources and webinars and raw data regarding census data and it will take you days to go through it. Lucky for you, I spent all that time for you and have compiled my favorite resources to get you started! My goal is make the process as simple as possible so you can get to the good stuff: playing with some data!

As a newbie (literally new-born) Python programmer and data science…

A tutorial for beginners setting up their first SQL environment

Yes, you need to learn SQL

You’ve heard it before, I’m saying it again: If you’ve studied to be a Data Scientist, there’s a big chance you’ll need SQL to break into your first job. There are tons of resources for learning SQL (referred to by its letters S-Q-L or pronounced “sequal”), but I kept hearing about a book titled Sams Teach Yourself SQL in 10 Minutes a Day* by Ben Forta to get started. It shows up on practically every “best of” SQL book list, and is one of the best selling SQL books of all time (1/4 of a million copies sold holds some…

How to web scrape without getting blocked

Before beginning your first web scraping mission, we should talk about a few things that you might want to keep in mind. Especially if you’re thinking of scraping a ton of data.

There are websites that aren’t terribly keen on the idea of web scrapers sweeping through and gathering all of their data, and so they may have anti-scraping mechanisms in place. This could result in your IP address being blocked or your user credentials getting flagged and being locked out. While there are articles to address this, most have an overwhelming amount of information, and not many with specific…

What’s a Modal?

A modal is a dialog box or popup window on a website that is displayed on top of the current page. They aren’t present in every page, but you’ve definitely seen them before. They’re the “Read more” button or the little carrot you click on to view a popup/hidden element.

In web scraping, modals aren’t terribly tricky, but they can be annoying if you’ve never seen one and don’t know how to deal with them. For one, the site’s url doesn’t change when you click on them, so it’s not like you can point the driver to a unique url…

Gathering variables for a data frame

When web scraping (or for any project, for that matter), I always like to tackle one piece at a time. This helps you debug your code so much easier. That is why I’ve broken down this web scraping tutorial into several parts. In this article, we will walk through the process of retrieving variables with our web scraper and saving them into a data frame.

This article explains the code starting at line #29 of my web scraping function (referenced at the end of this article), beginning at “Gathering Variables — Main Page.” …

Getting started: gathering a list of links.

<summoning my best podcast voice> Hello! And welcome back to another edition of my web scraping series where we learn how to web scrape mission statements and ratings from from! Last week I did an overview of the web scraping function that I used to gather information from 3000+ company urls. This week I’d like to start breaking it down into bite sized chunks. This article will explain lines 1–28 of the function, which handle creating a list of links.

Overview including code along in Python

In response to a summer of unrest surrounding racial injustice in 2020, more companies began promising to the public that they would “do better” in terms of diversity. This made me wonder: Is there a way to look at how a company describes itself and determine whether that company lives up to its self-proclaimed diversity standards? As a data scientist, I can’t help but want to explore quantifiable ways to prove or disprove statements such as these.

In order to answer this question of accountability, I went to, a website you’re probably familiar with if you’ve ever wanted to…

Including Python tutorial using cv2

Have you ever wondered how a self-driving car can “see” in order to autopilot? They use a machine learning algorithm clustering method called image segmentation. Let’s break down some simple concepts that make the algorithm possible.

Want to read this story later? Save it in Journal.

First, some terminology:

Clustering is a technique of grouping data together with similar characteristics in order to identify groups. This can be useful for data analysis, recommender systems, search engines, spam filters, and image segmentation, just to name a few.

A centroid is a data point at the center of a cluster.

K-Means is a clustering method…

Anyone who has even thought about learning to code has already heard it a thousand times over: Learning to code is like learning a new language. Ok sure, but what does that even MEAN? Obviously we call Python, JavaScript, and Ruby coding languages, but is there more to it than that? Well yes, my young Padawan. As you may have guessed from this article’s title, the expression can be taken quite literally.

In the very beginning, learning a new language is so exciting! After all, you’re learning a new way to communicate. …

A glimpse of my beginnings into the world of Data Science.

Hi there! My name is Cierra, and I am a numbers-loving, problem solving, communications expert by nature.

First, a little background:

I’ve always been motivated by helping people, and so long ago in a galaxy far away, I went to school to become a language teacher (more about that in a later blog!). Unfortunately, the year I graduated, education budgets in my home state of Texas were being cut. Teachers were being let go, not hired. …

Cierra Andaur

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store